To use these scripts you will need the aggregated Open Repair Data as well as the data in the dat
directory
A dataset comprising records that conform to the Open Repair Data Standard.
Download the core data from the Open Repair Alliance (ORA) and unzip the files into the dat/ords directory.
The data is recorded by volunteers and participants at community repair events, it is collected by networks of organisers who are members and allies of the the Open Repair Alliance. It comes from hundreds of community groups in over two dozen countries with multiple languages represented.
Once or twice a year The Restart Project gathers the contributed data and transforms it to comply with the Open Repair Standard. The results are published as Open Data, freely available for download by anyone. The data and the standard are held in a public Git repository.
To get an idea of the effort and challenges involved in recording repair data at live community events, take a look at Repair data collection tips and tools
Some of the resources available to community repairers and data collectors:
Each item in our database represents a citizen who took hours out of their life to learn what went wrong with their device, and to learn how to fix it. This makes our data more powerful than any petition or online complaint.
See the Quests README
These files contain data used by the scripts in this repository.
consumer_electronics_timeline.csv
Timeline of electronic inventions and commercial availability related to ORDS product categories.
iso_country_codes.csv
List of 3 character codes and country short-names.
ords_lang_training_stopwords.txt
List of common abbreviations to filter out when training the repair language model.
ords_problem_translations.csv
Work-in-progress to translate the entire ORDS dataset into each of English, Dutch, German, French, Spanish, Danish and Italian.
ords_product_category_item_type_translations.csv
List of some of the device types per product_category
. Translations in English, Dutch, German, French.
ords_product_category_unu_key_map.csv
Maps the ORDS product_category
values to a set known as “UNU Keys” produced by UNITAR - United Nations Institute for Training and Research. These keys are commonly found in academic papers dealing with subjects around e-waste and trickle out to wider reports and news stories, e.g. OF 16 BILLION MOBILE PHONES POSSESSED WORLDWIDE, 5.3 BILLION WILL BECOME WASTE IN 2022. Note that UNITAR is reviewing their codes as part of a 4 year project due to conclude December 2023.
The UNU average weights (kgs) in the file are lifted from E-waste statistics: Guidelines on classifications, reporting and indicators.
Not all ORDS categories can be directly mapped to other sets of product categories. Combining category maps with queries to find item types could produce better results.
The Restart Project compiles emissions and waste data that can be mapped to ORDS and UNU categories, for details contact them directly.
ords_testdata_common_products.csv
A list of common item types found in the ORDS data. Mostly English.
ords_testdata_multilingual_products.csv
A list of common item types found in the ORDS data. English, Dutch, French and German with a little Italian and Spanish.
product_category_regex_elements.csv
Lists of terms and regex snippets used by ords_regex_category_compiler.py. English, Dutch, French and German with a little Italian and Spanish.
product_category_regexes.csv
A set of Regular Expressions used by ords_regex_category_tester.py and compiled by ords_regex_category_compiler.py.
stopwords-english-repair.txt
List of repair-related stopwords extracted from the ORDS corpus using ords_extract_vocabulary.py
.
stopwords-english.txt
List of English stopwords from a dataset published on Kaggle. “This dataset is Copyright (c) 2005, Jacques Savoy and distributed under the BSD License.”
tableschema_ords_mysql.sql
An SQL script for creating a MySQL table to hold the ORDS data. It is used by ords_database_setup.py and requires the name of the ORDS filename to be set in the .env file.
tableschema_translations_mysql.sql
An SQL script for creating a MySQL table to hold the ORDS translations.