-
Download and install MySQL
-
Download and install Python (Version 2.7)
-
Download and install required Python packages, including: Connector/Python (Version 2.1.3), pytz and pymongo. To test whether a package (e.g.,
pytz
) is installed successfully or not, you may runpython -c "import pytz"
in the terminal.
-
Create a folder named
course_log
and a folder nameddaily_logs
under thecourse_log
folder (i.e., the path for thedaily_logs
folder should be$PATH$/course_log/daily_logs/
). After that, upload all of the daily log files (in the form of .gzip) to thedaily_logs
folder. -
For each course, create a folder under the the
course_log
folder named with its course_code (e.g.,FP101x-3T2015
andEX101x-3T2015
). Within each course folder,-
create a folder named as
metadata
and upload the following extracted course metadata files here:- DelftX-
course_code
-auth_user-prod-analytics.sql - DelftX-
course_code
-auth_userprofile-prod-analytics.sql - DelftX-
course_code
-certificates_generatedcertificate-prod-analytics.sql - DelftX-
course_code
-course_structure-prod-analytics.json - DelftX-
course_code
-courseware_studentmodule-prod-analytics.sql
For courses starting prior to March 2015, we also need to put the following file here.
- DelftX-
course_code
-prod.mongo
The above step is very important.
- DelftX-
-
create a folder named as
surveys
and upload the following survey files here:- anon-ids.csv (storing the mapping between learners' anonymised ID used in Qualtrics and their edX ID)
- pre-survey.csv (storing learners' responses to the pre-survey)
- post-survey.csv (storing learners' responses to the post-survey)
-
-
After performing the above steps, the structure of the
course_log
folder should be:-- course_log -- daily_Logs -- delftx-edx-events-201X-MM-DD.log.gz -- ... -- FP101x-3T2015 -- metadata -- DelftX-FP101x-3T2015-auth_user-prod-analytics.sql -- ... -- surveys -- anon-ids.csv -- ... -- EX101x-3T2015 -- metadata -- DelftX-EX101x-3T2015-auth_user-prod-analytics.sql -- ... -- surveys -- anon-ids.csv -- ...
-
Go to the folder storing the translation codes (i.e.,
$PATH$/DelftX-Daily-Database/
), and run the following command to build the DelftX database:mysql -u root -p --local-infile=1 < moocdb.sql
-
Edit the config file
config
[mysqld] user = root password = 123456 host = 127.0.0.1 database = DelftX [data] path = /.../course_log/ remove_filtered_logs = 0 log_update_list = ["EX101x-3T2015"] metadata_update_list = ["Calc001x-2T2015", "CTB3365DWx-3T2014","EX101x-3T2015"] survey_update_map = { "Calc001x-2T2015":["13","10"], "CTB3365DWx-3T2014":["108","139"], "EX101x-1T2015":["13","10"], "EX101x-3T2015":["10","10"], "FP101x-3T2014":["103","118"] }
- The
[mysqld]
section stores required information to establish the database connection. Change theuser
andpassword
to your account. - The
[data]
section stores information required to process the course data, including:- The
path
points to thecourse_log
folder (e.g$PATH$/course_log/
) - The
remove_filtered_logs
indicates whether the extracted daily log files generated when processing the data should be removed (setting to1
) or not (setting to0
). - The
log_update_list
(in the format of list) stores courses in which learners' daily activities (e.g., watching videos, solving quizzes, posting in discussion forum) need to be re-imported into the database. - The
metadata_update_list
(in the format of list) stores courses whose metadata (e.g., course structure, learners' demographic, learners' certification) needs to be processed (for the first time) or updated. In other words, if we want to import a new course or update the metadata of a course that have been processed before, we should put the course here. - The
survey_update_map
(in the format of JSON) stores courses (and the indexes of learner's ID in the pre/post surveys) whose survey data needs to be processed (for the first time) or updated. In other words, if we want to import a new course or update the survey data of a course that have been processed before, we should put the coursehere.
- The
- The
-
Running the translation codes by using following command:
python main.py config
We recommend to use MySQL Workbench to build the database connection as detailed below:
- Fill in the connection name and choose
Standard(TCP/IP)
as the connection method; - Fill in the hostname (i.e., IP address of the server in which the database is built);
- Fill in the account information (i.e., username and password) used to visit the database;
- Click the
OK
button to connect to the database.
Our database schema is adapted from the MOOCdb Model, which consists of 5 major modules, namely Video Mode
, Quiz Mode
, Forum Mode
, Learner Mode
and Survey Mode
, as depicted in Figure 1.
The Video Mode has only one table, i.e., video_interaction.
The Quiz Mode has 4 tables, i.e., quiz_questions, submissions, assessments and quiz_sessions.
The Forum Mode has 2 tables, i.e., forum_interaction and forum_sessions.
The Survey Mode has 2 tables, i.e., survey_descriptions and survey_responses.
The Learner Mode has 6 tables, i.e., course_elements, courses, learner_index, sessions, learner_demographic and course_learner.