Coder Social home page Coder Social logo

eaai17-cpr-recover's Introduction

CPR-Recover data

Release 1, 2017-01-10

This is data used in "Recovering Concept Prerequisite Relations from University Course Dependencies" (Liang et al., 2017)

Data Description

  • cs_courses.csv: These are CS-related course information collected from 11 U.S universities (Carnegie Mellon University, Stanford University, the Massachusetts Institute of Technology, Princeton University, the California Institute of Technology, Purdue University, University of Maryland, Michigan State University, Pennsylvania State University, University of Illinois, and University of Iowa). Each line is formatted as "<Course_id>,<Course_description>". Note the course titles are located at the begining of the description.

  • cs_edges.csv: There are course prerequisite information. Each line "<course_1>,<course_2>" represents <course_2> is a prerequisite for <course_1>.

  • cs_annotations.tsv: These are annotation results for candidate pairs generated from above CS courses. Please refer to the "Data Labeling" section for more details. Each line is formatted as "<Concept_A>,<Concept_B>,<Annotator_1>...<Annotator_13>". Each pair gets labels from three different annotators. Valid labels are: 1 B is a prerequisite of A. 2 A is a prerequisite of B. 3 There is no prerequisite relation between A and B.

  • cs_preqs.csv: These are concept prerequisite pairs exported from the above annotation by using majority vote. Each line "<Concept_A>,<Concept_B>" represents that B is a prerequisite of A.

Note: As described in the paper, Wikipedia concepts in this data are all extracted with the help of Wikipedia-miner. You can also try other Wikification/Entity linking methods to extract Wiki concepts from course descriptions. In that case, even though our labeled prerequisite pairs perhaps will not cover all candidate pairs, we believe this annotation still covers most of them and can save you lots of time when collecting prerequisite labels.

Citation

Please cite the following paper if you use this data.

@inproceedings{liang2017recovering,
  title={Recovering Concept Prerequisite Relations from University Course Dependencies.},
  author={Liang, Chen and Ye, Jianbo and Wu, Zhaohui and Pursel, Bart and Giles, C Lee},
  booktitle={AAAI},
  pages={4786--4791},
  year={2017}
}

If you have any problems, please contact Chen Liang at [email protected].

License

Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

Alt

eaai17-cpr-recover's People

Contributors

harrylclc avatar

Stargazers

 avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

Forkers

urmi22 liwentao42

eaai17-cpr-recover's Issues

数据集

您好,感谢您提供的数据集。
我有一个疑问,目前提供的数据并不能知道一个概念是从哪个课程中抽取出来的,或者说每个概念隶属于哪个课程,或者每个课程包含哪些概念。请问有什么方法可以知道吗?(因为你提到概念并非完全是从课程描述中提取的,而是由维基百科工具生成,所以如果直接在课程描述中查找概念会存在歧义)

Hi, thanks for the dataset.
I have a question, how can I know which course a concept belongs to, or which concepts a course contains. Is there any way to know? (Because you mentioned that the concepts are not completely extracted from the course description, but are generated by the wikipedia tool, so there will be ambiguity if you look up the concept directly in the course description)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.