Coder Social home page Coder Social logo

kengdic's Introduction

Kengdic

Kengdic is a large, open Korean/English dictionary database created by Joe Speigle. It was originally hosted by Joe at ezcorean.com.

The bulk of the usable data is in kengdic_2011.tsv. The raw directory contains data which still needs to be examined. The scripts directory contains some automated QA checks for the data; these are run whenever a change is made in this repository.

Contributors

The dictionary data is still quite dirty, and contributions are very welcome. Some ways you can help:

  • Fix some of the issues which have been automatically flagged. The list of these issues is available on the repository website: http://garfieldnate.github.io/kengdic/.

  • Check existing entries for correctness (흰색 means white, not gray, "the m prophets", lowercase "british", etc.)

  • Add new entries

  • Assess the current coverage. Are we missing any particularly basic words, or words related to any specific subject?

  • Fix grammatical, spelling or formatting issues

  • Help come up with editorial and style guidelines.

  • Come up with new automatic checks we can run on the dictionary to find possible issues (see scripts/lint.py).

By contributing data, you release it under the same license terms as Kengdic itself (see below).

Example Datapackage Usage

We've provided a datapackage.json for convenience. To retrieve and load the data in python:

  • pip install datapackage
    $ python
    >>> from datapackage import Package
    >>> package = Package('https://raw.githubusercontent.com/garfieldnate/kengdic/master/datapackage.json')
    >>> resource = package.get_resource('kengdic')
    >>> data = resource.read(keyed=True)
    >>> data[20977]
    {'id': 20978, 'surface': '급조', 'hanja': '急造', 'gloss': None, 'level': None, 'created': datetime.datetime(2009, 1, 1, 20, 23, 14), 'source': 'mr.hanja-213889@ezcorean:213889'}

License

The Kengdic data is released under dual licenses: users may choose to use MPL 2.0 or the LGPL, version 2.0 or later.

kengdic's People

Contributors

bmcd avatar garfieldnate avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.