Coder Social home page Coder Social logo

kakaobrain / kortok Goto Github PK

View Code? Open in Web Editor NEW
114.0 114.0 11.0 5.73 MB

The code and models for "An Empirical Study of Tokenization Strategies for Various Korean NLP Tasks" (AACL-IJCNLP 2020)

License: Apache License 2.0

Python 100.00%
aacl korean machine-translation natural-language-understanding tokenizer

kortok's People

Contributors

kyubyong avatar noowad93 avatar roomylee avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

kortok's Issues

Accessing Korean parallel dataset in United States

Dear Authors,

Thanks for providing the dataset link regarding the Korean parallel dataset from the AI hub. However, The users from the United States cannot pass the phone verification at the AI hub so we cannot download the dataset. Could you kindly share the downloaded data with my email [email protected]? Or it is also good to share a link to download it.

Thanks,
Ken Lou

Wiki, namu wiki 데이터에 대해 질문드립니다.

안녕하세요!
코드를 공유해주셔서 정말 감사드립니다.

다름이 아니라, 논문에서 사용하신 데이터셋을 바탕으로 저희의 모델을 평가해 보고 싶은데요,
위키, 나무위키 덤프 데이터의 날짜가 정확히 언제인지 알 수 없어서 문의 드립니다.

  • dataset 폴더에 있는 샘플 데이터는 200420이라고 되어 있는데 혹시 200420 덤프 파일로 학습을 진행하신 것이 맞을까요?
  • 나무위키 덤프 파일의 날짜는 어떻게 되는지요?

그럼 좋은 하루 보내시기 바랍니다!
감사합니다!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.