Coder Social home page Coder Social logo

robert1003 / adl-final Goto Github PK

View Code? Open in Web Editor NEW
2.0 1.0 0.0 37.09 MB

Named Entity Recognition (NER) on Japanese bidding document with transfer learning and novel data processing / model training techniques

Python 99.20% Shell 0.80%
deep-learning nlp

adl-final's Introduction

Document Information Extraction

This project is the final project of course ADL (Applied Deep Learning). My teammates and I constructed machine learning model(use BERT as a part) which extract set of tags from a Japanese bidding document, trained only on a little amount of dataset(about a hundred), and won the contest with a large gap.

Project detail

  • Link to project slide.
  • Link to dataset.
  • Link to Kaggle competition website.

Our performance

Ranked 1 in private dataset with score 0.97904

Execution detail

Training

Train Bert on DRCDv2 dataset (from 2020 ADL hw2)

First, prepare DRCDv2 dataset (the data used in ADL hw2). You can choose to download it here. Then, execute the following command:

python3.6 hw2/train.py [train_data_path]

The BERT model is stored at hw2/bert.pth.

Train the four models used in our final prediction

Train 4 models:

cd parent_sampler_longer
bash run.sh conv [cuda_num] parent_sampler_longer --train_data [train_data path] --dev_data [dev_data path] --dev_ref_file [dev_ref.csv file path] --epochs 30 --hw2_QA_bert [bert from above training] --kernel_size 7 --learning_rate 5e-6 --round 2000 --ratio 2.0
cd parent_sampler_meow
bash run.sh conv [cuda_num] parent_sampler_meow --train_data [train_data path] --dev_data [dev_data path] --dev_ref_file [dev_ref.csv file path] --epochs 30 --hw2_QA_bert [bert from above training] --kernel_size 7 --learning_rate 5e-6 --round 2000 --ratio 4.0
cd parent_sampler_higher_ratio
bash run.sh conv [cuda_num] parent_sampler_higher_ratio --train_data [train_data path] --dev_data [dev_data path] --dev_ref_file [dev_ref.csv file path] --epochs 30 --hw2_QA_bert [bert from above training] --kernel_size 7 --learning_rate 5e-6 --round 2000 --ratio 6.0
cd parent_sampler_ratio_8.0
bash run.sh conv [cuda_num] parent_sampler_ratio_8.0 --train_data [train_data path] --dev_data [dev_data path] --dev_ref_file [dev_ref.csv file path] --epochs 30 --hw2_QA_bert [bert from above training] --kernel_size 7 --learning_rate 5e-6 --round 2000 --ratio 8.0

After training, check the log for best null threshold for prediction (each model has its unique best null threshold)

Testing

Change null_threshold in line 15 test.py (null threshold part) to the corresponding best null threshold for each models. Then, execute python3 test.py [test file path].

adl-final's People

Contributors

giver139 avatar ltf0501 avatar robert1003 avatar

Stargazers

 avatar  avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.