Coder Social home page Coder Social logo

dlcjfgmlnasa / text-classification_with_word_based Goto Github PK

View Code? Open in Web Editor NEW
3.0 2.0 0.0 13.5 MB

딥러닝을 활용한 문장 분류(with word based)

Python 100.00%
text-classification deep-learning pytorch attention textcnn textrnn bilstm self-attention

text-classification_with_word_based's Introduction

Text-Classification

작성중...

딥러닝을 활용한 문장 분류

How to Using

  • GitHub Cloning
>> git clone https://github.com/dlcjfgmlnasa/Text-Classification.git --recursive
  • Installing Python Package (with python virtualenv)
>> python -m venv venv                          # create python virtualenv  
>> source venv/source/activte                   # activate virtualenv  
>> (venv) pip install -r requirements.txt       # install...  
  • Prepare Dataset
    Your dataset should look like this

    • id: id
    • document: The actual review
    • label: The sentiment class of the review. (0: negative, 1: positive)
    • dataset line split \t
    • example

    id document label
    1 아 더빙.. 진짜 짜증나네요 목소리 0
    2 흠...포스터보고 초딩영화줄....오버연기조차 가볍지 않구나 1
    3 너무재밓었다그래서보는것을추천한다 1
    4 교도소 이야기구먼 ..솔직히 재미는 없다..평점 조정 0
    5 막 걸음마 뗀 3세부터 초등학교 1학년생인 8살용영화.ㅋㅋㅋ...별반개도 아까움. 0
    6 원작의 긴장감을 제대로 살려내지못했다. 0
    7 액션이 없는데도 재미 있는 몇안되는 영화 1
    8 재미없다 지루하고. 같은 음식 영화인데도 바베트의 만찬하고 넘 차이남....바베트의 만찬은 이야기도 있고 음식 보는재미도 있는데 ; 이건 볼게없다 음식도 별로 안나오고, 핀란드 풍경이라도 구경할랫는데 그것도 별로 안나옴 0
    ... ... ...
  • Training

  • Predicate

Requirements

  • Python 3.6 (may work with other versions, but I used 3.6)
  • PyTorch 1.2.0
  • konlpy 0.5.1

Datasets

Model

목차

  1. TextCNN
  2. TextRNN
  3. BiLSTM with Attention
  4. Self Attention

1. TextCNN

paramter

epoch batch_size seq_len embedding_dim output_channels dropout_rate n_grams
20 500 20 512 50 0.8 [2,3,4]

Training Graph

TextCNN Result Image

Test

TestCNN Test Result Image

2. TextRNN

paramter

epoch batch_size seq_len embedding_dim rnn_dim rnn_num_layer bidirectional
20 500 20 512 50 2 True

Training Graph

TextRNN Result Image

Test

TextRNN Test Result Image

3. BiLSTM with Attention

paramter

epoch batch_size seq_len embedding_dim rnn_dim rnn_num_layer bidirectional
20 500 20 512 50 2 True

Training Graph

BiLSTM Result Image

Test

BiLSTM Test Result Image

4. Self Attention

paramter

epoch batch_size seq_len embedding_dim self_attention_dim self_attention_num_heads
20 500 20 512 64 8

Training Graph

Self Attention Result Image

Test

Self Attention Test Result Image

text-classification_with_word_based's People

Contributors

dlcjfgmlnasa avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.