Coder Social home page Coder Social logo

final_thesis's Introduction

final_thesis

Research in Short Text Classification Based on Pre-trained Model BERT

This is a repo with my undergraduate thesis codes.

My thesis topic is BERT and news classification.

The codes run in Google Colab AutoDL.

Google Colab is too unstable. TAT

Goal: Before 2020.4.1

  1. Rewrite datasets.py

    We notice that all the datasets that are required are in package torchtext. We can write a general method/class to process the datasets. By indicating the datasets name, we can create different dataloaders in running scripts. In that case, running.py should be rewritten too.

    F**K the internet error

  2. Run scripts to get results.

    Use autodl server to get result of classification onfour datasets:

    1. IMDB
    2. Yelp
    3. AG news
    4. Sogou news

    By feeding into the following models:

    1. TextRNN
    2. TextCNN
    3. Transformer Encoder
    4. Bert+Linear
    5. Bert+RNN

Abstract

In the era of big data, short text such as news and comments are growing significantly through the Internet. It is thus important to design topics or sentiment classification models to automatically identify valuable information. Traditional text classification methods have problems regarding feature extraction and model structures, such as sparsity of features, or lack of semantic relations. Pre-trained BERT model provides an end-to-end paradigm. It consists of 12 bi-directional Transformer structures, and can achieve high classification performance after being pre-trained in large corpus and fine-tuned.

This paper study short text classification with pre-trained model BERT. There are usually three phases regarding text classification with BERT, including Pre-processing, Further Pre-training and Fine-tuning. We make an improvement to the algorithm of Further Pre-training, especially addressing the problem of class imbalance. We compare classification performances of BERT and other baseline models and proved the exceeding performances of BERT model. Moreover, we observe and analyze the output of each layer of BERT model. By conducting experiments, We verify that outputs of lower layers focus more on low-level features like syntax, while those of higher layers focus more on high-level features like semantics. Weights of higher layer outputs tend to be greater when we make those weights learnable.Finally, we focus on the condition where the number of samples in training set is low. BERT can weaken the disadvantages of low quantity of data to a certain extent, and the Further Pre-training phase will contribute to higher performances. We analyze the pre-training mechanism as well as the training process, and verify the existence of such phenomenon.

Full paper

click here to get to the paper repo.

final_thesis's People

Contributors

ericpro-nju avatar

Watchers

James Cloos avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.