Coder Social home page Coder Social logo

shahrukhx01 / joint-learn Goto Github PK

View Code? Open in Web Editor NEW
3.0 2.0 1.0 42.61 MB

A PyTorch based comprehensive toolkit for weight-sharing in text classification setting.

License: MIT License

Python 98.93% Shell 1.07%
nlp knowledge-distillation pytorch weight-sharing

joint-learn's Introduction

GitHub - License

Joint Learn: A python toolkit for task-specific weight sharing for sequence classification

Transfer Learning has achieved state-of-the-art results recently in Machine Learning and specially, Natural Language Processing tasks. However, for low resource corporas where there is a lack of pre-trained model checkpoints available. We propose Joint Learn which leverages task specific weight-sharing for training multiple sequence classification tasks simulataneously and has empirically showed resulting in more generalizable models. Joint Learn is a PyTorch based comprehensive toolkit for weight-sharing in text classification settings.

Joint Learn LSTM Self Attention Joint Learn Transformer LSTM Self Attention

Usage

LSTM

Full Example

    ## init jl lstm
    jl_lstm = JLLSTMClassifier(
        batch_size=batch_size,
        hidden_size=hidden_size,
        lstm_layers=lstm_layers,
        embedding_size=embedding_size,
        dataset_hyperparams=dataset_hyperparams,
        device=device,
    )

    ## define optimizer and loss function
    optimizer = torch.optim.Adam(params=jl_lstm.parameters())

    train_model(
        model=jl_lstm,
        optimizer=optimizer,
        dataloaders=jl_dataloaders,
        max_epochs=max_epochs,
        config_dict={"device": device, "model_name": "jl_lstm"},
    )

LSTM Transformer Encoder

Full Example

    ## init jl transformer lstm
    jl_lstm = JLLSTMTransformerClassifier(
        batch_size=batch_size,
        hidden_size=hidden_size,
        lstm_layers=lstm_layers,
        embedding_size=embedding_size,
        nhead=nhead,
        transformer_hidden_size=transformer_hidden_size,
        transformer_layers=transformer_layers,
        dataset_hyperparams=dataset_hyperparams,
        device=device,
        max_seq_length=max_seq_length,
    )

    ## define optimizer and loss function
    optimizer = torch.optim.Adam(params=jl_lstm.parameters())

    train_model(
        model=jl_lstm,
        optimizer=optimizer,
        dataloaders=jl_dataloaders,
        max_epochs=max_epochs,
        config_dict={"device": device, "model_name": "jl_lstm"},
    )

LSTM Self-Attention

Full Example

    ## init jl lstm self-attention
    jl_lstm = JLLSTMAttentionClassifier(
        batch_size=batch_size,
        hidden_size=hidden_size,
        lstm_layers=lstm_layers,
        embedding_size=embedding_size,
        dataset_hyperparams=dataset_hyperparams,
        bidirectional=bidirectional,
        fc_hidden_size=fc_hidden_size,
        self_attention_config=self_attention_config,
        device=device,
    )

    ## define optimizer and loss function
    optimizer = torch.optim.Adam(params=jl_lstm.parameters())

    train_model(
        model=jl_lstm,
        optimizer=optimizer,
        dataloaders=jl_dataloaders,
        max_epochs=max_epochs,
        config_dict={
            "device": device,
            "model_name": "jl_lstm_attention",
            "self_attention_config": self_attention_config,
        },
    )

LSTM Self-Attention with Transformer Encoder

Full Example

 ## init jl lstm Self-Attention with Transformer Encoder
    jl_lstm = JLLSTMTransformerAttentionClassifier(
        batch_size=batch_size,
        hidden_size=hidden_size,
        lstm_layers=lstm_layers,
        embedding_size=embedding_size,
        nhead=nhead,
        transformer_hidden_size=transformer_hidden_size,
        transformer_layers=transformer_layers,
        dataset_hyperparams=dataset_hyperparams,
        bidirectional=bidirectional,
        fc_hidden_size=fc_hidden_size,
        self_attention_config=self_attention_config,
        device=device,
        max_seq_length=max_seq_length,
    )

Citing & Authors

If you find this repository helpful, feel free to cite our publication Hindi/Bengali Sentiment Analysis Using Transfer Learning and Joint Dual Input Learning with Self Attention:

@article{khan2022hindi,
  title={Hindi/Bengali Sentiment Analysis Using Transfer Learning and Joint Dual Input Learning with Self Attention},
  author={Khan, Shahrukh and Shahid, Mahnoor},
  journal={BOHR International Journal of Research on Natural Language Computing (BIJRNLC)},
  year={2022}
}

joint-learn's People

Contributors

shahrukhx01 avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar  avatar

Forkers

softwareimpacts

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.