Coder Social home page Coder Social logo

nor_bert's Introduction

Language Modeling

This project contains a Bert-model on norwegian and the code to rebuild. Feel free to reuse it.

The model is in the data-directory. When checkout use Git lfs, as the file is pretty big.

Why

This model works out of the box with f.eks. the sentence transfomer library. It is built on top of the multilingual bert.

Getting Started

from sentence_transformers import SentenceTransformer
model = SentenceTransformer('checkpoint-33500')
vector =  model.encode("Ibsen er en norsk forfatter som ikke var redd for litt kinnskjegg..")

Fine-tuning

Bert works best when fine tuned against the relevant material.

Put data in a directory and follow the train-part in the scripts here.

Notebooks

To use your module code (src/) in Jupyter notebooks (notebooks/) without running into import errors, make sure to install the source locally

pip install -e .

This way, you'll always use the latest version of your module code in your notebooks via import language_modeling.

Assuming you already have Jupyter installed, you can make your virtual environment available as a separate kernel by running:

pip install ipykernel
python -m ipykernel install --user --name="language-modeling"

Note that we mainly use notebooks for experiments, visualizations and reports. Every piece of functionality that is meant to be reused should go into module code and be imported into notebooks.

Distribution Package

To build a distribution package (wheel), please use

python setup.py dist

this will clean up the build folder and then run the bdist_wheel command.

Contributions

Before contributing, please set up the pre-commit hooks to reduce errors and ensure consistency

pip install -U pre-commit
pre-commit install

Contact

[email protected]

Thanks to Russ and Wassim for cooperation :)

License

Freebsd

nor_bert's People

Contributors

pegesund avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.