Coder Social home page Coder Social logo

mindis / spacyapp Goto Github PK

View Code? Open in Web Editor NEW

This project forked from acdh-oeaw/spacyapp

0.0 2.0 0.0 3.19 MB

spacyapp is a small web-application to expose spacy's nlp-functionalities through http.

Home Page: https://spacyapp.acdh.oeaw.ac.at/

License: MIT License

Jupyter Notebook 8.53% Python 65.56% HTML 25.77% CSS 0.14%

spacyapp's Introduction

DOI

nlp - An NLP-App/Service

spacyapp is a NLP service provided by ACDH. It is built around spaCy, but extends spacys functionalities and provides an easy to use webservice.

Our service is currently under heavy development, but it provides so far:

  • RestAPI endpoints for all services
  • an endpoint that provides a standard spaCy pipline
  • an endpoint that uses spaCy to extract named entities
  • an endpoint that returns POS tags for tokens provided
  • an pipline endpoint that allows batch processing for TEI documents:
    • accepts a ZIP of TEI documents
    • uses the xtx tokenizer developed at ACDH to tokenize TEI documents while preserving existing tags
    • allows to choose between a Treetagger based service - also developed at ACDH - and spaCy for POS tagging
    • provides the processed files as zipped TEIs
    • informs users logged in via email that their job is finished (processing a lot of TEI files can take a while)

Have a look at https://spacyapp.acdh.oeaw.ac.at/ for a running version

install

  • clone the repo
  • set up a virtual environment (optional)
  • install required package (pip install -r requirements.txt)

customize settings

spacyapp uses modularized settings. To start the developement server you'll need to add a settings parameter, e.g. python manage.py runserver --settings=spacyapp.settings.dev

celery settings

Settings for celery are stored in celery_settings.py. Celery depends on django-settings. You can either provide them as environement variables

  • TODO Add example
  • or adapt in celery_settings.py the line os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'spacyapp.settings.dev_custom')

start the app

  • start spacaypp python manage.py runserver --settings=spacyapp.settings.dev
  • start celery worker celery -A celery_settings worker --loglevel=info
    • on Windows you'll need to add --pool=solo

Spacy Active learning (spacyal)

spacyal is a python package for training your own spacy language models using active learning. To plug spacyal to spacyapp you'll need to

  • install the package pip install spacyall
  • add spacyall to your project's INSTALLED_APPS e.g. in spacyapp/settings/base.py
  • add spacyal.urls and pacyal.api_urls to your project's main URL definition spacyapp/urls.py, something like
urlpatterns = [
    ...
    path('spacyal_api/', include('spacyal.api_urls')),
    path('spacyal/', include('spacyal.urls'))),
    ...
]
  • run python manage.py migrate --settings=spacaypp.settings.your_custom_settings

For further information about spacyal please refer to spacyal

spacyapp's People

Contributors

csae8092 avatar sennierer avatar aureon249 avatar alexanderwatzinger avatar zxenia avatar

Watchers

Mindaugas Zickus avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.