Coder Social home page Coder Social logo

samueldobbie / markup Goto Github PK

View Code? Open in Web Editor NEW
238.0 13.0 33.0 81.61 MB

A web-based document annotation tool, powered by GPT-4 :rocket:

Home Page: https://getmarkup.com

License: MIT License

HTML 0.23% CSS 0.08% TypeScript 95.79% Dockerfile 0.15% PLpgSQL 3.75%
active-learning text-annotation natural-language-processing annotation-tool machine-learning sequence-to-sequence nlp data-science text-annotation-tool data-labeling

markup's Introduction

Markup Annotation Tool for ML and NLP

Markup is an online annotation tool that can be used to transform unstructured documents into structured formats for NLP and ML tasks, such as named-entity recognition. Markup learns as you annotate to predict and suggest complex annotations, and also provides integrated access to common and custom ontologies for concept mapping.

Key Features

  • Predictive annotation - Markup's machine learning-powered predictive annotation feature suggests complex annotations as you work, making the process of annotating documents more efficient and saving you valuable time.

  • Integrated ontology access Markup provides integrated access to a wide range of common ontologies (e.g. UMLS, SNOMED-CT, ICD-10), as well as the ability to upload custom ontologies, for concept mapping.

  • Predictive ontology mapping - Markup's predictive ontology mapping feature uses machine learning to suggest appropriate mappings to standard and custom terminologies based on the text you're annotating.

  • User-friendly interface - Whether you're a technical expert or a beginner, Markup's user-friendly interface makes it easy for anyone to start annotating documents with minimal setup.

Installation

To install and run Markup locally:

  1. Clone the repository and install dependencies, git clone https://github.com/samueldobbie/markup && cd markup && yarn install
  2. Install the Supabase CLI
  3. Start Supabase, supabase start. This will generate and output an API URL and anon key. Add both to the .env.local file
  4. Add an OpenAI API key to the .env.local file (Optional)
  5. Run the development server, yarn start
  6. Open Markup in your web browser, http://localhost:3000

Usage

To get started with Markup, read the quick start guide.

Contributions

Contributions to Markup are appreciated. If you'd like to contribute, please follow these guidelines:

  1. Fork the repository
  2. Create a new branch for your feature
  3. Make your changes
  4. Submit a pull request for review

Support

If you have any questions or need assistance with Markup, you can contact me at [email protected].

markup's People

Contributors

arronlacey avatar dependabot[bot] avatar huwstrafford avatar samueldobbie avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

markup's Issues

Build Document Query functionality

  • Would be able to search documents for specific annotations from .ann files i.e. "find all documents where someone has focal epilepsy or CUI 39192XXXXX" etc

  • Could also search non-annotated letters with zero-shot / some model to retrieve letters and accept/reject to improve

Record multiple values for the same attribute

If I have an entity A, and it has an attribute B, could we allow to record multiple values for B. i.e. if B denotes risk factors for heart disease there might be a drop down list that includes smoking, high BMI and we would want to include both of those. Currently we can only pick one and the workaround is to annotate two separate entities, one for each attribute value.

Change entity type of annotation

I find I end up in the following situation quite a bit:

I have just annotated an entity, and then I select some more text to annotate something else. The span gets entered as the entity I last used (it's still toggled as that entity on the left hand side) but if I wanted that span to be a different entity I would need to delete the annotation, toggle to the correct entity, then re-highlight.

Would it be possible so that when you toggle to the correct entity, the highlighted span and annotation on the right hand side automatically changes to the new entity? This would save having to delete the annotation and re-highlight.

Customizing markup for a special use-case

Hello Samuel,

I apologize to put this as an issue but couldn't find any other way to contact you.

I have a specific use-case of your brilliant annotation tool and would like to collaborate with you on this can you please provide any contact information ? email, or any other way to get in touch with you ?

my email is : [email protected]

Cheers,

Integrate HuggingFace

  • Pull down a HuggingFace model and see annotations
  • Interact with HuggingFace Trainer?

Blank page on localhost:3000 after installation

Hi. I wanted to install the markup tool locally on my M1 Mac (MacBook Pro 14" with Brave Browser). After install through yarn is finished I always get the following error in the browser console:

caught Error: supabaseUrl is required.
at new SupabaseClient (SupabaseClient.ts:87:1)
at createClient (index.ts:38:1)
at ./src/utils/Supabase.ts (Supabase.ts:5:1)
at options.factory (react refresh:6:1)
at webpack_require (bootstrap:24:1)
at fn (hot module replacement:62:1)
at ./src/providers/AuthProvider.tsx (Support.tsx:23:1)
at options.factory (react refresh:6:1)
at webpack_require (bootstrap:24:1)
at fn (hot module replacement:62:1)

... and the page stays blank. I was not able to fix the issue myself. I am very thankful for advise :)

Thank you very much!

Add more demo corpuses

  • We could add different domain types for a wider audience
  • ConLL conference competitions have fully annotated brat format - would look great on startup
  • Can use different domains to show different functionality i.e. UMLS for health

Errors in .ann files

EXAMPLE .ANN FILE

T1 AnnotationName1 218 233 String1
A1 Feature1 T1 A
A2 Feature2 T1 B
A3 Feature3 T1 C
T2 AnnotationName2 262 272 String2
A2 Feature4 T2 Other

If you have the same identifier for features (A2) in one .ann file then GATE will only add one of the features.

Map Workspaces to git repo

There seem to be an opportunity to map the following to git repos for version control purposes:

  1. config files
  2. annotation guidelines
  3. custom ontologies

If you could allow Workspaces to point to a repo and select a branch that would automatically use the latest commit on that branch then it could require less work on the user to keep their Workspace up to date.

You need to provide valid config file. ERROR

I am attempting to create a new page, similar to annotate, called analyse. Initially, I simply duplicated the annotate file and renamed all relevant annotate variables to analyse. Now when I upload a folder and try to start a session I receive the error shown below which asks for me to provide a valid config file. I am unsure why this issue arises as following the functions called when clicking the button everything for analyse is the same as annotate. So I am uncertain if the issue lies in another function which actually uploads the files selected on the setup page.

Any help with this issue would be greatly appreciated and please let me know if you need me to elaborate any more.

Thank you, William

Screenshot 2022-06-15 at 13 52 44

Enable bulk upload of annotations

Uploading annotations is currently slow as it has to be done file-by-file. Should update to allow for bulk uploading, where uploaded annotations will be matched to documents based on the file name (e.g. annotations from Demo-Doc-123.ann will map to Demo-Doc-123.txt)

config-creator Page not found 404

image

The error is in the screenshot above.

How to reproduce: clicking on config-creator on the docs page I get this page not found error.

Configure settings

Page is currently a placeholder. Users should be able to update their email and password at a minimum.

"Live" config preview in the config editor

Currently the config editor will populate a preview of the JSON on the fly when filling in the form on the left hand side. For some users not familiar with / don't really care about the underlying JSON, presenting a view more in line with the aesthetics of the left hand annotation panel i.e. coloured entities might be better for them.

CRLF text documents

After exporting the annotations are no longer lined up to the text. This is also the case visually when annotating.
No issue with LF text documents.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.