lajavaness / annotto Goto Github PK
View Code? Open in Web Editor NEWAnnotto is the only go to annotation tool to successfully annotate your documents at scale
License: Apache License 2.0
Annotto is the only go to annotation tool to successfully annotate your documents at scale
License: Apache License 2.0
Add the audio datatype to the project as an available type, as well as to the items.
Initially, the display of audio items will be in the HTML audio player https://developer.mozilla.org/fr/docs/Web/HTML/Element/audio.
This issue requires a refactoring of the Annotation page component (see issue #15).
There should be no impact on the labeling types.
This functionality must be added in order to allow classification and transcription of audios.
Suggested by @timothee-LJN in this comment we should add a generic implementation of react-i18next mocks to be able to use it easily in the tests of all the components.
Is your feature request related to a problem? Please describe.
It is currently impossible to use Annotto for images without access to S3.
Describe the solution you'd like
Add an option for the image projects, allowing to fetch files from local storage.
E.g. create a docker volume with images to use and item.data.url
would contain a path to an image within this volume.
Describe the bug
When a user has already been registered in Annotto, the role changes in keycloak will not take effect.
To Reproduce
Steps to reproduce the behavior:
Expected behavior
Your role in Keycloak should reflect the one persisted in Annotto mongodb.
Desktop (please complete the following information):
Add the html datatype to the project as an available type, as well as to the items.
There should be no impact on the labeling types.
You should use the sanitize
function of shared/utils/htmlUtils
in order to clean up the HTML code and safely insert external content into the annotation page.
Currently the TextItemContainer component is used to display html content. It is necessary to remove the functionality of this component and create a new component to separate the display of content in text or in html depending on the type of project.
Describe the bug
The Annotto container does not start correctly. It exists just after startup, with an exception listed below.
To Reproduce
Steps to reproduce the behavior:
docker run -d --name annotto -p 3000:3000 ljnrepo/annotto:latest
docker logs annotto
09:41:23,478 INFO [org.keycloak.services] (ServerService Thread Pool -- 59) KC-SERVICES0031: Import of realm 'annotto' requested. Strategy: IGNORE_EXISTING
09:41:23,688 FATAL [org.keycloak.services] (ServerService Thread Pool -- 59) Error during startup: java.lang.NullPointerException
at [email protected]//org.keycloak.services.managers.RealmManager.setupMasterAdminManagement(RealmManager.java:304)
at [email protected]//org.keycloak.services.managers.RealmManager.importRealm(RealmManager.java:525)
at [email protected]//org.keycloak.exportimport.util.ImportUtils.importRealm(ImportUtils.java:110)
at [email protected]//org.keycloak.exportimport.dir.DirImportProvider$4.runExportImportTask(DirImportProvider.java:138)
at [email protected]//org.keycloak.exportimport.util.ExportImportSessionTask.run(ExportImportSessionTask.java:35)
at [email protected]//org.keycloak.models.utils.KeycloakModelUtils.runJobInTransaction(KeycloakModelUtils.java:250)
at [email protected]//org.keycloak.exportimport.dir.DirImportProvider.importRealm(DirImportProvider.java:134)
at [email protected]//org.keycloak.exportimport.ExportImportManager.runImport(ExportImportManager.java:90)
at [email protected]//org.keycloak.services.resources.KeycloakApplication.bootstrap(KeycloakApplication.java:207)
at [email protected]//org.keycloak.services.resources.KeycloakApplication$1.run(KeycloakApplication.java:136)
at [email protected]//org.keycloak.models.utils.KeycloakModelUtils.runJobInTransaction(KeycloakModelUtils.java:250)
at [email protected]//org.keycloak.services.resources.KeycloakApplication.startup(KeycloakApplication.java:128)
at [email protected]//org.keycloak.provider.wildfly.WildflyPlatform.onStartup(WildflyPlatform.java:36)
at [email protected]//org.keycloak.services.resources.KeycloakApplication.<init>(KeycloakApplication.java:114)
Desktop (please complete the following information):
Description:
Currently, Annotto only allows for assigning a single tag to a word or group of words during annotation. However, there is a need to enhance Annotto's annotation capabilities by introducing the ability to handle overlaps in Named Entity Recognition (NER) annotations. This feature would enable users to assign multiple tags to a word, even if it is already part of a tagged group of words.
Implementation Details:
UI/UX: Update the user interface to support the viewing and management of overlapping annotations. This could involve incorporating visual cues or indicators to differentiate between different annotations on the same word.
Backend: No modification is expected on the backend, as the annotation management can remain the same. The existing data structure and storage mechanism can continue to handle annotations effectively.
Additional context:
Improved Annotation Flexibility: Enabling the assignment of multiple tags to a word provides users with increased flexibility in their annotations. This allows for more precise labeling of complex entities or multiple aspects within a single word.
Enhanced Training Data Quality: With the ability to assign overlapping tags, Annotto can produce higher-quality training datasets. This is particularly useful in cases where multiple entities coexist within the same text span.
Streamlined Annotation Process: By eliminating the need for manual workarounds or separate annotations for overlapping entities, this feature simplifies the annotation process, saving time and effort for users.
Compatibility: Ensure backward compatibility with existing annotated datasets and models, allowing for a smooth transition to the new overlapping annotation functionality.
The addition of overlap management to Annotto's NER annotations would significantly enhance its annotation capabilities, making it a more powerful tool for training machine learning models that require precise entity recognition.
Timeseries are structured data
t1 v1
t2 v2
...
The goal would be to select a span ti -> tj that would be labelled as anomaly.
The span could be all the timeserie (from the start to the end) in that case, this would be a timeserie categorisation task.
There would be multiple timeseries at once (not sure how to handle this case)
Describe the bug
When trying to add items to a (new or existing) project according to the structure of the 'item file' that is given as an example in the section 'Create projet' of the documentation (https://lajavaness.github.io/annotto-docs/fr/docs/user-manual/create-project),
an error is returned to the user and the Annotto docker container is then found exited.
Note that a possible fix is shared at the end of this issue.
To Reproduce
Steps to reproduce the behavior:
Expected behavior
Annotto should add three items to the selected project.
Screenshots
Desktop (please complete the following information):
Additional context
Server is on Docker 23.0.1 on Debian GNU/Linux 11 (bullseye) with Annotto release 1.2.7 (2023-03-30)
It seems that files in JSONL format usually do not have a comma separator between lines (in addition to LF or CR+LF separator), as it is strangely present in the example of 'items.jsonline' file, and somehow it is less strange that it does not work.
After suppressing commas between lines (at the end of each JSON line inside the jsonline file, except le last one which had no comma at the end) and keeping LF or CR+LF separators, the item file adding function of Annotto seems to work well.
One suggestion is to update the Annotto Documentation, by suppressing comma separators between lines in the example(s)*.
*: It has not been tested yet, but it could be the same problem with other jsonline examples like images.jsonline which also have comma separators which might also need to be removed.
see also :
https://github.com/lajavaness/annotto-docs/blob/main/docs/user-manual/create-project.md
The goal would be to change the way the configuration are handled. At the moment a merge system is in place based on NODE_ENV
but the system in place doesn't correctly take into account the possible fallback with environment variable. More precisely if NODE_ENV=development
and ENV_TEST_XXXXX=titi
and in the development.ts
file configuration we have
{ test: 'toto' }
but in the parent config we do have
{ test: process.env.ENV_TEST_XXXXX }
we will have test=toto
as the final merged configuration.
Describe the solution you'd like
We should refactor this solution by removing all ["development.ts", "production.ts", ...] files and just keep the parent config.ts
file to remove any ambiguity or use an already working solution like convict
Describe the bug
When I open a project and click on an item, I got this error shown in the screenshot with this stack trace
RangeError [ERR_HTTP_INVALID_STATUS_CODE]: Invalid status code: undefined
at new NodeError (node:internal/errors:387:5)
at ServerResponse.writeHead (node:_http_server:314:11)
at ServerResponse.writeHead (/usr/src/app/node_modules/on-headers/index.js:44:26)
at ServerResponse._implicitHeader (node:_http_server:305:8)
at write_ (node:_http_outgoing:867:9)
at ServerResponse.end (node:_http_outgoing:977:5)
at ServerResponse.send (/usr/src/app/node_modules/express/lib/response.js:221:10)
at ServerResponse.json (/usr/src/app/node_modules/express/lib/response.js:267:15)
at errorHandlerMiddleware (/usr/src/app/dist/src/utils/error.js:28:22)
at Layer.handle_error (/usr/src/app/node_modules/express/lib/router/layer.js:71:5)
at trim_prefix (/usr/src/app/node_modules/express/lib/router/index.js:315:13)
at /usr/src/app/node_modules/express/lib/router/index.js:284:7
at Function.process_params (/usr/src/app/node_modules/express/lib/router/index.js:335:12)
at next (/usr/src/app/node_modules/express/lib/router/index.js:275:10)
at Layer.handle_error (/usr/src/app/node_modules/express/lib/router/layer.js:67:12)
at trim_prefix (/usr/src/app/node_modules/express/lib/router/index.js:315:13)
Expected behavior
The item should open correctly.
Desktop (please complete the following information):
We would like to upgrade keycloak to the latest version : 22.0.0
Keycloak 22 offers
Bug description
We retrieve and run the latest ljnrepo/annotto image as specified:
docker run --rm -d --name annotto -p 3000:3000 ljnrepo/annotto:latest
We can log in through http://localhost:3000/ and we get the following home listing existing projects:
Accessing any project by clicking on its name or on the related Annotate
link yields to a never ending loader view with a An error occured
notification.
To Reproduce
Steps to reproduce the behavior:
Sign In
DEMO Zone and Text : CV - Extraction
(for instance, or any another project link to project/id/)Expected behavior
We don't know yet what should happen here ๐
Screenshots
N/A see previous sections
Desktop (please complete the following information):
Additional context
No particular suspicious logs from docker logs...
Did I miss something (at keycloak level or something)?
Describe the bug
When annotating an image, the annotation page opens but the image doesn't load.
To Reproduce
Steps to reproduce the behavior:
Expected behavior
The image should load
Additional context
On this project, all images are hosted on the s3 bucket "s3://pollentrack", which is accessible with the rnd-data AWS profile (but not the default one). Special credentials have been created that can only see and access this specific bucket, and these have been loaded in the config file.
Add the video datatype to the project as an available type, as well as to the items.
Initially, the display of video items will be in the HTML player https://developer.mozilla.org/fr/docs/Web/HTML/Element/video.
This issue requires a refactoring of the Annotation page component (see issue 15).
There should be no impact on the labeling types.
This functionality must be added in order to allow classification and transcription of videos.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.