heta-io / tap Goto Github PK
View Code? Open in Web Editor NEWText Analytics Pipeline (TAP)
Home Page: https://heta-io.github.io/tap/
License: Apache License 2.0
Text Analytics Pipeline (TAP)
Home Page: https://heta-io.github.io/tap/
License: Apache License 2.0
This is different than the affective expressions, but could contribute to them.
From @andrewresearch on July 13, 2017 8:3
Copied from original issue: uts-cic/tap-api#22
From @andrewresearch on August 31, 2017 0:51
Copied from original issue: uts-cic/tap-graphql#5
From @andrewresearch on July 13, 2017 2:6
Allow for nlytx-commons dependency for travis CI build.
Copied from original issue: uts-cic/tap-api#18
As AWA may need to interact with TAP differently from other services, a dedicated AWA graphql endpoint would allow AWA specific schema additions. Also it may be possible to manage performance better by collecting different stats for AWA and general requests.
This can be copied from athanor server, but also need to include basic connection integration tests
We need to be able to load a text file that has common academic section headings (such as introduction, methodology/methods, results, discussion, conclusion) and retrieves the appropriate sections.
This would need to be able to pull text between major sections and strip out sub-headings.
It would also need to indicate a confidence level (i.e. if the sections were easily identifiable and therefore high confidence, or if there were complications which reduce the confidence of getting a good section).
It would be good for this code to be generalisable, so that 'profiles' could be created to suit different academic publishing formats.
Need to add NER to token annotations, but care needs to be taken that it is not too time consuming to produce.
We may need to set some additional input variables, so that this can be an option?
Need to be able to monitor tap performance and the retrieve near real time stats
We need to automate the deployment of releases to ECS.
Suggested approach is blue/green deployment
Code is available on the awslabs github page
Can we have documentation on what features are outputted from TAP please? We will need information on what each module does and if the output is at sentence or word level, since the modules work at different levels. Also, the TAP Athanor tags, for example, are different from the XIP parser analytical tags - can we have doc on what they mean?
Rather than wait for first query, ensure that Factorie has loaded all models at application start time.
Also ensure the language is loaded for LanguageTool.
It would be useful to be able to query the results of this startup process. Need to write this information to whatever gets used for logging system analytics and metadata.
The project is using paradox and sbt-site to create documentation as part of the build process.
The documentation is stored on GitHub pages.
We need the current project to be updated with stub documentation files in order to build structured documentation with sbt-site, and the GitHub pages updated with the stub pages to reflect the structure of the documentation.
From @andrewresearch on May 26, 2017 1:59
Can include sentence length
Copied from original issue: uts-cic/tap-api#6
Need to find correct Factorie variables for representing the depencies (as close to UD as possible) and ensure that these are correctly mapped to the token annotations.
This also needs documenting, as at the moment it could be a significant source of confusion (i.e. what is parent and child? What if there are multiple children?)
Apollo Engine Docs
Scala trace example
Possibly need to test the performance hit of this proxy to ensure it does not add too much overhead to the service.
Need a health endpoint that responds with a status 200 and basic json:
{
"message": "ok"
}
In particular, we need a common way of creating word and sentence vectors, so ideally we'll create an existing model or models, and then TAP can process new text using existing models.
From @andrewresearch on May 24, 2017 4:13
Copied from original issue: uts-cic/tap-api#4
For example, html encoded text prevents the sentence parser from processing sentences so no results are returned. In this case, it would be helpful to return a sensible message that states that the query appeared to be encoded rather than in raw text form.
From @andrewresearch on May 26, 2017 1:58
Need to implement expression analysis and attach to endpoint
Copied from original issue: uts-cic/tap-api#5
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.