Name: Xavier Tannier
Type: User
Company: Sorbonne Université, Inserm, Limics, Polytech Sorbonne
Bio: Professor at Sorbonne Université (formerly known as Univ. Pierre et Marie Curie — UPMC). Teaches at Polytech Sorbonne.
Researcher at LIMICS.
Location: Paris, France
Blog: http://xavier.tannier.free.fr/
Xavier Tannier's Projects
Extract title and creation time from web page.
Tool for fast concept and rule-based extraction for dummies.
Hyper-parameter optimization for sklearn
A tokenizer for French
NCRF++, an Open-source Neural Sequence Labeling Toolkit. It includes character LSTM/CNN, word LSTM/CNN and softmax/CRF components. (code for COLING/ACL 2018 paper)
Natural language structuring library
"Python Rule-based feAture sTructure Analysis" or "Python Rule-bAsed Text Analysis"
Transformers made simple with training, evaluation, and prediction possible with one line each. Currently supports Sequence Classification (binary, multiclass, multilabel, sentence pair), Token Classification (NER), Question Answering, Language Modeling, Regression, Conversational AI, and Multi-Modal tasks. Built on top of the Hugging Face Transformer library.
Extraction de termes
WebAnnotator is a tool for annotating Web pages. WebAnnotator is implemented as a Firefox extension (https://addons.mozilla.org/en-US/firefox/addon/webannotator/), allowing annotation of both offline and inline pages. The HTML rendering is fully preserved and all annotations consist in new HTML spans with specific styles. WebAnnotator provides an easy and general-purpose framework and is made available under CeCILL free license (close to GNU GPL — see the license text), so that use and further contributions are made simple. All parts of an HTML document can be annotated: text, images, videos, tables, menus, etc. The annotations are created by simply selecting a part of the document and clicking on the relevant type and subtypes. The annotated elements are then highlighted in a specific color. Annotation schemas can be defined by the user by creating a simple DTD representing the types and subtypes that must be highlighted. Finally, annotations can be saved (HTML with highlighted parts of documents) or exported (in a machine-readable format).
Yet Another SEquence Tagger