Leon Derczynski's Projects
Data and software for building the ACL Anthology.
Official style files for papers submitted to venues of the Association for Computational Linguistics
ACL Rolling Review website
CMU ARK Twitter Part-of-Speech Tagger
Repository for code and metadata to support work described in "Authorless Topic Models: Biasing Models Away from Known Structure"
An experimental open-source attempt to make GPT-4 fully autonomous.
autoredteam: code for training models that automatically red team other language models
A curated list of awesome resources for Danish language technology
Public repo for HF blog posts
branchLSTM model from Turing at SemEval-2017 Task 8: Sequential Approach to Rumour Stance Classification with Branch-LSTM
Track and predict the energy consumption and carbon footprint of training deep learning models.
Automatically exported from code.google.com/p/cavat
preparation of a UIMA / GATE / etc. workshop at COLING 2014
Markdown-formatted Creative Commons licenses
Dataset of Teen Cyberbullying scenari in French
The Danish Gigaword project
DaNLP is a repository for Natural Language Processing resources for the Danish Language.
🤗 The largest hub of ready-to-use NLP datasets for ML models with fast, easy-to-use and efficient data manipulation tools
Dataset for the Emerging & Novel Entity NER task (WNUT '17)
A collection of corpora for named entity recognition (NER) and entity recognition tasks. These annotated datasets cover a variety of languages, domains and entity types.
framework for doing NER and other types of entity recognition, in Python
Library for fast text representation and classification.
Fighting Fantasy gamebook sound board
LLM vulnerability scanner