Philip May's Projects
Useful extra functionality for TensorFlow 2.0 maintained by SIG-addons
✨ Argilla: Open-source platform empowering teams to build better language models through human feedback
Go ahead and axolotl questions
Prometheus exporter for Btrfs
Bitnami Helm Charts
German T5 Training corpus
Pruner for (nested) cross-validation
Native iOS app using the exposure notification framework from Apple.
🤗 The largest hub of ready-to-use NLP datasets for ML models with fast, easy-to-use and efficient data manipulation tools
Toolset to create a German NLP training set with labels from German wikipedia as a reference for NLP experiments.
Python scripts to process german wiki dump. This is to generate a german text corpus for supervised word representation learning. Especially for training an BILM.
The corresponding code from our paper "DeCLUTR: Deep Contrastive Learning for Unsupervised Textual Representations". Do not hesitate to open an issue if you run into any trouble!
Detectron2 is FAIR's next-generation platform for object detection and segmentation.
TensorFlow documentation
A PyTorch implementation of EfficientNet
Repository for my Profils README
ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators
Eniak Blog with Sphinx
:house_with_garden: Fast & easy transfer learning for NLP. Harvesting language models for the industry.
Library for fast text representation and classification.
Experiments with Kaggle's Credit Card Fraud Detection
During my research, I happen to enjoy implementing algorithms from scratch. Through Fundamentals, I like to share my experiences with others.
The official Python client for the Huggingface Hub.
Distributed Asynchronous Hyperparameter Optimization in Python
Deep Learning for humans
Preprint: Less: Selecting Influential Data for Targeted Instruction Tuning
LLM Training Data
LLM training code
Longformer: The Long-Document Transformer