Coder Social home page Coder Social logo

Gege Sun's Projects

- icon -

搜索所有中文NLP数据集,附常用英文NLP数据集

advancedliteratemachinery icon advancedliteratemachinery

A collection of original, innovative ideas and algorithms towards Advanced Literate Machinery. This project is maintained by the OCR Team in the Language Technology Lab, Tongyi Lab, Alibaba Group.

aisystem icon aisystem

AISystem 主要是指AI系统,包括AI芯片、AI编译器、AI推理和训练框架等AI全栈底层技术

awesome-llm icon awesome-llm

Awesome-LLM: a curated list of Large Language Model

chatgpt-next-web icon chatgpt-next-web

A cross-platform ChatGPT/Gemini UI (Web / PWA / Linux / Win / MacOS). 一键拥有你自己的跨平台 ChatGPT/Gemini 应用。

coldataset icon coldataset

The official repository of the paper: COLD: A Benchmark for Chinese Offensive Language Detection

competition-baseline icon competition-baseline

数据挖掘、计算机视觉、自然语言处理、推荐系统竞赛知识、代码、思路

daily-interview icon daily-interview

Datawhale成员整理的面经,内容包括机器学习,CV,NLP,推荐,开发等,欢迎大家star

fairseq icon fairseq

Facebook AI Research Sequence-to-Sequence Toolkit written in Python.

flask-zh icon flask-zh

翻译自Miguel Grinberg的blog https://blog.miguelgrinberg.com 的2017年新版The Flask Mega-Tutorial教程

handson-ml icon handson-ml

⛔️ DEPRECATED – See https://github.com/ageron/handson-ml3 instead.

hanlp icon hanlp

中文分词 词性标注 命名实体识别 依存句法分析 成分句法分析 语义依存分析 语义角色标注 指代消解 风格转换 语义相似度 新词发现 关键词短语提取 自动摘要 文本分类聚类 拼音简繁转换 自然语言处理

harvesttext icon harvesttext

文本挖掘和预处理工具(文本清洗、新词发现、情感分析、实体识别链接、关键词抽取、知识抽取、句法分析等),无监督或弱监督方法

hfl-anthology icon hfl-anthology

Collections of resources from Joint Laboratory of HIT and iFLYTEK Research (HFL)

jionlp icon jionlp

中文 NLP 预处理、解析工具包,准确、高效、易用 A Chinese NLP Preprocessing & Parsing Package www.jionlp.com

keyword_extraction icon keyword_extraction

利用Python实现中文文本关键词抽取,分别采用TF-IDF、TextRank、Word2Vec词聚类三种方法。

lit icon lit

The Learning Interpretability Tool: Interactively analyze ML models to understand their behavior in an extensible and framework agnostic interface.

llmsurvey icon llmsurvey

The official GitHub page for the survey paper "A Survey of Large Language Models".

ml-visuals icon ml-visuals

🎨 ML Visuals contains figures and templates which you can reuse and customize to improve your scientific writing.

mnbvc icon mnbvc

MNBVC(Massive Never-ending BT Vast Chinese corpus)超大规模中文语料集。对标chatGPT训练的40T数据。MNBVC数据集不但包括主流文化,也包括各个小众文化甚至火星文的数据。MNBVC数据集包括新闻、作文、小说、书籍、杂志、论文、台词、帖子、wiki、古诗、歌词、商品介绍、笑话、糗事、聊天记录等一切形式的纯文本中文数据。

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.