Coder Social home page Coder Social logo

roshanson / textinfoexp Goto Github PK

View Code? Open in Web Editor NEW
1.7K 1.7K 772.0 78.4 MB

自然语言处理实验(sougou数据集),TF-IDF,文本分类、聚类、词向量、情感识别、关系抽取等

Python 28.88% C++ 17.56% HTML 13.77% Makefile 0.60% Java 16.70% M4 10.45% C 11.88% MATLAB 0.15%
nlp python

textinfoexp's People

Contributors

haibaoy avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

textinfoexp's Issues

遗漏的文件

楼主能把完整的文件发上来吗?运行的时候都缺少文本文件。

中文近义词库

hi, 你好

sogou的开放语料质量不错,wikidata也不错,下面是我做的一个word2vec模型。
https://github.com/huyingxi/Synonyms
欢迎对比和使用,一起优化,谢谢!

对此处给出的相似度计算方法:
https://github.com/Roshanson/TextInfoExp/tree/master/Part4_Word_Similarity/get_similarity

我们可以一起评测一下:
Synonyms使用https://github.com/fssqawj/SentenceSim/blob/master/train.txt 来寻找最佳的模型参数,然后在 https://github.com/fssqawj/SentenceSim/blob/master/dev.txt 达到了 88%的准确度。
详见:chatopera/Synonyms#6

关于数据集

想问下采用的是哪里的搜狗数据集作为训练用的,谢谢

关于训练集的问题

part4 词向量训练的语料完全木有说明,语料方便的话你上传一下,不方便的话,你好歹说明一下啊,比如用的什么语料,下载连接之类的?

关于part2_text_classify

本章 获取数据和标记中代码如下:
data = pd.read_table('Art.txt', header=None, sep=',')
data2 = pd.read_table('Computer.txt', header=None, sep=',')
data3 = pd.read_table('Sports.txt', header=None, sep=',')
但是在代码和相关资源中并未发现art.txt等三个文件,请问这三个文件是否可以上传一下?谢谢

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.