Comments (7)
from gojieba.
哦,用法应该没啥问题,看来你这个词库数量级可能确实64G不够。。。
from gojieba.
@yanyiwu 好的,感谢🙏
from gojieba.
from gojieba.
建议是清理一下词库,看上去是词库建设不太合理。
from gojieba.
from gojieba.
第一步,将你的词库 5千万的数量级,分割为好几次处理,例如分为 500 个文件,那么每次就需要处理 10 万行。
第二步,将处理后的结果去重。
或者直接采用流的方式打开文档,每次读取一行然后分词处理。
from gojieba.
Related Issues (20)
- cgo 奔溃
- cc1.exe: sorry, unimplemented: 64-bit mode not compiled in HOT 1
- 容器化报错 HOT 1
- 报错,看不懂 HOT 4
- RemoveWord方法无效,而且AddWord针对英文词组无效
- 日期切分的问题
- 请问下,支持基于 TF-IDF 算法的关键词抽取么
- 编译时遇到Warming HOT 1
- 内存泄漏问题 HOT 1
- 1.3.0 版本在编译的时候有warning HOT 6
- undefined: gojieba.NewJieba HOT 5
- 加载自己的词库报错是什么原因??? HOT 1
- Process finished with the exit code -1073741819 (0xC0000005) HOT 1
- 1.4.1版本交叉编译问题 HOT 5
- windows下面运行初始化失败 HOT 3
- 调用完jieba.Extract后defer jieba.Free() 过一会就会报错退出程序 HOT 1
- NewJieba停用词文件,实际没有生效 HOT 1
- NewJieba如果只想要自定义用户词表,能否只传参用户词表的文件地址,其余使用默认值呢。目前如果只传参用户词表,会报错 HOT 2
- 内存释放有异常。当我操作了jieba.Free()。然后开始将一个50M的对象json marshal后存储到file中。程序会报错 HOT 1
- Can not run test on Mac M1 computer HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from gojieba.