skywind3000 / ecdict Goto Github PK
View Code? Open in Web Editor NEWFree English to Chinese Dictionary Database
License: MIT License
Free English to Chinese Dictionary Database
License: MIT License
File "D:\workspace\dictionary\ECDICT\stardict.py", line 570, in delete_all
with self.__conn as c:
AttributeError: __enter__
python小白,这个问题是什么原因
没有commit函数,convert_dict会报错
delete_all中:
self.__cursor.execute(sql1)
self.__conn.commit()
需要改成
with self.__conn as c:
c.execute(sql1)
不知道为什么
写入mysql数据库的时候会有几种warning:
Warning: Incorrect string value
Warning: Data truncated for column
sqlite3.ProgrammingError: SQLite objects created in a thread can only be used in thread xxxx
StartDict sqlite class showes above exception with multi thread.
simply I just change "self.__conn = sqlite3.connect(self.__dbname, isolation_level = "IMMEDIATE",check_same_thread = False)", to add last parameter
for lookups/query, it works ok, but for writing, there may be issue.
如果按照ECDICT里,行长度进行排序,会发现一些在胡说八道的超长条目,包括:
This piece works well:
srcname = ur"bnc-words.csv"
dstname = ur"bnc-words.sqlite"
convert_dict(dstname, srcname)
Differ from the file stardict.csv
which was extracted from stardict.7z
:
srcname = ur"stardict.csv"
dstname = ur"stardict.sqlite"
convert_dict(dstname, srcname)
thrown an MemoryError
:
thrown same error when running a command-line,:
作者原始数据是否有疏漏或者出现了bug?
目前的match函数只能查询以特定单词开始的条目,be good at 用be可以查询到,可是用good查不到,希望实现查找包含“特定单词”的词组
查 submode 时发现字典中没有收录
翻译应该是子模式
Great work first of all! I was toying around with this.
To be honest I had trouble getting your example to work.
system: win7 x64
IDE: Wing IDE
python version: Python 2.7.10
An exception is thrown, when running linguist.py
. See image below.
The file named "bnc-lemma.txt" doesn't exists.
第一个API是查词(String 英文词)
, 返回所有字段
@skywind3000 如有注意事项/建议还请指点.
@现在的字典用户, 如有相关需求请分享一下, 在API/数据结构设计时尽量考虑.
多谢!
lemma.en.txt里of的词干是have,是不是错了,还是本来就是这样?
如果能的话就太好了
韦大你好:
在论坛发现 The little dict 用了你的数据,有几处需要修改,防止以讹传讹。
修改 digitalrepresentati,,,on 为digitalrepresentation:
digitalrepresentati,,,on
n. 数位表示法,,,,,0,0,,,
建议修改
:
excitated,,,v.
vbl. 激发,,,,,0,0,,,
electroncephaloscop,,,e
n. 电脑波测量器,,,,,0,0,,,
frndamental,,,"n. 基本原理,基音,基谐波
a. 基本的,重要的,",,,,,0,0,,,
建议修改 左括号
aardvarks,ˈɑ:dvɑ:ks,-s form of aardvark\nn. nocturnal burrowing mammal of the grasslands of Africa that feeds on termites; sole extant representative of the order Tubulidentata,n. 土豚( aardvark的名词复数),,,,,0,0,0:aardvark/1:s3,,
建议加入:https://github.com/leosongwei/cl-astrodict
谢谢!很喜欢你的字典。
嗨,请问这个词典有办法导出一个词表吗?一行一个词。
如
foo
bar
baz
...
比如:book、look、go、mine
7,8钟过去了才转10%左右。是我操作有误?
字典太大了,预览起来很麻烦。能不能提供一个只包含几十个单词的预览版,方便其他人阅读字典的格式?
大概想法很简单:
say(mac only)
用于发音实现很简单:
~/Documents/dict/ecdict.db
mkdir ~/Documents/dict
sqlite3 ~/Documents/dict/ecdict.db
sqlite>.mode csv
sqlite>.import ecdict.csv dict
sqlite3 ~/Documents/dict/ecdict.db 'select word from dict;' > ~/Documents/dict/all.txt
~/bin/ecdict
中,并chmod +x ~/bin/ecdict
#!/bin/sh
function __dict_search() {
echo $(sqlite3 ~/Documents/dict/ecdict.db "select word,phonetic,definition,translation,tag,frq,exchange from dict where word = '$1';" | awk -F '|' '{printf "%s [%s]\\n\\n%s\\n\\n%s\\n\\n%s\\n\\nfreq:%s\\n\\n%s", $1,$2,$3,$4,$5,$6,$7}')
}
__dict_search "$@"
贴到~/.bashrc
或者~/.profile
之类的rc文件里,souce一下就能使用ec
来做搜索了
function _ecdict() {
cmd='ecdict {}'
cat ~/Documents/dict/all.txt| fzf --bind "enter:execute(say {} &> /dev/null &)" --preview="$cmd" --preview-window=top:80%
}
alias ec=_ecdict
效果很好,回车可以发音。
感谢作者前期大量的工作!非常棒的数据库!
function format($str){ //default: $str=preg_replace('/p:/', " 过去式: ", $str); $str=preg_replace('/d:/', " 过去完成: ",$str); $str=preg_replace('/i:/', " 现在分词: ",$str); $str=preg_replace('/r:/', " 比较级: ",$str); $str=preg_replace('/t:/', " 最高级: ",$str); $str=preg_replace('/s:/', " 名词复数: ",$str); $str=preg_replace('/3:/', " 第3人单数: ",$str); $str=preg_replace('/\//', "",$str); return "\n词形变化:".$str; } return ""; }
严格来讲senario这个单词是错的,正确形式是scenario
import stardict
import os
my_dict = stardict.StarDict('stardict.db', verbose=True)
print('my_dict.match--------', my_dict.match('senario'))
返回
my_dict.match-------- [(2710465, 'senario'), (2710466, 'senarius'), (2710467, 'Senarmont'), (2710468, 'Senarmont compensator'), (2710469, 'senarmont plate'), (2710470, 'Senarmont polarimeter'), (2710471, 'Senarmont prism'), (2710472, 'senarmontite'), (2710473, 'senaru'), (2710474, 'senary')]
这里面没有正确形式scenario,这个结果不太理想,用hunspell感觉不错,如下
>>> from hunspell import Hunspell
>>> h = Hunspell();
>>> h.suggest('senario')
('scenario', 'arisen')
非常感谢作者!
我的脚本发现一些exchange相关的错误,大概100多条,主要是:
oov exchange,可能原因
1、词典错误,例如action是名词不该有actioning等变形
2、单词的某种变形与某个缩略语相同导致的。例如aid的复数aids与AIDS(艾滋)相同。
case-sensitive exchange
3、首字母大小写混乱导致的
bad exchange
4、exchange里某项只列了key没有列出value,例如bath的进行时:s:baths/3:baths/i:/p:bathed/d:bathed
warning: case-sensitive exchange. word=aboriginal, exchange=s:aboriginals
error: oov exchange. word=action, exchange=i:actioning
error: oov exchange. word=action, exchange=p:actioned
error: oov exchange. word=ad, exchange=s:ads
error: oov exchange. word=aid, exchange=s:aids
error: oov exchange. word=aid, exchange=3:aids
error: oov exchange. word=aim, exchange=s:aims
error: oov exchange. word=aim, exchange=3:aims
warning: case-sensitive exchange. word=amalgamate, exchange=d:amalgamated
warning: case-sensitive exchange. word=amalgamate, exchange=p:amalgamated
warning: case-sensitive exchange. word=American, exchange=s:americans
warning: case-sensitive exchange. word=Arabian, exchange=s:arabians
warning: case-sensitive exchange. word=are, exchange=s:ares
warning: case-sensitive exchange. word=aria, exchange=s:arias
error: oov exchange. word=armored, exchange=0:armore
warning: case-sensitive exchange. word=bacchanal, exchange=s:bacchanals
error: bad exchange. word=bath, exchange=i:
error: oov exchange. word=bedclothes, exchange=0:bedclothe
warning: case-sensitive exchange. word=bishop, exchange=s:bishops
warning: case-sensitive exchange. word=bower, exchange=s:bowers
warning: case-sensitive exchange. word=bower, exchange=3:bowers
warning: case-sensitive exchange. word=bower, exchange=f:bowers
warning: case-sensitive exchange. word=brook, exchange=i:brooking
warning: case-sensitive exchange. word=brown, exchange=s:browns
warning: case-sensitive exchange. word=brown, exchange=i:browning
warning: case-sensitive exchange. word=brown, exchange=3:browns
warning: case-sensitive exchange. word=Buddhist, exchange=s:buddhists
warning: case-sensitive exchange. word=burrow, exchange=s:burrows
warning: case-sensitive exchange. word=burrow, exchange=3:burrows
warning: case-sensitive exchange. word=Byzantine, exchange=s:byzantines
warning: case-sensitive exchange. word=Canadian, exchange=s:canadians
error: bad exchange. word=can, exchange=i:
error: bad exchange. word=can, exchange=d:
warning: case-sensitive exchange. word=catholic, exchange=s:catholics
warning: case-sensitive exchange. word=chamber, exchange=s:chambers
warning: case-sensitive exchange. word=chamber, exchange=3:chambers
warning: case-sensitive exchange. word=Christmas, exchange=f:Christmases
warning: case-sensitive exchange. word=clement, exchange=s:clements
warning: case-sensitive exchange. word=Conservative, exchange=s:conservatives
warning: case-sensitive exchange. word=cozen, exchange=3:cozens
error: oov exchange. word=crossbones, exchange=0:crossbone
warning: case-sensitive exchange. word=Cuban, exchange=s:cubans
error: oov exchange. word=deforest, exchange=0:defore
warning: case-sensitive exchange. word=dive, exchange=3:dives
warning: case-sensitive exchange. word=dive, exchange=s:dives
warning: case-sensitive exchange. word=down, exchange=i:downing
warning: case-sensitive exchange. word=Easter, exchange=s:easters
error: oov exchange. word=evenhanded, exchange=0:evenhand
warning: case-sensitive exchange. word=eve, exchange=s:eves
error: oov exchange. word=frank, exchange=3:franks
warning: case-sensitive exchange. word=French, exchange=f:Frenches
warning: case-sensitive exchange. word=Friday, exchange=s:fridays
warning: case-sensitive exchange. word=gibbon, exchange=s:gibbons
warning: case-sensitive exchange. word=gold, exchange=i:golding
warning: case-sensitive exchange. word=Greek, exchange=s:greeks
warning: case-sensitive exchange. word=grove, exchange=s:groves
warning: case-sensitive exchange. word=hall, exchange=i:halling
warning: case-sensitive exchange. word=hawk, exchange=i:hawking
warning: case-sensitive exchange. word=headteacher, exchange=s:headteachers
warning: case-sensitive exchange. word=helm, exchange=s:helms
warning: case-sensitive exchange. word=higher, exchange=s:highers
warning: case-sensitive exchange. word=hilly, exchange=r:hillier
error: oov exchange. word=hip, exchange=s:hips
error: bad exchange. word=hyphen, exchange=i:
error: bad exchange. word=hyphen, exchange=d:
warning: case-sensitive exchange. word=Iceland, exchange=s:icelands
error: oov exchange. word=international, exchange=s:internationals
error: oov exchange. word=labored, exchange=0:labore
warning: case-sensitive exchange. word=lady, exchange=s:ladies
warning: case-sensitive exchange. word=lance, exchange=i:lancing
warning: case-sensitive exchange. word=Latin, exchange=s:latins
warning: case-sensitive exchange. word=mall, exchange=i:malling
warning: case-sensitive exchange. word=mar, exchange=3:mars
warning: case-sensitive exchange. word=Marxism, exchange=s:marxisms
warning: case-sensitive exchange. word=Marxist, exchange=s:marxists
warning: case-sensitive exchange. word=meadow, exchange=s:meadows
error: bad exchange. word=matter, exchange=i:
warning: case-sensitive exchange. word=mesh, exchange=p:meshed
warning: case-sensitive exchange. word=mesh, exchange=d:meshed
warning: case-sensitive exchange. word=Mexican, exchange=s:mexicans
warning: case-sensitive exchange. word=mire, exchange=d:mired
warning: case-sensitive exchange. word=mire, exchange=p:mired
warning: case-sensitive exchange. word=Monday, exchange=s:mondays
error: oov exchange. word=motley, exchange=b:motlier
error: oov exchange. word=motley, exchange=z:motliest
error: oov exchange. word=motley, exchange=f:motleys
error: oov exchange. word=neighboring, exchange=0:neighbore
warning: case-sensitive exchange. word=nut, exchange=i:nutting
error: oov exchange. word=odometer, exchange=0:odomete
warning: case-sensitive exchange. word=onion, exchange=s:onions
warning: case-sensitive exchange. word=penance, exchange=s:penances
error: oov exchange. word=pet, exchange=s:pets
error: oov exchange. word=pet, exchange=3:pets
warning: case-sensitive exchange. word=polished, exchange=0:polish
warning: case-sensitive exchange. word=pope, exchange=s:popes
error: oov exchange. word=pot, exchange=s:pots
error: oov exchange. word=pot, exchange=3:pots
warning: case-sensitive exchange. word=potter, exchange=s:potters
warning: case-sensitive exchange. word=potter, exchange=3:potters
warning: case-sensitive exchange. word=provisional, exchange=s:provisionals
error: oov exchange. word=purple, exchange=z:purplest
warning: case-sensitive exchange. word=regent, exchange=s:regents
warning: case-sensitive exchange. word=republican, exchange=s:republicans
error: oov exchange. word=replete, exchange=b:more replete
error: oov exchange. word=replete, exchange=z:most replete
error: bad exchange. word=reproof, exchange=d:
warning: case-sensitive exchange. word=rocket, exchange=s:rockets
warning: case-sensitive exchange. word=rocket, exchange=3:rockets
warning: case-sensitive exchange. word=rowdy, exchange=s:rowdies
error: oov exchange. word=rim, exchange=s:rims
error: oov exchange. word=rim, exchange=3:rims
warning: case-sensitive exchange. word=Saturday, exchange=s:saturdays
warning: case-sensitive exchange. word=scotsman, exchange=s:scotsmen
warning: case-sensitive exchange. word=Scottish, exchange=s:scottish
warning: case-sensitive exchange. word=Scottish, exchange=0:scottish
warning: case-sensitive exchange. word=seller, exchange=s:sellers
warning: case-sensitive exchange. word=shepherd, exchange=s:shepherds
warning: case-sensitive exchange. word=shepherd, exchange=3:shepherds
error: bad exchange. word=sick, exchange=i:
error: bad exchange. word=sick, exchange=p:
error: bad exchange. word=sick, exchange=d:
error: oov exchange. word=ski, exchange=d:ski'd
warning: case-sensitive exchange. word=Soviet, exchange=s:soviets
warning: case-sensitive exchange. word=spear, exchange=s:spears
warning: case-sensitive exchange. word=spear, exchange=3:spears
warning: case-sensitive exchange. word=staple, exchange=s:staples
warning: case-sensitive exchange. word=staple, exchange=3:staples
warning: case-sensitive exchange. word=state, exchange=s:states
warning: case-sensitive exchange. word=state, exchange=3:states
warning: case-sensitive exchange. word=summer, exchange=s:summers
warning: case-sensitive exchange. word=summer, exchange=3:summers
warning: case-sensitive exchange. word=Sunday, exchange=s:sundays
warning: case-sensitive exchange. word=synoptic, exchange=s:synoptics
warning: case-sensitive exchange. word=tar, exchange=i:tarring
warning: case-sensitive exchange. word=Thursday, exchange=s:thursdays
warning: case-sensitive exchange. word=transformer, exchange=s:transformers
warning: case-sensitive exchange. word=Trident, exchange=s:tridents
error: oov exchange. word=trip, exchange=s:trips
error: oov exchange. word=trip, exchange=3:trips
error: oov exchange. word=trumpeter, exchange=0:trumpete
warning: case-sensitive exchange. word=Tuesday, exchange=s:tuesdays
warning: case-sensitive exchange. word=twin, exchange=s:twins
warning: case-sensitive exchange. word=twin, exchange=3:twins
error: oov exchange. word=unblemished, exchange=0:unblemish
error: oov exchange. word=unexpected, exchange=0:unexpect
error: oov exchange. word=ungrudging, exchange=0:ungrudge
error: oov exchange. word=unnoticed, exchange=0:unnotice
error: oov exchange. word=unstinting, exchange=0:unstint
error: oov exchange. word=unthreatening, exchange=0:unthreaten
error: oov exchange. word=up, exchange=s:ups
error: oov exchange. word=upbraid, exchange=0:upbray
error: oov exchange. word=vine, exchange=s:vines
warning: case-sensitive exchange. word=vicar, exchange=s:vicars
warning: case-sensitive exchange. word=walkman, exchange=s:walkmans
warning: case-sensitive exchange. word=ware, exchange=i:waring
warning: case-sensitive exchange. word=Wednesday, exchange=s:wednesdays
warning: case-sensitive exchange. word=wilt, exchange=3:wilts
warning: case-sensitive exchange. word=winter, exchange=s:winters
warning: case-sensitive exchange. word=winter, exchange=3:winters
error: bad exchange. word=worst, exchange=i:
error: bad exchange. word=worst, exchange=p:
比如我在必应在线词典查询senario
https://cn.bing.com/dict/search?q=senario&go=搜索&qs=ds&form=Z9LH5
这个单词拼写不对,然后它给出了“你是不是要找……”的如下内容,请问这是什么算法?本项目里有用到吗?
您要找的是不是
音近词
scenario
(可能发生的)情况, 前景; 剧情概要, 剧本;方案,方式,规划
scenarios
(可能发生的)情况, 前景; 剧情概要, 剧本;方案,方式,规划
stereo
立体(声)的
stereos
立体(声)的
scenery
风景;景色;风光;景致;背景;风景画;舞台面;scenery布景
形近词
scenario
(可能发生的)情况, 前景; 剧情概要, 剧本;方案,方式,规划
非常感谢制作这个非常棒的词典!
只是目前多音词的音标不支持多音标,是否考虑支持呢?
比如单词 wind 在表示不同词意时候有 [wind]
和 [waind]
两个发音;而单词 record 在词性不同时候有 ['rekɔ:d]
和 [ri'kɔ:d]
两个发音,等等。
初中高中词库有具体分出 初一 初二那种年级分类吗?
比如所谓四六级考试大纲是否出自 2016 年的词表?http://cet.neea.edu.cn/html1/folder/16113/1588-1.htm
考研英语呢?
托福雅思好像没有官方词表,不知数据是哪里来的。
所谓宝即 GRE 红宝书,又是第几版呢?
如果作者可以把发音和例句可以补充一下,这样就更好了,谢谢~
我想用exchange的0的内容去掉一个词的时态
mirroring的exchange内容0:mirrore/1:i/s:mirrorings, 这个mirrore有问题吧?
还有不少词有类似问题, 比如strolled的exchange内容0:strol/1:pd, 这个strol不是一个词啊.
这是bug吗?
我查了一下, 必应版对应的内容也是一样的, 应该所有版本都有类似问题
请问这个数据库可以直接商用吗?我想做一个英语软件,使用这个数据库。
Great work! One year has already passed by the latest release version. Do you have any plan to release a new version? Thanks.
有现成的sql可以分享,我试了好多次都不成功
例如malignant这个词,考试收录范围是“研托雅宝”,前三个应该是考研、托福、雅思,最后一个猜不出来,难道是红宝书?
找了论坛了wiki都没有看到。
如果能确定一些基础的分类,像一些中英图解词典那样,如交通、动物。给词条再添加一个类别字段 ,就又强大了许多。😊
例如,当我查找 incorrect 时,我希望 incorrectly 也显示出来;
另外,如果correct 和 correctly 也能够显示出来就更好了。
我这么用
import stardict
dict = stardict.StarDict(r'C:\Users\i\Downloads\简明增强-css\concise-enhanced.mdx.db', verbose = True)
result = dict.query('hi')
print(result)
为什么没有查询结果啊?
treadmill 单词另外还有跑步机和走步机的含义
目前我知道这几个词典格式:mdx、eudic、stardict、CSV、SQLite、MySQL
想知道在哪种词典格式里从查询单词到返回释义结果的时间最短?
目前从这里了解到的是
本地使用 csv很痛苦,文件大了打开速度很慢。所以自己使用时,一般都是转换成本地的 SQLite 数据库,这样快速很多,基本没有等待,查单词也很迅速。
但是并没有提及其他几个格式的情况,请大家发表下看法
我看这个网站上面说是要购买才行啊,有免费的吗?
https://www.wordfrequency.info/
下载中途提示网络错误
论坛的字典和26版本的字典有sw字段
见到很多 pos 的值为 j:100 m:100
请问一下这是什么意思?
In "stardict.py", cursors are not closed after using, not sure whether there will be memory leak issue.
But in sqlite3 c source code, the "close" function is not empty pysqlite_cursor_close:
(void)pysqlite_statement_reset(self->statement);
Py_CLEAR(self->statement);
However, I think it is better to close cursors ASAP, saving memory when running on cheaper cloud VMs.
也看了pdawiki的下载链接,里面也没有这个格式的啦
非常感谢!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.