Comments (12)
Another example is ようこそ.
from kakugo.
Also 方 (かた) in the meaning "person".
from kakugo.
I used the words from JMdict for kakugo. The issue is that that dictionary is HUGE, I had to filter them. I chose to take only words that were "ichi1"
ichi1/2: appears in the "Ichimango goi bunruishuu", Senmon Kyouiku Publishing, Tokyo, 1998. (The entries marked "ichi2" were demoted from ichi1 because they were observed to have low frequencies in the WWW and newspapers.)
(source)
And even just that gives a lot of entries. It's an arbitrary choice, but it's hard to find a criteria to keep only words useful for a learner...
from kakugo.
要る is also missing. I think, there is something wrong with your method of filtering the dictionary.
from kakugo.
Out of curiosity I checked the given words in the latest JMDict english version I could at the source URL above.
ようこそ is listed as "ichi1" so not sure what happened there. Maybe the tags were different in the older JMDict.
The tags for 彼氏 are news2, nf36, spec2 so it doesn't seem to be that popular. nf36 indicates it's in the top 36000 words
方 is ichi1 but the translations are "direction, way"
Tags for 要る are news2, nf27, spec1 so it makes sense why it's missing.
A case could be made for adding words that have spec1 tags and those with with nf01 to nf10+ if they're not in the existing word list.
spec1 and spec2: a small number of words use this marker when they are detected as being common, but are not included in other lists.
from kakugo.
Hello, I'm posting here as it kind of join this issue of "which data to use"
The Kanji 和 is listed in N3, but it should actually be in N1 according to jisho https://jisho.org/search/%E5%92%8C
The list used as an input (https://www.tanos.co.uk/jlpt/) have this mistake.
I have no idea if there are more of them in this case, as I don't automatically check the JLPT lvl on jisho, but I'll try to be more attentive about it and will report them.
from kakugo.
@blastrock: Is the code you used to generate the dictionary also open-source? I poked around this repo and your other repos but I couldn't find them. I would really like to adapt it in a fork to generate a new dictionary which includes a lot more vocab. I know you would like to reduce the size as there are many "useless" words, but there are also many that I'm missing. I see the dictionary is in a gzipped sqlite db, but I'm hoping that I don't have to write my own scripts to add more vocab. I would just like to modify the filter you used.
I started out learning kanji and vocab just using Kakugo and I think it's by far the best app out there (thanks so much!). But now I'm attending Japanese classes and I realize that the book they use (いろどり, which is free) requires me to learn a lot of vocab that is missing. A couple of examples just from the current chapter:
- 再起動 (to restart)
- 変倍 (different size)
- 差出人 (sender)
If whatever scripts or code you used to generate the dictionary is open source, I can adapt it and make my own fork. I'd be happy to make pull-requests for any additional features I might also work on for myself (for example, I might add a different heuristic for auto-selecting vocab based on kanji, as the existing one selects 1000s of vocab words once you know a few 100 kanji).
from kakugo.
The script to generate the dictionary is not open source because it is quite ugly. I don't mind sharing it with a few people though. I'll push it to a private repo and add you to it if you want.
from kakugo.
@blastrock, it would really be great if you could grant me access to that script. Thank you!
from kakugo.
Done. The repo is in a poor state, don't hesitate to email me if you have any question.
from kakugo.
Thank you, I'll report back!
from kakugo.
I recently worked on this. In the latest release, all ichi1 and news1 words are included. For each word, I included multiple translations (kind of like kanji test). Also, it is now possible to show words that are usually written in kana actually in kanji, like 下さい and many others. This doesn't completely solve this issue, but greatly improves things I think.
from kakugo.
Related Issues (20)
- Screenshots for F-Droid
- Furigata or word+reading setting for vocabulary mode HOT 1
- (Another try) German translation HOT 5
- Backup and restore HOT 4
- [FEATURE REQUEST] AMOLED optimized black theme HOT 1
- [FEATURE REQUEST] Apply custom font only for kana & kanji.
- [feature request] Add pitch data
- Question: Logo characters HOT 2
- Custom fonts don't work HOT 1
- Romaji writing doesn't work HOT 2
- Stats box shows answer in kana drawing mode HOT 2
- Russian translation HOT 1
- [NOT A BUG] question about the writing HOT 1
- German translation not correctly imported HOT 3
- Duplicate answers displayed HOT 2
- corrupted data after phone change HOT 4
- Update to Latest Version of KanjiVG HOT 1
- Reading font unreadably small if there are many Hiragana/Katakana variants
- Add support for android 13 themed icon.
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from kakugo.