Comments (3)
please describe your experimental setup.
In our experimental setup, our model performs better than the other baselines.
from g2pm.
The pretrained g2pM model is even worse then pypinyin. The count of poly char is too much while the training corpus is too small. But even we had extend the corpus, the result is not so good.
from g2pm.
`
p1 = lazy_pinyin(sentence, style=Style.TONE3, neutral_tone_with_five=True)
print('pypinyin lazy')
print(p1)
model = G2pM()
p2 = model(sentence, tone=True, char_split=False)
print('g2m')
print(p2)`
Here is what I found where it may perform worth than pypinyin...
`
然而,他红了20年以后,他在长沙长大,也在长沙退休。
pypinyin lazy
['ran2', 'er2', ',', 'ta1', 'hong2', 'le5', '20', 'nian2', 'yi3', 'hou4', ',', 'ta1', 'zai4', 'chang2', 'sha1', 'zhang3', 'da4', ',', 'ye3', 'zai4', 'chang2', 'sha1', 'tui4', 'xiu1', '。']
g2m
['ran2', 'er2', ',', 'ta1', 'hong2', 'le5', '20', 'nian2', 'yi3', 'hou4', ',', 'ta1', 'zai4', 'chang2', 'sha1', 'chang2', 'da4', ',', 'ye3', 'zai4', 'chang2', 'sha1', 'tui4', 'xiu1', '。']`
from g2pm.
Related Issues (18)
- can you open train data?thanks
- polyphone's classification HOT 1
- which Chinese Bert model ? which repo? HOT 1
- how to new data? HOT 1
- It seems that the result of g2pM model is worse than that of pypinyin model?
- Two suggestions
- numpy and pytorch predict logits result don't match
- how to use Bert model?
- Can not reproduce the result.
- There are some polyphone words missed
- Training Data Explanation HOT 2
- Pronunciation of "A" HOT 1
- what does the special PinYin "xx5" used for HOT 2
- Why the count of polys in cedict is larger then that in corpus HOT 2
- suggestion to change some Pinyin style HOT 1
- Can you provide the complete code for training? HOT 3
- 论文示例里的数据输出错误 HOT 5
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from g2pm.