julius-speech / dictation-kit Goto Github PK
View Code? Open in Web Editor NEWJapanese dictation kit using Julius
License: Other
Japanese dictation kit using Julius
License: Other
I followed the tutorial in the readme.md and run run-osx-gnn.sh but got the following errors:
STAT: include config: main.jconf STAT: include config: am-dnn.jconf STAT: parsing option string: "-htkconf model/dnn/config.lmfb -cvn -cmnload model/dnn/norm -cmnstatic" Stat: para: parsing HTK Config file: model/dnn/config.lmfb Warning: para: "SOURCEFORMAT" ignored (not supported, or irrelevant) Warning: para: "SOURCEKIND" ignored (not supported, or irrelevant) Stat: para: SOURCERATE=625 Warning: para: TARGETKIND skipped (will be determined by AM header) Stat: para: TARGETRATE=100000.0 Warning: para: "SAVECOMPRESSED" ignored (not supported, or irrelevant) Warning: para: "SAVEWITHCRC" ignored (not supported, or irrelevant) Stat: para: WINDOWSIZE=250000.0 Stat: para: USEHAMMING=T Stat: para: PREEMCOEF=0.97 Stat: para: NUMCHANS=40 Stat: para: ENORMALISE=F Warning: para: "BYTEORDER" ignored (not supported, or irrelevant) STAT: jconf successfully finalized STAT: *** loading AM00 _default Stat: init_phmm: Reading in HMM definition Error: init_phmm: failed to read model/dnn/binhmm.SID ERROR: m_fusion: failed to initialize AM ERROR: Error in loading model
Also I got the same error on my windows.
I installed git-lfs but just got a total size of 348MB entity.I'm not sure if this has something to do with the error.
I am on Ubuntu 16.04 and got:
$ bash run-linux-dnn.sh
STAT: include config: main.jconf
STAT: include config: am-dnn.jconf
STAT: parsing option string: "-htkconf model/dnn/config.lmfb -cvn -cmnload model/dnn/norm -cmnstatic"
Stat: para: parsing HTK Config file: model/dnn/config.lmfb
Warning: para: "SOURCEFORMAT" ignored (not supported, or irrelevant)
Warning: para: "SOURCEKIND" ignored (not supported, or irrelevant)
Stat: para: SOURCERATE=625
Warning: para: TARGETKIND skipped (will be determined by AM header)
Stat: para: TARGETRATE=100000.0
Warning: para: "SAVECOMPRESSED" ignored (not supported, or irrelevant)
Warning: para: "SAVEWITHCRC" ignored (not supported, or irrelevant)
Stat: para: WINDOWSIZE=250000.0
Stat: para: USEHAMMING=T
Stat: para: PREEMCOEF=0.97
Stat: para: NUMCHANS=40
Stat: para: ENORMALISE=F
Warning: para: "BYTEORDER" ignored (not supported, or irrelevant)
STAT: jconf successfully finalized
STAT: *** loading AM00 _default
Stat: init_phmm: Reading in HMM definition
Error: init_phmm: failed to read model/dnn/binhmm.SID
ERROR: m_fusion: failed to initialize AM
ERROR: Error in loading model
Can someone give me a hint or workaround?
can you show me how to do speech2text from a 16kHz 16bit wav file to text?
Lately, my CI have been doing a lot of git lfs pulls.
As a result, we can't download any more.
I promise not to use git lfs pull too much after this.
I really apologize for any inconvenience this may cause you...
$ git clone https://github.com/julius-speech/dictation-kit.git
$ git lfs pull
Git LFS: (0 of 28 files) 0 B / 851.93 MB
batch response: This repository is over its data quota. Account responsible for LFS bandwidth should purchase more data packs to restore access.
error: failed to fetch some objects from 'https://github.com/julius-speech/dictation-kit.git/info/lfs'
I would like to send raw audio stream by network socket and receive recognition result from network output, so there is how I set the julius engine parameters:
Julius ... -input adinnet -adport 5532 -module
By this, I am able to send an audio stream to julius on port 5532, and receive XML format result from port 10500.
def send_audio(audiodata):
header = struct.pack('i', len(audiodata))
julius_socket_sender.sendall(header + audiodata)
def recv_trans(self):
while True:
trans = julius_trans_receiver.recv(4096).decode('utf-8')
Julius is a huge system already, so I am not sure if there is any "correct" way to achieve my purpose.
## 2. Couldn't receive result at first attempted
At the very beginning, I send a chunk of a raw audio stream to 5532, then try to receive a result from 10500:
send_audio(audio_stream)
recv_trans()
By this, i am expecting to receive full result XML format. but what I received is only part of it
<STARTPROC>
.
<INPUT STATUS="LISTEN" TIME="132747392" />
.
However, after re-sending the audio stream again, I am able to receive a full result.
So, my question is, is it expected behavior?
In fact, that is ecpected behavior as mentioned in the connecttion message, sorry for that.
I am trying to write a python3 client library, is it possible to have any exist documents for all communication protocols of julius, such as data structure of vecnet, adinnet and module ?
Hey there. I am trying to setup an API endpoint for Julius system.
How can I enable the client to stream audio into the system without installing adintool?
I have downloaded the Japanese dictation Kit (Julius v4.5), and was trying to run:
./run-osx-dnn.sh
But the is an error:
Error: init_phmm: failed to read binhmm.SID
I am using the default jconf setting file as given, so it should works fine but it doesn’t.
I am sure the binhmm.SID file and the logicalTri.bin file are in the right directory, so the program should locate it fine.
What could cause the problem? And how should I fix it?
Thank you,
Xinlei
I was trying dictation-kit with default settings, while the output seemingly to be messy.
I did not know which charset to use, as I had tried chcp 65001
(utf-8), chcp 50220
(Japanese) and several other recommended charsets online.
私は先日デフォルトでキットをつかっていましたが、出力が文字化けしていました。
chcp 65001(utf-8)と50220が違っていますが、どの文字セットが正しいますか?
I want to use this kit in my Java website to recognition user voice (real time). I haven't done it yet. I have some questions as below:
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.