Coder Social home page Coder Social logo

pannous / tensorflow-speech-recognition Goto Github PK

View Code? Open in Web Editor NEW
2.2K 190.0 643.0 31.87 MB

🎙Speech recognition using the tensorflow deep learning framework, sequence-to-sequence neural networks

License: Other

Python 98.02% Swift 1.98%
tensorflow speech-recognition neural-network deep-learning stt speech-to-text

tensorflow-speech-recognition's Introduction

Tensorflow Speech Recognition

Speech recognition using google's tensorflow deep learning framework, sequence-to-sequence neural networks.

Replaces caffe-speech-recognition, see there for some background.

Update 2024: Use Whisper !

This (relatively) old project is NO LONGER UP TO DATE.
The tensorflow 1.0 used is not compatible anymore and the theory is no longer state of the art either.
We highly recommend you check out and use whisper

Update 2020: Mozilla released DeepSpeech

They achieve good error rates. Free Speech is in good hands, go there if you are an end user. For now this project is only maintained for educational purposes.

Ultimate goal

Create a decent standalone speech recognition for Linux etc. Some people say we have the models but not enough training data. We disagree: There is plenty of training data (100GB here and 21GB here on openslr.org , synthetic Text to Speech snippets, Movies with transcripts, Gutenberg, YouTube with captions etc etc) we just need a simple yet powerful model. It's only a question of time...

Sample spectrogram, That's what she said, too laid?

Sample spectrogram, Karen uttering 'zero' with 160 words per minute.

Installation

clone code

git clone https://github.com/pannous/tensorflow-speech-recognition
cd tensorflow-speech-recognition
git clone https://github.com/pannous/layer.git
git clone https://github.com/pannous/tensorpeers.git

pyaudio

requirements portaudio from http://www.portaudio.com/

git clone  https://git.assembla.com/portaudio.git
./configure --prefix=/path/to/your/local
make
make install
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/path/to/your/local/lib
export LIDRARY_PATH=$LIBRARY_PATH:/path/to/your/local/lib
export CPATH=$CPATH:/path/to/your/local/include
source ~/.bashrc

install pyaudio

pip install pyaudio

Getting started

Toy examples: ./number_classifier_tflearn.py ./speaker_classifier_tflearn.py

Some less trivial architectures: ./densenet_layer.py

Later: ./train.sh ./record.py

Sample spectrogram or record.py

Update: Nervana demonstrated that it is possible for 'independents' to build speech recognizers that are state of the art.

Fun tasks for newcomers

Extensions

Extensions to current tensorflow which are probably needed:

Even though this project is far from finished we hope it gives you some starting points.

Looking for a tensorflow collaboration / consultant / deep learning contractor? Reach out to [email protected]

tensorflow-speech-recognition's People

Contributors

camelshang avatar pannous avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

tensorflow-speech-recognition's Issues

Requirements File, Installation Guide

Hello,

The requirements.txt file is missing, I was not able to install the project.
It would be great if we can have an installation guide.

Thanks in advance,
Regards

Speaker Classification Clarification

Hello, I was investing your speaker classification example that uses TFLearn. I had a question about the test audio sample that was used to test the model. I may be mistaken, but I believe that this sample is inside the training set which would not be ideal for testing. Why is this (or isn't this if I am wrong) done?

Thank you in advance for your help!

everything works fine except that patterns recognition index is not retrieved

Greetings

I did the following:

  • cloned the repository;
  • installed pre-requisites
  • trained my dataset using ./number_classifier_tflearn.py
  • ran ./record.py
  • erased the repository and repeated the steps copying the output to gist

as stated below
https://gist.github.com/tiagmoraismorgado/673ca5de5317a1583761a314e7d38ab1

even though, everything works fine except that patterns recognition index is not retrieved tensorflow records voice but it doesn't return recognized pattern index. looking forward for help

Failed to find any matching files for tflearn.lstm.model in speech2text-tflearn.py

speech2text-tflean.py fails with the error:
Failed to find any matching files for tflearn.lstm.model

or
ValueError: Restore called with invalid save path: 'tflearn.lstm.model'. File path is: 'tflearn.lstm.model'

on both linux and windows.

on lines:
model.load("tflearn.lstm.model")
and
model.save("tflearn.lstm.model")

thanks..

How to augment data for the spectogram?

Any sample code to do data augmentation on the spectrograms?
Observed that for spectrogram words, they are mainly in 160 format. How do you get other variations such as 40, 60 +? What kind of transformation is that?

Thanks!

Any solution for the error -> Exception: Invalid objective: catagorical_crossentropy

I came across with some errors. I solved some of them but I can't solve this one.
Speech_data.py comes with error below.

Looking for data spoken_numbers_pcm.tar in data/
Extracting data/spoken_numbers_pcm.tar to data/
'tar' is not recognized as an internal or external command,
operable program or batch file.
Data ready!
loaded batch of 2402 files
Traceback (most recent call last):
File "demo.py", line 15, in
net = tflearn.regression(net, optimizer='adam', learning_rate=learning_rate, loss='catagorical_crossentropy')
File "C:\python35\lib\site-packages\tflearn\layers\estimator.py", line 174, in regression
loss = objectives.get(loss)(incoming, placeholder)
File "C:\python35\lib\site-packages\tflearn\objectives.py", line 10, in get
return get_from_module(identifier, globals(), 'objective')
File "C:\python35\lib\site-packages\tflearn\utils.py", line 25, in get_from_module
raise Exception('Invalid ' + str(module_name) + ': ' + str(identifier))
Exception: Invalid objective: catagorical_crossentropy

What is this error and how to do I resolve this?

Can someone tell me how to fix this issue? Please

TODOs

  • split test set(s) #28
  • make input->chars converge well (input->class works well already)
  • sliding window
  • merge WarpCTC or alternative
  • peer2peer training!

ValueError: No variables to save

I am getting error while running the examples; number_calssifier_tflearn.py and speaker_classifier_tflearn.py. below are details;

Looking for data spoken_numbers_pcm.tar in data/ Extracting data/spoken_numbers_pcm.tar to data/ Data ready! loaded batch of 2402 files loaded batch of 2402 files loaded batch of 2402 files loaded batch of 2402 files loaded batch of 2402 files Traceback (most recent call last): File "number_classifier_tflearn.py", line 26, in <module> model = tflearn.DNN(net) File "C:\Program Files\Anaconda3\lib\site-packages\tflearn\models\dnn.py", line 57, in __init__ session=session) File "C:\Program Files\Anaconda3\lib\site-packages\tflearn\helpers\trainer.py", line 125, in __init__ keep_checkpoint_every_n_hours=keep_checkpoint_every_n_hours) File "C:\Program Files\Anaconda3\lib\site-packages\tensorflow\python\training\saver.py", line 1000, in __init__ self.build() File "C:\Program Files\Anaconda3\lib\site-packages\tensorflow\python\training\saver.py", line 1021, in build raise ValueError("No variables to save") ValueError: No variables to save

AND

15 speakers: ['Ralph', 'Albert', 'Vicki', 'Samantha', 'Junior', 'Kathy', 'Fred', 'Princess', 'Steffi', 'Alex', 'Daniel', 'Agnes', 'Victoria', 'Tom', 'Bruce'] speakers ['Ralph', 'Albert', 'Vicki', 'Samantha', 'Junior', 'Kathy', 'Fred', 'Princess', 'Steffi', 'Alex', 'Daniel', 'Agnes', 'Victoria', 'Tom', 'Bruce'] Looking for data spoken_numbers_pcm.tar in data/ Extracting data/spoken_numbers_pcm.tar to data/ Data ready! 15 speakers: ['Ralph', 'Albert', 'Vicki', 'Samantha', 'Junior', 'Kathy', 'Fred', 'Princess', 'Steffi', 'Alex', 'Daniel', 'Agnes', 'Victoria', 'Tom', 'Bruce'] loaded batch of 2402 files Traceback (most recent call last): File "speaker_classifier_tflearn.py", line 27, in <module> model = tflearn.DNN(net) File "C:\Program Files\Anaconda3\lib\site-packages\tflearn\models\dnn.py", line 57, in __init__ session=session) File "C:\Program Files\Anaconda3\lib\site-packages\tflearn\helpers\trainer.py", line 125, in __init__ keep_checkpoint_every_n_hours=keep_checkpoint_every_n_hours) File "C:\Program Files\Anaconda3\lib\site-packages\tensorflow\python\training\saver.py", line 1000, in __init__ self.build() File "C:\Program Files\Anaconda3\lib\site-packages\tensorflow\python\training\saver.py", line 1021, in build raise ValueError("No variables to save") ValueError: No variables to save

Thanks..

What should dtype of placeholder y_ in training be?

From speech_encoder.py,
batch_xs, batch_ys = speech.train.next_batch(100)
batch_xs=[flatten(matrix) for matrix in batch_xs]
feed = {x: batch_xs, y_: batch_ys}

The above has the following error:

ValueError: invalid literal for float(): 2 14 68 6 32 14 73 6 47 14 73 3

What should placeholder of y_ be?

Thanks!

The error when running densenet_layer.py

ALSA lib confmisc.c:1286:(snd_func_refer) Unable to find definition 'cards.CA0106.pcm.hdmi.0:CARD=0,AES0=4,AES1=130,AES2=0,AES3=2'
ALSA lib conf.c:4292:(_snd_config_evaluate) function snd_func_refer returned error: No such file or directory
ALSA lib conf.c:4771:(snd_config_expand) Evaluate error: No such file or directory
ALSA lib pcm.c:2266:(snd_pcm_open_noupdate) Unknown PCM hdmi
ALSA lib confmisc.c:1286:(snd_func_refer) Unable to find definition 'cards.CA0106.pcm.hdmi.0:CARD=0,AES0=4,AES1=130,AES2=0,AES3=2'
ALSA lib conf.c:4292:(_snd_config_evaluate) function snd_func_refer returned error: No such file or directory
ALSA lib conf.c:4771:(snd_config_expand) Evaluate error: No such file or directory
ALSA lib pcm.c:2266:(snd_pcm_open_noupdate) Unknown PCM hdmi
ALSA lib confmisc.c:1286:(snd_func_refer) Unable to find definition 'cards.CA0106.pcm.modem.0:CARD=0'
ALSA lib conf.c:4292:(_snd_config_evaluate) function snd_func_refer returned error: No such file or directory
ALSA lib conf.c:4771:(snd_config_expand) Evaluate error: No such file or directory
ALSA lib pcm.c:2266:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.phoneline:CARD=0,DEV=0
ALSA lib confmisc.c:1286:(snd_func_refer) Unable to find definition 'cards.CA0106.pcm.modem.0:CARD=0'
ALSA lib conf.c:4292:(_snd_config_evaluate) function snd_func_refer returned error: No such file or directory
ALSA lib conf.c:4771:(snd_config_expand) Evaluate error: No such file or directory
ALSA lib pcm.c:2266:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.phoneline:CARD=0,DEV=0
ALSA lib confmisc.c:1286:(snd_func_refer) Unable to find definition 'cards.CA0106.pcm.modem.0:CARD=0'
ALSA lib conf.c:4292:(_snd_config_evaluate) function snd_func_refer returned error: No such file or directory
ALSA lib conf.c:4771:(snd_config_expand) Evaluate error: No such file or directory
ALSA lib pcm.c:2266:(snd_pcm_open_noupdate) Unknown PCM phoneline
ALSA lib confmisc.c:1286:(snd_func_refer) Unable to find definition 'cards.CA0106.pcm.modem.0:CARD=0'
ALSA lib conf.c:4292:(_snd_config_evaluate) function snd_func_refer returned error: No such file or directory
ALSA lib conf.c:4771:(snd_config_expand) Evaluate error: No such file or directory
ALSA lib pcm.c:2266:(snd_pcm_open_noupdate) Unknown PCM phoneline
[Errno Input overflowed] -9981
Expression 'ret' failed in 'src/hostapi/alsa/pa_linux_alsa.c', line: 1735
Expression 'AlsaOpen( &alsaApi->baseHostApiRep, params, streamDir, &self->pcm )' failed in 'src/hostapi/alsa/pa_linux_alsa.c', line: 1902
Expression 'PaAlsaStreamComponent_Initialize( &self->capture, alsaApi, inParams, StreamDirection_In, NULL != callback )' failed in 'src/hostapi/alsa/pa_linux_alsa.c', line: 2166
Expression 'PaAlsaStream_Initialize( stream, alsaHostApi, inputParameters, outputParameters, sampleRate, framesPerBuffer, callback, streamFlags, userData )' failed in 'src/hostapi/alsa/pa_linux_alsa.c', line: 2835
Traceback (most recent call last):
File "record.py", line 101, in record
dataraw = stream.read(CHUNK)
File "/usr/lib/python3/dist-packages/pyaudio.py", line 605, in read
return pa.read_stream(self._stream, num_frames)
OSError: [Errno Input overflowed] -9981

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "record.py", line 138, in
record()
File "record.py", line 104, in record
stream=get_audio_input_stream()
File "record.py", line 71, in get_audio_input_stream
input_device_index=INDEX)
File "/usr/lib/python3/dist-packages/pyaudio.py", line 747, in open
stream = Stream(self, *args, **kwargs)
File "/usr/lib/python3/dist-packages/pyaudio.py", line 442, in init
self._stream = pa.open(**arguments)
OSError: [Errno Device unavailable] -9985

errors

there are many errors when i use tensorflow 0.12 , could you get me a readnode for speech recognition?
thanks!

Error When Run densenet_layer.py

Hello Everybody,
I have a problem.
./number_classifier_tflearn.py ./speaker_classifier_tflearn.py run and success but densenet_layer.py not working
I follow this steps on docker.

docker run -it -v C:\WorkData\GitRespostory\tensorflow-speech-recognition:/tf_speech gcr.io/tensorflow/tensorflow:latest-devel

after on shell command screen show

cd /tensorflow
git pull

then run this steps

apt-get update
apt-get install -y libasound-dev portaudio19-dev libportaudio2 libportaudiocpp0
cd /tf_speech
pip install -r requirements.txt
pip install h5py
pip install librosa

Note: spoken_words.tar file manuel download and copy to folder.
and now
python densenet_layer.py

but show this error, please help me.

Traceback (most recent call last): File "densenet_layer.py", line 69, in <module> net.train(data=batch,batch_size=10,steps=5000,dropout=0.6,display_step=10,test_step=100) # run File "/tf_speech/layer/net.py", line 385, in train loss,_= session.run([self.cost,self.optimizer], feed_dict=feed_dict) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 766, in run run_metadata_ptr) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 943, in _run % (np_val.shape, subfeed_t.name, str(subfeed_t.get_shape()))) ValueError: Cannot feed value of shape (10, 262144) for Tensor u'data/Placeholder:0', which has shape '(?, 4096, 4096)'

record my own voice

can anyone tell me how do i record my own voice and where we should put it so that we get speech to text converted??

Data for CTC in lstm to chars.

The data directory given for ctc data in the lstm_to_chars.py file is given as -
INPUT_PATH = '/data/ctc/sample_data/mfcc' # directory of MFCC nFeatures x nFrames 2-D array .npy files
Where can I find the data (since it is not available in speech_data.py)?

tflearn error: No variables to save

Hi,
I download the speech_data.py and speaker_classifier_tflearn.py.
When I run the speaker_classifier_tflearn.py, I got errors as follows:
Traceback (most recent call last): File "speaker_classifier_tflearn.py", line 28, in <module> model = tflearn.DNN(net) File "/root/anaconda2/envs/tensorflow/lib/python2.7/site-packages/tflearn/models/dnn.py", line 57, in __init__ session=session) File "/root/anaconda2/envs/tensorflow/lib/python2.7/site-packages/tflearn/helpers/trainer.py", line 125, in __init__ keep_checkpoint_every_n_hours=keep_checkpoint_every_n_hours) File "/root/anaconda2/envs/tensorflow/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 1000, in __init__ self.build() File "/root/anaconda2/envs/tensorflow/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 1021, in build raise ValueError("No variables to save") ValueError: No variables to save

Training Number

The Training keeps going on and on no matter what...I set the training_iters value to 3000 still it keeps going on .. what is the reason?
screen shot 2017-08-03 at 11 07 59 pm

License of the project

Hi!
What license does the project have? Also who is meant to be the copyright owner for the code commited to the project by third party developers?

i am getting this issue

NotFoundError (see above for traceback): Unsuccessful TensorSliceReader constructor: Failed to find any matching files for /home/hitesh/speec/tflearn.lstm.model
[[Node: save_1/RestoreV2_16 = RestoreV2[dtypes=[DT_FLOAT], _device="/job:localhost/replica:0/task:0/cpu:0"](_recv_save_1/Const_0, save_1/RestoreV2_16/tensor_names, save_1/RestoreV2_16/shape_and_slices)]]

could not run train.py

train.py uses a function prepare_data in speech_data, but there is no such a function defined in speech_data

Broken dependency ?? in densenet_layer.py

Not able to run densenet_layer.py; getting the error below;

Traceback (most recent call last): File "densenet_layer.py", line 4, in <module> import layer File "D:\git\AI\tensorflow-speech-recognition\layer\__init__.py", line 1, in <module> from net import * ImportError: No module named 'net'

i fixed the problem with;

from .net import *
File "D:\git\AI\tensorflow-speech-recognition\layer_init_.py", line 1, in

Running on windows 10, tensorflow 0.12, python3.5

Thanks..

Create Spectrograms

How did you create "Sample spectrogram, Karen uttering 'zero' with 160 words per minute."? How did you create that gray scale spectrogram?

How to...

How does one use this code? More specifically: How does someone who doesn't have an nvidia GPU to train a model use the speech-to-text?

Train data is used to determine accuracy in dense_layer

I'm trying to use dense_layer. Dense_layer uses spectro_batch_generator from speech_data.py to fetch batches of data. Here it is already noted, that training and testing/validation set needs to be split
# shuffle(files) # todo : split test_fraction batch here!

A bit further in dense_layer, the function train from layer/net.py is used. In the train function, currently around line 389, there is:

  feed_dict = {x: batch_xs, y: batch_ys, keep_prob: dropout, self.train_phase: True}
  loss,_= session.run([self.cost,self.optimizer], feed_dict=feed_dict)

Immediately followed by:

  if step % display_step == 0:
    # Calculate batch accuracy, loss
    feed = {x: batch_xs, y: batch_ys, keep_prob: 1., self.train_phase: False}
    acc , summary = session.run([self.accuracy,self.summaries], feed_dict=feed)

If I understand it correctly (and I am new to this, so it's likely that I am wrong), the data is first fed into the train step, after which the exact same data is used to determine the accuracy.

How to record my own speech?

I have tried to record but unsuccessfully. There is module Record.py but it doesn't save speech.

What should I add to code in order to recognize my own speech?

predict.py

i need a predict.py file. can anyone please help me out with it.

ImportError: No module named layer while running densenet_layer.py

Hi,

I could not run densenet_layer.py since it throws import error of the module layer.

Traceback (most recent call last):
File "densenet_layer.py", line 6, in
import layer
ImportError: No module named layer

From my understanding, this is layers in tflearn. But the model architecture defined here doesnt work
net = layer.net(simple_dense, input_shape=(width,height), output_width=classes, learning_rate=0.01)

Thanks
Manishanker

ImportError: No module named core_rnn

Hi,

I'm trying to run ./number_classifier_tflearn.py and the following error occurs:
hdf5 is not supported on this machine (please install/reinstall h5py for optimal experience) Traceback (most recent call last): File "./number_classifier_tflearn.py", line 3, in <module> import tflearn File "/usr/local/lib/python2.7/site-packages/tflearn/__init__.py", line 21, in <module> from .layers import normalization File "/usr/local/lib/python2.7/site-packages/tflearn/layers/__init__.py", line 10, in <module> from .recurrent import lstm, gru, simple_rnn, bidirectional_rnn, \ File "/usr/local/lib/python2.7/site-packages/tflearn/layers/recurrent.py", line 8, in <module> from tensorflow.contrib.rnn.python.ops.core_rnn import static_rnn as _rnn, \ ImportError: No module named core_rnn

Any suggestions?

OS : OSX Yosemit on Unix

multiple problems with speech2text-seq2seq.py

Broken dependency "sugartensor" not in requirements.txt

AND

"tensorflow.examples.tutorials" not available in windows install of tensorflow, needs to be copied manually from git repo of tensorflow.

AND

Line 13, "Update:" needs to be commented in speech2text-seq2seq.py

AND

Traceback (most recent call last): File "speech2text-seq2seq.py", line 65, in <module> z = x.sg_conv1d(size=1, dim=num_dim, act='tanh', bn=True) AttributeError: 'list' object has no attribute 'sg_conv1d'

OS: Windows 10,
TensorFlow: 0.12
Python: 3.5

How to create custom 'train_words_index.txt'

I would like to know how to get this sequence number (2 42 14 66 93 19 46 42 24 43 49 3)?

In train_words_index.txt there are number of lines of the word and sequence number like this
'measurement_Victoria_160.wav.png 2 42 14 66 93 19 46 42 24 43 49 3'. I had try to find the way to create this sequence number many where but couldn't be found

Thank you in advance,

newer paper?

Hello, Pannous,
You do great work!!! Really great!!!
I'm new comer in speech recognition. I noticed that you cite in the project some papers of 2012 and 2014. Now it is 2017. If I want to repeat the state-of-art work, do you think I should read some recent papers beyond the ones you use in this project?
Please give me some suggestion. Thanks in advance.

Training is not using GPU capacity

Hey everyone!

I'm trying to train my model with the speech2text-tflearn code. Unfortunately it takes ages to train (a few days). I have installed Tensorflow with GPU support, but the Code is not using any of the GPUs capacity. I have not changed any paramters. What am I getting wrong? Any suggestions?

Thanks!
Cheers
julitosm

Error while reshaping tensors

I am facing problem in reshaping my tensors. Right now I am running train.py from your source but I got the following error:

File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 625, in _run
    % (np_val.shape, subfeed_t.name, str(subfeed_t.get_shape())))
ValueError: Cannot feed value of shape (100, 4096) for Tensor 'Placeholder:0', which has shape '(?, 262144)'

this is my code snippet

for i in range(6000-1):
    batch_xs, batch_ys = speech.train.next_batch(100)
    # WTF, tensorflow can't do 3D tensor operations?
    # https://github.com/tensorflow/tensorflow/issues/406 =>

    batch_xs=[flatten(matrix) for matrix in batch_xs]

    #batch_ys = np.reshape(batch_ys, (100,4096))
    #batch_xs = np.reshape(batch_xs, (4096,100))

    #  you have to reshape to flat/matrix data? why didn't they call it matrixflow?
    feed = {x: batch_xs, y_: batch_ys}
    speech_step.run(feed) # better for encod_entropy too! (later)
    if(i%100==0):
        print("iteration %d"%i)#, end=' ')
        eval(feed)
    if((i+1)%7000==0):
      print("l_rate*=0.1")
      sess.run(tf.assign(l_rate,l_rate*0.1))
  print("Train")

problems with tensorboard_util.py

tensorboard_util.py will not run on windows; i made the following changes to make it work.

in layer/__init__.py:
changed "from tensorboard_util import *" to "from .tensorboard_util import *"
tensorboard_logs = '/tmp/tensorboard_logs/'
needs to be updated for windows, i just changed it to tensorboard_logs = './tmp/tensorboard_logs/'

logs=subprocess.check_output(["ls", tensorboard_logs]).split("\n")
to
logs=subprocess.check_output(["ls", tensorboard_logs]).decode("utf-8").split("\n")

thanks..

How to classify a entire sentence.

I have many sounds that WAV format. The duration is about 10 seconds. All the wav is one of the ten sentences. For example: "The number you have dialed is power off"/"dialed number is not exist"/ some other sentences. Is it suitable to use your project to do this?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.