Coder Social home page Coder Social logo

Label encoding about medaka HOT 7 CLOSED

nanoporetech avatar nanoporetech commented on July 17, 2024
Label encoding

from medaka.

Comments (7)

cjw85 avatar cjw85 commented on July 17, 2024

Hi @vineeth-s

Thanks for your interest.

We haven't properly considered the case where a draft contains "N" but suspect the consensus inference should still work (though we will skip over regions containing "N" during training - the model will not learn to recall "N").

In your case I think the problem arises because you terminated the training early: when training ends we write some additional meta information required for the consensus program required for inference. You should however be able to rescue the situation by running the consensus with the additional option:

--encoding <train_name>/<train_name>_label_encodings.json

For future reference, the training program does use early stopping: this is hard-coded to terminate the training when the validation loss has not improved for 20 epochs.

from medaka.

vineeth-s avatar vineeth-s commented on July 17, 2024

Hi @cjw85

So I did that, and now it manages to read the labels, but jumps out with another error :

[17:30:35 - root] Running network took 1.2875598339887802s for data of shape (1, 4639, 9)
[17:30:35 - root] Decoding took 0.001005782003630884s for 1 chunks.
Exception ignored in: <bound method BaseSession.__del__ of <tensorflow.python.client.session.Session object at 0x7ff24f7f1c88>>
Traceback (most recent call last):
File "/home/ngs/medaka/venv/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 696, in__del__
File "/home/ngs/medaka/venv/lib/python3.5/site-packages/tensorflow/python/framework/c_api_util.py", line 30, in __init__
TypeError: 'NoneType' object is not callable

I think I am just going to let it run to completion instead of preempting it, and see what happens ?

from medaka.

vineeth-s avatar vineeth-s commented on July 17, 2024

Hi @cjw85,

How easy or difficult would it be to make the number of epochs a parameter that can be provided on the command line ?

Also, thanks for developing this and releasing it, if it works the way we see it working, it'd be a major boost to our work

from medaka.

vineeth-s avatar vineeth-s commented on July 17, 2024

Hi @cjw85

Just letting you know, when I let it run to the end of training, the consensus gets polished without any errors.

Cheers, and thanks much

from medaka.

cjw85 avatar cjw85 commented on July 17, 2024

That can be easily done. We plan to make a release with a pre-trained model soonish; these models are relatively simple so we're concerned about over-training. We'll see about adding more features to the command line interface.

I've occasionally seen the error you report above whilst running keras. In those instances it appeared harmless. Can you check if an output .fasta file has been produced?

from medaka.

vineeth-s avatar vineeth-s commented on July 17, 2024

I am looking at the output with the chunks, and there is only one segment of that particular draft written out ... would i be correct in assuming that only sequences or segments of sequences which get polished get written out ?

ps : apologies for the comment/question bombardment

from medaka.

cjw85 avatar cjw85 commented on July 17, 2024

Yes, that is correct. This was simply a pragmatic choice to not over-complicate the code and keep the output transparent.

Happy to answer questions, it help us improve these tools. :)

from medaka.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.