Coder Social home page Coder Social logo

ifcb_classifier's People

Contributors

joefutrelle avatar sbatchelder avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

ifcb_classifier's Issues

Metadata integration

Allow model to include roi dimensions and other metadata into the classification process

Trailing newline characters on filter strings read from a text file prevent proper matching

Hi @sbatchelder and @joefutrelle. I've been working to integrate ifcb_classifier into our IFCB processing pipeline here at Axiom. Overall, it's been great to work with, but I ran into a bug when trying to pull in filter strings from a text file.

The issue I was seeing was that when I ran something like python neuston_net.py RUN ... --clobber --filter my-filters.txt, I would just immediately get RUN IS DONE as if none of my filter strings had matched any of the bin object filepaths, even though the same filters matched if I passed them directly to --filter.

I dug in and tracked the issue down to trailing newline characters resulting from this call to f.readlines(). Basically each newline-delimited filter string still had its newline character on the end of it (because readlines doesn't remove them), which prevented it from matching with any of the bin object filepaths. I changed that line to f.read().splitlines(), and that got file-based filtering working again for me.

I noticed a similar pattern being used in some places, but not in others, so I'm assuming the issue I ran into was just an oversight. I have a fix on the fork we're currently running here and would be happy to turn that into a pull request if you're open to contributions. If so, I can also fix the other occurences of this readlines pattern I ran across in the codebase.

json file from --type img option has repeat entries

Following the wiki documentation on use of this command format:
./neuston_net.py RUN run-data/YOUR_PNG_FOLDER training-output/PATH/TO/MODEL.ptl YOUR_RUN_ID --type img
produces json result files. The files appear to contain two copies of all the output data (e.g., there are twice as many records as expected and the first and second half of the files appear to be identical).

SLURM Workflow Automation

Submitting HPC SLURM jobs is not very streamlined, especially for non-dev end-users.
Implement a one-stop-shop solution for submitting training and classifying jobs on a slurm-enabled system.

Transfer Learning

Transfer Learning is the process of training a pre-existing model for new output targets without having to retrain the whole network.

In this project this could look similar to the regular TRAIN subcommand, but where MODEL points to a previously trained .ptl model file.

neuston_net.py TRANSFER <optional_args> SRC MODEL TRAINING_ID

Selectively Train on multiple datasets

Labeled data may come from any number of datasets. To improve training experiment throughput, implement a feature by which training datasets can be dynamically combined.

The feature should support the aggregation of classes and images from any number of on-disc datasets and allow the user to specify what classes from what dataset should be included.

Furthermore, to account for the --class-max flag behavior, this feature should be able to prioritize certain datasets over others when truncating per-class sample sizes.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.