Comments (9)
Hi Rasmus,
Can you confirm the version of medaka that you are using? Apologies in advance for this, the command line program does not report its version so you would need to run:
python -c "import medaka; print(medaka.__version__)"
to print the version number.
Code to limit CPU resource was added in v0.4.0. The latest release is v1.4.3, available through the github release page and pypi.
With v0.4.0 and greater you should see a log message along the lines of:
Setting tensorflow threads to 60.
Can you confirm this is the case? If you can report the tensorflow version that is being used, that would be useful also.
from medaka.
Hi
The medaka version is 0.4.3 and the release is 1.4.3. Would be nice to have the command line program report the version number as well, it is very handy to keep track of all the updates.
I am afraid that medaka did not produce a log file and I killed the process after running for a day as it seemed to be stuck.
The tensorflow version on the system is 1.8.0.
from medaka.
Oh! Thank you for discovering a misnumbering of our github releases! The version in the code should be considering canonical (we'll have to get that put right).
We will see if we can reproduce this with tensorflow 1.8.0 (currently we've been using tensorflow 1.12.0).
Can I ask how big is your draft sequence? A day seems rather long, though admittedly the longest sequence on which we have tested is a human chromosome. I am suspicious that there is some threading outside of tensorflow's control occurring, e.g. in the BLAS library on your system.
If you know what BLAS library is being called can you try setting its threading behaviour through an environment variable, e.g. for openblas:
OPENBLAS_NUM_THREADS=1
I presume you see the 8000% use in top even if you ask medaka to use only a single thread? (Note you may see bursts above 100% even in this case since medaka performs some multithreaded IO).
from medaka.
Okay. I will try to get my sysadmin to update our tensorflow installation.
The draft is a 400 Mbp metagenome. But the hardware I am running this on is rather dated. Have you tested if medaka can run on the promethion compute unit?
I have tried restarting the consensus calling with OPENBLAS_NUM_THREADS=1 in the environment.
I am used to see jumps a bit above 100% but not from 6000% to 8000%.
Any chance that future releases of medaka will have a resume option? Currently it is giving an error when run in the same folder without deleting the old outputfolder. It would be nice to avoid running e.g. minimap twice.
from medaka.
We generally recommend installing medaka into a virtual environment (if only to not bother our own sysadmins and manage versions of python packages per software). Depending on exactly how dated your hardware is, the tensorflow versions on pypi may work. My suggestion would be to set up a virtual environment and see.
It is possible to run the medaka_consensus
script and not have minimap2
run twice. Currently the part which is broken is that if medaka itself fails it leaves around a partial consensus_probs.hdf
file in the output directory. You can delete this file rather than the whole directory to skip only the minimap2
step. We will try to make this easier in the next release.
Medaka will certainly run on a promethion compute unit (and can in principle make use of the GPUs if you install tensorflow-gpu into the medaka environment). However, naturally I cannot condone installation of additional software (even ONT software) onto the box which may interfere with its intended and warrantied use.
from medaka.
Our sysadmin prefers to install software in the module system and I cannot install python3 stuff in virtual environment as we have some installation issue apparently.
I can confirm that even with OPENBLAS_NUM_THREADS=1 and when "Setting tensorflow threads to 30." it requests much more resources than is available and thus ends up sleeping most of the time. This is probably the reason why it never finishes.
Very handy, I will just delete that file next time.
Okay that is a pity but I understand that it is risky to run more beta release software and could be a problem.
from medaka.
The misnumbered github releases have been fixed.
I've not been able to reproduce the higher than expected CPU usage with tensorflow 1.8.0 installed from pypi. On a server with 28 physical CPUs (56 thread), I see a maximum of ~3000% usage in top
. With the -t
option of medaka_consensus
set to 8
I see usage fairly well pinned to ~800%, admittedly with a few short bursts up to 1000%.
We will endeavour to research and debug this further.
We are looking to provide releases validated for use on the PromethION compute box, though there is no timeline for this.
from medaka.
I installed on a different machine in a virtual environment and it ran all the way to the end so for now I am happy with that 😄 . Thank you for your time.
from medaka.
No worries, always good to know where we have issues on some environments and hardware.
from medaka.
Related Issues (20)
- Medaka_consensus python command not found error message HOT 4
- medaka 1.11.3 quits early HOT 4
- Missing pyabpoa in Docker image HOT 3
- process pool error in medaka tandem HOT 10
- Bad heap free list error in medaka stitch HOT 2
- medaka_consensus run without pyabpoa HOT 1
- Medaka error
- medaka error pyabpoa and libgsl.so.25 HOT 1
- Duplicate entries in annotated VCF file HOT 2
- Unable to run medaka_consensus on Mac M3 HOT 5
- Empty vcfs when running medaka_haploid_variant command HOT 2
- Makefile:158: recipe for target 'check_lfs' failed make: *** [check_lfs] Error 1 HOT 2
- 1.6.0 release unavailable in pypi HOT 2
- Medaka Compatibility with Fungal Reads HOT 6
- I run medaka consensus in HPC, it only generated HDF5 data, how to generate consensus. fasta? HOT 6
- Is it possible to use medaka in offline mode? HOT 11
- Unable to install medaka on Mac M3 HOT 9
- help please with minimap2, tabix, bgzip and bcftools binary files
- Python 3.12 compatibility for pip HOT 1
- batch size and GPU use HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from medaka.