Comments (7)
Hi David,
I would like to be able to specify the precise location and name of the root of the output directory. It's mildly annoying to have to move it after the OrthoFinder execution finishes in order to keep my pipeline results directory tidy.
Thanks again!
from orthofinder.
I think something like that should be possible. The consideration will be how to do it while balancing a couple of things:
- The ability to run modified analyses (i.e. adding/removing species or picking up from where a previous analysis left off) without it being too complicated to specify/find where all the files are etc.
- Allowing the user to specify where results files go so they can better keep track of them if they are running multiple analyses.
What were the other things you were thinking of beyond specifying the location of the output directory? And are you specifically interested in being able to put the results in a certain place or avoiding files/directories being created in the input directory?
All the best
David
from orthofinder.
+1
I read the current documentation but have not found a solution to this. Has any headway been made?
from orthofinder.
Hi
I'm looking at rearranging the output directory structure for OrthoFinder and I just wanted to check if the planned changes would work for what you'd like to be able to do. The plan is you would be able to specify a root directory for the OrthoFinder results (which could be anywhere). You would also be able to supply an optional name to let you describe this particular run (E.g. 'Initial_Run'). If you specified "/path/to/myresults/" then the results structure would look like this:
myresults/
- Results_Mar01_Initial_Run/
- Orthogroups/
- Orthologues/
- GeneTrees/
- SpeciesTree/
- GeneDuplications/
- DubiousGenes/
- ComparativeGenomicStatistics/
- Log.txt
- WorkingDirectory/
The 'NAME' is optional and allows you to give a name to let you describe this particular run. Each of the directories would contain the corresponding results files. The Log.txt file would say what species were included and what the command line given to OrthoFinder was.
If you wanted to run a subsequent analysis reusing some of the previous results (e.g. removing one of the species and using the previous BLAST results) then you would have to use the same root directory as before but you could specify a new 'NAME' to identify this one (e.g 'Removed_Ecoli'). You're result directory structure would then look like this:
myresults/
- Results_Mar01_InitialRun/
- Orthogroups/
- Orthologues/
- GeneTrees/
- etc ...
- Log.txt
- Results_Mar01_Removed_Ecoli/
- Orthogroups/
- Orthologues/
- GeneTrees/
- etc ...
- Log.txt
- WorkingDirectory/
By default (if you didn't specify a location) then the root directory would be called 'OrthoFinder' and be in the directory containing the FASTA files as before. But you'd be able to chose any other location instead. Any subsequent runs that used previously calculated results would have to use the same root directory so that OrthoFinder would be able to find the files it needs. Of course, if you did a new analysis of the FASTA files right from the start (not reusing any of the previous work) then you'd be able to specify a different root directory (in fact, you'd have to have a different root directory as it would need it own, new WorkingDirectory/ for all the files OrthoFinder needs).
I haven't worked out yet all the implementation details necessary to make this work and it will involve quite a bit of refactoring but those should all be below the surface and not alter any of the above plan. Would you be able to let me know if these changes would let you do what you'd like to be able to do, and if you have any other feedback on the plan?
Thanks
David
from orthofinder.
This strategy would solve my problems. +1!
from orthofinder.
Thank you, David, this seems to address every concern that I had. Looking forward to seeing the implementation!
from orthofinder.
Forgot to say, this is now available in the latest version.
from orthofinder.
Related Issues (20)
- An error is reported when running orthofinder for more than 15 species
- Running with GLIBC 2.17
- Orthofinder not writing gene to HOG even though it matches all others in the group
- -M msa -T raxml fails in OrthoFinder 2.5.5 HOT 1
- ERROR: .fas appears to contain nucleotide sequences instead of amino acid sequences. Use '-d' option
- diamond dropped --ignore-warnings flag, leads to error HOT 5
- To run STAG/STRIDE without requiring all species present in tree/ output with only species present in all trees HOT 1
- paralogs in same species
- different sets of genes as input
- Installing on Mac M1 - README update
- Issue with user supplied species tree
- conda install failes HOT 1
- [Bug:] Option `-p tmp_pickle_dir` not working
- [Feature request] reduce number generated files per directory (e.g. for blast results, pickle files)
- ERROR when starting from BLAST, Orthogroups or Tree
- Rooted species tree branch length meaning HOT 1
- Is it possible to detect Orthologous Isoforms
- Only two species comparison, but the number of one-to-one orthologs in Orthologues Stats one_to_one.tsv is more than the number of single-copy orthogroups in statistics Overall.tsv file
- DockerFile
- OSError: [Errno 24] Too many open files [32514] Failed to execute script 'orthofinder' due to unhandled exception!
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from orthofinder.