Coder Social home page Coder Social logo

opal's People

Contributors

yunwilliamyu avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

opal's Issues

Vowpal wabbit spitting back warning: "label 5278 is not in {1,5277} This won't work right."

Hi Yun,

Thanks for your help. Your wrapper works beautifully, but now I've run into trouble with Vowpal Wabbit. It keeps telling me: "label 5278 is not in {1,5277} This won't work right."

I googled around, and found this:
VowpalWabbit/vowpal_wabbit#410

It says something about formatting with newlines in the --oaa format... I cannot make heads or tails of their discussion. Perhaps you can help me along a bit? Though I realise that this is not anything to do with your code...

I'm fishing about here, but looking at the formatting of my vw-dico.txt file I see an unusual line with a massive number (3rd line in this snippet):

1118379 4117
1519374198215   5274
1670446 2931
426355  189

Could this be part of the problem?

vw-dico.txt file attached:
vw-dico.txt

Log file with error attached:
vw-model_vwps.log.txt

errors compiling

Hi

I am trying to install opal on Mac OSX system and ran into the following errors:

file.c:59:2: error: non-void function '_gdl_runtime_fread_rng' should return a value [-Wreturn-type]
GDL_FREAD_STATUS (status, 1);
^
../gdl/gdl_errno.h:162:7: note: expanded from macro 'GDL_FREAD_STATUS'
{ GDL_ERROR_VOID ("fread failed", GDL_EFAILED);}
^
../gdl/gdl_errno.h:131:8: note: expanded from macro 'GDL_ERROR_VOID'
return ;
^
file.c:61:2: error: non-void function '_gdl_runtime_fread_rng' should return a value [-Wreturn-type]
GDL_FREAD_STATUS (rng!=0, 1);
^
../gdl/gdl_errno.h:162:7: note: expanded from macro 'GDL_FREAD_STATUS'
{ GDL_ERROR_VOID ("fread failed", GDL_EFAILED);}
^
../gdl/gdl_errno.h:131:8: note: expanded from macro 'GDL_ERROR_VOID'
return ;
^
file.c:83:2: error: non-void function '_gdl_runtime_fwrite_rng' should return a value [-Wreturn-type]
GDL_FWRITE_STATUS (status, 1);
^
../gdl/gdl_errno.h:167:7: note: expanded from macro 'GDL_FWRITE_STATUS'
{ GDL_ERROR_VOID ("fwrite failed", GDL_EFAILED);}
^
../gdl/gdl_errno.h:131:8: note: expanded from macro 'GDL_ERROR_VOID'
return ;
^
file.c:85:2: error: non-void function '_gdl_runtime_fwrite_rng' should return a value [-Wreturn-type]
GDL_FREAD_STATUS (status, GDL_SUCCESS);
^
../gdl/gdl_errno.h:162:7: note: expanded from macro 'GDL_FREAD_STATUS'
{ GDL_ERROR_VOID ("fread failed", GDL_EFAILED);}
^
../gdl/gdl_errno.h:131:8: note: expanded from macro 'GDL_ERROR_VOID'
return ;
^
file.c๐Ÿ’ฏ3: error: non-void function 'gdl_runtime_fread' should return a value [-Wreturn-type]
GDL_FREAD_STATUS (T!=0, 1);
^
../gdl/gdl_errno.h:162:7: note: expanded from macro 'GDL_FREAD_STATUS'
{ GDL_ERROR_VOID ("fread failed", GDL_EFAILED);}
^
../gdl/gdl_errno.h:131:8: note: expanded from macro 'GDL_ERROR_VOID'
return ;
^
file.c:105:3: error: non-void function 'gdl_runtime_fread' should return a value [-Wreturn-type]
GDL_FREAD_STATUS (run->rng!=0, 1);
^
../gdl/gdl_errno.h:162:7: note: expanded from macro 'GDL_FREAD_STATUS'
{ GDL_ERROR_VOID ("fread failed", GDL_EFAILED);}
^
../gdl/gdl_errno.h:131:8: note: expanded from macro 'GDL_ERROR_VOID'
return ;
^
file.c:120:3: error: non-void function 'gdl_runtime_fwrite' should return a value [-Wreturn-type]
GDL_FWRITE_STATUS (status, GDL_SUCCESS);
^
../gdl/gdl_errno.h:167:7: note: expanded from macro 'GDL_FWRITE_STATUS'
{ GDL_ERROR_VOID ("fwrite failed", GDL_EFAILED);}
^
../gdl/gdl_errno.h:131:8: note: expanded from macro 'GDL_ERROR_VOID'
return ;
^
file.c:122:3: error: non-void function 'gdl_runtime_fwrite' should return a value [-Wreturn-type]
GDL_FWRITE_STATUS (status, GDL_SUCCESS);
^
../gdl/gdl_errno.h:167:7: note: expanded from macro 'GDL_FWRITE_STATUS'
{ GDL_ERROR_VOID ("fwrite failed", GDL_EFAILED);}
^
../gdl/gdl_errno.h:131:8: note: expanded from macro 'GDL_ERROR_VOID'
return ;
^
8 errors generated.
make[2]: *** [file.lo] Error 1
make[1]: *** [all-recursive] Error 1
make: *** [all] Error 2
Making install in gdl
make[2]: Nothing to be done for install-exec-am'. make[2]: Nothing to be done for install-data-am'.
Making install in err
make[2]: Nothing to be done for `install-exec-am'.
.././install-sh -c -d '/Users/ksoh/Projects/opal/util/ext/gdl-1.1/GDL/include/gdl'
/usr/bin/install -c -m 644 gdl_errno.h gdl_message.h '/Users/ksoh/Projects/opal/util/ext/gdl-1.1/GDL/include/gdl'
Making install in run
/bin/sh ../libtool --tag=CC --mode=compile gcc -DHAVE_CONFIG_H -I. -I.. -I.. -I.. -g -O2 -c -o file.lo file.c
libtool: compile: gcc -DHAVE_CONFIG_H -I. -I.. -I.. -I.. -g -O2 -c file.c -fno-common -DPIC -o .libs/file.o
file.c:59:2: error: non-void function '_gdl_runtime_fread_rng' should return a value [-Wreturn-type]
GDL_FREAD_STATUS (status, 1);
^
../gdl/gdl_errno.h:162:7: note: expanded from macro 'GDL_FREAD_STATUS'
{ GDL_ERROR_VOID ("fread failed", GDL_EFAILED);}
^
../gdl/gdl_errno.h:131:8: note: expanded from macro 'GDL_ERROR_VOID'
return ;
^
file.c:61:2: error: non-void function '_gdl_runtime_fread_rng' should return a value [-Wreturn-type]
GDL_FREAD_STATUS (rng!=0, 1);
^
../gdl/gdl_errno.h:162:7: note: expanded from macro 'GDL_FREAD_STATUS'
{ GDL_ERROR_VOID ("fread failed", GDL_EFAILED);}
^
../gdl/gdl_errno.h:131:8: note: expanded from macro 'GDL_ERROR_VOID'
return ;
^
file.c:83:2: error: non-void function '_gdl_runtime_fwrite_rng' should return a value [-Wreturn-type]
GDL_FWRITE_STATUS (status, 1);
^
../gdl/gdl_errno.h:167:7: note: expanded from macro 'GDL_FWRITE_STATUS'
{ GDL_ERROR_VOID ("fwrite failed", GDL_EFAILED);}
^
../gdl/gdl_errno.h:131:8: note: expanded from macro 'GDL_ERROR_VOID'
return ;
^
file.c:85:2: error: non-void function '_gdl_runtime_fwrite_rng' should return a value [-Wreturn-type]
GDL_FREAD_STATUS (status, GDL_SUCCESS);
^
../gdl/gdl_errno.h:162:7: note: expanded from macro 'GDL_FREAD_STATUS'
{ GDL_ERROR_VOID ("fread failed", GDL_EFAILED);}
^
../gdl/gdl_errno.h:131:8: note: expanded from macro 'GDL_ERROR_VOID'
return ;
^
file.c๐Ÿ’ฏ3: error: non-void function 'gdl_runtime_fread' should return a value [-Wreturn-type]
GDL_FREAD_STATUS (T!=0, 1);
^
../gdl/gdl_errno.h:162:7: note: expanded from macro 'GDL_FREAD_STATUS'
{ GDL_ERROR_VOID ("fread failed", GDL_EFAILED);}
^
../gdl/gdl_errno.h:131:8: note: expanded from macro 'GDL_ERROR_VOID'
return ;
^
file.c:105:3: error: non-void function 'gdl_runtime_fread' should return a value [-Wreturn-type]
GDL_FREAD_STATUS (run->rng!=0, 1);
^
../gdl/gdl_errno.h:162:7: note: expanded from macro 'GDL_FREAD_STATUS'
{ GDL_ERROR_VOID ("fread failed", GDL_EFAILED);}
^
../gdl/gdl_errno.h:131:8: note: expanded from macro 'GDL_ERROR_VOID'
return ;
^
file.c:120:3: error: non-void function 'gdl_runtime_fwrite' should return a value [-Wreturn-type]
GDL_FWRITE_STATUS (status, GDL_SUCCESS);
^
../gdl/gdl_errno.h:167:7: note: expanded from macro 'GDL_FWRITE_STATUS'
{ GDL_ERROR_VOID ("fwrite failed", GDL_EFAILED);}
^
../gdl/gdl_errno.h:131:8: note: expanded from macro 'GDL_ERROR_VOID'
return ;
^
file.c:122:3: error: non-void function 'gdl_runtime_fwrite' should return a value [-Wreturn-type]
GDL_FWRITE_STATUS (status, GDL_SUCCESS);
^
../gdl/gdl_errno.h:167:7: note: expanded from macro 'GDL_FWRITE_STATUS'
{ GDL_ERROR_VOID ("fwrite failed", GDL_EFAILED);}
^
../gdl/gdl_errno.h:131:8: note: expanded from macro 'GDL_ERROR_VOID'
return ;
^
8 errors generated.
make[1]: *** [file.lo] Error 1
make: *** [install-recursive] Error 1
gcc -O2 -Wall -Wshadow -I./ -Iext/gdl-1.1/GDL/include -c drawfrag.c
drawfrag.c:18:10: fatal error: 'gdl/gdl_common.h' file not found
#include <gdl/gdl_common.h>
^
1 error generated.
make: *** [drawfrag.o] Error 1

Any advice on resolving these errors? Thank you.

Update code for Python 3 (Python 2.7 no longer maintained after January 1, 2020)?

https://pythonclock.org/

It doesn't look like it would be too difficult:

bash-4.3$2to3-3.5 opal.py util/*.py 2> /dev/null
--- opal.py     (original)
+++ opal.py     (refactored)
@@ -21,7 +21,7 @@
 This pipeline depends on Python scikit-learn and on Vowpal Wabbit. Vowpal
 Wabbit must be properly installed in the system path.
 '''
-from __future__ import print_function
+
 __version__ = "0.9.1"

 import argparse
--- util/drawfrag.py    (original)
+++ util/drawfrag.py    (refactored)
@@ -4,7 +4,7 @@
 From each Fasta sequence in a file, draw random substrings of size k covering it c times, returning a new multi-fasta file with labels
 '''

-from __future__ import print_function
+
 __version__ = "0.0.1"
 import argparse
 import os
--- util/fasta2skm.py   (original)
+++ util/fasta2skm.py   (refactored)
@@ -6,7 +6,7 @@
     The C program has several bugs, including not accepting single character labels, and not being able to read non-numeric labels from the dico file, though it can generate them.
 '''

-from __future__ import print_function
+
 __version__ = "0.0.1"
 import argparse
 import os
@@ -39,7 +39,7 @@
             label2vwid[label] = vwid
             vwidset.add(vwid)
     with open(dico_file, "w") as df:
-        for label, vwid in label2vwid.items():
+        for label, vwid in list(label2vwid.items()):
             df.write("{}\t{}\n".format(label, vwid))
     return label2vwid

@@ -78,7 +78,7 @@
             if args.reverse:
                 feature_list.extend(gen_features(pattern_getters, reverse_complement(seq), args.kmer))
             features = " ".join(feature_list)
-            yield '{} | {}\n'.format(labels.next(), features)
+            yield '{} | {}\n'.format(next(labels), features)
     if args.taxid:
         taxid_file.close()

--- util/fasta_functions.py     (original)
+++ util/fasta_functions.py     (refactored)
@@ -46,7 +46,7 @@
     return dna[::-1].translate(trans)

 def get_all_substrings(input_string, k):
-    return [input_string[i:i+k] for i in xrange(len(input_string) - k + 1)]
+    return [input_string[i:i+k] for i in range(len(input_string) - k + 1)]

 pat = re.compile('^[ACGTacgt]*$')
 def check_acgt(s):

I don't think the "list(___) change is necessary, here.
But there might be an issue with the version of sklearn.

num_batches option in wrapper fails to split training fasta file

Hello Yun,

Lovely idea for Opal! I cannot use the wrapper because opal.py train will not split my training fasta file with ~5k bacterial genomes and paired taxids into batches. I can set --num-batches to 1000 and the wrapper continues to use a single batch. I looked at your train function and found this:
for i in range(num_batches): seed = seed + 1 batch_prefix = os.path.join(model_dir, "train.batch-{}".format(i)) fasta_batch = batch_prefix + ".fasta" gi2taxid_batch = batch_prefix + ".gi2taxid" taxid_batch = batch_prefix + ".taxid"

But my run (time ./opal.py train -c 10 -r --num-batches 1000 train/ model/) is killed at this snippet, I guess because I run out of memory or something like that:

print("Getting training set ...") sys.stdout.flush() skms = fasta2skm.main_generator(fasta2skm_namespace) training_list = [line.rstrip('\n') for line in skms]
The wrapper only generates a single set of massive batch-0 files: like one 300Gb file of fasta frags!

I cannot see where between these two pieces of code the wrapper splits up the fasta training set. Am I mean to do that myself first? Does it expect me to split up my fasta file first? The example datasets don't look like that...

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.