Comments (9)
when built with make CFLAGS='-g -O0'
, the backtrace is
Program received signal SIGSEGV, Segmentation fault.
0x000000000042c748 in GD::audit_quad (all=..., left_feature=<value optimized out>, left_audit=0x0, right_features=..., audit_right=..., results=std::vector of length 4, capacity 4 = {...}, ns_pre=
"", offset=<value optimized out>) at gd.cc:235
235 audit_features(all, right_features, audit_right, results, prepend, ns_pre, halfhash + offset, left_audit->x);
Missing separate debuginfos, use: debuginfo-install boost-program-options-1.41.0-18.el6.x86_64 glibc-2.12-1.132.el6.x86_64 libgcc-4.4.7-4.el6.x86_64 libstdc++-4.4.7-4.el6.x86_64 zlib-1.2.3-29.el6.x86_64
(gdb) where
#0 0x000000000042c748 in GD::audit_quad (all=..., left_feature=<value optimized out>, left_audit=0x0, right_features=..., audit_right=..., results=std::vector of length 4, capacity 4 = {...},
ns_pre="", offset=<value optimized out>) at gd.cc:235
#1 0x000000000042e060 in GD::print_features (all=..., ec=...) at gd.cc:295
#2 0x000000000043f6cc in learn<true, true, true, 2, 0> (d=0x704330, base=..., ec=...) at gd.cc:620
#3 LEARNER::tlearn<GD::gd, &(GD::learn)> (d=0x704330, base=..., ec=...) at ./learner.h:67
#4 0x0000000000447c82 in learn (all=0x6ca4c0) at ./learner.h:109
#5 LEARNER::generic_driver (all=0x6ca4c0) at learner.cc:20
#6 0x0000000000409c68 in main (argc=<value optimized out>, argv=<value optimized out>) at main.cc:46
from vowpal_wabbit.
Hi Sam,
Thanks for the report. Unfortunately, I can't reproduce with latest source from github.
Can you provide a full reproducible example (full command line, and some hopefully small sample of a data-set) please? In order to help trim the data size significantly, you may try and run with --progress 1 to find the specific example where the crash occurs.
Thanks again
from vowpal_wabbit.
vw --passes 100 --invert_hash 6289-hinge-100-qdk.tx^C--loss_function hinge --cache_file cache -q dk
I have a 5mb data file; github rejected e-mail with it as too large
from vowpal_wabbit.
Ok, thanks. I managed to reproduce this with a small data-set. There's no need to do 100 passes, 2 suffice, there's no need to use --loss_function hinge either. We need a combo of 3 conditions to reproduce:
- Using a cache
- using -q
- using --invert_hash
It SEGVs after the 1st pass is complete and it tries to write the cache.
My command line is:
vw -k -c --passes 2 --invert_hash zz.ih -q dk Regressions/q-segfault.dat
Regressions/q-segfault.dat
is just 2 lines:
1 |domain x.com |keyword a b c
-1 |domain y.com |keyword d e f
from vowpal_wabbit.
Is there a reason to allow cache and invert_hash? It seems like a bad
idea. I tweaked the code to not allow this.
-John
On 05/15/2014 01:02 AM, Ariel Faigon wrote:
Ok, thanks. I managed to reproduce this with a small data-set. There's
no need to do 100 passes, 2 suffice, there's no need to use
--loss_function hinge either. We need a combo of 3 conditions to
reproduce:
- Using a cache
- using -q
- using --invert_hash
It SEGVs after the 1st pass is complete and it tries to write the cache.
My command line is:
|vw -k -c --passes 2 --invert_hash zz.ih -q dk Regressions/q-segfault.dat|
|Regressions/q-segfault.dat| is just 2 lines:
|1 |domain x.com |keyword a b c
-1 |domain y.com |keyword d e f
|—
Reply to this email directly or view it on GitHub
#299 (comment).
from vowpal_wabbit.
Is there a reason to allow cache and invert_hash?
I need cache to have more than 1 pass and invert_hash to produce a human-readable model.
from vowpal_wabbit.
invert_hash has a severe performance impact. You should only use it
sparingly. Why not instead save a learned regressor and then use
invert_hash in a single pass over the data to get a readable model?
-John
On 05/17/2014 09:27 PM, Sam Steingold wrote:
Is there a reason to allow cache and invert_hash? I need cache to have more than 1 pass and invert_hash to produce a human-readable model.
—
Reply to this email directly or view it on GitHub
#299 (comment).
from vowpal_wabbit.
The single pass you are suggesting will change the regressor, so the resulting readable model will not be identical to the learned one.
Also, this one extra pass may be quite expensive - what if I trained the model on hadoop?
from vowpal_wabbit.
Use '-t' to turn of training.
This one extra pass will be radically less expensive than keeping
invert_hash on for all the passes.
-John
On 05/17/2014 09:34 PM, Sam Steingold wrote:
The single pass you are suggesting will change the regressor, so the
resulting readable model will not be identical to the learned one.
Also, this one extra pass may be quite expensive - what if I trained
the model on hadoop?—
Reply to this email directly or view it on GitHub
#299 (comment).
from vowpal_wabbit.
Related Issues (20)
- Vowpal_wabbit failed to runtest on MSVC on Windows HOT 2
- Coin/FTRL/Pistol are not available via vw.get_config() HOT 1
- Detailed explanation on --explore_eval option for contextual bandits HOT 2
- Classification Multivariate time series : values both categorical and continues HOT 5
- Segmentation fault in CATS HOT 2
- Contextual Bandit vowpal_wabbit training dataset validation HOT 2
- New line is misinterpreted as example
- Binary File Inputs HOT 3
- Your domain is only being misused for illegal gambling promotions in Indonesia HOT 2
- --interact is not working HOT 3
- Slates Json parser error
- Option to enable SSE2 optimization HOT 2
- Segfault on ccb_explore_adf -cb_type dr HOT 7
- Support for multi-line featuers in --audit_regressor HOT 1
- New installation unable to find boost python lib LINK : fatal error LNK1104: cannot open file 'boost_python312-vc143-mt-x64-1_84.lib' HOT 2
- Sporadic failure in read_span_flatbuffer tests
- Request to daemon hands HOT 2
- Multiclass Classifier Consumes Large Memory HOT 1
- Unexpected predictions when training ccb-model HOT 7
- Incremental Training Best Practice HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from vowpal_wabbit.