Coder Social home page Coder Social logo

arboretum_benchmark's People

Contributors

sh1ng avatar

Watchers

 avatar  avatar

arboretum_benchmark's Issues

Results

Criteo dataset

40M dataset - 5000 threes

python3 src/criteo_speed_test.py xgboost ; python3 src/criteo_speed_test.py lightgbm; python3 src/criteo_speed_test.py arboretum
reading data....
startring benchmark xgboost
2388.4275090694427
roc auc train:0.8894494708973515 cv:0.782368838905777
reading data....
startring benchmark lightgbm
[LightGBM] [Warning] Starting from the 2.1.2 version, default value for the "boost_from_average" parameter in "binary" objective is true.
This may cause significantly different results comparing to the previous versions of LightGBM.
Try to set boost_from_average=false, if your old models produce bad results
[LightGBM] [Info] Number of positive: 1031045, number of negative: 30968955
[LightGBM] [Info] This is the GPU trainer!!
[LightGBM] [Info] Total Bins 3872
[LightGBM] [Info] Number of data: 32000000, number of used features: 39
[LightGBM] [Info] Using GPU Device: GeForce GTX 1070, Vendor: NVIDIA Corporation
[LightGBM] [Info] Compiling OpenCL Kernel with 256 bins...
[LightGBM] [Info] GPU programs have been built
[LightGBM] [Info] Size of histogram bin entry: 12
[LightGBM] [Info] 38 dense feature groups (1220.70 MB) transfered to GPU in 0.967707 secs. 1 sparse feature groups
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.032220 -> initscore=-3.402412
[LightGBM] [Info] Start training from score -3.402412
2522.5609214305878
roc auc train:0.8973754627179997 cv:0.7764168713748687
reading data....
startring benchmark arboretum
feature 0 has been reduced to 15 bits 
feature 1 has been reduced to 13 bits 
feature 2 has been reduced to 10 bits 
feature 3 has been reduced to 15 bits 
feature 4 has been reduced to 12 bits 
feature 5 has been reduced to 10 bits 
feature 6 has been reduced to 9 bits 
feature 7 has been reduced to 14 bits 
feature 8 has been reduced to 10 bits 
feature 9 has been reduced to 4 bits 
feature 10 has been reduced to 8 bits 
feature 11 has been reduced to 19 bits 
feature 12 has been reduced to 11 bits 
max feature size 19 
Total bytes 8513978368 available 2080964608 
Memory usage estimation 220 per record 7040000000 in total 
copied features data 13 from 13 
copied category features 1 from 26 


roc auc train:0.8108035778617945 cv:0.7864887547868098

10M dataset - 5000 threes

reading data....
startring benchmark xgboost
662.0523405075073
roc auc train:0.965875942403954 cv:0.756494298707683
reading data....
startring benchmark lightgbm
[LightGBM] [Warning] Starting from the 2.1.2 version, default value for the "boost_from_average" parameter in "binary" objective is true.
This may cause significantly different results comparing to the previous versions of LightGBM.
Try to set boost_from_average=false, if your old models produce bad results
[LightGBM] [Info] Number of positive: 247552, number of negative: 7752448
[LightGBM] [Info] This is the GPU trainer!!
[LightGBM] [Info] Total Bins 3844
[LightGBM] [Info] Number of data: 8000000, number of used features: 39
[LightGBM] [Info] Using GPU Device: GeForce GTX 1070, Vendor: NVIDIA Corporation
[LightGBM] [Info] Compiling OpenCL Kernel with 256 bins...
[LightGBM] [Info] GPU programs have been built
[LightGBM] [Info] Size of histogram bin entry: 12
[LightGBM] [Info] 38 dense feature groups (305.18 MB) transfered to GPU in 0.311981 secs. 1 sparse feature groups
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.030944 -> initscore=-3.444143
[LightGBM] [Info] Start training from score -3.444143
805.6181969642639
roc auc train:0.9904928384666543 cv:0.7458449429346047
reading data....
startring benchmark arboretum
feature 0 has been reduced to 14 bits 
feature 1 has been reduced to 13 bits 
feature 2 has been reduced to 9 bits 
feature 3 has been reduced to 14 bits 
feature 4 has been reduced to 12 bits 
feature 5 has been reduced to 9 bits 
feature 6 has been reduced to 9 bits 
feature 7 has been reduced to 13 bits 
feature 8 has been reduced to 9 bits 
feature 9 has been reduced to 4 bits 
feature 10 has been reduced to 8 bits 
feature 11 has been reduced to 19 bits 
feature 12 has been reduced to 10 bits 
max feature size 19 
Total bytes 8513978368 available 7124615168 
Memory usage estimation 180 per record 1440000000 in total 
copied features data 13 from 13 
copied category features 26 from 26 
roc auc train:0.8339605258327059 cv:0.7800383186995146

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.