Coder Social home page Coder Social logo

jiwon4178 / dacon_jobcare Goto Github PK

View Code? Open in Web Editor NEW

This project forked from yoonj98/dacon_jobcare

0.0 0.0 0.0 467 KB

๐ŸฅˆDacon : ์žก์ผ€์–ด ์ถ”์ฒœ ์•Œ๊ณ ๋ฆฌ์ฆ˜ ๊ฒฝ์ง„๋Œ€ํšŒ๐Ÿฅˆ

Jupyter Notebook 100.00%

dacon_jobcare's Introduction

Dacon ์žก์ผ€์–ด ์ถ”์ฒœ ์•Œ๊ณ ๋ฆฌ์ฆ˜ ๊ฒฝ์ง„๋Œ€ํšŒ

ํŒ€ ํ› ๊ถˆ / ๐Ÿฅˆ์šฐ์ˆ˜์ƒ ์ˆ˜์ƒ๐Ÿฅˆ


  ํ•œ๊ตญ๊ณ ์šฉ์ •๋ณด์›์—์„œ ์ œ๊ณตํ•˜๋Š” ๊ตฌ์ธ๊ตฌ์ง ๋น…๋ฐ์ดํ„ฐ ๊ธฐ๋ฐ˜ ์ปค๋ฆฌ์–ด ๊ด€๋ฆฌ ์„œ๋น„์Šค์ธ 
  ์žก์ผ€์–ด ๋ฐ์ดํ„ฐ๋ฅผ ํ†ตํ•ด ๊ฐœ์ธ๋ณ„ ๋งž์ถคํ˜• ์ปจํ…์ธ  ์ถ”์ฒœ ๋ชจ๋ธ ๊ตฌ์ถ• ๋ฐ ํ™œ์šฉ ๋ฐฉ์•ˆ์„ ์ œ์‹œํ•œ ํ”„๋กœ์ ํŠธ

๐Ÿ“Œ ์ „์ฒด ์‹คํ–‰ ํ”„๋กœ์„ธ์Šค

1. Catboost / Optuna ์„ค์น˜

2. ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ ๋ถˆ๋Ÿฌ์˜ค๊ธฐ

  • ๊ฐœ๋ฐœ ํ™˜๊ฒฝ ๋ฐ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ ๋ฒ„์ „ ํ™•์ธ

3. ๋ฐ์ดํ„ฐ ๋ถˆ๋Ÿฌ์˜ค๊ธฐ

4. ํƒ์ƒ‰์  ์ž๋ฃŒ๋ถ„์„ (EDA)

  • ๋ฐ์ดํ„ฐ ๊ธฐ์ดˆ ํ†ต๊ณ„๋Ÿ‰ ํ™•์ธ
  • ๋ฐ์ดํ„ฐ ๊ฒฐ์ธก์น˜ ๋ฐ ๋ถˆ๊ท ํ˜• ํ™•์ธ
  • ๋ฐ์ดํ„ฐ ์‹œ๊ฐํ™”

5. ๋ฐ์ดํ„ฐ ์ „์ฒ˜๋ฆฌ

  • Booleanํ˜• ๋ณ€์ˆ˜ label encoding
  • ํŒŒ์ƒ๋ณ€์ˆ˜ ์ƒ์„ฑ
    • ์ปจํ…์ธ  ์—ด๋žŒ ์ผ์‹œ ๋ณ€์ˆ˜ โ†’ ์š”์ผ๊ณผ ์‹œ๊ฐ„ ๊ด€๋ จ ๋ณ€์ˆ˜ ์ƒ์„ฑ (contents_open_wd, contents_open_hour, contents_weekday, contents_work_time)
    • ์ปจํ…์ธ  ๋ฒˆํ˜ธ ๋นˆ๋„์ˆ˜ ๋ณ€์ˆ˜ ์ƒ์„ฑ (contents_rn_cnt)
    • ์‚ฌ์šฉ์ž ๋ฒˆํ˜ธ ๋นˆ๋„์ˆ˜ ๋ณ€์ˆ˜ ์ƒ์„ฑ (person_rn_cnt)
    • ์†์„ฑ D์˜ ๋Œ€๋ถ„๋ฅ˜ ๋งค์นญ ์—ฌ๋ถ€ ๋ณ€์ˆ˜ ์ƒ์„ฑ (d_1_l_match_yn, d_2_l_match_yn, d_3_l_match_yn)
    • ์†์„ฑ D์˜ ์ฝ”๋“œ ๋งค์นญ ์—ฌ๋ถ€ ๋ณ€์ˆ˜ ์ƒ์„ฑ (d_1_s_match_yn, d_2_s_match_yn, d_3_s_match_yn)
  • ๋ณ€์ˆ˜ ์‚ญ์ œ โ†’ label์ด ํ•˜๋‚˜๊ฑฐ๋‚˜, ํŒŒ์ƒ๋ณ€์ˆ˜๋ฅผ ์ƒ์„ฑํ•˜๋Š” ๋ฐ ์‚ฌ์šฉํ•œ ์ผ๋ถ€ ๋ณ€์ˆ˜ ์ œ๊ฑฐ

id, person_prefer_f ,person_prefer_g, person_rn, contents_rn, contents_open_dt, d_l_match_yn, d_m_match_yn, d_s_match_yn, h_m_match_yn, h_s_match_yn, person_prefer_d_1_l, person_prefer_d_2_l,person_prefer_d_3_l, contents_attribute_d_l

6. ๋ชจ๋ธ๋ง

  • ๋ฐ์ดํ„ฐ์— ๋ฒ”์ฃผํ˜• ๋ณ€์ˆ˜์˜ ๋น„์ค‘์ด ๋†’๊ธฐ ๋•Œ๋ฌธ์— Catboost ๋ชจ๋ธ์„ ์‚ฌ์šฉ
  • Optuna ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๋ฅผ ํ†ตํ•ด ์ตœ์ ์˜ ํ•˜์ดํผ ํŒŒ๋ฆฌ๋ฏธํ„ฐ ํƒ์ƒ‰ (F1 score maximize, Trial 10)
  • K-fold ๊ต์ฐจ ๊ฒ€์ฆ ์ง„ํ–‰ (n_splits = 5)
  • CV๋ณ„ ์˜ˆ์ธก ํ™•๋ฅ ์„ ํ‰๊ท  ๋‚ด์–ด ์ตœ์ข… ์˜ˆ์ธก ํ™•๋ฅ ๋กœ ํ™œ์šฉ
  • threshold = 0.4๋ฅผ ๊ธฐ์ค€์œผ๋กœ ์˜ˆ์ธก ํ™•๋ฅ ์„ label๋กœ ๋ณ€ํ™˜

๐Ÿ“Œ Presentation

์ €ํฌ ํ”„๋กœ์ ํŠธ์— ๋Œ€ํ•ด ์ž์„ธํ•˜๊ฒŒ ์•Œ๊ณ  ์‹ถ์œผ์‹œ๋‹ค๋ฉด, ํ”„๋กœ์ ํŠธ ์„ค๋ช…์ž๋ฃŒ๋ฅผ ์ฐธ๊ณ ํ•ด์ฃผ์„ธ์š”.

  • GoogleDrive Badge

๐Ÿ“Œ Structure

ํ› ๊ถˆ  
โ”œโ”€โ”€ README.md
โ”œโ”€โ”€ Final_Code.ipynb
โ”œโ”€โ”€ data  
โ”‚    โ”œโ”€โ”€โ”€train.csv
โ”‚    โ”œโ”€โ”€โ”€test.csv
โ”‚    โ”œโ”€โ”€โ”€result_submission.csv
โ”‚    โ”œโ”€โ”€โ”€train_data.csv
โ”‚    โ””โ”€โ”€โ”€test_data.csv
โ”‚          
โ”œโ”€โ”€ preprocess
โ”‚    โ”œโ”€โ”€โ”€EDA.ipynb
โ”‚    โ””โ”€โ”€โ”€preprocess.ipynb
โ”‚    
โ””โ”€โ”€ model
     โ”œโ”€โ”€โ”€hyper_parameter.ipynb
     โ”œโ”€โ”€โ”€model.ipynb
     โ””โ”€โ”€โ”€ model
           โ””โ”€โ”€โ”€catboost_optuna_parameter.pkl

๐Ÿ“Œ ๊ฐœ๋ฐœ ํ™˜๊ฒฝ ๋ฐ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ ๋ฒ„์ „

OS                            Linux-5.4.0-91-generic-x86_64-with-debian-buster-sid
Process information           x86_64
Process Architecture          x86_64
RAM                           252 GB

python                        3.7.6
numpy                         1.18.1
pandas                        1.0.1
scikit-learn                  0.22.1
catboost                      1.0.4
optuna                        2.10.0

๐Ÿ“Œ Contributors

์ด์œค์ • ๋ฐ•์ง€์› ๋ฐ•์ง€ํ˜„ ์–‘์ง€์šฐ

dacon_jobcare's People

Contributors

yoonj98 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.