Coder Social home page Coder Social logo

twtrubiks / google-play-store-spider-bs4-excel Goto Github PK

View Code? Open in Web Editor NEW
10.0 2.0 7.0 119 KB

Google-Play-Store-spider use Beautiful Soup on Python to EXCEL

License: MIT License

Python 100.00%
python beautifulsoup pyexcel sql-database xlsx crawler google-play-store

google-play-store-spider-bs4-excel's Introduction

Google-Play-Store-spider-bs4-excel

抓取 Google Play Store 資料 use Beautiful Soup on Python 📝

並使用 SQLite 儲存 DB 以及 EXCEL

特色

安裝套件

確定電腦有安裝 Python 之後

clone 我的簡單範例

git clone https://github.com/twtrubiks/Google-Play-Store-spider-bs4-excel.git

接著請在 cmd (命令提示字元) 輸入以下指令

pip install -r requirements.txt

使用方法 以及 執行畫面

抓取 Google Play Store 資料 (熱門排行榜) 前100筆資料。

python app.py

執行畫面

alt tag

alt tag

alt tag

執行完畢後,會將資料存在 app.db 裡,可以使用 SQLiteBrowser 觀看

alt tag

Item 欄位總共會有 6 個類型,分別為

Android 應用程式類熱門免費下載 Android 應用程式類熱門付費下載Android 應用程式類最賣座項目

遊戲類熱門免費下載遊戲類熱門付費下載遊戲類最賣座項目

每種類別各 100 筆資料,每執行一次 app.py ,就會有 600 筆資料 (除非資料有問題)

如果你需要將資料存成 EXCEL

可以再執行

python SQL_Database_To_Excel.py

執行完畢後,會多出名稱為 Excel-data.xlsx

alt tag

update 2017/2/27

python app_category.py

抓取 Google Play Store topselling_new_free 前 600 資料,

注意,一次 post 最多只能抓 120 筆資料,超過 120 筆資料就會出問題,目前猜測是api設計的規定 ?! ( 不確定 )

執行環境

  • Python 3.4.3

Reference

License

MIT license

google-play-store-spider-bs4-excel's People

Contributors

twtrubiks avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

google-play-store-spider-bs4-excel's Issues

關於執行問題

您好,
我在執行時有遇到以下的問題,

File "app.py", line 91
SyntaxError: Non-ASCII character '\xe5' in file app.py on line 91, but no encoding declared; see http://python.org/dev/peps/pep-0263/ for details

找到原來是開頭沒有放 # -- coding: UTF-8 -- 這個XD

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.