Coder Social home page Coder Social logo

richhorace / elastic-stack-pocket Goto Github PK

View Code? Open in Web Editor NEW
14.0 4.0 5.0 4.04 MB

Import your Pocket API Data into Elastic Stack

License: MIT License

Python 99.61% Shell 0.39%
docker docker-compose elastic elasticsearch logstash kibana elasticstack python pocket-api getpocket

elastic-stack-pocket's Introduction

Pocket Data with Elastic Stack & Docker

This repository will retrieve data from Pocket API, prep data for ingest into the Elastic Stack (Elasticsearch, Logstash, Kibana) using Elastic's official docker images.

Instead of this

Wouldn't this be better!

Get the most our your Pocket Data!!

  • Date Added
  • Unique URLs
  • Unique Given Domain
  • Unique Resolved Domain
  • Tag Cloud

As you can see, I've been a long time user of Pocket even before it was rebranded from Read It Later.

Tested Versions

Example has been tested in following versions:

  • Python 3.8
  • Elasticsearch 7.8.0
  • Filebeat 7.8.0
  • Kibana 7.8.0
  • Docker 19.03.8
  • Docker Compose 1.25.5

Requirements Pocket App

Assumption is that you already have created a Pocket App with credentials.

If you have an account, but do not have an App, following instructions:

If you do not have an Pocket Account, jump down to Launch Containers, Ingest Data

Getting Started - Data Prep

  1. Update Credentials Update config-example.py with Pocket App credentials then save as config.py

  2. Retrieve and Prep Pocket Data

    retrieve-prep-pocket-data.py will retrieve and prep data from the Pocket API. The default parameter is one day back.

    • Removes images and videos
    • Creates list for tags and authors while removing item_id
    • Dumps JSON lines to log file ready for Logstash

    Usage:

    usage: retrieve-prep-pocket-data.py [-h] [-d DAYS_BACK]
    
    Pass number of days back to start from
    
    optional arguments:
    -h, --help            show this help message and exit
    -d DAYS_BACK, --days_back DAYS_BACK
    						Number of days back
    

    Example:

     ``` 
     cd scripts
     python retrieve-prep-pocket-data.py -d 10
     ```
    

Launch Stack to Ingest Data

  1. Launch Containers and Test Connections

    Docker Compose Ingest will launch Elasticsearch, Logstash and Kibana official Elastic images.

    docker-compose -f docker-compose-ingest.yml up

  2. Ingest Data with Logstash

    • Logstash will Ingest *.logs in ./data/logs
    • Creates fields based on uri for given_domain and resolved_domain
    • Transforms UNIX Timestamps to ISO Dates for time_added and time_updated
    • Outputs to Elasticsearch to Pocket Index while setting document_id to item_id
  3. Import Visualizations

    • Launch Kibana from browser: http://localhost:5601
    • From Side Navigation Bar
      • Select Stack Management
      • Under Kibana select Saved Objects
      • Select Import
      • Navigate to local repo then elastic-stack/config/kibana/Kibana780-pocket-dashboards.ndjson
      • Select Import
      • Select Confirm Changes
      • Select Done
    • From Side Navigation Bar
      • Select Dashboard
      • Select Pocket Overview
  4. Shutdown Stack You can stop the Stack without loosing data. The ingested data will persist until you remove the volume.

    docker-compose -f docker-compose-ingest.yml down

Launch Stack to Review Data

You can start the Stack with only Elasticsearch and Kibana to view existing data.

Start: `docker-compose -f docker-compose.yml up`
Stop: `docker-compose -f docker-compose.yml down`

elastic-stack-pocket's People

Contributors

bjogden avatar richhorace avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.