Coder Social home page Coder Social logo

kalamari's Introduction

Kalamari

Build Status Coverage Status

Kalamari is a convenience wrapper for Python's built-in json module that makes extracting data from JSON a lot easier. Instead of typing out absolute paths to your desired keys, you can just pass the keys as parameters to kalamari methods.

Installation

This package is not available through PyPI yet, so you have to manually install it. Go ahead and clone the repository and run the following command

python3 setup.py sdist

This will create a sub-directory called dist which will contain a compressed archive file. The file format defaults to tar.gz on POSIX systems and .zip on Windows.

After creating this archive file, you can install the package via pip3

sudo pip3 install kalamari-0.1dev.tar.gz

Usage

Initialization

All major extraction methods are available through instances of the smartJSON class.

>>> from kalamari import smartJSON
>>> import requests
>>> r = requests.get('https://yourapi.io/someendpoint') # returns some JSON
>>> data = smartJSON(r.content)

Fetching data

Say that the above GET request returns the following information and you'd like to extract the highest number of views accumulated for a video

{
  "videos": {
    "0": {
      "title": "Pytest tutorial (1/5)",
      "url": "https://myvid.com/454F5gK9700e",
      "author": "pythonguy226",
      "email": "[email protected]",
      "total_views": "4561452"
    },
    "1": {
      "title": "JavaScript async await",
      "url": "https://myvid.com/784F5gF9800e",
      "author": "jsguy995",
      "email": "[email protected]",
      "total_views": "784569"
    }
  }
}

With kalamari, this can be achieved with only a few lines of code

>>> _views = data.get_attrs("total_views")
>>> views = max(map(int, _views["total_views"]))
>>> views
>>> 4561452

Pretty cool right? You can also fetch more than one attribute at a time.

>>> views_w_authors = data.get_attrs("author","total_views")
>>> views_w_authors
>>> {'author': ['pythonguy226', 'jsguy995'], 'total_views': ['4561452', '784569']}

Extraction methods

  • get_attrs()
  • This is the simplest method. It accepts the names of all the attributes that you wish to extract and returns a dict. (Used in above example)
  • get_attrs_by()
  • This method accepts names of attributes and a boolean function. It applies that function to all Node objects in a tree and only returns the values of nodes which satisfy the condition. The boolean function should accept two arguments, depth(int) and node(Node). Say you want to extract the resident IDs and house IDs separately from the following JSON
{
 "houses": {
   "0": {
     "id": "451478",
     "location": "Plattsburgh, NY",
     "zip": "12901",
     "owner": "John Doe",
     "residents": {
       "0": {
         "id": "7004",
         "name": "Alan Turing",
         "occupation": "Computer Scientist"
       },
       "1": {
         "id": "6004",
         "name": "Grace Hopper",
         "occupation": "Software Engineer"
       }
     }
   },
   "1": {
     "id": "451648",
     "location": "Albany, NY",
     "zip": "12901",
     "owner": "Alex Turner",
     "residents": {
       "0": {
         "id": "6549",
         "name": "Liam Gallagher",
         "occupation": "Musician"
       },
       "1": {
         "id": "5470",
         "name": "Noel Gallagher",
         "occupation": "Musician"
       }
     }
   }
 }
}

You can pass a boolean function to get_attrs_by to achieve this

>>> house_ids = data.get_attrs_by(lambda depth,node:node.get_parent().get_parent().data=="houses","id")
>>> resident_ids = data.get_attrs_by(lambda depth,node:node.get_parent().get_parent().data=="residents","id")
>>> house_ids
>>> {'id': ['451478', '451648']}
>>> resident_ids
>>> {'id': ['7004', '6004', '6549', '5470']}
  • get_attrs_by_key()
  • This method accepts a regular expression and returns all attributes that match that regular expression
  • get_attrs_by_value()
  • This method accepts a regular expression and returns all attributes whose values match that regular expression
>>> attrs_w_numbers = data.get_attrs_by_value("[0-9]")
>>> attrs_w_numbers
>>> {'title': ['Pytest tutorial (1/5)'], 'url': ['https://myvid.com/454F5gK9700e', 'https://myvid.com/784F5gF9800e'], 'author': ['pythonguy226', 'jsguy995'], 'email': ['[email protected]', '[email protected]'], 'total_views': ['4561452', '784569']}
  • get_attrs_by_parent()
  • This method accepts a regular expression and returns all attributes whose immediate parent matches that regular expression

TODO

  • Add method chaining
    • All methods should return a smartJSON object.
    • A smartJSON object should be convertible to a dict

kalamari's People

Contributors

abelarm avatar joviij avatar lowkorn avatar prithajnath avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar  avatar

kalamari's Issues

Finish head() and reveal() methods

The Tree class has two methods, head() and reveal() which are supposed to help visualize the underlying JSON structure. head() should only print the first few levels, depending on how deep the tree is, and reveal() should print the entire tree structure. You don't have to implement the search yourself because the __iter__ method already lets you traverse the tree level by level. Also, feel free to suggest better names for these methods!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.