Coder Social home page Coder Social logo

panda_post's Introduction

PandaPost

These extensions allow you to represent Python NumPy/Pandas objects in Postgres.

Warning
This is very much a work in progress. Expect the APIs and what is in each extension to change!

Currently you can think of PandaPost more as an experiment than anything else. It seems like it would be useful to be able to natively store ndarrays, DataFrames, etc in a Postgres column. Then again, maybe not. :)

If you can make use of any of this, please drop a line to our Google Group. We’d love any feedback!

Installation

Install the pgxn client, and then pgxn install panda_post --unstable. After that you can CREATE EXTENSION "PandaPost"; in your database. Currently it installs everything in your default schema; I suggest creating a panda schema and installing there. Note that it also requires plpythonu.

To wrap all that together:

pgxn install panda_post --unstable
psql <database name>
CREATE EXTENSION plpythonu;
CREATE SCHEMA panda;
CREATE EXTENSION panda_post WITH SCHEMA panda;

Current Status

There are now casts between ndarray and all the plpython supported types (boolean, int*, float, real, numeric and text). This allows things like SELECT panda.str(array[1,1,2,2]::panda.ndarray);.

Table 1. Table General Functions
Function Description

T

Apply ndarray T() function to input

eval

Returns the results of running python’s eval() on the input, as an ndarray. IE: panda.repr(panda.eval('range(4)'))

repr

Returns the python repr() of the input.

str

Returns the python str() of the input.

Table 2. Table Set Functions
Function Description

ediff1d

Returns difference between elements in array

in1d

Returns boolean array of whether each element in i is present in ar2. See numpy.in1d.

intersect1d

Return unique list of values that are present in both i and ar2, similar to INTERSECT in SQL. See numpy.intersect1d.

ndunique

Return unique elements in ar, plus other potential output. See numpy.unique.

ndunique1

Same as ndunique() but only return the array of unique elements.

setdiff1d

Return the difference between i and ar2, similar to EXCEPT in SQL. See numpy.setdiff1d.

setxor1d

Return the values that are in only one (but not both) i and ar2. See numpy.setxor1d.

union1d

Return the unique values that are in either i or ar2. See numpy.union1d.

Table 3. Table Special Functions
Function Description

create_cast

Creates casts to and from a Postgres data type and ndarray

Next up: porting the more common ndarray functions:

np.ndarray.byteswap      np.ndarray.cumsum        np.ndarray.flat          np.ndarray.min           np.ndarray.ravel         np.ndarray.shape         np.ndarray.tobytes
np.ndarray.all           np.ndarray.choose        np.ndarray.data          np.ndarray.flatten       np.ndarray.nbytes        np.ndarray.real          np.ndarray.size          np.ndarray.tofile
np.ndarray.any           np.ndarray.clip          np.ndarray.diagonal      np.ndarray.getfield      np.ndarray.ndim          np.ndarray.repeat        np.ndarray.sort          np.ndarray.tolist
np.ndarray.argmax        np.ndarray.compress      np.ndarray.dot           np.ndarray.imag          np.ndarray.newbyteorder  np.ndarray.reshape       np.ndarray.squeeze       np.ndarray.tostring
np.ndarray.argmin        np.ndarray.conj          np.ndarray.dtype         np.ndarray.item          np.ndarray.nonzero       np.ndarray.resize        np.ndarray.std           np.ndarray.trace
np.ndarray.argpartition  np.ndarray.conjugate     np.ndarray.dump          np.ndarray.itemset       np.ndarray.partition     np.ndarray.round         np.ndarray.strides       np.ndarray.transpose
np.ndarray.argsort       np.ndarray.copy          np.ndarray.dumps         np.ndarray.itemsize      np.ndarray.prod          np.ndarray.searchsorted  np.ndarray.sum           np.ndarray.var
np.ndarray.astype        np.ndarray.ctypes        np.ndarray.fill          np.ndarray.max           np.ndarray.ptp           np.ndarray.setfield      np.ndarray.swapaxes      np.ndarray.view
np.ndarray.base          np.ndarray.cumprod       np.ndarray.flags         np.ndarray.mean          np.ndarray.put           np.ndarray.setflags      np.ndarray.take

Examples

Basic examples
CREATE EXTENSION IF NOT EXISTS plpythonu;
CREATE EXTENSION panda_post;

CREATE TEMP TABLE s AS SELECT array['a','b','c']::ndarray AS s1, array['c','d']::ndarray AS s2;

-- python repr() of an ndarray of strings
SELECT repr(s1) FROM s;
          repr
-------------------------
 array(['a', 'b', 'c'], +
       dtype='|S1')
(1 row)

-- python str() of same array
SELECT str(s1) FROM s;
      str
---------------
 ['a' 'b' 'c']
(1 row)

-- exclusive-or of two ndarrays
SELECT str(setxor1d(s1, s2)) FROM s;
      str
---------------
 ['a' 'b' 'd']
(1 row)

-- Intersection
SELECT str(intersect1d(s1, s2)) FROM s;
  str
-------
 ['c']
(1 row)

This more complicated example uses the lambad extension to create a dataframe, which can currently be returned as an ndarray.

Note
Eventually there will be an actual DataFrame Postgres data type
Basic ndarray type storing a subclass (in this case, a Pandas DataFrame)
CREATE EXTENSION IF NOT EXISTS lambda;
\set df pd.DataFrame.from_dict([{"a":1,"b":"a"},{"a":2,"b":"b"}])
SELECT repr(
    lambda(
      $l$(
        ndarray
      ) RETURNS ndarray
      LANGUAGE plpythonu
      TRANSFORM FOR TYPE ndarray
      AS $body$
        import pandas as pd

        return $l$ || :'df' || $l$
      $body$
      $l$
      , NULL::ndarray
    )
  );
  repr
---------
    a  b+
 0  1  a+
 1  2  b
(1 row)

PandaPost is released under a BSD liscense.

Copyright (c) 2016 Jim Nasby <[email protected]>.

panda_post's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

Forkers

decibel

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.