Coder Social home page Coder Social logo

risingwavelabs / ibis Goto Github PK

View Code? Open in Web Editor NEW

This project forked from ibis-project/ibis

0.0 1.0 0.0 58.43 MB

The flexibility of Python with the scale and performance of modern SQL.

Home Page: https://ibis-project.org

License: Apache License 2.0

Shell 0.07% JavaScript 0.12% C++ 0.95% Python 98.39% R 0.04% Nix 0.25% CMake 0.04% Dockerfile 0.01% Visual Basic 6.0 0.04% Just 0.11%

ibis's Introduction

Ibis

Documentation status Project chat Anaconda badge PyPI Build status Build status Codecov branch

What is Ibis?

Ibis is the portable Python dataframe library:

See the documentation on "Why Ibis?" to learn more.

Getting started

You can pip install Ibis with a backend and example data:

pip install 'ibis-framework[duckdb,examples]'

Tip

See the installation guide for more installation options.

Then use Ibis:

>>> import ibis
>>> ibis.options.interactive = True
>>> t = ibis.examples.penguins.fetch()
>>> t
┏━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━┓
┃ speciesislandbill_length_mmbill_depth_mmflipper_length_mmbody_mass_gsexyear  ┃
┡━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━┩
│ stringstringfloat64float64int64int64stringint64 │
├─────────┼───────────┼────────────────┼───────────────┼───────────────────┼─────────────┼────────┼───────┤
│ AdelieTorgersen39.118.71813750male2007 │
│ AdelieTorgersen39.517.41863800female2007 │
│ AdelieTorgersen40.318.01953250female2007 │
│ AdelieTorgersenNULLNULLNULLNULLNULL2007 │
│ AdelieTorgersen36.719.31933450female2007 │
│ AdelieTorgersen39.320.61903650male2007 │
│ AdelieTorgersen38.917.81813625female2007 │
│ AdelieTorgersen39.219.61954675male2007 │
│ AdelieTorgersen34.118.11933475NULL2007 │
│ AdelieTorgersen42.020.21904250NULL2007 │
│ …       │ …         │              … │             … │                 … │           … │ …      │     … │
└─────────┴───────────┴────────────────┴───────────────┴───────────────────┴─────────────┴────────┴───────┘
>>> g = t.group_by(["species", "island"]).agg(count=t.count()).order_by("count")
>>> g
┏━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━┓
┃ speciesislandcount ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━┩
│ stringstringint64 │
├───────────┼───────────┼───────┤
│ AdelieBiscoe44 │
│ AdelieTorgersen52 │
│ AdelieDream56 │
│ ChinstrapDream68 │
│ GentooBiscoe124 │
└───────────┴───────────┴───────┘

Tip

See the getting started tutorial for a full introduction to Ibis.

Python + SQL: better together

For most backends, Ibis works by compiling its dataframe expressions into SQL:

>>> ibis.to_sql(g)
SELECT
  "t1"."species",
  "t1"."island",
  "t1"."count"
FROM (
  SELECT
    "t0"."species",
    "t0"."island",
    COUNT(*) AS "count"
  FROM "penguins" AS "t0"
  GROUP BY
    1,
    2
) AS "t1"
ORDER BY
  "t1"."count" ASC

You can mix SQL and Python code:

>>> a = t.sql("SELECT species, island, count(*) AS count FROM penguins GROUP BY 1, 2")
>>> a
┏━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━┓
┃ speciesislandcount ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━┩
│ stringstringint64 │
├───────────┼───────────┼───────┤
│ AdelieTorgersen52 │
│ AdelieBiscoe44 │
│ AdelieDream56 │
│ GentooBiscoe124 │
│ ChinstrapDream68 │
└───────────┴───────────┴───────┘
>>> b = a.order_by("count")
>>> b
┏━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━┓
┃ speciesislandcount ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━┩
│ stringstringint64 │
├───────────┼───────────┼───────┤
│ AdelieBiscoe44 │
│ AdelieTorgersen52 │
│ AdelieDream56 │
│ ChinstrapDream68 │
│ GentooBiscoe124 │
└───────────┴───────────┴───────┘

This allows you to combine the flexibility of Python with the scale and performance of modern SQL.

Backends

Ibis supports 20+ backends:

How it works

Most Python dataframes are tightly coupled to their execution engine. And many databases only support SQL, with no Python API. Ibis solves this problem by providing a common API for data manipulation in Python, and compiling that API into the backend’s native language. This means you can learn a single API and use it across any supported backend (execution engine).

Ibis supports three types of backend:

  1. SQL-generating backends
  2. Expression-generating backends
  3. Naïve execution backends

Ibis backend types

Portability

To use different backends, you can set the backend Ibis uses:

>>> ibis.set_backend("duckdb")
>>> ibis.set_backend("polars")
>>> ibis.set_backend("datafusion")

Typically, you'll create a connection object:

>>> con = ibis.duckdb.connect()
>>> con = ibis.polars.connect()
>>> con = ibis.datafusion.connect()

And work with tables in that backend:

>>> con.list_tables()
['penguins']
>>> t = con.table("penguins")

You can also read from common file formats like CSV or Apache Parquet:

>>> t = con.read_csv("penguins.csv")
>>> t = con.read_parquet("penguins.parquet")

This allows you to iterate locally and deploy remotely by changing a single line of code.

Tip

Check out the blog on backend agnostic arrays for one example using the same code across DuckDB and BigQuery.

Community and contributing

Ibis is an open source project and welcomes contributions from anyone in the community.

Join our community by interacting on GitHub or chatting with us on Zulip.

For more information visit https://ibis-project.org/.

ibis's People

Contributors

cpcloud avatar ibis-squawk-bot[bot] avatar kszucs avatar renovate[bot] avatar wesm avatar gforsyth avatar jcrist avatar datapythonista avatar krzysztof-kwitt avatar deepyaman avatar nickcrews avatar lostmygithubaccount avatar xmnlab avatar renovate-bot avatar laserson avatar pre-commit-ci[bot] avatar mesejo avatar gerrymanoim avatar icexelloss avatar ncclementi avatar saulpw avatar timothydijamco avatar emilyreff7 avatar tswast avatar github-actions[bot] avatar chloeh13q avatar anjakefala avatar semantic-release-bot avatar nicoretti avatar matthewmturner avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.