Coder Social home page Coder Social logo

jjmontesl / cubetl Goto Github PK

View Code? Open in Web Editor NEW
25.0 6.0 10.0 15.53 MB

CubETL - Framework and tool for data ETL (Extract, Transform and Load) in Python (PERSONAL PROJECT / SELDOM MAINTAINED)

License: MIT License

Python 99.46% Makefile 0.50% Dockerfile 0.04%
etl etl-framework olap database sql sdmx csv python

cubetl's People

Contributors

dependabot[bot] avatar dfrankow avatar jjmontes-crt avatar jjmontesl avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

cubetl's Issues

Unexpected error in examples/sql

I ran examples/sql/sql2olap.sh, then ran the cubes server, and got the error below. I don't know exactly which "Title" is the problem.

  1. To replicate:
# Apply https://github.com/jjmontesl/cubetl/pull/8 to open port 5005
docker-compose run --service-ports cubetls /bin/bash
pip install cubes click flask
cd examples/sql
sh sql2olap.sh
# Edit chinook.cubes-config.ini to use host 0.0.0.0 to listen for connections outside docker
slicer serve chinook.cubes-config.ini >& slicer.log &

From outside the docker container, connect to http://localhost:5005/cube/Invoice/facts.

  1. The error (from slicer.log):
2019-11-22 20:28:26,018 DEBUG Exception stack trace:
Traceback (most recent call last):
  File "/usr/local/lib/python3.7/site-packages/cubes/sql/query.py", line 575, in column
    column = table.columns[mapping.column]
  File "/usr/local/lib/python3.7/site-packages/sqlalchemy/util/_collections.py", line 194, in __getitem__
    return self._data[key]
KeyError: 'Title'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.7/site-packages/flask/app.py", line 1949, in full_dispatch_request
    rv = self.dispatch_request()
  File "/usr/local/lib/python3.7/site-packages/flask/app.py", line 1935, in dispatch_request
    return self.view_functions[rule.endpoint](**req.view_args)
  File "/usr/local/lib/python3.7/site-packages/cubes/server/decorators.py", line 118, in wrapper
    return f(*args, **kwargs)
  File "/usr/local/lib/python3.7/site-packages/cubes/server/decorators.py", line 167, in wrapper
    retval = f(*args, **kwargs)
  File "/usr/local/lib/python3.7/site-packages/cubes/server/blueprint.py", line 414, in cube_facts
    page_size=g.page_size)
  File "/usr/local/lib/python3.7/site-packages/cubes/sql/browser.py", line 249, in facts
    include_fact_key=True)
  File "/usr/local/lib/python3.7/site-packages/cubes/sql/browser.py", line 486, in denormalized_statement
    context = self._create_context(context_attributes)
  File "/usr/local/lib/python3.7/site-packages/cubes/sql/browser.py", line 472, in _create_context
    safe_labels=self.safe_labels)
  File "/usr/local/lib/python3.7/site-packages/cubes/sql/query.py", line 883, in __init__
    bases = {attr:self.star_schema.column(attr) for attr in base_names}
  File "/usr/local/lib/python3.7/site-packages/cubes/sql/query.py", line 883, in <dictcomp>
    bases = {attr:self.star_schema.column(attr) for attr in base_names}
  File "/usr/local/lib/python3.7/site-packages/cubes/sql/query.py", line 579, in column
    % (mapping.column, mapping.table, avail))
cubes.sql.query.SchemaError: Unknown column 'Title' in table 'Invoice' possible: Invoice.InvoiceId, Invoice.CustomerId, Invoice.InvoiceDate, Invoice.BillingAddress, Invoice.BillingCity, Invoice.BillingState, Invoice.BillingCountry, Invoice.BillingPostalCode, Invoice.Total

Recent Installation of cubetl raises error with SQLAlchemy

Reporting Error and Fix

When following the installation procedures provided here (Raspberry Pi OS) we throw an error at

cubestl -h

ERROR LAST LINE OF TRACEBACK
File "/home/<user>/cubetl/cubetl/sql/sql.py", line 33, in <module>
from sqlalchemy.types import Integer, String, Float, Boolean, Unicode, Date, Time, DateTime, Binary ImportError: cannot import name 'Binary' from 'sqlalchemy.types' (/home/<user>/cubetl/env/lib/python3.7/site-packages/sqlalchemy/types.py)

FIX
Edit cubetl/sql/sql.py and alter import names to match case in sqlalchemy types.py

from sqlalchemy.types import Integer, String, Float, Boolean, Unicode, Date, Time, DateTime, Binary

to

from sqlalchemy.types import INTEGER, String, FLOAT, BOOLEAN, Unicode, DATE, Time, DATETIME, BINARY

I hope this fix helps anyone else who encounters the error and lets them get to using this brilliant tool.

Phil

cubetl/geoip uses incf.countryutils that doesn't support python 3

I am using python 3.7.4.

To reproduce:

$ docker-compose run cubetls python -c "import cubetl.geoip"
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/app/cubetl/geoip/__init__.py", line 26, in <module>
    from incf.countryutils import transformations
  File "/usr/local/lib/python3.7/site-packages/incf/countryutils/transformations.py", line 151
    raise KeyError, code
                  ^
SyntaxError: invalid syntax

That package was last updated in 2009. There is a more recently updated version. It looks to not have the same error:

$ git clone https://github.com/wyldebeast-wunderliebe/incf.countryutils.git
$ cd incf.countryutils.git
$ virtualenv .venv
$ source .venv/bin/activate
$ python setup.py install
$ python -c "from incf.countryutils import transformations"
$ 

cubetl not working after successfully installation

After following each and every step as stated in the instructions,
not able to use the command "cubetl -h".
The following error is shown :

Traceback (most recent call last):
  File "/root/cubetl/env/bin/cubetl", line 11, in <module>
    load_entry_point('cubetl', 'console_scripts', 'cubetl')()
  File "/root/cubetl/env/local/lib/python2.7/site-packages/pkg_resources/__init__.py", line 542, in load_entry_point
    return get_distribution(dist).load_entry_point(group, name)
  File "/root/cubetl/env/local/lib/python2.7/site-packages/pkg_resources/__init__.py", line 2569, in load_entry_point
    return ep.load()
  File "/root/cubetl/env/local/lib/python2.7/site-packages/pkg_resources/__init__.py", line 2229, in load
    return self.resolve()
  File "/root/cubetl/env/local/lib/python2.7/site-packages/pkg_resources/__init__.py", line 2235, in resolve
    module = __import__(self.module_name, fromlist=['__name__'], level=0)
  File "/root/cubetl/cubetl/core/bootstrap.py", line 23, in <module>
    import importlib.util
ImportError: No module named util

Uses deprecated inspect.getargspec

  /app/cubetl/core/context.py:177: DeprecationWarning: inspect.getargspec() is deprecated since Python 3.0, use inspect.signature() or inspect.getfullargspec()
    spec = getargspec(value)

I'm using python 3.7.4.

error in jinja2 init py

I am getting this error while running it with cubetl -h

Traceback (most recent call last):
File "/var/www/html/cubetl/cub_env/bin/cubetl", line 9, in
load_entry_point('cubetl', 'console_scripts', 'cubetl')()
File "/var/www/html/cubetl/cub_env/lib/python3.5/site-packages/pkg_resources/init.py", line 542, in load_entry_point
return get_distribution(dist).load_entry_point(group, name)
File "/var/www/html/cubetl/cub_env/lib/python3.5/site-packages/pkg_resources/init.py", line 2569, in load_entry_point
return ep.load()
File "/var/www/html/cubetl/cub_env/lib/python3.5/site-packages/pkg_resources/init.py", line 2229, in load
return self.resolve()
File "/var/www/html/cubetl/cub_env/lib/python3.5/site-packages/pkg_resources/init.py", line 2235, in resolve
module = import(self.module_name, fromlist=['name'], level=0)
File "/var/www/html/cubetl/cubetl/core/bootstrap.py", line 36, in
from cubetl.util import config, log
File "/var/www/html/cubetl/cubetl/util/config.py", line 27, in
from cubetl.template.jinja import JinjaTemplateRenderer
File "/var/www/html/cubetl/cubetl/template/jinja.py", line 23, in
from jinja2 import Template
File "", line 969, in _find_and_load
File "", line 958, in _find_and_load_unlocked
File "", line 664, in _load_unlocked
File "", line 634, in _load_backward_compatible
File "/var/www/html/cubetl/cub_env/lib/python3.5/site-packages/Jinja2-3.0.0a1-py3.5.egg/jinja2/init.py", line 8, in
File "", line 969, in _find_and_load
File "", line 954, in _find_and_load_unlocked
File "", line 896, in _find_spec
File "", line 1139, in find_spec
File "", line 1115, in _get_spec
File "", line 1096, in _legacy_get_spec
File "", line 444, in spec_from_loader
File "", line 533, in spec_from_file_location
File "/var/www/html/cubetl/cub_env/lib/python3.5/site-packages/Jinja2-3.0.0a1-py3.5.egg/jinja2/bccache.py", line 207
dirname = f"_jinja2-cache-{os.getuid()}"
^
SyntaxError: invalid syntax

Are you looking for contributors?

We're looking for a project that we can build upon to ETL and visualize Real Estate Data. Also allowing CRUD functionality in regards to Create/Update Dimensions and creating custom ranges with Filters at a user level. If we started a proof of concept using cubetl, would we be able to get support to our questions?

Uses deprecated logger.warn (instead of logger.warning)

I'm using python 3.7.4.

This would be pretty easy for me to fix if you wished.

Example:

  /app/cubetl/sql/sql.py:372: DeprecationWarning: The 'warn' method is deprecated, use 'warning' instead

All the lines:

$ git grep 'logger.warn('
cubetl/core/components.py:            logger.warn("Finalized a non initialized component: %s" % comp)
cubetl/core/components.py:                logger.warn("Unfinalized component %s" % comp_desc.comp)
cubetl/fs/__init__.py:            logger.warn("Could not stat file: %s", path)
cubetl/http/__init__.py:                logger.warn("Could not retrieve HTTP document (attempt %d/%d): %s " % (attempt_count, self.attempts, e))
cubetl/olap/sql.py:            logger.warn("%s has multiple primary keys mapped: %s (ignoring)" % (self, pk_mappings))
cubetl/olap/sql.py:                            logger.warn("%s looked up an entity which exists with different attributes (field=%s, existing_value=%r, tried_value=%r) (reported only once per field)" % (self, mapping.sqlcolumn, v1, v2))
cubetl/olap/sqlschema.py:                        logger.warn("Ignoring foreign key reference to self: %s", dbcol.name)
cubetl/olap/sqlschema.py:                        logger.warn("Ignoring foreign key reference from %s.%s to not available entity: %s", dbcol.sqltable.name, dbcol.name, related_fact_name)
cubetl/olap/sqlschema.py:                logger.warn("Multiple primary key found in table %s (not supported, ignoring table)", sqltable.name)
cubetl/olap/sqlschema.py:                logger.warn("No primary key found in table %s (not supported, ignoring table)", sqltable.name)
cubetl/pcaxis/__init__.py:                logger.warn("PCAxisIterator could not parse value: %r (cell: %s)", value, container)
cubetl/sql/schemaimport.py:                        logger.warn("Skipped foreign key %s in table %s, as foreign key column (%s.%s) was not found.", dbcol.name, dbtable.name, list(dbcol.foreign_keys)[0].column.table.name, list(dbcol.foreign_keys)[0].column.name)
cubetl/sql/sql.py:                                logger.warn("%s updating an entity that exists with different attributes, overwriting (field=%s, existing_value=%s, tried_value=%s)" % (self, c.name, v1, v2))
cubetl/sql/sql.py:                        logger.warn("Unicode column %r received non-unicode string: %r " % (column.name, row[column.name]))
cubetl/text/__init__.py:                    logger.warn("Failed to match regular expresion %s on value: %s", self.regexp, data)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.