xzkostyan / clickhouse-sqlalchemy Goto Github PK
View Code? Open in Web Editor NEWClickHouse dialect for SQLAlchemy
Home Page: https://clickhouse-sqlalchemy.readthedocs.io
License: Other
ClickHouse dialect for SQLAlchemy
Home Page: https://clickhouse-sqlalchemy.readthedocs.io
License: Other
Hi,
The package is not properly installed when using pip install -e .
(but works totally fine when not using -e)
This is because of the following part in the setup.py:
packages=[
p.replace('src', 'clickhouse_sqlalchemy')
for p in find_packages(exclude=['tests'])
if p.startswith('src')
],
This is only a limitation when working on clickhouse-sqlalchemy code, so it's more of a dev experience issue.
May I suggest to rename src/ to clickhouse_sqlalchemy/ to make it works (and change the setup.py to match), this should have no side effects (successfully tried the refacto on my local repo)
Let me know if you're fine with this and if you want me to create the PR.
Thanks,
Timothé
Sorry, if it's not the right place to ask, I just don't know, where is.
If I use pandas.dataframe.to_sql, insert the empty date value in dataframe to Clickhouse,There will be a mistake here.
def main():
d2 = pd.DataFrame({
'date': [pd.to_datetime('2018-01-01'), None, pd.to_datetime('2018-01-03')]
})
print(d2)
"""
date
0 2018-01-01
1 NaT
2 2018-01-03
"""
ch_engine = sqla.create_engine(Const.CH_URI)
d2.to_sql(name='t_date', con=ch_engine,
if_exists='append', index=False)
.............
File "/anaconda3/envs/study/lib/python3.6/site-packages/clickhouse_driver/columns/service.py", line 80, in write_column
column.write_data(items, buf)
File "/anaconda3/envs/study/lib/python3.6/site-packages/clickhouse_driver/columns/base.py", line 79, in write_data
self._write_data(items, buf)
File "/anaconda3/envs/study/lib/python3.6/site-packages/clickhouse_driver/columns/base.py", line 82, in _write_data
prepared = self.prepare_items(items)
File "/anaconda3/envs/study/lib/python3.6/site-packages/clickhouse_driver/columns/base.py", line 69, in prepare_items
x = before_write(x)
File "/anaconda3/envs/study/lib/python3.6/site-packages/clickhouse_driver/columns/datetimecolumn.py", line 26, in before_write_item
if value.tzinfo is None:
AttributeError: 'NoneType' object has no attribute 'tzinfo'
so,I edit 'clickhouse_driver/columns/datetimecolumn.py',such as:
def before_write_item(self, value):
if self.timezone:
# Set server's timezone for offset-naive datetime.
# if value.tzinfo is None:
# value = self.timezone.localize(value)
# value = value.astimezone(utc)
# return int(timegm(value.timetuple()))
"""
my code
"""
if hasattr(value, 'tzinfo'):
if value.tzinfo is None:
value = self.timezone.localize(value)
value = value.astimezone(utc)
return int(timegm(value.timetuple()))
else:
return 0
else:
# If datetime is offset-aware use it's timezone.
if value.tzinfo is not None:
value = value.astimezone(utc)
return int(timegm(value.timetuple()))
return int(mktime(value.timetuple()))
Null date can also be inserted into Clickhouse (0000-00-00 00:00:00).
py3.7
clickhouse-driver 0.0.19
clickhouse-sqlalchemy 0.0.10
from clickhouse_sqlalchemy.ext.declarative import declarative_base
class BaseModel(declarative_base()):
__abstract__ = True
__table_args__ = {'extend_existing': True}
class MachineOp(BaseModel):
__tablename__ = 'machine_op'
date = Column(Date)
time = Column(UInt32)
ip = Column(String)
behavior_type = Column(UInt8)
file_name = Column(String)
file_path = Column(String)
detail = Column(String)
ERROR
sqlalchemy.exc.ArgumentError: Mapper mapped class MachineOp->machine_op could not assemble any primary key columns for mapped table 'machine_op'
class BaseModel(declarative_base()):
__abstract__ = True
# __table_args__ = {'extend_existing': True}
class MachineOp(BaseModel):
__tablename__ = 'xxxxdfaf'
date = Column(Date)
time = Column(UInt32)
ip = Column(String)
behavior_type = Column(UInt8)
file_name = Column(String)
file_path = Column(String)
detail = Column(String)
I didn't have any Table 'xxxxdfaf' . It was strange.
ERROR: Failure: InvalidRequestError (Table 'xxxxdfaf' is already defined for this MetaData instance. Specify 'extend_existing=True' to redefine options and columns on an existing Table object.)
My table setting
CREATE TABLE machine_op (
`date` Date,
`time` UInt32,
`ip` String,
`behavior_type` UInt8,
`file_name` String,
`file_path` String,
`detail` String
) ENGINE = MergeTree()
PARTITION BY date
ORDER BY date;
When I updated clickhouse-sqlalchemy from repo I got exception on select-query OrmObject.query.filter_by(id=1]).first()
:
File "/home/vasily/PycharmProjects/np_diff/venv/lib/python3.6/site-packages/clickhouse_sqlalchemy/drivers/base.py", line 232, in _compose_select_body
if select._array_join is not None:
AttributeError: 'Select' object has no attribute '_array_join'
If I just comment this two lines in /driver/base.py
if select._array_join is not None:
text += select._array_join._compiler_dispatch(self, **kwargs)
or add check
if "_array_join" in select.__dict__.keys() and select._array_join is not None:
exception disappear. But, I'm sure it's wrong way to fix. Do I have to update sqlalchemy (I use 1.3.4) or something else?
Hi,
try select count() from clickhouse_db --> raise error generator raised StopIteration:
clickhouse_sqlalchemy/drivers/http/connector.py", line 193, in _process_response
self._columns = next(response, None)
RuntimeError: generator raised StopIteration
I have a column with type Nullable(Float32)
. When I query this value, it gives me a string value as the output for this column.
The converters
at
Getting error when calling function clickhouse_sqlalchemy.drivers.base.ClickHouseDialect#get_columns
:
packages/clickhouse_sqlalchemy/drivers/base.py", line 424, in get_columns
for name, type_, default_type, default_expression in rows:
ValueError: too many values to unpack (expected 4)
this happens in new version of Clickhouse because the added new column comment_expression
- https://github.com/yandex/ClickHouse/blame/7beb25eaaf6a31399c8ad9355fd93789ea068b6f/dbms/src/Interpreters/InterpreterDescribeQuery.cpp#L48
Is it possible to apply FINAL modifier to query?
https://clickhouse.tech/docs/ru/query_language/select/#select-from-final
Hello,
Was looking to support migrations in my app. I was wondering if this is on the roadmap, or if you have any suggestions for creating migrations based on changes in my sql-alchemy Base classes.
Thanks!
When I execute code like this
engine = create_engine(db_opt)
Session = sessionmaker(bind=engine)
session = Session()
item = Item(data='test')
session.add(item)
session.flush()
the following error appears:
RuntimeError: generator raised StopIteration
I'm using Python 3.7. On 3.6 it works fine.
Looks like it's because of how Python 3.7 handles StopIteration:
https://www.python.org/dev/peps/pep-0479/
Full error log: error.log
This is my code:
import sqlalchemy as sqla
import pymssql
import pandas as pd
SQL_DB_CONN_STR = 'mssql+pymssql://sa:[email protected]:1433/UFDATA_008_2015'
CH_DB_CONN_STR = 'clickhouse://default:@192.168. 1.188/zy'
table_name = 'AA_PinYin'
sql = 'select * from dbo.{}'.format(table_name)
sql_engine = sqla.create_engine(SQL_DB_CONN_STR)
ch_engine = sqla.create_engine(CH_DB_CONN_STR)
sql_df = pd.read_sql(sql, sql_engine, chunksize=5000)
num = 1
for i in sql_df:
if num == 1:
i.to_sql(name=table_name, con=ch_engine, if_exists='replace')
else:
i.to_sql(name=table_name, con=ch_engine, if_exists='append')
num= num + 1
---------------------------------------------------------------------------
CompileError Traceback (most recent call last)
<ipython-input-7-efbf2225776b> in <module>()
18 print(j)
19 if num == 1:
---> 20 i.to_sql(name=table_name, con=ch_engine, if_exists='replace')
21 else:
22 i.to_sql(name=table_name, con=ch_engine, if_exists='append')
.......
c:\users\jeff\appdata\local\programs\python\python35\lib\site-packages\sqlalchemy\sql\compiler.py in visit_create_table(self, create)
2481 text += separator + "\t" + const
2482
-> 2483 text += "\n)%s\n\n" % self.post_create_table(table)
2484 return text
2485
c:\users\jeff\appdata\local\programs\python\python35\lib\site-packages\clickhouse_sqlalchemy\drivers\base.py in post_create_table(self, table)
287
288 if not engine:
--> 289 raise exc.CompileError("No engine for table '%s'" % table.name)
290
291 return ' ENGINE = ' + self.process(engine)
CompileError: No engine for table 'AA_PinYin'`
Hi,
I've started to play with your lib and I've noticed one major bug when using the HTTP driver:
When calling the execute()
method of a Cursor the SQL will not be appended with ' FORMAT TabSeparatedWithNamesAndTypes' as you could initially expect from the drivers/http/base.py
file. This causes the result not to include the columns name and therefore the TSV post processing
(drivers/http/connector.py#195
) will go wrong because the first 2 rows are actually data samples.
It seems to me that the following method of ClickHouseDialect_http
def _execute(self, connection, sql):
sql += ' FORMAT TabSeparatedWithNamesAndTypes'
return connection.execute(sql)
is actually never called (Cursor or not), but I could be wrong.
One quick and easy fix could be to move this to the execute()
method of Cursor.
Thanks,
Timothé
Hi, it seems the current join support is only applied for orm api. I'm using sqlalchemy core api to do some query-generation task, and I need to use join in my querys. But I always get this error in drivers/base.py:
join has no attribute global_
Could you add support for core api queries? Thx!
Hello, I try to create a relationship between two tables as I do with a mysql for example. I got the following issue when trying to. I think it's more a newbies question than a real issue.
Code
#!/usr/bin/env python3
# coding=utf-8
from sqlalchemy import create_engine, Column, MetaData, literal, ForeignKey
from clickhouse_sqlalchemy import Table, make_session, get_declarative_base, types, engines
from sqlalchemy.orm import relationship, sessionmaker
uri = 'clickhouse://default:@localhost/test'
engine = create_engine(uri)
session = make_session(engine)
metadata = MetaData(bind=engine)
Base = get_declarative_base(metadata=metadata)
class Tableas(Base):
__tablename__ = "tableas"
uid = Column(types.Int64, primary_key=True)
item = Column(types.String)
__table_args__ = (
engines.Memory(),
)
class Datas(Base):
__tablename__ = "datas"
uid = Column(types.Int64, primary_key=True)
insert_date = Column(types.Date, primary_key=True)
# Relation to the tables
tablea_uid = Column(types.Int64, ForeignKey('tableas.uid'))
tablea = relationship(Tableas)
__table_args__ = (
engines.Memory(),
)
if __name__ == "__main__":
Base.metadata.create_all(engine)
Error :
clickhouse_sqlalchemy.exceptions.DatabaseException: Orig exception: Code: 62, e.displayText() = DB::Exception: Syntax error: failed at position 77 (line 6, col 2): ,
FOREIGN KEY(tablea_uid) REFERENCES tableas (uid)
) ENGINE = Memory
. Expected one of: identifier, column declaration, e.what() = DB::Exception
The tableas is well created
clickhouse :) describe tableas;
DESCRIBE TABLE tableas
┌─name─┬─type───┬─default_type─┬─default_expression─┬─comment_expression─┬─codec_expression─┐
│ uid │ Int64 │ │ │ │ │
│ item │ String │ │ │ │ │
└──────┴────────┴──────────────┴────────────────────┴────────────────────┴──────────────────┘
2 rows in set. Elapsed: 0.009 sec.
Is it a limitation ? or a user error ?
The bot created this issue to inform you that pyup.io has been set up on this repo.
Once you have closed it, the bot will open pull requests for updates as soon as they are available.
sqlalchemy can't reflect database if it has a table with an Array field.
CREATE TABLE test_db.test(
id Int64,
array_field Array(Float64)
) ENGINE = Memory()
import sqlalchemy as sa
from clickhouse_sqlalchemy import make_session
engine = sa.create_engine('clickhouse+native://user:password@host:9000/test_db')
ch_session = make_session(engine)
metadata = sa.MetaData(bind=engine, quote_schema='')
metadata.reflect()
Raises:
TypeError Traceback (most recent call last)
<ipython-input-81-1a8e5f1aa1a4> in <module>()
2 ch_session = make_session(engine)
3 metadata = sa.MetaData(bind=engine, quote_schema='')
----> 4 metadata.reflect()
~/.pyenv/versions/jupyter3.6.4/lib/python3.6/site-packages/sqlalchemy/sql/schema.py in reflect(self, bind, schema, views, only, extend_existing, autoload_replace, **dialect_kwargs)
3907
3908 for name in load:
-> 3909 Table(name, self, **reflect_opts)
3910
3911 def append_ddl_listener(self, event_name, listener):
~/.pyenv/versions/jupyter3.6.4/lib/python3.6/site-packages/sqlalchemy/sql/schema.py in __new__(cls, *args, **kw)
437 except:
438 with util.safe_reraise():
--> 439 metadata._remove_table(name, schema)
440
441 @property
~/.pyenv/versions/jupyter3.6.4/lib/python3.6/site-packages/sqlalchemy/util/langhelpers.py in __exit__(self, type_, value, traceback)
64 self._exc_info = None # remove potential circular references
65 if not self.warn_only:
---> 66 compat.reraise(exc_type, exc_value, exc_tb)
67 else:
68 if not compat.py3k and self._exc_info and self._exc_info[1]:
~/.pyenv/versions/jupyter3.6.4/lib/python3.6/site-packages/sqlalchemy/util/compat.py in reraise(tp, value, tb, cause)
185 if value.__traceback__ is not tb:
186 raise value.with_traceback(tb)
--> 187 raise value
188
189 else:
~/.pyenv/versions/jupyter3.6.4/lib/python3.6/site-packages/sqlalchemy/sql/schema.py in __new__(cls, *args, **kw)
432 metadata._add_table(name, schema, table)
433 try:
--> 434 table._init(name, metadata, *args, **kw)
435 table.dispatch.after_parent_attach(table, metadata)
436 return table
~/.pyenv/versions/jupyter3.6.4/lib/python3.6/site-packages/sqlalchemy/sql/schema.py in _init(self, name, metadata, *args, **kwargs)
512 self._autoload(
513 metadata, autoload_with,
--> 514 include_columns, _extend_on=_extend_on)
515
516 # initialize all the column, etc. objects. done after reflection to
~/.pyenv/versions/jupyter3.6.4/lib/python3.6/site-packages/sqlalchemy/sql/schema.py in _autoload(self, metadata, autoload_with, include_columns, exclude_columns, _extend_on)
525 autoload_with.dialect.reflecttable,
526 self, include_columns, exclude_columns,
--> 527 _extend_on=_extend_on
528 )
529 else:
~/.pyenv/versions/jupyter3.6.4/lib/python3.6/site-packages/sqlalchemy/engine/base.py in run_callable(self, callable_, *args, **kwargs)
1532
1533 """
-> 1534 return callable_(self, *args, **kwargs)
1535
1536 def _run_visitor(self, visitorcallable, element, **kwargs):
~/.pyenv/versions/jupyter3.6.4/lib/python3.6/site-packages/sqlalchemy/engine/default.py in reflecttable(self, connection, table, include_columns, exclude_columns, **opts)
370 insp = reflection.Inspector.from_engine(connection)
371 return insp.reflecttable(
--> 372 table, include_columns, exclude_columns, **opts)
373
374 def get_pk_constraint(self, conn, table_name, schema=None, **kw):
~/.pyenv/versions/jupyter3.6.4/lib/python3.6/site-packages/sqlalchemy/engine/reflection.py in reflecttable(self, table, include_columns, exclude_columns, _extend_on)
596
597 for col_d in self.get_columns(
--> 598 table_name, schema, **table.dialect_kwargs):
599 found_table = True
600
~/.pyenv/versions/jupyter3.6.4/lib/python3.6/site-packages/sqlalchemy/engine/reflection.py in get_columns(self, table_name, schema, **kw)
372 coltype = col_def['type']
373 if not isinstance(coltype, TypeEngine):
--> 374 col_def['type'] = coltype()
375 return col_defs
376
TypeError: __init__() missing 1 required positional argument: 'item_type'
At the moment, DatabaseException
constructor now looks like this:
class DatabaseException(Exception):
def __init__(self, orig):
self.orig = orig
super(DatabaseException, self).__init__()
It can cause problems when DatabaseException
is being thrown from a multiprocessing
s process (e.g. while working with big data in ClickHouse in parallel), ForkingPickler
complaining about the missing orig
parameter. So I suggest to add orig
into the super constructor, because orig
is non-optional:
class DatabaseException(Exception):
def __init__(self, orig):
self.orig = orig
super(DatabaseException, self).__init__(orig)
I'm using native driver with clickhouse 20.1.4.14 and latest clickhouse-sqlalchemy from PIP.
It seems the issue was coming with clickhouse update (we were using 19.1.x before), but this is a guess as we switched to 20.1 changed driver from sqlalchemy-clickhouse to clickhouse-sqlalchemy due to some other issues.
So when running a very basic selection of a varchar field, we get an error:
select description from my_database.creatives;
The corresponding error is attached below with tracelog. Interestingly I figured the toString() function will make the query work:
select toString(description) from my_database.creatives;
Somehow it might be related to Nullable columns, as this column is defined as follows. Note the fields of the query are not null in this case, so there is real data coming back!
CREATE TABLE IF NOT EXISTS `my_database`.`creatives` (
`id` Int64,
`description` Nullable(varchar),
...
)
ENGINE = MergeTree
PRIMARY KEY (`id`)
ORDER BY id;
I'd be happy about any feedback. Thanks!
PS: Sorry for the bad formatting of the trace
2020-02-19 17:49:44,069:DEBUG:clickhouse_driver.connection:Query: select version()
2020-02-19 17:49:44,069:DEBUG:clickhouse_driver.connection:Block send time: 0.000063
2020-02-19 17:49:44,071:INFO:root:Running statement 1 out of 1
2020-02-19 17:49:44,083:INFO:root:Parsing with sqlparse statement select description from my_database.creatives
2020-02-19 17:49:44,085:INFO:root:Parsing with sqlparse statement select description from my_database.creatives
2020-02-19 17:49:44,086:INFO:root:Running query:
select description from spearad_data.creatives
LIMIT 1000
2020-02-19 17:49:44,086:DEBUG:clickhouse_driver.connection:Connecting. Database: my_database. User: default
2020-02-19 17:49:44,089:DEBUG:clickhouse_driver.connection:Connected to ClickHouse server version 20.1.4, revision: 54431
2020-02-19 17:49:44,089:DEBUG:clickhouse_driver.connection:Query: select description from my_database.creatives
LIMIT 1000
2020-02-19 17:49:44,089:DEBUG:clickhouse_driver.connection:Block send time: 0.000060
2020-02-19 17:49:44,091:DEBUG:root:�[36m[stats_logger] (timing) sqllab.query.time_executing_query | 4.578125 �[0m
2020-02-19 17:49:44,091:ERROR:root:Orig exception: Code: 50. Unknown type varchar
Traceback (most recent call last):
File "/usr/local/lib/python3.6/site-packages/clickhouse_driver/columns/service.py", line 74, in get_column_by_spec
cls = column_by_type[spec]
KeyError: 'varchar'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.6/site-packages/clickhouse_sqlalchemy/drivers/native/connector.py", line 156, in execute
**execute_kwargs
File "/usr/local/lib/python3.6/site-packages/clickhouse_driver/client.py", line 214, in execute
columnar=columnar
File "/usr/local/lib/python3.6/site-packages/clickhouse_driver/client.py", line 337, in process_ordinary_query
columnar=columnar)
File "/usr/local/lib/python3.6/site-packages/clickhouse_driver/client.py", line 83, in receive_result
return result.get_result()
File "/usr/local/lib/python3.6/site-packages/clickhouse_driver/result.py", line 48, in get_result
for packet in self.packet_generator:
File "/usr/local/lib/python3.6/site-packages/clickhouse_driver/client.py", line 95, in packet_generator
packet = self.receive_packet()
File "/usr/local/lib/python3.6/site-packages/clickhouse_driver/client.py", line 109, in receive_packet
packet = self.connection.receive_packet()
File "/usr/local/lib/python3.6/site-packages/clickhouse_driver/connection.py", line 386, in receive_packet
packet.block = self.receive_data()
File "/usr/local/lib/python3.6/site-packages/clickhouse_driver/connection.py", line 449, in receive_data
return self.block_in.read()
File "/usr/local/lib/python3.6/site-packages/clickhouse_driver/streams/native.py", line 75, in read
self.fin)
File "/usr/local/lib/python3.6/site-packages/clickhouse_driver/columns/service.py", line 83, in read_column
column = get_column_by_spec(column_spec, column_options=column_options)
File "/usr/local/lib/python3.6/site-packages/clickhouse_driver/columns/service.py", line 63, in get_column_by_spec
return create_nullable_column(spec, create_column_with_options)
File "/usr/local/lib/python3.6/site-packages/clickhouse_driver/columns/nullablecolumn.py", line 5, in create_nullable_column
nested = column_by_spec_getter(inner)
File "/usr/local/lib/python3.6/site-packages/clickhouse_driver/columns/service.py", line 45, in create_column_with_options
return get_column_by_spec(x, column_options)
File "/usr/local/lib/python3.6/site-packages/clickhouse_driver/columns/service.py", line 78, in get_column_by_spec
raise errors.UnknownTypeError('Unknown type
{}
'.format(e.args[0]))
clickhouse_driver.errors.UnknownTypeError: Code: 50. Unknown type varchar
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/superset/superset/sql_lab.py", line 204, in execute_sql_statement
db_engine_spec.execute(cursor, sql, async_=True)
File "/home/superset/superset/db_engine_specs/base.py", line 493, in execute
cursor.execute(query)
File "/usr/local/lib/python3.6/site-packages/clickhouse_sqlalchemy/drivers/native/connector.py", line 160, in execute
raise DatabaseException(orig)
clickhouse_sqlalchemy.exceptions.DatabaseException: Orig exception: Code: 50. Unknown type varchar
10.192.10.21 - - [19/Feb/2020 17:49:44] "POST /superset/sql_json/ HTTP/1.1" 500 -
2020-02-19 17:49:44,111:INFO:werkzeug:10.192.10.21 - - [19/Feb/2020 17:49:44] "POST /superset/sql_json/ HTTP/1.1" 500 -
2020-02-19 17:49:44,255:DEBUG:root:�[36m[stats_logger] (incr) queries�[0m
10.192.10.21 - - [19/Feb/2020 17:49:44] "GET /superset/queries/1582128376000 HTTP/1.1" 200 -
2020-02-19 17:49:44,263:INFO:werkzeug:10.192.10.21 - - [19/Feb/2020 17:49:44] "GET /superset/queries/1582128376000 HTTP/1.1" 200 -
127.0.0.1 - - [19/Feb/2020 17:49:46] "GET /health HTTP/1.1" 200 -
2020-02-19 17:49:46,382:INFO:werkzeug:127.0.0.1 - - [19/Feb/2020 17:49:46] "GET /health HTTP/1.1" 200 -
10.192.11.13 - - [19/Feb/2020 17:50:03] "GET / HTTP/1.1" 302 -
2020-02-19 17:50:03,099:INFO:werkzeug:10.192.11.13 - - [19/Feb/2020 17:50:03] "GET / HTTP/1.1" 302 -
10.192.10.21 - - [19/Feb/2020 17:50:03] "GET / HTTP/1.1" 302 -
2020-02-19 17:50:03,223:INFO:werkzeug:10.192.10.21 - - [19/Feb/2020 17:50:03] "GET / HTTP/1.1" 302 -
127.0.0.1 - - [19/Feb/2020 17:50:16] "GET /health HTTP/1.1" 200 -
2020-02-19 17:50:16,818:INFO:werkzeug:127.0.0.1 - - [19/Feb/2020 17:50:16] "GET /health HTTP/1.1" 200 -
10.192.11.13 - - [19/Feb/2020 17:50:33] "GET / HTTP/1.1" 302 -
2020-02-19 17:50:33,104:INFO:werkzeug:10.192.11.13 - - [19/Feb/2020 17:50:33] "GET / HTTP/1.1" 302 -
10.192.10.21 - - [19/Feb/2020 17:50:33] "GET / HTTP/1.1" 302 -
2020-02-19 17:50:33,230:INFO:werkzeug:10.192.10.21 - - [19/Feb/2020 17:50:33] "GET / HTTP/1.1" 302 -
127.0.0.1 - - [19/Feb/2020 17:50:47] "GET /health HTTP/1.1" 200 -
2020-02-19 17:50:47,249:INFO:werkzeug:127.0.0.1 - - [19/Feb/2020 17:50:47] "GET /health HTTP/1.1" 200 -
10.192.11.13 - - [19/Feb/2020 17:51:03] "GET / HTTP/1.1" 302 -```
Натолкнулся на проблему с компиляцией запросов
engine_clickhouse = create_engine(config['sqlalchemy']['conn_string_clickhouse'],
convert_unicode=True,
echo=config.getboolean('sqlalchemy', 'ECHO_DB'))
SessionClickhouse = ScopedSession(sessionmaker(bind=engine_clickhouse))
from clickhouse_sqlalchemy import types, engines
from clickhouse_sqlalchemy.ext.declarative import get_declarative_base
from sqlalchemy import Column
CHBase = get_declarative_base()
# кусок модельки, ничего необычного
class PixelStats(CHBase):
...
ts_spawn = Column(types.UInt32, primary_key=True)
...
Вот такая петрушка в консоли:
>>> from proj.core.model.meta import SessionClickHouse
>>> from proj.core.model_clickhouse import PixelStats
>>> from proj.core.orm_extensions.compiler import Compiler
>>> SessionClickHouse.bind.dialect.name
'clickhouse'
>>> print Compiler.compile_query(SessionClickHouse.query(PixelStats.ts_spawn - PixelStats.ts_spawn % 3600), dialect=SessionClickHouse.bind.dialect)
SELECT ts_spawn - ts_spawn %(ts_spawn_1)s AS anon_1
FROM pixel_stats
>>> SessionClickHouse.query(PixelStats.ts_spawn - PixelStats.ts_spawn % 3600).first()
Traceback (most recent call last):
File "<input>", line 1, in <module>
File "/home/anton/Projects/proj/.venv/lib/python2.7/site-packages/sqlalchemy/orm/query.py", line 2700, in first
ret = list(self[0:1])
File "/home/anton/Projects/proj/.venv/lib/python2.7/site-packages/sqlalchemy/orm/query.py", line 2498, in __getitem__
return list(res)
File "/home/anton/Projects/proj/core/proj/core/model/cachingquery.py", line 106, in __iter__
return Query.__iter__(self)
File "/home/anton/Projects/proj/.venv/lib/python2.7/site-packages/sqlalchemy/orm/query.py", line 2802, in __iter__
return self._execute_and_instances(context)
File "/home/anton/Projects/proj/.venv/lib/python2.7/site-packages/sqlalchemy/orm/query.py", line 2817, in _execute_and_instances
result = conn.execute(querycontext.statement, self._params)
File "/home/anton/Projects/proj/.venv/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 914, in execute
return meth(self, multiparams, params)
File "/home/anton/Projects/proj/.venv/lib/python2.7/site-packages/sqlalchemy/sql/elements.py", line 323, in _execute_on_connection
return connection._execute_clauseelement(self, multiparams, params)
File "/home/anton/Projects/proj/.venv/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 1010, in _execute_clauseelement
compiled_sql, distilled_params
File "/home/anton/Projects/proj/.venv/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 1146, in _execute_context
context)
File "/home/anton/Projects/proj/.venv/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 1344, in _handle_dbapi_exception
util.reraise(*exc_info)
File "/home/anton/Projects/proj/.venv/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 1139, in _execute_context
context)
File "/home/anton/Projects/proj/.venv/lib/python2.7/site-packages/clickhouse_sqlalchemy/drivers/base.py", line 435, in do_execute
cursor.execute(statement, parameters, context=context)
File "/home/anton/Projects/proj/.venv/lib/python2.7/site-packages/clickhouse_sqlalchemy/drivers/http/connector.py", line 105, in execute
self._process_response(response_gen)
File "/home/anton/Projects/proj/.venv/lib/python2.7/site-packages/clickhouse_sqlalchemy/drivers/http/connector.py", line 197, in _process_response
self._columns = next(response, None)
File "/home/anton/Projects/proj/.venv/lib/python2.7/site-packages/clickhouse_sqlalchemy/drivers/http/transport.py", line 39, in execute
r = self._send(query, params=params, stream=True)
File "/home/anton/Projects/proj/.venv/lib/python2.7/site-packages/clickhouse_sqlalchemy/drivers/http/transport.py", line 77, in _send
raise DatabaseException(orig)
DatabaseException: Orig exception: Code: 62, e.displayText() = DB::Exception: Syntax error: failed at position 43 (line 1, col 43): AS anon_1
FROM pixel_stats
LIMIT 1 FORMAT TabSeparatedWithNamesAndTypes. Expected one of: Comma, INTO OUTFILE, FROM, WITH, HAVING, SETTINGS, GROUP BY, ORDER BY, UNION ALL, PREWHERE, WHERE, token, LIMIT, FORMAT, e.what() = DB::Exception
Ручной "компилятор":
from psycopg2.extensions import adapt
from sqlalchemy.dialects import postgresql
_pg_dialect = postgresql.dialect()
class Compiler(object):
encoding = 'utf8'
@classmethod
def compile_statement(cls, statement, dialect=_pg_dialect):
comp = statement.compile(dialect=dialect)
params = {}
for k, v in comp.params.items():
if isinstance(v, unicode):
v = v.encode(cls.encoding)
if dialect.name == _pg_dialect.name:
params[k] = adapt(v)
return (comp.string.encode(cls.encoding) % params).decode(cls.encoding)
@classmethod
def compile_query(cls, query, dialect=_pg_dialect):
return cls.compile_statement(query.statement, dialect=dialect)
Аналогичная операция прекрасно работает с пг (с другой моделью, со стандартным алхимичный declarative_base()
и с пг-шным же engine
):
>>> from proj.core.model import Stat
>>> from proj.core.model.meta import Session
>>> Session.bind.dialect.name
'postgresql'
>>> print Compiler.compile_query(Session.query(Stat.ts_spawn - Stat.ts_spawn % 3600), dialect=Session.bind.dialect)
SELECT stats.ts_spawn - stats.ts_spawn % 3600 AS anon_1
FROM stats
>>> Session.query(Stat.ts_spawn - Stat.ts_spawn % 3600).first()
(1508162400,)
Понимаю, что есть функции mod
в пг и modulo
в кликхаусе, но хотелось бы использовать оператор %
, именно за счет того, что он одинаков в разных СУБД.
Hi,
We're having some issues with the driver under a use case which is using a raw connection.
How to reproduce:
from sqlalchemy import create_engine, Column, MetaData, literal
from datetime import date, timedelta
from clickhouse_sqlalchemy import make_session, get_declarative_base, types, engines
uri = 'clickhouse+native://default:xxx@localhost/test'
engine = create_engine(uri)
session = make_session(engine)
metadata = MetaData(bind=engine)
Base = get_declarative_base(metadata=metadata)
class Rate(Base):
day = Column(types.Date, primary_key=True)
value = Column(types.Int32)
__table_args__ = (
engines.Memory(),
)
table = Rate.__table__
table.create()
today = date.today()
rates = [{'day': today - timedelta(i), 'value': 200 - i} for i in range(100)]
# Emits single INSERT statement.
session.execute(table.insert(), rates) # This is fine
session.execute("SELECT * FROM rate") # Still OK
# now using a raw connection
raw = engine.raw_connection()
cur = raw.cursor()
qs = cur.execute("SELECT * FROM rate") # Boum
Traceback (most recent call last):
File "/home/achille/.virtualenvs/clickhouse/lib/python3.6/site-packages/IPython/core/interactiveshell.py", line 3267, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "<ipython-input-14-392e6e940555>", line 1, in <module>
qs = cur.execute("SELECT * FROM rate")
File "/home/achille/clickhouse-sqlalchemy/clickhouse_sqlalchemy/drivers/native/connector.py", line 142, in execute
external_tables, execute, settings = self._prepare(context)
File "/home/achille/clickhouse-sqlalchemy/clickhouse_sqlalchemy/drivers/native/connector.py", line 115, in _prepare
execution_options = context.execution_options
AttributeError: 'NoneType' object has no attribute 'execution_options'
So the problem is in fact simple and related to the last update of drivers/native/connector.py:
the execute
and executemany
methods of Cursor
can take context=None
as argument but _prepare
does not handle this case and the very first line of it raises the exception:
def _prepare(self, context):
execution_options = context.execution_options # Here
external_tables = self.make_external_tables(
context.dialect, execution_options
)
transport = self._connection.transport
execute = transport.execute
execute_iter = getattr(transport, 'execute_iter', None)
self._stream_results = execution_options.get('stream_results', False)
settings = execution_options.get('settings')
if self._stream_results and execute_iter:
execute = execute_iter
settings = settings or {}
settings['max_block_size'] = execution_options['max_row_buffer']
return external_tables, execute, settings
def execute(self, operation, parameters=None, context=None):
self._reset_state()
self._begin_query()
try:
external_tables, execute, settings = self._prepare(context)
I have a working patch for this (well it's just a check on context
no big deal), but as I'm not using any external tables I cannot test their usage with a raw_connection. However I think this particular usage may be ignored, so I'll make a PR to fix this.
Thanks for you work
I found this code
pos = rv.rfind('VALUES (')
# Remove (%s)-templates from VALUES clause if exists.
# ClickHouse server since version 19.3.3 parse query after VALUES and
# allows inplace parameters.
# Example: INSERT INTO test (x) VALUES (1), (2).
if pos != -1:
rv = rv[:pos + 6]
in base.py in function visit_insert
It truncates rv
from INSERT INTO data (param) VALUES (%(param)s)
to INSERT INTO data (param) VALUES
And I didn't found a test, that should check that.
Can you provide an example of how integrate with flask-sqlalchemy?
Best regards.
subj
Code to reproduce:
import pandas as pd
from sqlalchemy import create_engine
engine = create_engine('clickhouse://.../testing')
df = pd.DataFrame({'value': [0, 1], 'str': ['qwer', 'asdf']})
conn = engine.connect()
df.to_sql('test_drop', conn, if_exists='replace', index=False)
conn.close()
Clickhouse:
CREATE TABLE testing.test_drop (value Int32, str String) ENGINE = TinyLog;
Full exception:
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-20-56f4dbb057f1> in <module>()
1 with DBConnection('ch_credentials.pkl', dbname='testing') as conn:
----> 2 test_df.to_sql('test_drop', conn, if_exists='replace', index=False)
/usr/local/lib/python3.6/site-packages/pandas/core/generic.py in to_sql(self, name, con, flavor, schema, if_exists, index, index_label, chunksize, dtype)
1532 sql.to_sql(self, name, con, flavor=flavor, schema=schema,
1533 if_exists=if_exists, index=index, index_label=index_label,
-> 1534 chunksize=chunksize, dtype=dtype)
1535
1536 def to_pickle(self, path, compression='infer',
/usr/local/lib/python3.6/site-packages/pandas/io/sql.py in to_sql(frame, name, con, flavor, schema, if_exists, index, index_label, chunksize, dtype)
471 pandas_sql.to_sql(frame, name, if_exists=if_exists, index=index,
472 index_label=index_label, schema=schema,
--> 473 chunksize=chunksize, dtype=dtype)
474
475
/usr/local/lib/python3.6/site-packages/pandas/io/sql.py in to_sql(self, frame, name, if_exists, index, index_label, schema, chunksize, dtype)
1153 if_exists=if_exists, index_label=index_label,
1154 schema=schema, dtype=dtype)
-> 1155 table.create()
1156 table.insert(chunksize)
1157 if (not name.isdigit() and not name.islower()):
/usr/local/lib/python3.6/site-packages/pandas/io/sql.py in create(self)
592 raise ValueError("Table '%s' already exists." % self.name)
593 elif self.if_exists == 'replace':
--> 594 self.pd_sql.drop_table(self.name, self.schema)
595 self._execute_create()
596 elif self.if_exists == 'append':
/usr/local/lib/python3.6/site-packages/pandas/io/sql.py in drop_table(self, table_name, schema)
1203 if self.has_table(table_name, schema):
1204 self.meta.reflect(only=[table_name], schema=schema)
-> 1205 self.get_table(table_name, schema).drop()
1206 self.meta.clear()
1207
/usr/local/lib/python3.6/site-packages/sqlalchemy/sql/schema.py in drop(self, bind, checkfirst)
792 bind._run_visitor(ddl.SchemaDropper,
793 self,
--> 794 checkfirst=checkfirst)
795
796 def tometadata(self, metadata, schema=RETAIN_SCHEMA,
/usr/local/lib/python3.6/site-packages/sqlalchemy/engine/base.py in _run_visitor(self, visitorcallable, element, **kwargs)
1547 def _run_visitor(self, visitorcallable, element, **kwargs):
1548 visitorcallable(self.dialect, self,
-> 1549 **kwargs).traverse_single(element)
1550
1551
/usr/local/lib/python3.6/site-packages/sqlalchemy/sql/visitors.py in traverse_single(self, obj, **kw)
119 meth = getattr(v, "visit_%s" % obj.__visit_name__, None)
120 if meth:
--> 121 return meth(obj, **kw)
122
123 def iterate(self, obj):
/usr/local/lib/python3.6/site-packages/sqlalchemy/sql/ddl.py in visit_table(self, table, drop_ok, _is_metadata_operation)
953 self.traverse_single(column.default)
954
--> 955 self.connection.execute(DropTable(table))
956
957 table.dispatch.after_drop(
/usr/local/lib/python3.6/site-packages/sqlalchemy/engine/base.py in execute(self, object, *multiparams, **params)
946 raise exc.ObjectNotExecutableError(object)
947 else:
--> 948 return meth(self, multiparams, params)
949
950 def _execute_function(self, func, multiparams, params):
/usr/local/lib/python3.6/site-packages/sqlalchemy/sql/ddl.py in _execute_on_connection(self, connection, multiparams, params)
66
67 def _execute_on_connection(self, connection, multiparams, params):
---> 68 return connection._execute_ddl(self, multiparams, params)
69
70 def execute(self, bind=None, target=None):
/usr/local/lib/python3.6/site-packages/sqlalchemy/engine/base.py in _execute_ddl(self, ddl, multiparams, params)
1001 dialect=dialect,
1002 schema_translate_map=self.schema_for_object
-> 1003 if not self.schema_for_object.is_default else None)
1004 ret = self._execute_context(
1005 dialect,
<string> in <lambda>(self, bind, dialect, **kw)
/usr/local/lib/python3.6/site-packages/sqlalchemy/sql/elements.py in compile(self, default, bind, dialect, **kw)
440 else:
441 dialect = default.StrCompileDialect()
--> 442 return self._compiler(dialect, bind=bind, **kw)
443
444 def _compiler(self, dialect, **kw):
/usr/local/lib/python3.6/site-packages/sqlalchemy/sql/ddl.py in _compiler(self, dialect, **kw)
24 Dialect."""
25
---> 26 return dialect.ddl_compiler(dialect, self, **kw)
27
28
/usr/local/lib/python3.6/site-packages/sqlalchemy/sql/compiler.py in __init__(self, dialect, statement, bind, schema_translate_map, compile_kwargs)
217 if self.can_execute:
218 self.execution_options = statement._execution_options
--> 219 self.string = self.process(self.statement, **compile_kwargs)
220
221 @util.deprecated("0.7", ":class:`.Compiled` objects now compile "
/usr/local/lib/python3.6/site-packages/sqlalchemy/sql/compiler.py in process(self, obj, **kwargs)
243
244 def process(self, obj, **kwargs):
--> 245 return obj._compiler_dispatch(self, **kwargs)
246
247 def __str__(self):
/usr/local/lib/python3.6/site-packages/sqlalchemy/sql/visitors.py in _compiler_dispatch(self, visitor, **kw)
79 raise exc.UnsupportedCompilationError(visitor, cls)
80 else:
---> 81 return meth(self, **kw)
82 else:
83 # The optimization opportunity is lost for this case because the
~/Projects/Git/clickhouse-sqlalchemy/clickhouse_sqlalchemy/drivers/base.py in visit_drop_table(self, drop)
293 text = '\nDROP TABLE '
294
--> 295 if drop.if_exists:
296 text += 'IF EXISTS '
297
AttributeError: 'DropTable' object has no attribute 'if_exists'
Lib version - '0.0.5', installed from master-branch.
Hi, I am trying to insert values from select statement to table. Let's say I have one table Table1
with columns A
and B
. I want select all records from Table1
, add new column C
with some value and insert this to second table Table2
.
But I don't know how to add new column C
.
My raw SQL query would look like
INSERT INTO Table2
FROM
SELECT A, B, `some_value` AS C
FROM Table1
My python code:
from sqlalchemy import cast
# Labels must be present.
select_query = session.query(Table1).filter(Table.A.in_(ids)).subquery() # How to add new column C?
session.execute(
Table2.insert()
.from_select(['A', 'B', 'C'], select_query)
)
Thanks in advance.
Same as mymarilyn/clickhouse-driver#101
session.query(...).count()
That code generates something like that:
select count(*) from (
select ...
from table
) anon_1
If I trying to use sample, query will look like that:
session.query(...).sample(0.1)
select count(*) from (
select ...
from table
sample 0.1
) anon_1
sample 0.1
The same problem appears with ARRAY JOIN
clickhouse_sqlalchemy/drivers/base.py:583
looks like this code is never called.
Code: type_enum = enum.Enum('%s_enum' % name, options)
Enum
accepts only one arg.
В 1.1.54310 добавлена возможность задавать произвольный ключ партицирования, будет здорово, если будет реализована генерация DDL.
Возможные варианты ENGINE:
... ) ENGINE = MergeTree
PARTITION BY toDate(ts_spawn)
ORDER BY (user_id, country_id);
... ) ENGINE = MergeTree
PARTITION BY ts_spawn
ORDER BY user_id;
... ) ENGINE = MergeTree
PARTITION BY ts_spawn - ts_spawn % 86400
ORDER BY (user_id, country_id);
Trying to inspect schema and table names from an engine fails as the column name seems to be missing from the ResultProxy
instance. Accessing it via the integer position works, but it seems accessing via name doesn't work as described by the API: https://docs.sqlalchemy.org/en/13/core/connections.html#sqlalchemy.engine.ResultProxy
Example code:
from sqlalchemy import create_engine, inspect
uri = 'clickhouse://default:@localhost'
engine = create_engine(uri)
print('Running manually')
res = engine.execute('SHOW DATABASES')
for row in res:
print(row[0]) # has to be referenced by integer position to work
print('\nRunning via inspector')
insp = inspect(engine)
insp.get_schema_names()
Result:
Running manually
test
Running via inspector
Traceback (most recent call last):
File "click.py", line 16, in <module>
insp.get_schema_names()
File "/Users/ville/.local/share/virtualenvs/incubator-superset-Q6iPzijF/lib/python3.7/site-packages/sqlalchemy/engine/reflection.py", line 160, in get_schema_names
self.bind, info_cache=self.info_cache
File "<string>", line 2, in get_schema_names
File "/Users/ville/.local/share/virtualenvs/incubator-superset-Q6iPzijF/lib/python3.7/site-packages/sqlalchemy/engine/reflection.py", line 56, in cache
ret = fn(self, con, *args, **kw)
File "/Users/ville/.local/share/virtualenvs/incubator-superset-Q6iPzijF/lib/python3.7/site-packages/clickhouse_sqlalchemy/drivers/base.py", line 560, in get_schema_names
return [row.name for row in connection.execute('SHOW DATABASES')]
File "/Users/ville/.local/share/virtualenvs/incubator-superset-Q6iPzijF/lib/python3.7/site-packages/clickhouse_sqlalchemy/drivers/base.py", line 560, in <listcomp>
return [row.name for row in connection.execute('SHOW DATABASES')]
AttributeError: Could not locate column in row for column 'name'
Is there a way to add ARRAY JOIN
clause to my query?
Hi!
I have the next table:
class Rate(Base):
day = Column(types.Date, primary_key=True)
value = Column(types.Int32, nullable=True)
__table_args__ = (
engines.MergeTree(date_col='day', key_expressions=('day',)),
)
Then I try to insert data to this table:
session.execute(Rate.__table__.insert(), [{'day':datetime.now(), 'value': None}, {'day':datetime.now(), 'value': 1}])
And it causes the following error:
---------------------------------------------------------------------------
error Traceback (most recent call last)
~/dev/kafka_sink/venv/lib/python3.7/site-packages/clickhouse_driver/columns/base.py in write_items(self, items, buf)
141 try:
--> 142 buf.write(s.pack(*items))
143
error: required argument is not an integer
During handling of the above exception, another exception occurred:
StructPackException Traceback (most recent call last)
~/dev/kafka_sink/venv/lib/python3.7/site-packages/clickhouse_driver/columns/service.py in write_column(context, column_name, column_spec, items, buf, types_check)
90 column.write_state_prefix(buf)
---> 91 column.write_data(items, buf)
92
~/dev/kafka_sink/venv/lib/python3.7/site-packages/clickhouse_driver/columns/base.py in write_data(self, items, buf)
78
---> 79 self._write_data(items, buf)
80
~/dev/kafka_sink/venv/lib/python3.7/site-packages/clickhouse_driver/columns/base.py in _write_data(self, items, buf)
82 prepared = self.prepare_items(items)
---> 83 self.write_items(prepared, buf)
84
~/dev/kafka_sink/venv/lib/python3.7/site-packages/clickhouse_driver/columns/base.py in write_items(self, items, buf)
144 except struct_error as e:
--> 145 raise exceptions.StructPackException(e)
146
StructPackException: required argument is not an integer
During handling of the above exception, another exception occurred:
TypeMismatchError Traceback (most recent call last)
~/dev/kafka_sink/venv/lib/python3.7/site-packages/clickhouse_sqlalchemy/drivers/native/connector.py in executemany(self, operation, seq_of_parameters, context)
164 operation, params=seq_of_parameters,
--> 165 external_tables=external_tables, settings=settings
166 )
~/dev/kafka_sink/venv/lib/python3.7/site-packages/clickhouse_driver/client.py in execute(self, query, params, with_column_types, external_tables, query_id, settings, types_check, columnar)
204 query, params, external_tables=external_tables,
--> 205 query_id=query_id, types_check=types_check
206 )
~/dev/kafka_sink/venv/lib/python3.7/site-packages/clickhouse_driver/client.py in process_insert_query(self, query_without_data, data, external_tables, query_id, types_check)
355 if sample_block:
--> 356 self.send_data(sample_block, data, types_check=types_check)
357 packet = self.connection.receive_packet()
~/dev/kafka_sink/venv/lib/python3.7/site-packages/clickhouse_driver/client.py in send_data(self, sample_block, data, types_check)
379 types_check=types_check)
--> 380 self.connection.send_data(block)
381
~/dev/kafka_sink/venv/lib/python3.7/site-packages/clickhouse_driver/connection.py in send_data(self, block, table_name)
478
--> 479 self.block_out.write(block)
480 self.block_out.reset()
~/dev/kafka_sink/venv/lib/python3.7/site-packages/clickhouse_driver/streams/native.py in write(self, block)
40 write_column(self.context, col_name, col_type, items,
---> 41 self.fout, types_check=block.types_check)
42
~/dev/kafka_sink/venv/lib/python3.7/site-packages/clickhouse_driver/columns/service.py in write_column(context, column_name, column_spec, items, buf, types_check)
106 'Column {}: {}'.format(
--> 107 column_name, str(error)
108 )
TypeMismatchError: Code: 53. Type mismatch in VALUES section. Repeat query with types_check=True for detailed info. Column value: required argument is not an integer
During handling of the above exception, another exception occurred:
DatabaseException Traceback (most recent call last)
<ipython-input-210-be61f4dbb946> in <module>
----> 1 session.execute(Rate.__table__.insert(), [{'day':datetime.now(), 'value': None}, {'day':datetime.now(), 'value': 1}])
~/dev/kafka_sink/venv/lib/python3.7/site-packages/sqlalchemy/orm/session.py in execute(self, clause, params, mapper, bind, **kw)
1267
1268 return self._connection_for_bind(bind, close_with_result=True).execute(
-> 1269 clause, params or {}
1270 )
1271
~/dev/kafka_sink/venv/lib/python3.7/site-packages/sqlalchemy/engine/base.py in execute(self, object_, *multiparams, **params)
986 raise exc.ObjectNotExecutableError(object_)
987 else:
--> 988 return meth(self, multiparams, params)
989
990 def _execute_function(self, func, multiparams, params):
~/dev/kafka_sink/venv/lib/python3.7/site-packages/sqlalchemy/sql/elements.py in _execute_on_connection(self, connection, multiparams, params)
285 def _execute_on_connection(self, connection, multiparams, params):
286 if self.supports_execution:
--> 287 return connection._execute_clauseelement(self, multiparams, params)
288 else:
289 raise exc.ObjectNotExecutableError(self)
~/dev/kafka_sink/venv/lib/python3.7/site-packages/sqlalchemy/engine/base.py in _execute_clauseelement(self, elem, multiparams, params)
1105 distilled_params,
1106 compiled_sql,
-> 1107 distilled_params,
1108 )
1109 if self._has_events or self.engine._has_events:
~/dev/kafka_sink/venv/lib/python3.7/site-packages/sqlalchemy/engine/base.py in _execute_context(self, dialect, constructor, statement, parameters, *args)
1246 except BaseException as e:
1247 self._handle_dbapi_exception(
-> 1248 e, statement, parameters, cursor, context
1249 )
1250
~/dev/kafka_sink/venv/lib/python3.7/site-packages/sqlalchemy/engine/base.py in _handle_dbapi_exception(self, e, statement, parameters, cursor, context)
1466 util.raise_from_cause(sqlalchemy_exception, exc_info)
1467 else:
-> 1468 util.reraise(*exc_info)
1469
1470 finally:
~/dev/kafka_sink/venv/lib/python3.7/site-packages/sqlalchemy/util/compat.py in reraise(tp, value, tb, cause)
151 if value.__traceback__ is not tb:
152 raise value.with_traceback(tb)
--> 153 raise value
154
155 def u(s):
~/dev/kafka_sink/venv/lib/python3.7/site-packages/sqlalchemy/engine/base.py in _execute_context(self, dialect, constructor, statement, parameters, *args)
1222 if not evt_handled:
1223 self.dialect.do_executemany(
-> 1224 cursor, statement, parameters, context
1225 )
1226 elif not parameters and context.no_parameters:
~/dev/kafka_sink/venv/lib/python3.7/site-packages/clickhouse_sqlalchemy/drivers/base.py in do_executemany(self, cursor, statement, parameters, context)
584
585 def do_executemany(self, cursor, statement, parameters, context=None):
--> 586 cursor.executemany(statement, parameters, context=context)
587
588 def do_execute(self, cursor, statement, parameters, context=None):
~/dev/kafka_sink/venv/lib/python3.7/site-packages/clickhouse_sqlalchemy/drivers/native/connector.py in executemany(self, operation, seq_of_parameters, context)
167
168 except DriverError as orig:
--> 169 raise DatabaseException(orig)
170
171 self._process_response(response, context)
DatabaseException: Orig exception: Code: 53. Type mismatch in VALUES section. Repeat query with types_check=True for detailed info. Column value: required argument is not an integer
Could you please tell what I am doing wrong. I just want to insert None value to nullable columns like I would do in oiriginal python clickhouse_driver.
Our project use a database extended from clickhouse. In our database, there is an engine named Graph to support graphic data, and we can use such a statement to create a graph table:
create table graph1 (w Float64) engine=Graph;
Is it possible to use clickhouse-sqlalchemy in this case?
Hey @xzkostyan !
It misses enum34
, even in python3.6
AFAIK it should not accorund to driver's setup.py
Ideas?
Traceback (most recent call last):
File "<stdin>", line 3, in <module>
File "/Users/krisz/.pyenv/versions/ml3/lib/python3.6/site-packages/sqlalchemy/engine/__init__.py", line 387, in create_engine
return strategy.create(*args, **kwargs)
File "/Users/krisz/.pyenv/versions/ml3/lib/python3.6/site-packages/sqlalchemy/engine/strategies.py", line 56, in create
entrypoint = u._get_entrypoint()
File "/Users/krisz/.pyenv/versions/ml3/lib/python3.6/site-packages/sqlalchemy/engine/url.py", line 139, in _get_entrypoint
cls = registry.load(name)
File "/Users/krisz/.pyenv/versions/ml3/lib/python3.6/site-packages/sqlalchemy/util/langhelpers.py", line 214, in load
return impl.load()
File "/Users/krisz/.pyenv/versions/ml3/lib/python3.6/site-packages/pkg_resources/__init__.py", line 2315, in load
self.require(*args, **kwargs)
File "/Users/krisz/.pyenv/versions/ml3/lib/python3.6/site-packages/pkg_resources/__init__.py", line 2338, in require
items = working_set.resolve(reqs, env, installer, extras=self.extras)
File "/Users/krisz/.pyenv/versions/ml3/lib/python3.6/site-packages/pkg_resources/__init__.py", line 858, in resolve
raise DistributionNotFound(req, requirers)
pkg_resources.DistributionNotFound: The 'enum34' distribution was not found and is required by clickhouse-driver
Currently, it doesn't appear to be possible to specify column compression codecs when using declarative tables. For many columns, monotonously growing DateTime
columns in particular, usage of such expressions can result in extreme compression rate and performance improvements.
Would you in general like to support these? I'd be willing to contribute the required changes, however my experience with SQLAlchemy is rather limited, so I'd need some guidance on how that would have to be designed. It appears like it is possible to handle extra arguments passed to the Column
constructor and adding an argument such as codec=['DoubleDelta', 'ZSTD']
looks like to way to go for me personally, but I might be horribly wrong.
I am trying to run this with clickhouse and am trying just a select 1
and a select * from mytable limit 1
.
When I do this or even more complex queries in postman using the HTTP api directly, it works:
POST /?database=mydatabase&query_id=a HTTP/1.1
Host: localhost:8123
Authorization: Basic .....
Cache-Control: no-cache
Postman-Token: 2fa39919-4195-42e7-a182-b61ba28596c5
select * from mytable limit 1
But when I do this it just gives me an empty set:
In [5]: import sqlalchemy
In [6]: eng = sqlalchemy.create_engine("clickhouse://default:@localhost:8123/mydatabase")
In [7]: res = eng.execute("select 1")
In [8]: list(res)
Out[8]: []
On further checking, when I do a:
LIMIT 2
I get back 0 recordsLIMIT 3
I get back 1 recordLIMIT 4
I get back 2 recordLIMIT 5
I get back 3 recordWhen I check using APIs, I can see that the first 2 rows for any query I write seem to be missing. I even did a show tables
and validated this.
When I do res.keys()
I can see that the first row is being thought of as the header
or columnnames
ClickHouse server version 1.1.54380
python 2.7.14
clickhouse-sqlalchemy (0.0.5)
sqlalchemy (1.2.6)
When join two tables, I found that there are no columns prefixed with table name. However, it is not work for clickhouse with the following error:
org.jkiss.dbeaver.model.sql.DBSQLException: SQL Error [352]:
ClickHouse exception, code: 352, host: 172.0.0.1, port: xxxx; Code: 352,
e.displayText() = DB::Exception: Column 'user_id' is ambiguous
(version 19.16.7.24 (official build))
Because my SQL have the same column name in two join tables.
The related code in this repository is in /clickhouse-sqlalchemy/drivers/base.py. There are related code:
def visit_column(self, column, include_table=True, **kwargs):
# Columns prefixed with table name are not supported
return super(ClickHouseCompiler, self).visit_column(
column, include_table=False, **kwargs
)
There is always False for include_table argument in this function.
Hi 👋, I wrote the https://github.com/cloudflare/sqlalchemy-clickhouse dialect and was wondering if it made sense to merge both of these projects? I see you use pretty much the same dialect, but with a way nicer connector. The HTTPS and support for native driver is neat. I'd very much appreciate help maintaining the project instead of having to cherry-pick patches between two very similar projects.
What do you think?
Hi,
Sorry for such a basic question question: I am building a setup using native interface (non http) with a clickhouse server running 18.12.17 revision 54407.
I do this message in return while testing connectivity (using superset):
Source db config looks like:
clickhouse+native://[email protected]
ERROR: {"error": "Connection failed!\n\nThe error message returned was:\n'NoneType' object has no attribute 'encode'"}
On the server side, I do have (hiding my IP address):
2018.09.21 09:26:16.777616 [ 732 ] <Trace> TCPHandlerFactory: TCP Request. Address: [::ffff:XX.XXX.XXX.XXX]:55812
2018.09.21 09:26:16.777857 [ 732 ] <Warning> TCPHandler: Client has gone away.
2018.09.21 09:26:16.777933 [ 732 ] <Information> TCPHandler: Done processing connection.
When I do a basic telnet on port 9000 to check connectivity I do have this:
telnet side:
Escape character is '^]'.
�e�DB::NetException/DB::NetException: Unexpected packet from clientusr/bin/clickhouse-server(StackTrace::StackTrace()+0x16) [0x9491e46]
1. /usr/bin/clickhouse-server(DB::Exception::Exception(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, int)+0x22) [0x3019952]
2. /usr/bin/clickhouse-server(DB::TCPHandler::receiveHello()+0xce) [0x3023fae]
3. /usr/bin/clickhouse-server(DB::TCPHandler::runImpl()+0x1c4) [0x3026e64]
4. /usr/bin/clickhouse-server(DB::TCPHandler::run()+0x2b) [0x302836b]
5. /usr/bin/clickhouse-server(Poco::Net::TCPServerConnection::start()+0xf) [0x96107df]
6. /usr/bin/clickhouse-server(Poco::Net::TCPServerDispatcher::run()+0x16a) [0x9610bba]
7. /usr/bin/clickhouse-server(Poco::PooledThread::run()+0x77) [0x972d6b7]
8. /usr/bin/clickhouse-server(Poco::ThreadImpl::runnableEntry(void*)+0x38) [0x9729228]
9. /usr/bin/clickhouse-server() [0x9e659ef]
10. /lib/x86_64-linux-gnu/libpthread.so.0(+0x76ba) [0x7f2ea6e166ba]
11. /lib/x86_64-linux-gnu/libc.so.6(clone+0x6d) [0x7f2ea643741d]
Connection closed by foreign host.
server side:
2018.09.21 09:35:27.908995 [ 29 ] <Error> ServerErrorHandler: Code: 101, e.displayText() = DB::NetException: Unexpected packet from client, e.what() = DB::NetException, Stack trace:
0. /usr/bin/clickhouse-server(StackTrace::StackTrace()+0x16) [0x9491e46]
1. /usr/bin/clickhouse-server(DB::Exception::Exception(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, int)+0x22) [0x3019952]
2. /usr/bin/clickhouse-server(DB::TCPHandler::receiveHello()+0xce) [0x3023fae]
3. /usr/bin/clickhouse-server(DB::TCPHandler::runImpl()+0x1c4) [0x3026e64]
4. /usr/bin/clickhouse-server(DB::TCPHandler::run()+0x2b) [0x302836b]
5. /usr/bin/clickhouse-server(Poco::Net::TCPServerConnection::start()+0xf) [0x96107df]
6. /usr/bin/clickhouse-server(Poco::Net::TCPServerDispatcher::run()+0x16a) [0x9610bba]
7. /usr/bin/clickhouse-server(Poco::PooledThread::run()+0x77) [0x972d6b7]
8. /usr/bin/clickhouse-server(Poco::ThreadImpl::runnableEntry(void*)+0x38) [0x9729228]
9. /usr/bin/clickhouse-server() [0x9e659ef]
10. /lib/x86_64-linux-gnu/libpthread.so.0(+0x76ba) [0x7f2ea6e166ba]
11. /lib/x86_64-linux-gnu/libc.so.6(clone+0x6d) [0x7f2ea643741d]
2018.09.21 09:35:29.654793 [ 732 ] <Trace> TCPHandlerFactory: TCP Request. Address: [::ffff:XX.XXX.XXX.XXX]:55822
Which tend to demonstrate that connectivity is fine ...
Do you think it could be a version support issue ?
Thanks Matt.
If I call any query using http-session and execute
method, got that traceback:
File "/home/anton/Projects/clickhouse-sqlalchemy/tests/sql/test_schema.py", line 51, in test_reflect
session.execute(text('select 1'))
File "/home/anton/Projects/venvs/clickhouse-sqlalchemy/lib/python3.7/site-packages/SQLAlchemy-1.3.0b1-py3.7-linux-x86_64.egg/sqlalchemy/orm/session.py", line 1187, in execute
bind, close_with_result=True).execute(clause, params or {})
File "/home/anton/Projects/venvs/clickhouse-sqlalchemy/lib/python3.7/site-packages/SQLAlchemy-1.3.0b1-py3.7-linux-x86_64.egg/sqlalchemy/orm/session.py", line 1051, in _connection_for_bind
engine, execution_options)
File "/home/anton/Projects/venvs/clickhouse-sqlalchemy/lib/python3.7/site-packages/SQLAlchemy-1.3.0b1-py3.7-linux-x86_64.egg/sqlalchemy/orm/session.py", line 409, in _connection_for_bind
conn = bind.contextual_connect()
File "/home/anton/Projects/venvs/clickhouse-sqlalchemy/lib/python3.7/site-packages/SQLAlchemy-1.3.0b1-py3.7-linux-x86_64.egg/sqlalchemy/engine/base.py", line 2123, in contextual_connect
self._wrap_pool_connect(self.pool.connect, None),
File "/home/anton/Projects/venvs/clickhouse-sqlalchemy/lib/python3.7/site-packages/SQLAlchemy-1.3.0b1-py3.7-linux-x86_64.egg/sqlalchemy/engine/base.py", line 2158, in _wrap_pool_connect
return fn()
File "/home/anton/Projects/venvs/clickhouse-sqlalchemy/lib/python3.7/site-packages/SQLAlchemy-1.3.0b1-py3.7-linux-x86_64.egg/sqlalchemy/pool/base.py", line 355, in connect
return _ConnectionFairy._checkout(self)
File "/home/anton/Projects/venvs/clickhouse-sqlalchemy/lib/python3.7/site-packages/SQLAlchemy-1.3.0b1-py3.7-linux-x86_64.egg/sqlalchemy/pool/base.py", line 743, in _checkout
fairy = _ConnectionRecord.checkout(pool)
File "/home/anton/Projects/venvs/clickhouse-sqlalchemy/lib/python3.7/site-packages/SQLAlchemy-1.3.0b1-py3.7-linux-x86_64.egg/sqlalchemy/pool/base.py", line 484, in checkout
rec = pool._do_get()
File "/home/anton/Projects/venvs/clickhouse-sqlalchemy/lib/python3.7/site-packages/SQLAlchemy-1.3.0b1-py3.7-linux-x86_64.egg/sqlalchemy/pool/impl.py", line 128, in _do_get
self._dec_overflow()
File "/home/anton/Projects/venvs/clickhouse-sqlalchemy/lib/python3.7/site-packages/SQLAlchemy-1.3.0b1-py3.7-linux-x86_64.egg/sqlalchemy/util/langhelpers.py", line 66, in __exit__
compat.reraise(exc_type, exc_value, exc_tb)
File "/home/anton/Projects/venvs/clickhouse-sqlalchemy/lib/python3.7/site-packages/SQLAlchemy-1.3.0b1-py3.7-linux-x86_64.egg/sqlalchemy/util/compat.py", line 249, in reraise
raise value
File "/home/anton/Projects/venvs/clickhouse-sqlalchemy/lib/python3.7/site-packages/SQLAlchemy-1.3.0b1-py3.7-linux-x86_64.egg/sqlalchemy/pool/impl.py", line 125, in _do_get
return self._create_connection()
File "/home/anton/Projects/venvs/clickhouse-sqlalchemy/lib/python3.7/site-packages/SQLAlchemy-1.3.0b1-py3.7-linux-x86_64.egg/sqlalchemy/pool/base.py", line 302, in _create_connection
return _ConnectionRecord(self)
File "/home/anton/Projects/venvs/clickhouse-sqlalchemy/lib/python3.7/site-packages/SQLAlchemy-1.3.0b1-py3.7-linux-x86_64.egg/sqlalchemy/pool/base.py", line 429, in __init__
self.__connect(first_connect_check=True)
File "/home/anton/Projects/venvs/clickhouse-sqlalchemy/lib/python3.7/site-packages/SQLAlchemy-1.3.0b1-py3.7-linux-x86_64.egg/sqlalchemy/pool/base.py", line 636, in __connect
exec_once(self.connection, self)
File "/home/anton/Projects/venvs/clickhouse-sqlalchemy/lib/python3.7/site-packages/SQLAlchemy-1.3.0b1-py3.7-linux-x86_64.egg/sqlalchemy/event/attr.py", line 274, in exec_once
self(*args, **kw)
File "/home/anton/Projects/venvs/clickhouse-sqlalchemy/lib/python3.7/site-packages/SQLAlchemy-1.3.0b1-py3.7-linux-x86_64.egg/sqlalchemy/event/attr.py", line 284, in __call__
fn(*args, **kw)
File "/home/anton/Projects/venvs/clickhouse-sqlalchemy/lib/python3.7/site-packages/SQLAlchemy-1.3.0b1-py3.7-linux-x86_64.egg/sqlalchemy/util/langhelpers.py", line 1336, in go
return once_fn(*arg, **kw)
File "/home/anton/Projects/venvs/clickhouse-sqlalchemy/lib/python3.7/site-packages/SQLAlchemy-1.3.0b1-py3.7-linux-x86_64.egg/sqlalchemy/engine/strategies.py", line 184, in first_connect
dialect.initialize(c)
File "/home/anton/Projects/venvs/clickhouse-sqlalchemy/lib/python3.7/site-packages/SQLAlchemy-1.3.0b1-py3.7-linux-x86_64.egg/sqlalchemy/engine/default.py", line 264, in initialize
self._get_server_version_info(connection)
File "/home/anton/Projects/clickhouse-sqlalchemy/clickhouse_sqlalchemy/drivers/base.py", line 698, in _get_server_version_info
version = connection.scalar('select version()')
File "/home/anton/Projects/venvs/clickhouse-sqlalchemy/lib/python3.7/site-packages/SQLAlchemy-1.3.0b1-py3.7-linux-x86_64.egg/sqlalchemy/engine/base.py", line 880, in scalar
return self.execute(object, *multiparams, **params).scalar()
File "/home/anton/Projects/venvs/clickhouse-sqlalchemy/lib/python3.7/site-packages/SQLAlchemy-1.3.0b1-py3.7-linux-x86_64.egg/sqlalchemy/engine/base.py", line 942, in execute
return self._execute_text(object, multiparams, params)
File "/home/anton/Projects/venvs/clickhouse-sqlalchemy/lib/python3.7/site-packages/SQLAlchemy-1.3.0b1-py3.7-linux-x86_64.egg/sqlalchemy/engine/base.py", line 1104, in _execute_text
statement, parameters
File "/home/anton/Projects/venvs/clickhouse-sqlalchemy/lib/python3.7/site-packages/SQLAlchemy-1.3.0b1-py3.7-linux-x86_64.egg/sqlalchemy/engine/base.py", line 1200, in _execute_context
context)
File "/home/anton/Projects/venvs/clickhouse-sqlalchemy/lib/python3.7/site-packages/SQLAlchemy-1.3.0b1-py3.7-linux-x86_64.egg/sqlalchemy/engine/base.py", line 1416, in _handle_dbapi_exception
util.reraise(*exc_info)
File "/home/anton/Projects/venvs/clickhouse-sqlalchemy/lib/python3.7/site-packages/SQLAlchemy-1.3.0b1-py3.7-linux-x86_64.egg/sqlalchemy/util/compat.py", line 249, in reraise
raise value
File "/home/anton/Projects/venvs/clickhouse-sqlalchemy/lib/python3.7/site-packages/SQLAlchemy-1.3.0b1-py3.7-linux-x86_64.egg/sqlalchemy/engine/base.py", line 1193, in _execute_context
context)
File "/home/anton/Projects/clickhouse-sqlalchemy/clickhouse_sqlalchemy/drivers/base.py", line 689, in do_execute
cursor.execute(statement, parameters, context=context)
File "/home/anton/Projects/clickhouse-sqlalchemy/clickhouse_sqlalchemy/drivers/http/connector.py", line 105, in execute
self._process_response(response_gen)
File "/home/anton/Projects/clickhouse-sqlalchemy/clickhouse_sqlalchemy/drivers/http/connector.py", line 193, in _process_response
self._columns = next(response, None)
RuntimeError: generator raised StopIteration
It happens when I use both session.execute(text('select 1'))
and session.execute('select 1)
calls.
The reason is here: clickhouse_sqlalchemy.drivers.http.transport.RequestsTransport#execute
.
Looks like statement format TabSeparatedWithNamesAndTypes
must be specified for all the queries.
I see two options:
For some reason (unknown to me), using pd.read_sql
drops the second record and uses the first one as a header. Any idea why is it happening? Here's a sample code used on the Rate
example
import pandas as pd
query = """
select day, value from rate limit 2;
"""
df = pd.read_sql(query, engine)
print(df)
returns
Empty DataFrame
Columns: [2017-12-01, 200]
Index: []
Thanks!
Ok, so here comes a nice one: We are using API mode, not native with Clickhouse 20.1.4.1, clickhouse-sqlalchemy was installed today with PIP, so should be latest official version available there.
When querying the DB its seems, there's data missing and corrupt column headers:
The first 2 rows are broken and will not be interpreted as real data, so results with less than 2 rows will show no data. This is not dependent on the result, table or data, so an overall behaviour. As Crosscheck I used JDBC/DBeaver and results are looking fine there.
Following table has 7 entries with IDs from 1 to 7:
The Count shows even no data at all:
Verification with DBeaver and JDBC driver looks good:
Also sometimes the column header is broken and seems to "merge" come columns. Note the "1_1", "0_1", "0_2" and "1_2" are int values from the first two columns from the resultset concatenated with an underscore!
Some of this seems to be related to varchars (as these are also shown as NullType in the Column overview, so some Metadata seems broken or at least affected). Hash is a varchar:
I'm trying to reflect the existing table using SQL Expression Language:
from sqlalchemy import create_engine, MetaData, Table
engine = create_engine('clickhouse://default:@localhost:8123/MyDatabase')
metadata = MetaData()
MyTable = Table('MyTable', metadata, autoload=True, autoload_with=engine)
But the newly created Table() object does not contains any info about table engine (the engine of an actual table is ReplacingMergeTree). I tried to get the engine info with Reflection Inspector:
from sqlalchemy import create_engine, MetaData, Table
from sqlalchemy.engine import reflection
engine = create_engine('clickhouse://default:@localhost:8123/MyDatabase')
insp = reflection.Inspector.from_engine(engine)
print(insp.get_table_options('MyTable'))
It's possible to get the e.g., MySQL table engine in this way (get_table_options() returns {'mysql_engine': 'InnoDB'}), but I can't do this for ClickHouse. Is there any way?
I'm new to SQLAlchemy and clickhouse-sqlalchemy, so please forgive if I misunderstood something)
Здравствуйте.
В http версии протокола получаю данные в обратном порядке. Проблема наблюдается для версии 0.1.0 и выше.
from sqlalchemy import create_engine, text
from sqlalchemy.orm import sessionmaker
engine = create_engine(uri)
Session = sessionmaker()
Session.configure(bind=engine)
session = Session()
print(list(session.execute(text('select x from test_table order by x'))))
Результат для версии 0.0.10:
[(1,), (2,), (3,), (4,)]
Результат для версии 0.1.2:
[(4,), (3,), (2,), (1,)]
Информация о таблице
create table test_table
(
x Int32
)
engine = Memory;
Причиной данной поведении может быть изменения в методе clickhouse_sqlalchemy.drivers.http.connector.Cursor._process_response
. Ранее ответ от кликхауса реверсился, в новых версиях убрали это.
python3.6
SQLAlchemy==1.3.13
Спасибо.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.