Comments (11)
Cool. I'm going to PyCon this week, so it may be a couple of weeks. :)
from databricks-sql-python.
Thanks for the insight and links. Our focus when writing the initial SQLAlchemy dialect for this package was to functionally replace the community-made sqlalchemy-databricks
pypi package, which is not being maintained. Full implementation of the SQLA api wasn't our target; although long-term that's desirable. For now, development has been driven by specific customer use-cases.
from databricks-sql-python.
Would you like to have a PR that sets up the dialect-compliance tests?
from databricks-sql-python.
Yes that would be a terrific contribution.
from databricks-sql-python.
I started working on this.
With a clean checkout, unit tests run fine.
When I ran the end-to-end tests, which by default only run the sqlalchemy tests, I got lots of errors like:
Catalog 'None' plugin class not found: spark.sql.catalog.None is not defined
.
I then:
export catalog=hive_metastore
which got me past those errors and then I got a lot of:
Database 'none' not found
.
which were fixed by:
export schema=default
I'll update CONTRIBUTING.md
to mention setting catalog
and schema
.
Finally, I'm getting:
self = <databricks.sql.thrift_backend.ThriftBackend object at 0x7f28145012a0>
op_handle = TOperationHandle(operationId=THandleIdentifier(guid=b'RJ"M\x82\x19G\xea\x92\xb3\x80\xfa\xc1R\x02\x15', secret=b'\xb7\x...x8dM\xb4\x92\xcb\xa9\x83\x03i\xefC', executionVersion=None), operationType=0, hasResultSet=True, modifiedRowCount=None)
get_operations_resp = TGetOperationStatusResp(status=TStatus(statusCode=0, infoMessages=None, sqlState=None, errorCode=None, errorMessage=No...5)\n\t... 16 more\n", responseValidation=None, idempotencyType=None, statementTimeout=None, statementTimeoutLevel=None)
def _check_command_not_in_error_or_closed_state(
self, op_handle, get_operations_resp
):
if get_operations_resp.operationState == ttypes.TOperationState.ERROR_STATE:
if get_operations_resp.displayMessage:
> raise ServerOperationError(
get_operations_resp.displayMessage,
{
"operation-id": op_handle and op_handle.operationId.guid,
"diagnostic-info": get_operations_resp.diagnosticInfo,
},
)
E sqlalchemy.exc.DatabaseError: (databricks.sql.exc.ServerOperationError)
E [PARSE_SYNTAX_ERROR] Syntax error at or near '('(line 6, pos 13)
E
E == SQL ==
E
E CREATE TABLE `PySQLTest_1683167898` (
E name STRING NOT NULL,
E episodes INT,
E some_bool BOOLEAN,
E PRIMARY KEY (name)
E -------------^^^
E ) USING DELTA
E
E [SQL:
E CREATE TABLE `PySQLTest_1683167898` (
E name STRING NOT NULL,
E episodes INT,
E some_bool BOOLEAN,
E PRIMARY KEY (name)
E ) USING DELTA
E
E ]
E (Background on this error at: https://sqlalche.me/e/14/4xp6)
src/databricks/sql/thrift_backend.py:484: DatabaseError
Any suggestions?
from databricks-sql-python.
Hi @unj1m sorry for the late response. yes that error is happening because you're running Unity Catalog tests against a hive_metastore
catalog. In the Unity Catalog world the hive_metastore
catalog is dedicated to maintaining backwards compatibility with pre-UC metastores. It doesn't implement any of the new features of UC, including primary keys.
For this to work, you need to set the catalog
to any other catalog
in a UC enabled SQL warehouse or endpoint.
[edit] Just not the samples
catalog since it's read-only.
from databricks-sql-python.
No worries. Thanks for clarifying.
from databricks-sql-python.
Let me know if you need a hand with anything @unj1m!
from databricks-sql-python.
Let me know if you need a hand with anything @unj1m!
@susodapop Thanks!
Here's a PR that sets up the tests to run.
See the note there about updating requirements.py
. It would be good good for someone who knows databricks well to update that.
Current status:
73 failed, 201 passed, 277 skipped, 1 warning, 63 errors in 2550.73s (0:42:30)
which is a pretty good start. :)
from databricks-sql-python.
Yes that's a very good start. I will be reviewing this PR carefully, particularly combing out the many references to bigquery. More soon. Thanks!
from databricks-sql-python.
particularly combing out the many references to bigquery
Yeah, sorry about that. :(
from databricks-sql-python.
Related Issues (20)
- [sqlalchemy] execute("select 1") gives TypeError: int() argument must be a string, a bytes-like object or a real number, not 'NoneType' HOT 8
- `databricks.sql.connect` hangs in a long retrying loop when an invalid access token is used HOT 3
- Idea: arrow_record_batches cursor method
- SQLAlchemy engine from workspace level service principle? HOT 2
- Unable to write list/array type data HOT 2
- Issue with version 3.1.1
- Failure on cursor.fetchall() HOT 2
- Fixing a couple type problems. (adding py.typed, typing connect, returning Any from fetchall (which I failed to fix!)) HOT 2
- Connector reads 0 rows although Cluster returned results HOT 18
- support new Cursor attribute that provides information on completed commands HOT 1
- loosen, update, or widen pyarrow dependencies HOT 2
- ImportError: cannot import name 'sql' from partially initialized module 'databricks' HOT 2
- Unpin Thrift
- Original thrift file HOT 3
- Invalid SessionHandle Error
- PyCharm IDE + pandas exit code 139
- [Feature Request] Support async execution
- Bad token is being retried in `databricks.sql.connect`
- TypeError: Retry.__init__() got an unexpected keyword argument 'backoff_max' HOT 4
- Too large queries produce MaxRetryError HOT 5
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from databricks-sql-python.