Coder Social home page Coder Social logo

Support for duckdb about optuna HOT 3 OPEN

jerome-f avatar jerome-f commented on May 28, 2024
Support for duckdb

from optuna.

Comments (3)

not522 avatar not522 commented on May 28, 2024

Thank you for your feedback! I'm not familiar with duckdb, but I think it already works using URL (duckdb:///...) as the storage url if the SQLAlchemy support is implemented.

from optuna.

jerome-f avatar jerome-f commented on May 28, 2024

@not522 thanks for looking into it and yes sqlalchemy does support duckdb but I got errors in API call. duckdb is same convention as sqlite but allows for seamless integration with many python libraries (pandas, parquet I/O) and avoids database lock issue and it is more efficient, fast and small disk footprint. I ran the following code

import optuna

def objective(trial):
    x = trial.suggest_float("x", -10, 10)
    return (x - 2) ** 2
storage = optuna.storages.RDBStorage(url=f"duckdb:///test.db")
study = optuna.load_study(study_name="distributed-example", storage=storage)
study.optimize(objective, n_trials=100)

and it generates the following error which is related to the SERIAL keyword in db creation. In DuckDB the keyword INTEGER or BIGINT along with the AUTO_INCREMENT keyword to achieve similar functionality as SERIAL. But I am not familiar with api calls to sqlalchemy so posted this here.

---------------------------------------------------------------------------
CatalogException                          Traceback (most recent call last)
File /mambaforge/lib/python3.11/site-packages/sqlalchemy/engine/base.py:1971, in Connection._exec_single_context(self, dialect, context, statement, parameters)
   1970     if not evt_handled:
-> 1971         self.dialect.do_execute(
   1972             cursor, str_statement, effective_parameters, context
   1973         )
   1975 if self._has_events or self.engine._has_events:

File /mambaforge/lib/python3.11/site-packages/sqlalchemy/engine/default.py:919, in DefaultDialect.do_execute(self, cursor, statement, parameters, context)
    918 def do_execute(self, cursor, statement, parameters, context=None):
--> 919     cursor.execute(statement, parameters)

File /mambaforge/lib/python3.11/site-packages/duckdb_engine/__init__.py:163, in ConnectionWrapper.execute(self, statement, parameters, context)
    162     else:
--> 163         self.__c.execute(statement, parameters)
    164 except RuntimeError as e:

CatalogException: Catalog Error: Type with name SERIAL does not exist!
Did you mean "real"?

The above exception was the direct cause of the following exception:

ProgrammingError                          Traceback (most recent call last)
Cell In[3], line 6
      4     x = trial.suggest_float("x", -10, 10)
      5     return (x - 2) ** 2
----> 6 storage = optuna.storages.RDBStorage(url=f"duckdb:///test.db")
      7 study = optuna.load_study(study_name="distributed-example", storage=storage)
      8 study.optimize(objective, n_trials=100)

File /mambaforge/lib/python3.11/site-packages/optuna/storages/_rdb/storage.py:229, in RDBStorage.__init__(self, url, engine_kwargs, skip_compatibility_check, heartbeat_interval, grace_period, failed_trial_callback, skip_table_creation)
    225 self.scoped_session = sqlalchemy_orm.scoped_session(
    226     sqlalchemy_orm.sessionmaker(bind=self.engine)
    227 )
    228 if not skip_table_creation:
--> 229     models.BaseModel.metadata.create_all(self.engine)
    231 self._version_manager = _VersionManager(self.url, self.engine, self.scoped_session)
    232 if not skip_compatibility_check:

File /mambaforge/lib/python3.11/site-packages/sqlalchemy/sql/schema.py:5825, in MetaData.create_all(self, bind, tables, checkfirst)
   5801 def create_all(
   5802     self,
   5803     bind: _CreateDropBind,
   5804     tables: Optional[_typing_Sequence[Table]] = None,
   5805     checkfirst: bool = True,
   5806 ) -> None:
   5807     """Create all tables stored in this metadata.
   5808 
   5809     Conditional by default, will not attempt to recreate tables already
   (...)
   5823 
   5824     """
-> 5825     bind._run_ddl_visitor(
   5826         ddl.SchemaGenerator, self, checkfirst=checkfirst, tables=tables
   5827     )

File /mambaforge/lib/python3.11/site-packages/sqlalchemy/engine/base.py:3255, in Engine._run_ddl_visitor(self, visitorcallable, element, **kwargs)
   3248 def _run_ddl_visitor(
   3249     self,
   3250     visitorcallable: Type[Union[SchemaGenerator, SchemaDropper]],
   3251     element: SchemaItem,
   3252     **kwargs: Any,
   3253 ) -> None:
   3254     with self.begin() as conn:
-> 3255         conn._run_ddl_visitor(visitorcallable, element, **kwargs)

File /mambaforge/lib/python3.11/site-packages/sqlalchemy/engine/base.py:2461, in Connection._run_ddl_visitor(self, visitorcallable, element, **kwargs)
   2449 def _run_ddl_visitor(
   2450     self,
   2451     visitorcallable: Type[Union[SchemaGenerator, SchemaDropper]],
   2452     element: SchemaItem,
   2453     **kwargs: Any,
   2454 ) -> None:
   2455     """run a DDL visitor.
   2456 
   2457     This method is only here so that the MockConnection can change the
   2458     options given to the visitor so that "checkfirst" is skipped.
   2459 
   2460     """
-> 2461     visitorcallable(self.dialect, self, **kwargs).traverse_single(element)

File /mambaforge/lib/python3.11/site-packages/sqlalchemy/sql/visitors.py:664, in ExternalTraversal.traverse_single(self, obj, **kw)
    662 meth = getattr(v, "visit_%s" % obj.__visit_name__, None)
    663 if meth:
--> 664     return meth(obj, **kw)

File /mambaforge/lib/python3.11/site-packages/sqlalchemy/sql/ddl.py:918, in SchemaGenerator.visit_metadata(self, metadata)
    916 for table, fkcs in collection:
    917     if table is not None:
--> 918         self.traverse_single(
    919             table,
    920             create_ok=True,
    921             include_foreign_key_constraints=fkcs,
    922             _is_metadata_operation=True,
    923         )
    924     else:
    925         for fkc in fkcs:

File /mambaforge/lib/python3.11/site-packages/sqlalchemy/sql/visitors.py:664, in ExternalTraversal.traverse_single(self, obj, **kw)
    662 meth = getattr(v, "visit_%s" % obj.__visit_name__, None)
    663 if meth:
--> 664     return meth(obj, **kw)

File /mambaforge/lib/python3.11/site-packages/sqlalchemy/sql/ddl.py:956, in SchemaGenerator.visit_table(self, table, create_ok, include_foreign_key_constraints, _is_metadata_operation)
    947 if not self.dialect.supports_alter:
    948     # e.g., don't omit any foreign key constraints
    949     include_foreign_key_constraints = None
    951 CreateTable(
    952     table,
    953     include_foreign_key_constraints=(
    954         include_foreign_key_constraints
    955     ),
--> 956 )._invoke_with(self.connection)
    958 if hasattr(table, "indexes"):
    959     for index in table.indexes:

File /mambaforge/lib/python3.11/site-packages/sqlalchemy/sql/ddl.py:314, in ExecutableDDLElement._invoke_with(self, bind)
    312 def _invoke_with(self, bind):
    313     if self._should_execute(self.target, bind):
--> 314         return bind.execute(self)

File /mambaforge/lib/python3.11/site-packages/sqlalchemy/engine/base.py:1422, in Connection.execute(self, statement, parameters, execution_options)
   1420     raise exc.ObjectNotExecutableError(statement) from err
   1421 else:
-> 1422     return meth(
   1423         self,
   1424         distilled_parameters,
   1425         execution_options or NO_OPTIONS,
   1426     )

File /mambaforge/lib/python3.11/site-packages/sqlalchemy/sql/ddl.py:180, in ExecutableDDLElement._execute_on_connection(self, connection, distilled_params, execution_options)
    177 def _execute_on_connection(
    178     self, connection, distilled_params, execution_options
    179 ):
--> 180     return connection._execute_ddl(
    181         self, distilled_params, execution_options
    182     )

File /mambaforge/lib/python3.11/site-packages/sqlalchemy/engine/base.py:1533, in Connection._execute_ddl(self, ddl, distilled_parameters, execution_options)
   1528 dialect = self.dialect
   1530 compiled = ddl.compile(
   1531     dialect=dialect, schema_translate_map=schema_translate_map
   1532 )
-> 1533 ret = self._execute_context(
   1534     dialect,
   1535     dialect.execution_ctx_cls._init_ddl,
   1536     compiled,
   1537     None,
   1538     exec_opts,
   1539     compiled,
   1540 )
   1541 if self._has_events or self.engine._has_events:
   1542     self.dispatch.after_execute(
   1543         self,
   1544         ddl,
   (...)
   1548         ret,
   1549     )

File /mambaforge/lib/python3.11/site-packages/sqlalchemy/engine/base.py:1850, in Connection._execute_context(self, dialect, constructor, statement, parameters, execution_options, *args, **kw)
   1848     return self._exec_insertmany_context(dialect, context)
   1849 else:
-> 1850     return self._exec_single_context(
   1851         dialect, context, statement, parameters
   1852     )

File /mambaforge/lib/python3.11/site-packages/sqlalchemy/engine/base.py:1990, in Connection._exec_single_context(self, dialect, context, statement, parameters)
   1987     result = context._setup_result_proxy()
   1989 except BaseException as e:
-> 1990     self._handle_dbapi_exception(
   1991         e, str_statement, effective_parameters, cursor, context
   1992     )
   1994 return result

File /mambaforge/lib/python3.11/site-packages/sqlalchemy/engine/base.py:2357, in Connection._handle_dbapi_exception(self, e, statement, parameters, cursor, context, is_sub_exec)
   2355 elif should_wrap:
   2356     assert sqlalchemy_exception is not None
-> 2357     raise sqlalchemy_exception.with_traceback(exc_info[2]) from e
   2358 else:
   2359     assert exc_info[1] is not None

File /mambaforge/lib/python3.11/site-packages/sqlalchemy/engine/base.py:1971, in Connection._exec_single_context(self, dialect, context, statement, parameters)
   1969                 break
   1970     if not evt_handled:
-> 1971         self.dialect.do_execute(
   1972             cursor, str_statement, effective_parameters, context
   1973         )
   1975 if self._has_events or self.engine._has_events:
   1976     self.dispatch.after_cursor_execute(
   1977         self,
   1978         cursor,
   (...)
   1982         context.executemany,
   1983     )

File /mambaforge/lib/python3.11/site-packages/sqlalchemy/engine/default.py:919, in DefaultDialect.do_execute(self, cursor, statement, parameters, context)
    918 def do_execute(self, cursor, statement, parameters, context=None):
--> 919     cursor.execute(statement, parameters)

File /mambaforge/lib/python3.11/site-packages/duckdb_engine/__init__.py:163, in ConnectionWrapper.execute(self, statement, parameters, context)
    161         self.__c.execute(statement)
    162     else:
--> 163         self.__c.execute(statement, parameters)
    164 except RuntimeError as e:
    165     if e.args[0].startswith("Not implemented Error"):

ProgrammingError: (duckdb.duckdb.CatalogException) Catalog Error: Type with name SERIAL does not exist!
Did you mean "real"?
[SQL: 
CREATE TABLE studies (
	study_id SERIAL NOT NULL, 
	study_name VARCHAR(512) NOT NULL, 
	PRIMARY KEY (study_id)
)

]
(Background on this error at: https://sqlalche.me/e/20/f405)

from optuna.

not522 avatar not522 commented on May 28, 2024

Sorry for my late response and thank you for your error report. In my understanding, this issue should be addressed on the duckdb side, and I think it's been discussed recently. ref:

from optuna.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.