elixir-ecto / db_connection Goto Github PK
View Code? Open in Web Editor NEWDatabase connection behaviour
Home Page: http://hexdocs.pm/db_connection/DBConnection.html
Database connection behaviour
Home Page: http://hexdocs.pm/db_connection/DBConnection.html
See discussion here: https://elixirforum.com/t/phoenix-testing-with-ecto-2-sandbox-access-from-processes/9174/9
@fishcakez is this something we can do?
We've seen this error a couple of times in production. Stack trace:
Elixir.ErlangError: Erlang error: {:timeout, {:gen_server, :call, [#PID<0.2180.0>, {:checkout, #Reference<0.2496162780.1105199106.71425>, true, 15000}, 5000]}}
File "lib/db_connection/poolboy.ex", line 112, in DBConnection.Poolboy.checkout/3
File "lib/db_connection.ex", line 920, in DBConnection.checkout/2
File "lib/db_connection.ex", line 742, in DBConnection.run/3
File "lib/db_connection.ex", line 636, in DBConnection.execute/4
File "lib/ecto/adapters/postgres/connection.ex", line 98, in Ecto.Adapters.Postgres.Connection.execute/4
File "lib/ecto/adapters/sql.ex", line 256, in Ecto.Adapters.SQL.sql_call/6
File "lib/ecto/adapters/sql.ex", line 436, in Ecto.Adapters.SQL.execute_or_reset/7
File "lib/ecto/repo/queryable.ex", line 133, in Ecto.Repo.Queryable.execute/5
Any ideas why this might be happening and what we can do to resolve it?
I am considering if we should move backoff to the repository as db_connection_backoff.erl for two reasons:
Thoughts?
DB Connection version: 0.2.5
Opening this issue to better track it.
Using manual checkout mode. Haven't been able to track down exactly how or when this happens yet.
11:35:19.122 [error] Postgrex.Protocol (#PID<0.1074.0>) disconnected: ** (DBConnection.Error) client down
11:35:19.142 [error] GenServer #PID<0.1101.0> terminating
** (FunctionClauseError) no function clause matching in anonymous fn/1 in DBConnection.Ownership.Owner.handle_cast/2
(db_connection) lib/db_connection/ownership/owner.ex:170: anonymous fn({{#Reference<0.0.2.4968>, #Reference<0.0.2.4970>}, 15000, {#PID<0.1100.0>, #Reference<0.0.2.4969>}}) in DBConnection.Ownership.Owner.handle_cast/2
(stdlib) queue.erl:369: :queue.filter_f/2
(stdlib) queue.erl:353: :queue.filter/2
(db_connection) lib/db_connection/ownership/owner.ex:178: DBConnection.Ownership.Owner.handle_cast/2
(stdlib) gen_server.erl:615: :gen_server.try_dispatch/4
(stdlib) gen_server.erl:681: :gen_server.handle_msg/5
(stdlib) proc_lib.erl:240: :proc_lib.init_p_do_apply/3
11:35:19.142 [error] #PID<0.1098.0> running SchedWeb.Endpoint terminated
Server: localhost:4001 (http)
Request: POST /login
** (exit) exited in: GenServer.call(#PID<0.1101.0>, {:checkout, #Reference<0.0.5.2433>, true, 15000}, 5000)
** (EXIT) time out
11:35:19.143 [error] Process #PID<0.1100.0> raised an exception
** (Postgrex.Error) tcp recv: closed
(ecto) lib/ecto/adapters/sql.ex:405: Ecto.Adapters.SQL.struct/6
(ecto) lib/ecto/repo/schema.ex:373: Ecto.Repo.Schema.apply/5
(ecto) lib/ecto/repo/schema.ex:179: anonymous fn/11 in Ecto.Repo.Schema.do_insert/4
(ecto) lib/ecto/repo/schema.ex:584: anonymous fn/3 in Ecto.Repo.Schema.wrap_in_transaction/6
(ecto) lib/ecto/adapters/sql.ex:450: anonymous fn/3 in Ecto.Adapters.SQL.do_transaction/3
(db_connection) lib/db_connection.ex:1079: DBConnection.transaction_run/4
(db_connection) lib/db_connection.ex:1009: DBConnection.run_begin/3
(db_connection) lib/db_connection.ex:766: DBConnection.transaction/3
Appears to be a bug in DBConnection.
Running into this issue with 1.1.2
:
λ MIX_ENV=test mix ecto.migrate
** (CaseClauseError) no case clause matching: {:error, :invalid_message}
(db_connection) lib/db_connection/ownership.ex:58: DBConnection.Ownership.ownership_checkout/2
(ecto) lib/ecto/adapters/sql/sandbox.ex:477: Ecto.Adapters.SQL.Sandbox.checkout/2
(ecto) lib/mix/tasks/ecto.migrate.ex:79: anonymous fn/4 in Mix.Tasks.Ecto.Migrate.run/2
(elixir) lib/enum.ex:645: Enum."-each/2-lists^foreach/1-0-"/2
(elixir) lib/enum.ex:645: Enum.each/2
(mix) lib/mix/task.ex:294: Mix.Task.run_task/3
(mix) lib/mix/project.ex:313: Mix.Project.in_project/4
(elixir) lib/file.ex:1162: File.cd!/2
It works fine for ecto.create
but blows up on ecto.migrate
. Here's my configuration:
# test/test_helper.exs
ExUnit.start()
Ecto.Adapters.SQL.Sandbox.mode(Myapp.Repo, :manual)
# config/test.exs
config :myapp, Myapp.Repo,
adapter: Ecto.Adapters.Postgres,
...
pool: Ecto.Adapters.SQL.Sandbox
Related Deps
¯\_(ツ)_/¯
As in @petrikero's talk.
This issue is to record changes required for Ecto integration.
pool
is called pool_mod
in Ecto. Solution: rename in DBConnection to pool
and proxy_mod
to proxy
.pool_timeout
is called queue_timeout
. Solution: rename in DBConnection as is more consistent and should be used in poolboy pool for sync stoppingmax_overflow
is called pool_overflow
. Consider (soft?) deprecate in Ecto as the naming is inconsistent with other options.A query should be carried out as a prepare and a execute so that parameter encoding does not have to be done inside handle_query
where an error will crash the connection. It also means adapters do not have to handle the prepare/encode/execute logic themselves.
However this leads to an issue where un-named queries won't be closed after a query without adding a round trip to explicitly close with handle_close
. One way to solve this is to introduce a handle_execute_close
and execute_close
that allows an adapter to handle the execute and close in an efficient manor. The default implementation provided by __using__
would call handle_execute
and handle_close
.
According to the Ecto docs, the :ownership_timeout
can be configured in config.exs
:
@moduledoc """
# ...
If you have a long running test (or you're debugging with IEx.pry), the timeout for the connection ownership may
be too short. You can increase the timeout by setting the
`:ownership_timeout` options for your repo config in `config/config.exs` (or preferably in `config/test.exs`):
config :my_app, MyApp.Repo,
ownership_timeout: NEW_TIMEOUT_IN_MILLISECONDS
The `:ownership_timeout` option is part of
[`DBConnection.Ownership`](https://hexdocs.pm/db_connection/DBConnection.Ownership.html)
and defaults to 15000ms. Timeouts are given as integers in milliseconds.
# ...
"""
I tried this and could not get it to work. Looking at DBConnection.Ownership.Proxy.start_link/4
and DBConnection.Ownership.Proxy.init/2
, I can see why:
def start_link(manager, caller, pool, pool_opts) do
GenServer.start_link(__MODULE__, {manager, caller, pool, pool_opts}, [])
end
def init(proxy, opts) do
ownership_timeout = opts[:ownership_timeout] || @ownership_timeout
case GenServer.call(proxy, {:init, ownership_timeout}, :infinity) do
:ok -> :ok
{:error, _} = error -> error
end
end
start_link/4
is hardcoded to pass []
as the opts to init/2
.
At a minimum, it would be great if it could be configured via config.exs
as the Ecto docs state. In addition, I would love it if there was a way to configure it for a particular ownership checkout. 15 seconds is reasonable in general for my test suite (excluding a handful of smoke tests I don't usually run locally, my suite runs over 2K tests in 30 seconds) but for the smoke tests--which actually hit S3 to fetch data instead of using a test fake--it can take longer than 15 seconds we often hit the timeout. I would like to be able to pass an alternate timeout in the Ecto.Adapters.SQL.Sandbox.checkout
call (or in Ecto.Adapters.SQL.Sandbox.mode
call, if that makes more sense) so that the longer timeout is only used for a particular test.
When a socket is checked in to the DBConnection.Connection process and no check outs requests are waiting in the internal queue it sets the socket into active: :once
. If there is a client queue in Poolboy's central process a client will immediately be assigned to the connection process, a checkout request sent to the connection, the socket will be set to active: false
immediately and checked out again. Given that idle connections are pinged at :idle_timeout
we could investigate if delaying the check in/ active: :once
until the first idle timeout improves performance under load.
Hi, we are using a module using poolboy based on this project, I notice Sojourn process being started anyway.
From
db_connection/lib/db_connection/app.ex
Line 10 in 188e991
Is it intended? How to prevent Sojourn pool to start if using poolboy? Could we have the pool module as parameter or so instead?
Thanks
Its possible parallel preloads to occur using explicit preloading when the calling process has not used the Ecto sandbox so does not have an associated proxy. In this situation (automatic/shared) the caller should gain ownership or an allowance on a proxy and the preloading process use would then use that proxy for its query.
This timeout should be randomized to prevent sync disconnects.
Proxy modules do not provide a good abstraction for logging because it is devilishly hard to log at the right moments. The two main goals of DBConnection is to separate concerns and make calls to orthogonal parts at the optimum moment. It has failed to do that for logging. It will be much cleaner and more efficient to support logging or metric measuring in DBConnection directly.
@josevalim I agree with your comment on IRC and I think this is the way to go for logging. However if you have other ideas or thoughts please give them.
Hi,
I was trying to follow docs (I was implementing an after_connect
hook) and I noticed example is incorrect:
{ok, query} = DBConnection.prepare(pid, "SELECT id FROM table")
{:ok, result} = DBConnection.execute(pid, query, [])
:ok = DBConnection.close(pid, query)
DBConnection.prepare
must be something that implements DBConnection.Query (e.g. Postgrex.Query
), not a stringDBConnection.execute(pid, query, [], [])
pid
could be called conn
to match function headI'm happy to open up a PR with docs fixes (for some or all points) - what would be the best example of a query
argument to prepare/2
? Not sure it makes to assume Postgrex.Query
, but at the same time it would be good to be concrete.
Hi, i have the next issue with a database name with "-"
config :my_app, MyApp.Repo,
adapter: Tds.Ecto,
username: "sa",
password: "password",
database: "db-name",
hostname: "local.db.sqlserver",
pool_size: 10
The dependencies are:
Resolving Hex dependencies...
Dependency resolution completed:
connection 1.0.4
db_connection 1.1.3
ecto 2.2.8
postgrex 0.13.3
tds 1.0.11
tds_ecto 2.0.7
And the error is:
[error] Tds.Protocol (#PID<0.4038.0>) failed to connect: ** (Tds.Error) Line 10 (Error 102): Incorrect syntax near '-'.
The database is en SQL Server 2008 R2
Elixir 1.4.5
Erlang/OTP 18 [erts-7.3.1] [source] [64-bit] [smp:4:4] [async-threads:10] [hipe] [kernel-poll:false] [dtrace]
Eshell V7.3.1
Thanks for your time !
I see in #50, you've added the logging. I am sure that is a good thing! However, now when I run my mix test
, I get a ton of debug log messages:
[08:07:07.584 debug] line=202 module=DBConnection.Ownership.Manager function=owner_down/2
#PID<0.2377.0>, #PID<0.2384.0>, #PID<0.2382.0>, #PID<0.2380.0>, #PID<0.1538.0> lose proxy #PID<0.2378.0>
Perhaps the compiled package on hex can exclude the debug level messages?
I got this very cryptic (to me) error because I missed starting postgrex: http://stackoverflow.com/questions/38010058/shutdown-failed-to-start-child-dbconnection-ownership-manager-after-updating
José got this error, seemingly for similar reasons: elixir-ecto/ecto#1491
Do you think it would be technically feasible and reasonable to improve the error messages in situations like this? As an Elixir newbie, I feel cryptic errors are one of the rougher edges of the ecosystem.
EDIT: Or is this maybe something that Postgrex could/should improve, rather than db_connection? I guess what's happening is roughly that some Postgrex function is called and exits cryptically in the absence of certain running Postgrex processes?
I was talking with @fishcakez in elixir's slack and lamenting the difficulty in making use of Prepared statements when not using Ecto. It would be nice if DBConnection provided some of the facilities and part of the implementation to make this work across drives.
I think it's a common case that people want to prepare statements on startup and then execute those statements at runtime. We've handled this internally be having a macro register the name + sql, on startup, preparing the sql and storing the query into ETS. We can then execute a query by name which pulls the query from ETS.
I could see people struggling with setting this up. A part of the challenge is defining the prepared statement but deferring the actual preparation until the DB is up (and what if you have multiple databases, you need a way to specify which one you mean).
I am using db_connection 0.2.5 with postgrex 0.11.1 and elixir 1.2.5. I am aware that there's a newer version of db_connection available but the issue is so significant that I have decided to report it anyway.
Today our app located on heroku has silently stopped working. It apparently happened after postgresql on heroku was temporarily unavailable. After certain amount of retries it just reported "Postgrex.Protocol (#PID<0.428.0>) disconnected: ** (DBConnection.Error) client down".
It is rather critical issue as because the app is not crashing totally, heroku does not restart it so it continues to operate without notification about issue besides logs.
The log file is attached.
radiokit-agenda-prod_2016-06-20_022140_2016-06-20_022240.log.gz
Is there an ETA on the latest release from the master branch? The new error handling for connections {:error, msg}
is needed for a DB driver using this repo!
Are DBConnection.prepare/3
and DBConnection.execute/4
expected to raise errors? If this is the case then why there are prepare!/3
and execute!/4
functions there?
We should do this here or in Ecto. Keep in mind we don't need to keep Ecto API, we could for example ask a tuple of {Mod, fun, args}
to be given on configuration.
Hey
Just had problem with PostgreSQL not running correctly in the background. Error I got was the same as in title which to be honest wasn't really easy to understand. Maybe it would be good to change this message a little?
From what I see this message is used in two places in this repo
Message could include possible solutions ("check if database is working" for example).
Cheers
Today we get:
** (DBConnection.Error) client timeout
Could we say explicitly it is a transaction timeout?
When doing similar async tasks to parallel preloads in Ecto it would be useful to use the caller option. We may need to check/test that caller works if a child task uses parallel preloads in Ecto.
/cc @josevalim @nathany
Hi,
I am working on a project which uses Ecto and MariaDB in the backend. Our MariaDB instance is provided by Amazon RDS. Recently, we have enabled multi az on our RDS instances which enables automatic failover to a secondary. When this happens, Amazon switches over the DNS record for the RDS instance to the secondary, which takes about 2 minutes.
Having said that, when that happens our application fails anyway and requires a restart because the db connections don't move over with the DNS record. We believe that the ip address is cached which causes this behaviour.
Looking around, this SO post has a comment stating we can configure TTL behaviour for dns caching through the erlang library inet. Can you please confirm that db_connection (and therefore Ecto) use inet under the hood? Will configuring inet's cache_refresh
option help here?
The Ownership pool starts its internal pool via the Watcher, and Sojourn starts its internal pool via the Watcher. The Watcher blocks when starting an internal pool so can't receive the request from Sojourn to start its internal pool as it blocking starting the Sojourn pool itself (Ownership's internal pool).
As mentioned in elixir-ecto/ecto#2219 (comment)
The code is awkward and inconsistent as we can not close on a decode
error on prepare_execute
as the connection is checked in at that point.
At some point in future we may need to provide helpers to limit prepared queries at adapter level. It should not be done automatically as adapter sits between db_connection and database.
It would be great if we could tune the pool size using an environment variable - it doesn't currently seem to work. We can do this with other configuration parameters like url
in the form of {:system, "ENV_VAR_NAME"}
https://github.com/elixir-ecto/db_connection/blob/master/lib/db_connection.ex#L991
defp parse(call, query, params, opts) do
try do
DBConnection.Query.parse(query, opts)
catch
kind, reason ->
pre_log(call, query, params, opts, kind, reason, System.stacktrace())
end
end
DBConnection.Query.parse - this function does not use the option params
so the code is not working:
q = "copy (SELECT timestamp,open,high,low,volume FROM ohlcv
WHERE interval=$1
ORDER BY timestamp DESC) To STDOUT With CSV"
{:ok, res} = Ecto.Adapters.SQL.query(Repo, q, [ interval_sec])
raise error:
** (ArgumentError) parameters must be of length 0 for query %Postgrex.Query{columns: nil, name: "", param_formats: [], param_oids: [], param_types: [], ref: #Reference<0.0.1.267>, result_formats: [], result_oids: nil, result_types: nil, statement: "copy (SELECT timestamp,open,high,low,volume FROM ohlcv\n WHERE interval=$1\n ORDER BY timestamp DESC) To STDOUT With CSV", types: {Ecto.Adapters.Postgres.TypeModule, 438313}}
(postgrex) lib/postgrex/query.ex:50: DBConnection.Query.Postgrex.Query.encode/3
(db_connection) lib/db_connection.ex:1073: DBConnection.describe_run/5
(db_connection) lib/db_connection.ex:1144: anonymous fn/4 in DBConnection.run_meter/5
(db_connection) lib/db_connection.ex:1201: DBConnection.run_begin/3
(db_connection) lib/db_connection.ex:586: DBConnection.prepare_execute/4
(ecto) lib/ecto/adapters/postgres/connection.ex:93: Ecto.Adapters.Postgres.Connection.execute/4
(ecto) lib/ecto/adapters/sql.ex:243: Ecto.Adapters.SQL.sql_call/6
lib/mix/tasks/trading.calculate_indicators.ex:22: Mix.Tasks.Trading.CalculateIndicators.run/1
(mix) lib/mix/task.ex:296: Mix.Task.run_task/3
(mix) lib/mix/cli.ex:58: Mix.CLI.run_task/2
(elixir) lib/code.ex:363: Code.require_file/2
@josevalim @fishcakez Is it necessary?
See: robconery/moebius#37
If the owner dies, the connection process should exit.
When a connection errors due to a query timeout a helpful message is logged to errors like
Postgrex.Protocol (#PID<0.603.0>) disconnected: ** (DBConnection.ConnectionError) client #PID<0.14026.0> timed out because it checked out the connection for longer than 15000ms
Whatever process was using that connection will also exit if it wasn't trapping, which the error_logger will pick up and print but with a far less helpful message
#PID<0.14026.0> running ... terminated
(exit) an exception was raised: ** (DBConnection.ConnectionError) ssl recv: closed
When reporting exceptions to a service, only this second, less useful message gets reported: "ssl recv: closed". Devs then have to open up the logs and go searching for a Postgrex error logged sometime before the exception that contains a matching pid.
Would it be possible to have these timeout exits maintain the useful "timed out because..." message as their exception message? It'd be a big quality of life improvement as a developer, both to cut down on the exception dereferencing but also to differentiate timeout errors from actual ssl errors (handshake, network, ...)
This logic should be moved to the adapters themselves.
This is a pretty classic 3-way exit signal race condition where we technically log the wrong reason:
OR
Fortunately we disconnect in either case so the pool is secure. However it might be that we can give better information or at least try to prevent a user from being mislead that a different process caused the crash.
When we have 2 monitors active (one on owner and one on client) and we receive a :DOWN
we could use Process.alive?
to see if the other process is still alive and use this in the message. If the other process is not alive we can block to receive their :DOWN
and log both. If the process is alive it might be helpful to add that information to (e.g. "(with client #PID<..>)"). We may also want to special case the situation where the owner is the client.
During client check in:
Similar can happen on disconnect check in with queue: false
but that is likely desired as connection not immediately available. However if queue: true
:
However if we the old client sends disconnect to Connection before checking in to poolboy (not current behaviour) the disconnect/connect would always arrive before the new client checkout. This means the new client would always be blocked waiting for the connection, which may or may not succeed.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.