PGPool is a PosgreSQL client that automatically uses connection pools and handles reconnections in case of errors.
PGPool also optimizes all of your statements, by preparing them and caching them for you under the hood.
It uses:
If you're using rebar3, add pgpool
as a dependency in your project's rebar.config
file:
{pgpool, {git, "git://github.com/ostinelli/pgpool.git", {tag, "1.1.1"}}}
Or, if you're using Hex.pm as package manager (with the rebar3_hex plugin):
{pgpool, "1.1.1"}
Then, compile:
$ rebar3 compile
Ensure to start PGPool from your application. This can be done by either providing it as a dependency in your .app
file, or by starting it manually:
pgpool:start().
Databases can be set in the environment variable pgpool
. You're probably best off using an application configuration file (in releases, sys.config
):
{pgpool, [
{databases, [
{db1_name, [
{pool, [
%% poolboy options <https://github.com/devinus/poolboy>
%% The `name` and `worker_module` options here will be ignored.
{size, 10}, %% maximum pool size
{max_overflow, 20}, %% maximum number of workers created if pool is empty
{strategy, lifo} %% can be lifo or fifo (default is lifo)
]},
{connection, [
{host, "localhost"},
{user, "postgres"},
{pass, ""},
{options, [
%% epgsql connect_options() <https://github.com/epgsql/epgsql>
{port, 5432},
{ssl, false},
{database, "db1"}
]}
]}
]},
{db2_name, [
{pool, [
{size, 10},
{max_overflow, 20},
{strategy, lifo}
]},
{connection, [
{host, "localhost"},
{user, "postgres"},
{pass, ""},
{options, [
{port, 5432},
{ssl, false},
{database, "db2"}
]}
]}
]}
]}
]}
Please refer to epgsql README for how to perform queries. Currently, PGPool supports the following.
pgpool:squery(DatabaseName, Sql) -> Result
Types:
DatabaseName = atom()
Sql = string() | iodata()
Result = {ok, Count} | {ok, Count, Rows} | {error, no_connection}
Count = non_neg_integer()
Rows = (see epgsql for more details)
For example:
pgpool:squery(db1_name, "SELECT * FROM users;").
Simple queries cannot be optimized by PGPool since they cannot be prepared. If you want to optimize and cache your queries, consider using
equery/3,4
orbatch/2
instead.
In case there's no available connection to the database, the standard squery/2
function will return {error, no_connection}
. If you want to keep retrying until a connection is available, you can use squery/3
.
Note however that this is a blocking call, and should be used only if needed.
pgpool:squery(DatabaseName, Sql, RetryTimeout) -> Result
Types:
DatabaseName = atom()
Sql = string() | iodata()
RetryTimeout = non_neg_integer() | infinity
Result = {ok, Count} | {ok, Count, Rows} | {error, no_connection}
Count = non_neg_integer()
Rows = (see epgsql for more details)
RetryTimeout
specifies how much time (in milliseconds) will be spent waiting to retry (that is, excluding the time taken to call the database). Set to infinity
if you want the call to block forever until a connection becomes available.
For example:
pgpool:squery(db1_name, "SELECT * FROM users;", 60000).
pgpool:equery(DatabaseName, Statement, Params) -> Result
Types:
DatabaseName = atom()
Statement = string()
Params = list()
Result = {ok, Count} | {ok, Count, Rows} | {error, no_connection}
Count = non_neg_integer()
Rows = (see epgsql for more details)
For example:
pgpool:equery(db1_name, "SELECT * FROM users WHERE id = $1;", [3]).
PGPool will prepare your statements and cache them for you, which results in considerable speed improvements. If you use a lot of different statements, consider memory usage because the statements are not garbage collected.
In case there's no available connection to the database, the standard equery/3
function will return {error, no_connection}
. If you want to keep retrying until a connection is available, you can use equery/4
.
Note however that this is a blocking call, and should be used only if needed.
pgpool:equery(DatabaseName, Statement, Params, RetryTimeout) -> Result
Types:
DatabaseName = atom()
Statement = string()
Params = list()
RetryTimeout = non_neg_integer() | infinity
Result = {ok, Count} | {ok, Count, Rows} | {error, no_connection}
Count = non_neg_integer()
Rows = (see epgsql for more details)
RetryTimeout
specifies how much time (in milliseconds) will be spent waiting to retry (that is, excluding the time taken to call the database). Set to infinity
if you want the call to block forever until a connection becomes available.
For example:
pgpool:equery(db1_name, "SELECT * FROM users WHERE id = $1;", [3], 60000).
To execute a batch:
pgpool:batch(DatabaseName, StatementsWithParams) -> Result
Types:
DatabaseName = atom()
StatementsWithParams = [{Statement, Params}]
Statement = string()
Params = list()
Result = [{ok, Count} | {ok, Count, Rows}].
Count = non_neg_integer()
Rows = (see epgsql for more details)
For example:
S = "INSERT INTO users (name) VALUES ($1);",
[{ok, 1}, {ok, 1}] = pgpool:batch(db1_name, [
{S, ["Hedy"]},
{S, ["Roberto"]}
]).
PGPool will prepare your statements and cache them for you, which results in considerable speed improvements. If you use a lot of different statements, consider memory usage because the statements are not garbage collected.
So you want to contribute? That's great! Please follow the guidelines below. It will make it easier to get merged in.
Before implementing a new feature, please submit a ticket to discuss what you intend to do. Your feature might already be in the works, or an alternative implementation might have already been discussed.
Do not commit to master in your fork. Provide a clean branch without merge commits. Every pull request should have its own topic branch. In this way, every additional adjustments to the original pull request might be done easily, and squashed with git rebase -i
. The updated branch will be visible in the same pull request, so there will be no need to open new pull requests when there are changes to be applied.
Ensure to include proper testing. To run PGPool tests, you need to create the database pgpool_test
for user postgres
with no password, and then simply run from the project's root directory:
$ make tests