supabase / pg_net Goto Github PK
View Code? Open in Web Editor NEWA PostgreSQL extension that enables asynchronous (non-blocking) HTTP/HTTPS requests with SQL
Home Page: https://supabase.github.io/pg_net
License: Apache License 2.0
A PostgreSQL extension that enables asynchronous (non-blocking) HTTP/HTTPS requests with SQL
Home Page: https://supabase.github.io/pg_net
License: Apache License 2.0
If the url is null, the database crashes
supabase init
supabase start
docker restart <your-db-container-name>
(see supabase/cli#137)
Go to the sql editor and run the following code
create extension if not exists pg_net;
create table if not exists test (
id uuid primary key default gen_random_uuid()
);
create or replace function test_trigger()
returns trigger
language plpgsql
security invoker
as
$$
begin
perform
net.http_post(
url:=null,
body:='{"hello": "world"}'::jsonb
);
return new;
end
$$;
create trigger call_test_trigger
after insert on test
for each row execute procedure test_trigger();
insert into test default values;
Result: Connection terminated unexpectedly
An exception is raised telling the user that the url cannot be null.
For unsupported protocols
We want to send a delete request to a serverless function within a Nextjs api to trigger the deletion of the entity at an external provider after it was deleted in our database.
Support http delete in addition to post and get.
Adding another endpoint, e.g. /delete, and sending a post request there
IIRC, this issue happened because a user created a foreign key to our queue pk
Lines 11 to 12 in 5ff8b6b
Then the TTL would not work because doing DELETEs failed, this would also kill the worker on every restart.
Maybe the correct behavior in this case is to warn instead of crashing - though dying might also be good to avoid having excessive rows in the queue.
Originally posted by @steve-chavez in #43 (comment)
When a large number of requests starts at the same time (> 1024) the worker crashes
Run a webserver locally:
Process 1
python3 -m http.server
SQL
create extension pg_net;
select net.http_get('http://0.0.0.0:8000') from generate_series(1,5000);
Error
postgres=# 2021-07-23 16:20:16.307 CDT [32736] ERROR: CURL error code: 56
Currently, if 100 requests are made in a single transaction (or roughly at the same time) when the worker polls for new work it sends all 100 to the multi curl worker at the same time and doesn't write any of the responses to the net.http_response
table until all 100 are complete. If any 1 of those servers is slow to respond, it may take up to 30 seconds (the current local timeout) before any responses are available.
Ideally, each response would be written to net.http_response
as soon as its available
I have an HTTP webhook set in my supabase instance (cloud) making a call to a next.js app. The request goes through when making calls to my .vercel.app domain. But when using my domain example: domain.com, it failed. I realized it was a redirection issue when I changed the webhook URL from domain.com to www.domain.com. Vercel redirects domain.com => www.domain.com and returns 308 to the HTTP client.
Steps to reproduce the behavior, please provide code snippets or a repository:
Try a request to any URL that makes a 308 redirect.
A call to a URL that makes a 308 should redirect properly
https://www.loom.com/share/33ce73d2de094113b31a43d993e0eec7
Add any other context about the problem here.
Put a response
shim around http response to differentiate between status of the request & the remote server's response
-----------
-- TYPES --
-----------
-- A response from an HTTP server (abbreviated)
create type http_response AS (
status_code integer,
headers json,
content text
);
-- Potential states of an HTTP Request
create type request_status as enum ('PENDING', 'SUCCESS', 'ERROR');
-- State wrapper around (any protocol) responses
create type net_result as (status request_status, response http_response, msg text);
--------------
-- Examples --
--------------
-- A (non-http) error has occured e.g. timeout
select ('ERROR', null, 'client side timeout occurred')::net_result;
select ('ERROR', null, 'invalid protocol')::net_result;
-- Has not been processed
select ('PENDING', null, 'request has not been processed')::net_result;
-- Successful request resulting in http 404
select ('SUCCESS', (1, 404, 'Not Found')::http_response, 'ok')::net_result;
-- Successful request resulting in http 200
select ('SUCCESS', (1, 200, 'Some data')::http_response, 'ok')::net_result;
When trying to send a request from a trigger function, the request doesn't reach the server. In the _http_response
table, the error message is:
Couldn't connect to server
Do the following locally:
begin
perform
net.http_post(
url:='url_to_your_localhost_edge_function',
body:='{}'::jsonb
) as request_id;
return new;
end;
The request should go through.
curl
my edge function, it worksI have a webhook configured to watch a table for inserts. On an insert a POST request should be fired to an API hosted on Vercel.
This works when one insert operation is carried out. If multiple inserts are carried out: .insert([{message: 1}, {message: 2}, {message: 3}]) then the webhooks aren't fired at all. If instead we carry out multiple inserts by calling the Supabase API multiple times then the webhook is fired for only some of the inserts.
Steps to reproduce the behavior, please provide code snippets or a repository:
Configure a webhook on a table to watch for insert operations and call an external API
Use the Supabase JS library to perform multiple inserts like: let { data, error } = await supabase .from("example_table") .insert([{message: 1}, {message: 2}, {message: 3}, {message: 4}, {message: 5}, {message: 6}, etc ]) .select("message");
Check logs for external API to confirm it didn't receive any requests
See error
Something to note is that I was inserting 32 things at once.
For each insert, I would expect the external API to be called. It would also help if there was a log in Supabase that showed what happens when the webhook is run.
If applicable, add screenshots to help explain your problem.
Add any other context about the problem here.
When doing:
select net.http_get('localhost:3000/projects'); -- no protocol specified, it should be http://localhost:3000/projects
We get a non-descriptive error
ERROR: curl_url returned: 3
expand pg_net functionality with more operations and other data types
Yes. I would like to use pg_net
to access REST micro-services in an asynchronous manner, when those services rely on other HTTP methods besides just GET
, such as PUT
, PATCH
, and DELETE
. Moreover, sometimes those services work with payloads that are not JSON and therefore cannot be passed to a PostgreSQL function as a json
or jsonb
data type.
In addition to the existing net.http_get(url text, params jsonb, headers jsonb, timeout_milliseconds int)
and net.http_post(url text, params jsonb, headers jsonb, timeout_milliseconds int)
functions, I would like for there to be a master function net.http(request http_request, timeout_milliseconds int)
function similar to the http.http(request http_request)
function in the psql-http extension. Like in that extension, http_request
would be a data type that has both a method
and a content
attribute, the latter being varchar
. This would be enough to support other HTTP methods and other payloads.
I have considered and even used the synchronous http
extension in conjunction with custom tables and the pg_cron
extension to (re)implement a pseudo-async processor, but it's cumbersome and duplicative of the work that's in the pg_net
extension.
No other context is relevant.
After installing the pg_net
extension via Database->Extensions GUI the Queries do not get executed, as the background worker is not running. tested this on self-hosted and it's working like intended there
Steps to reproduce the behavior, please provide code snippets or a repository:
pg_net
select net.check_worker_is_up()
pg_net
select net.http_get('https://filesamples.com/samples/code/json/sample2.json')
and take note of the responding id.select net.http_collect_response(id)
- id = result from query beforecheck_worker_is_up should return the process id of the running worker.
http_collect_response should return the status success and data along with it
(this is on my local supabase cli instance)
As you can see, the response is not collect and instead it tells the request failed
As you can see here the worker is not running.
This works on local.
The error "request matching request_id not found" only comes to be, as a result of the request not being executed and instead "vanishing"
Enabling the pg_net extension either from SQL or from the studio results in the following error:
[42883] ERROR: function net.http_collect_response(bigint, boolean) does not exist Where: SQL statement "ALTER function net.http_collect_response(request_id bigint, async boolean) SECURITY DEFINER" PL/pgSQL function grant_pg_net_access() line 24 at SQL statement
Steps to reproduce the behavior, please provide code snippets or a repository:
The PG_NET extension is enabled
This issue can also be seen when enabling the extension from SQL
create extension pg_net;
Whenever any function related to pg_net
fails (eg. this error #38) the extension stops trying to send any request.
It inserts the request into net.http_request_queue
but nothing cares after that.
Re-enabling pg_net
and recreating my triggers fixes the error.
pg_net
extensionSupabase recovers from the error gracefully instead of exploding and requiring me to re-enable the extension and my triggers.
requests can not be safely cancelled after they have been submitted to the queue
remove the net.cancel_request
function to avoid confusion
When having a TTL of:
show pg_net.ttl;
6 hours
Doing new requests succeed, but their response rows get deleted quickly(after 1 second):
select net.http_get('https://webhook.site/449fddab-5642-4645-9a5e-6c71f7fe9a8e');
select * from net._http_response;
-- 0 rows
This is because the worker inserts rows using a now()
default value:
Lines 339 to 340 in 639b049
Line 41 in 639b049
And this now()
is not getting updated. Every new insert uses an "old" now()
value.
Use SetCurrentStatementStartTimestamp()
before:
Lines 205 to 206 in 639b049
To make now()
use the current time.
required due to #9
Only happens on Supabase prod instances which have pg 15, when doing:
CREATE DATABASE foo;
DROP DATABASE foo;
DROP DATABASE foo
gets locked with the following message:
still waiting for backend with PID 2576 to accept ProcSignalBarrier
PID 2576 was pg_net in this case.
To reproduce locally, we need a dummy function handling SIGUSR1
.
(Otherwise the problem is not reproducible locally as pg_net doesn't handle SIGUSR1 explicitly, on Supabase production other extensions in shared_preload_libraries
are likely triggering this effect.)
static void
handle_sigusr1(SIGNAL_ARGS)
{
elog(WARNING, "foo");
}
void
worker_main(Datum main_arg)
{
// ...
pqsignal(SIGUSR1, handle_sigusr1);
}
$ net-with-pg-15 psql
postgres=# create database foo;
CREATE DATABASE
postgres=# drop database foo;
2023-01-19 02:36:38.827 -05 [463963] LOG: checkpoint starting: immediate force wait
2023-01-19 02:36:38.830 -05 [463963] LOG: checkpoint complete: wrote 92 buffers (0.6%); 0 WAL file(s) added, 0 removed, 0 recycled; write=0.003 s, sync=0.001 s, total=0.003 s; sync files=0, l
ongest=0.000 s, average=0.000 s; distance=4714 kB, estimate=4714 kB
2023-01-19 02:36:38.830 -05 [463968] WARNING: foo
2023-01-19 02:36:43.835 -05 [463973] LOG: still waiting for backend with PID 463968 to accept ProcSignalBarrier
2023-01-19 02:36:43.835 -05 [463973] STATEMENT: drop database foo;
2023-01-19 02:36:48.840 -05 [463973] LOG: still waiting for backend with PID 463968 to accept ProcSignalBarrier
2023-01-19 02:36:48.840 -05 [463973] STATEMENT: drop database foo;
To solve this we'd need to handle ProcSignalBarrier, which is already done by procsignal_sigusr1_handler
here. We just need to add:
pqsignal(SIGUSR1, procsignal_sigusr1_handler);
Add a line/limitations to the docs
This will be fixed on #68. Opening the issue to make the bug clear.
When having
postgres=# select count(*) from net._http_response;
-[ RECORD 1 ]
count | 50001
postgres=# select count(*) from net.http_request_queue;
-[ RECORD 1 ]
count | 0
Usage stays mostly at 2.6%CPU. Also notice the 6.7% MEM, which indicates there's a memory leak.
This is due to this DELETE:
Lines 442 to 445 in 70ea4cd
clock_timestamp()
generates a full table scan each time it runs.
Differences laid out here: https://curl.se/libcurl/c/libcurl-multi.html.
Could potentially help with #13.
Docker install of supabase/postgres:15.1.0.109_arm64
pg_net
installs successfully, creates net
schema etc, and functions seemingly operate as expected, however requests do not execute. They do not time out either.
supabase/postgres:15.1.0.109_arm64
create extension pg_net;
SELECT net.http_get (
'https://postman-echo.com/get?foo1=bar1&foo2=bar2'
) AS request_id;
SELECT * FROM net.http_request_queue
=> responds with one row corresponding to the requestSELECT * FROM net._http_response
=> responds with no rowsSELECT net.check_worker_is_up()
reports: ERROR: the pg_net background worker is not up
SELECT * from pg_stat_activity
shows a worker is running - but the backend_type
from this query reports "pg_net 0.7.1 worker"
- where the query in the check_worker_is_up()
function expects "pg_net worker"
At this point the request stays in net.http_request_queue
and never times out etc...
NB. A test hosted on supabase sends requests successfully and populates net._http_response
with a response, but also shows ERROR: the pg_net background worker is not up
when check_worker_is_up()
is executed.
The request should execute, the row in the request table should no longer be present and a response should be in the response table.
supabase/postgres:15.1.0.109_arm64
I noticed that my tests unexpectedly failed despite only making changes to the documentation. The root cause appears to be timeout errors resulting from the httpbin endpoints being excessively slow (over 2 seconds for response). These timeout errors are not accounted for in the tests, causing them to fail even with functional code.
To replicate, simply run the tests. A failure typically occurs due to the unanticipated timeout error.
The tests should retry requests if there is a timeout error.
Remove the content_type field from net._http_response
since it is a header and there is already a field for that
https://github.com/supabase/pg_net/blob/master/sql/pg_net--0.1.sql#L22
Reduce 10 second delay between polling for new requests: suggest 250ms
Once this error is returned, the worker stops processing new requests.
(needs reproducing first, I need to setup an http server that produces this invalid response)
We also need to enable more tracing to debug the worker state in production.
CREATE EXTENSION IF NOT EXISTS pg_net;
select
net.http_post(
url:='https://httpbin.org/post',
body:='{"hello": "world"}'::jsonb
);
select
*
from
net.http_collect_response(1);
pg_net: Timeout was reached - Watch Video
Notice how the status says SUCCESS
, but upon inspecting the net._http_response
table, it says timeout was reached. Additionally it appears that the request was never made according to the remote server.
https://supabase.com/docs/guides/database/extensions/pgnet
Documentation is incorrect
The sql to enable pg_net extension in the doc is:
create schema if not exists net;
create extension pg_net with schema net;
but was getting error
{"Type":"ErrorResponse","Severity":"ERROR","SeverityUnlocalized":"ERROR","Code":"55000","Message":"schema net is not a member of extension \"pg_net\"","Detail":"An extension may only use CREATE ... IF NOT EXISTS to skip object creation if the conflicting object is one that it already owns.","Hint":"","Position":0,"InternalPosition":0,"InternalQuery":"","Where":"","SchemaName":"","TableName":"","ColumnName":"","DataTypeName":"","ConstraintName":"","File":"pg_depend.c","Line":274,"Routine":"checkMembershipInCurrentExtension","UnknownFields":null}
https://github.com/supabase/pg_net/blob/master/docs/installation.md had the sql I needed to enable the extension by migration:
create extension pg_net;
Thanks @GaryAustin1
_http_response
table must be done, this reduces throughput. There are cases where the client doesn't care about the response so it doesn't need to be persisted. For example, function hooks don't do anything with the response.http_request_queue
.http_request_queue
can grow big with many requests and cause production issues(internal link).Drop the _http_response
and http_request_queue
tables and instead use an in-memory queue, plus add two callbacks:
create or replace function net.http_get(
url text,
-- ..,
success_cb text default '',
error_cb text default ''
)
returns void -- no id is returned now
as $$
-- ...
$$ language sql;
Which can be used like:
select net.http_get(
url := 'http://domain.com',
, success_cb := $$ insert into my_table values ($1, $2, $3) $$ -- $1=status, $2=headers, $3=body
, error_cb := $$ do $_$begin raise exception 'Failed request on %: "%"', $1, $2; end$_$; $$ -- $1=url, $2=error_message
);
error_cb
can also be an insert/update on a table, so the request can be retried if needed.Currently it's text
On disabling and re-enabling the extension PG_NET, i am getting:
Failed to toggle PG_NET: failed to create pg.extensions: function net.http_collect_response(bigint, boolean) does not exist
I paused and restored my supabase project (hosted) to get latest postgres and whatnot. Then i disabled PG_NET because i wanted to track down some issues i had with my migrations script.
I digged in deeper: The error is kind of crazy because PG_NET cant be installed because it expects a function which the extension itself should install.
The function is defined in: pg_net/sql/pg_net--0.2--0.3.sql but somehow my re-enablement doesnt run through this script again or the statement didnt run at all because the real error is in pg_net--0.3--0.4.sql on line 3:
alter function net.http_collect_response(request_id bigint, async boolean) security definer;
Funny enough that with 0.7, the function was renamed to "_http_collect_response".
Dont know if people can reproduce the bug. For me it was just disabling and re-enabling the extension after i did a pause/restore of the supabase project today.
i should be able to "enable" the Extension again.
Postgres: PostgreSQL 15.1 on aarch64-unknown-linux-gnu, compiled by gcc (Ubuntu 10.3.0-1ubuntu1~20.04) 10.3.0, 64-bit
Dont know where i can check the version of the supabase hosted image version or something.
select * from pg_available_extensions where name = 'pg_net';
--> 0.7
Can we optionally log something based on the status code when the response is handled?
Default to logging 4xx - 5xx response codes would be great, and then everyone would automatically get this info in their Postgres logs.
After pg_net got upgraded, it throws a permission denied for table http_request_queue when trying to make a request as an authenticated user.
Upgrade to the latest version and attempt to make a http request using the authenticated role.
Assumption is that any supabase extension works out of the box with the role and auth setup in the database. If it is expected behaviour, it should be documented somewhere.
There are cases when net._http_response
would have a subset of records from net.http_request_queue
even though records from both tables should ideally match.
There should be a record in net._http_response
for every record in net.http_request_queue
and requests should go through successfully or be marked as failed in net._http_response
.
Related issue: supabase/supabase#2997
Currently it just makes it a GET
request
README.md in the repo
It is unclear whether one is supposed to clean up the net._http_response
table or if there is a task that does so. I'm assuming not, because every use case would have a different requirement for retention of that data. The docs clearly state that the net.http_request_queue
is self-cleaning.
State whether that table should be periodically cleaned by the user.
none.
Only happens when there's a connection refused
error like:
curl localhost:8888
curl: (7) Failed to connect to localhost port 8888 after 0 ms: Connection refused
And after some tries
select net.http_get('http://localhost:8888') from generate_series(1,10);
free(): invalid pointer
2022-09-05 20:17:45.207 -05 [180720] LOG: background worker "pg_net worker" (PID 180727) was terminated by signal 6: Aborted
2022-09-05 20:17:45.207 -05 [180720] LOG: terminating any other active server processes
# The error message varies among these
# munmap_chunk(): invalid pointer
# free(): double free detected in tcache 2
# free(): invalid pointer
Other types of error don't cause the worker to crash.
Happens somewhere in
Lines 76 to 104 in 824b163
If a user already has http
extension on their database and wants to create pg_net
extension, or vice versa, then they will get type "http_method" already exists error.
create extension http;
create extension pg_net;
Hey,
we are making quite heavy use of pg_net, mainly to make a request to an external api from within a trigger. For some of these, we will now migrate to a postgres-based queue (postgraphile-worker / pg-boss) to have better observability.
During the migration, I began to wonder how pg_net fits into the supabase architecture, for what use-cases it is supposed to be used and what its limitations are, especially regarding performance.
As an example, we are using segment to collect usage data and want to track events from within the database. One option would be to use pg_net to make a request to the Segment API. We created a simple segement.track function which makes the request using pg_net. When our usage grows, we might get a few events per seconds. How much requests is pg_net supposed to handle? As an alternative, we could either send the requests batch-wise or use a queueing solution and drop pg_net. Further, a webhook or queuing plugin for realtime would probably be a more scalable alternative.
Is there any official guidance on the scalability and intend of pg_net?
Best
Philipp
Hypothetical scenario:
pg_net
causes requests to accumulate in http_request_queue
I haven't tried reproducing the above scenario, but we should take preventive measures against this. Some possible actions:
TRUNCATE
all requests on worker restart (so it doesn't enter crashloop)pg_net not working in self-hosted supabase
Running this query does nothing, no errors, http_request_queue
and _http_response
tables are empty.
Although, it returns incrementing number of request_id
on each run.
select net.http_get('https://news.ycombinator.com') as request_id;
Using pg_net 0.6
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.