devrimgunduz / pagila Goto Github PK
View Code? Open in Web Editor NEWPostgreSQL Sample Database
License: Other
PostgreSQL Sample Database
License: Other
When you run the following command as postgres user on v9.6, I get the following errors:
psql pagila < pagila-schema.sql
[...]
ALTER TABLE
ALTER TABLE
ERROR: relation "payment_p2017_01" does not exist
ERROR: relation "payment_p2017_02" does not exist
ERROR: relation "payment_p2017_03" does not exist
ERROR: relation "payment_p2017_04" does not exist
ERROR: relation "payment_p2017_05" does not exist
ERROR: relation "payment_p2017_06" does not exist
This error is present on CentOS / Debian / Arch / Ubuntu.
Add an MV example to pagila-schema.sql .
At the very least we want to quickly see the table relations in an ERD diagram
Any plan to include like a screenshot to quickly see how the table relationships look like?
For an example schema design, i think that should be obvious
Hi,
Thanks for doing this, great stuff!
I'd to include Pagila as part of some test fixtures in my current project... would that be OK?
If it's not too much of a pain, might it be possible to slap a LICENSE.md
file or similar in there please?
Thanks again!
In pagila-schema.sql
, the definition for the customer table has the default value for create_date
as ('now'::text)::date
. Is there a reason why it can't be now()::date
?
https://github.com/devrimgunduz/pagila/blob/master/pagila-schema.sql#L276
In pagila-data.sql
there are several (182, I think) rentals that occur on 2020-02-14 15:16:03+00.
Here is a sample:
12114 2005-08-17 23:02:00+01 1405 65 2005-08-26 18:02:00+01 1 2020-02-16 02:30:53+00
12115 2005-08-17 23:04:15+01 1228 457 2005-08-20 22:25:15+01 2 2020-02-16 02:30:53+00
12116 2020-02-14 15:16:03+00 3082 560 \N 2 2020-02-16 02:30:53+00
12117 2005-08-17 23:11:12+01 4140 303 2005-08-22 23:56:12+01 1 2020-02-16 02:30:53+00
12118 2005-08-17 23:14:25+01 158 89 2005-08-26 22:26:25+01 1 2020-02-16 02:30:53+00
12119 2005-08-17 23:16:44+01 4298 567 2005-08-20 02:13:44+01 2 2020-02-16 02:30:53+00
12120 2005-08-17 23:16:46+01 2912 323 2005-08-19 00:11:46+01 2 2020-02-16 02:30:53+00
12121 2005-08-17 23:20:40+01 3423 69 2005-08-22 21:30:40+01 2 2020-02-16 02:30:53+00
12122 2005-08-17 23:20:45+01 4030 375 2005-08-25 04:23:45+01 2 2020-02-16 02:30:53+00
12123 2005-08-17 23:22:18+01 361 497 2005-08-19 23:36:18+01 2 2020-02-16 02:30:53+00
12124 2005-08-17 23:22:46+01 2036 22 2005-08-21 01:40:46+01 1 2020-02-16 02:30:53+00
12125 2005-08-17 23:24:25+01 136 573 2005-08-25 03:08:25+01 2 2020-02-16 02:30:53+00
12126 2005-08-17 23:25:21+01 2304 302 2005-08-23 21:51:21+01 1 2020-02-16 02:30:53+00
12127 2020-02-14 15:16:03+00 4218 582 \N 2 2020-02-16 02:30:53+00
12128 2005-08-17 23:31:09+01 2252 415 2005-08-24 05:07:09+01 2 2020-02-16 02:30:53+00
12129 2005-08-17 23:31:25+01 891 146 2005-08-26 19:10:25+01 2 2020-02-16 02:30:53+00
12130 2020-02-14 15:16:03+00 1358 516 \N 2 2020-02-16 02:30:53+00
12131 2005-08-17 23:34:16+01 3380 21 2005-08-26 01:18:16+01 1 2020-02-16 02:30:53+00
I'm trying to follow along with an online course. Their pagila db seems to have the same number of rows in rental
, but their dates span 5 months. So, it seems like the rentals on 2020-02-14 should be in the database, but maybe they were modified accidentally?
TODO: Add partitioning with pg_partman for educational purposes.
I noticed that the .sql files are in PGDMP format. This is fun, but the extension of .SQL is pretty much always in ASCII text or UTF-8 text format.
My suggestion is to rename the PGDMP formatted .sql files to .pgdmp to make it obvious what format these files are in.
In pagila-schema.sql there is this:
CREATE SEQUENCE customer_customer_id_seq
START WITH 1
INCREMENT BY 1
NO MINVALUE
NO MAXVALUE
CACHE 1;
ALTER TABLE customer_customer_id_seq OWNER TO postgres;
SET default_tablespace = '';
SET default_with_oids = false;
CREATE TABLE customer (
customer_id SERIAL PRIMARY KEY,
...
Unfortunately what you end up with is:
pagila_(postgres)# \d customer
Table "public.customer"
Column | Type | Collation | Nullable | Default
-------------+--------------------------+-----------+----------+------------------------------------------------
customer_id | integer | | not null | nextval('customer_customer_id_seq1'::regclass)
pagila_(postgres)# \d customer_customer_id_seq1
Sequence "public.customer_customer_id_seq1"
Type | Start | Minimum | Maximum | Increment | Cycles? | Cache
---------+-------+---------+------------+-----------+---------+-------
integer | 1 | 1 | 2147483647 | 1 | no | 1
Owned by: public.customer.customer_id
The SERIAL as 'type' for customer_id causes Postgres to create a new sequence bound to the field with a Start value of 1. This means inserting new data fails with duplicate key errors.
A quick scan through the rest of pagila-schema.sql shows that this is the only place this done. The rest of the CREATE TABLE statements follow the pattern:
id_fld fld_type DEFAULT nextval('some_seq'::regclass) NOT NULL,
It could be a good idea to have a link to Sakila in the readme, for people who are also interested in the MySQL version.
I noticed this in language.name
but maybe there are others
postgres=# select * from lang limit 2;
language_id | name | updated
-------------+----------------------+------------------------
1 | English | 2022-02-15 10:02:19+00
2 | Italian | 2022-02-15 10:02:19+00
2018-09-30 18:42:09.856 UTC [38] ERROR: cannot create index on partitioned table "payment"
2018-09-30 18:42:09.856 UTC [38] STATEMENT: CREATE INDEX idx_fk_customer_id ON payment USING btree (customer_id);
psql:/docker-entrypoint-initdb.d/pagila.sql:1102: ERROR: cannot create index on partitioned table "payment"
docker run --rm aa8y/postgres-dataset:pagila
The current examples for JSONB (yum
and apt
) are outside the domain of the DVD rental example. It might be more beneficial to add JSONB columns to either the existing view or to create a new (materialized) view that has these json columns. An example below.
-- actor_info view
SELECT a.actor_id,
a.first_name,
a.last_name,
jsonb_object_agg(c.name, (SELECT array_agg(f.title) AS array_agg
FROM film f
JOIN film_category fc_1 ON f.film_id = fc_1.film_id
JOIN film_actor fa_1 ON f.film_id = fa_1.film_id
WHERE fc_1.category_id = c.category_id
AND fa_1.actor_id = a.actor_id
GROUP BY fa_1.actor_id)) AS film_info
FROM actor a
LEFT JOIN film_actor fa ON a.actor_id = fa.actor_id
LEFT JOIN film_category fc ON fa.film_id = fc.film_id
LEFT JOIN category c ON fc.category_id = c.category_id
GROUP BY a.actor_id, a.first_name, a.last_name;
This would enable users to get more familiar with different jsonb operator when filtering on the view. For instance this simple example below
SELECT actor_id, first_name, last_name, film_info
FROM actor_info
WHERE film_info -> 'Games' ? 'FEATHERS METAL'
Would you welcome pull requests to diversify the data a bit?
For example:
language
table, only English (id 1) appears in the film
table.More variety would make practice query results more interesting and help people see the need to GROUP BY etc.
ERROR: syntax error at or near "AS"
LINE 580: AS integer
^
SQL state: 42601
Character: 16059
Second try reporting #32
Maybe it's intended to have some "legacy" data in a fixed length format, but in case not I thought I would report it as it surprised me and it's the only instance in the schema
I can't execute the file with inserts because the queries order violates the FKs.
For example, the city and country tables are inserted after address.
Please, reorder the inserts.
In the original Sakila schema, the payment_id
column in the payment
table has a PRIMARY KEY constraint. Pagila does not.
I found this by accident when I restored a backup of pagila over itself and duplicated all the rows in the partitioned tables.
Because payment_date
is the partition key, it would have to be defined as
ALTER TABLE payment ADD PRIMARY KEY (payment_date, payment_id)
No matter which SQL file I try to run (...-data
, ...-insert-data
) on empty database without any tables theres a whole lot of errors and exceptions happening. Some of them are constraint errors like:
[42710] ERROR: constraint "store_address_id_fkey" for relation "store" already exists
Others are relation errors:
[42P07] ERROR: relation "idx_fk_address_id" already exists
And some are saying that apparently some data was not loaded before relating as Foreign keys:
[23503] Batch entry 0 INSERT INTO public.rental VALUES (7780, '2005-07-28 07:11:55+01', 3069, 236, '2005-08-06 05:41:55+01', 1, '2020-02-16 02:30:53+00') was aborted: ERROR: insert or update on table "rental" violates foreign key constraint "rental_customer_id_fkey"
Detail: Key (customer_id)=(236) is not present in table "customer". Call getNextException to see other errors in the batch.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.