devrimgunduz / pagila Goto Github PK

PostgreSQL Sample Database

License: Other

PLpgSQL 100.00%

pagila's Introduction

Pagila

Pagila started as a port of the Sakila example database available for MySQL, which was originally developed by Mike Hillyer of the MySQL AB documentation team. It is intended to provide a standard schema that can be used for examples in books, tutorials, articles, samples, etc.

Pagila has been tested against PostgreSQL 12 and above.

All the tables, data, views, and functions have been ported; some of the changes made were:

Changed char(1) true/false fields to true boolean fields
The last_update columns were set with triggers to update them
Added foreign keys
Removed 'DEFAULT 0' on foreign keys since it's pointless with real FK's
Used PostgreSQL built-in fulltext searching for fulltext index. Removed the need for the film_text table.
The rewards_report function was ported to a simple SRF
Added JSONB data

The pagila database is made available under PostgreSQL license.

EXAMPLE QUERY

Find late rentals:

SELECT
	CONCAT(customer.last_name, ', ', customer.first_name) AS customer,
	address.phone,
	film.title
FROM
	rental
	INNER JOIN customer ON rental.customer_id = customer.customer_id
	INNER JOIN address ON customer.address_id = address.address_id
	INNER JOIN inventory ON rental.inventory_id = inventory.inventory_id
	INNER JOIN film ON inventory.film_id = film.film_id
WHERE
	rental.return_date IS NULL
	AND rental_date < CURRENT_DATE
ORDER BY
	title
LIMIT 5;

FULLTEXT SEARCH

Fulltext functionality is built in PostgreSQL, so parts of the schema exist in the main schema file.

Example usage:

SELECT * FROM film WHERE fulltext @@ to_tsquery('fate&india');

pgAdmin is included in the docker-compose.

Navigate to the URL : http://localhost:5050/ Default Username: [email protected] Default Password: root

PARTITIONED TABLES

The payment table is designed as a partitioned table with a 7 month timespan for the date ranges.

INSTALL NOTE

The pagila-data.sql file and the pagila-insert-data.sql both contain the same data, the former using COPY commands, the latter using INSERT commands, so you only need to install one of them. Both formats are provided for those who have trouble using one version or another, and for instructors who want to point out the longer data loading time with the latter. You can load them via psql, pgAdmin, etc.

Since JSONB data is quite large to store on Github, the backup is not a plain SQL file. You can still use psql/pgAdmin, etc. to load pagila-schema-jsonb.sql, however please use pg_restore to load jsonb data files:

pg_restore /usr/share/pagila/pagila-data-yum-jsonb.sql -U postgres -d pagila
pg_restore /usr/share/pagila/pagila-data-apt-jsonb.sql -U postgres -d pagila

VERSION HISTORY

Version 3.0.0

Add JSONB sample data (based on the packages at apt.postgresql.org and yum.postgresql.org)
Add docker compose support ( contributed by https://github.com/theothermattm ) #16
Add steps to create pagila database on docker by @dedeco in #13
Add missing user argument by @zOxta in #14
Update dates to 2022
Fix various issues reported in Github

Version 2.1.0

Replace varchar(n) with text (David Fetter)
Match foreign key and primary key data type in some tables (Ganeshan Venkataraman)
Change CREATE TABLE statement for customer table to use DEFAULT nextval('customer_customer_id_seq'::regclass) for customer_id field instead of SERIAL (Adrian Klaver).

Version 2.0

Update schema for newer PostgreSQL versions
Remove RULE for partitioning, add trigger support.
Update years in sample data.
Remove ARTICLES section from README, all links are dead.

Version 0.10.1

Add pagila-data-insert.sql file, added articles section

Version 0.10

Support for built-in fulltext. Add enum example

Version 0.9

Add table partitioning example

Version 0.8

First release of pagila

CREATE DATABASE ON DOCKER

On terminal pull the latest postgres image:

 docker pull postgres

Run image:

 docker run --name postgres -e POSTGRES_PASSWORD=secret -d postgres

Run postgres and create the database:

docker exec -it postgres psql -U postgres

psql (13.1 (Debian 13.1-1.pgdg100+1))
Type "help" for help.

postgres=# CREATE DATABASE pagila;
postgres-# CREATE DATABASE
postgres=\q

Create all schema objetcs (tables, etc) replace <local-repo> by your local directory :

cat <local-repo>/pagila-schema.sql | docker exec -i postgres psql -U postgres -d pagila

Insert all data:

cat <local-repo>/pagila-data.sql | docker exec -i postgres psql -U postgres -d pagila

Done! Just use:

docker exec -it postgres psql -U postgres

postgres
psql (13.1 (Debian 13.1-1.pgdg100+1))
Type "help" for help.

postgres=# \c pagila
You are now connected to database "pagila" as user "postgres".
pagila=# \dt
                    List of relations
 Schema |       Name       |       Type        |  Owner
--------+------------------+-------------------+----------
 public | actor            | table             | postgres
 public | address          | table             | postgres
 public | category         | table             | postgres
 public | city             | table             | postgres
 public | country          | table             | postgres
 public | customer         | table             | postgres
 public | film             | table             | postgres
 public | film_actor       | table             | postgres
 public | film_category    | table             | postgres
 public | inventory        | table             | postgres
 public | language         | table             | postgres
 public | payment          | partitioned table | postgres
 public | payment_p2022_01 | table             | postgres
 public | payment_p2022_02 | table             | postgres
 public | payment_p2022_03 | table             | postgres
 public | payment_p2022_04 | table             | postgres
 public | payment_p2022_05 | table             | postgres
 public | payment_p2022_06 | table             | postgres
 public | payment_p2022_07 | table             | postgres
 public | rental           | table             | postgres
 public | staff            | table             | postgres
 public | store            | table             | postgres
(21 rows)

pagila=#
```

CREATE DATABASE ON DOCKER-COMPOSE

Run:

docker-compose up

Done! Just use:

docker exec -it pagila psql -U postgres


postgres
psql (13.1 (Debian 13.1-1.pgdg100+1))
Type "help" for help.

postgres=# \c pagila
You are now connected to database "pagila" as user "postgres".
pagila=# \dt
                    List of relations
 Schema |       Name       |       Type        |  Owner
--------+------------------+-------------------+----------
 public | actor            | table             | postgres
 public | address          | table             | postgres
 public | category         | table             | postgres
 public | city             | table             | postgres
 public | country          | table             | postgres
 public | customer         | table             | postgres
 public | film             | table             | postgres
 public | film_actor       | table             | postgres
 public | film_category    | table             | postgres
 public | inventory        | table             | postgres
 public | language         | table             | postgres
 public | payment          | partitioned table | postgres
 public | payment_p2022_01 | table             | postgres
 public | payment_p2022_02 | table             | postgres
 public | payment_p2022_03 | table             | postgres
 public | payment_p2022_04 | table             | postgres
 public | payment_p2022_05 | table             | postgres
 public | payment_p2022_06 | table             | postgres
 public | payment_p2022_07 | table             | postgres
 public | rental           | table             | postgres
 public | staff            | table             | postgres
 public | store            | table             | postgres
(21 rows)

pagila=#

pagila's People

Contributors

Stargazers

Watchers

Forkers

andremikulec juanlh eugenekgn ricpelo absin1 gopinathan-av cabecada lexhung abevieiramota davidfetter aklaver cc-hsu funkybunky777 lmishra55 sfdcmahi six-arm hendler asitti natbusa praiskup amitvineti mattg317 meibassam slimee edib mmarich1 ninjae jrjsmrtn pravin-pratiti gokhandedeoglu process-science rustprooflabs badrisugavanam swarmee zafardorna ganeshan akhdaniel ankit2855 talibe84 utomoreza aleyafatema willmartell jblythe ahmedayman0 kreidoss delisher angel-acevedo-sanchez annedhyacinthe 2160051 ertugrulaslan ayoubhamaoui chilas josephitopa dallen66 ocharan zornhome ksuhiyp easychengxi mathbeal mzayat ranuzz mucahitozgun m1gra1n3 bangush joefissk rammysekham howaboutudance dgunjetti dedeco punpunjubu hedcler zoxta sama26 theothermattm thuyanduong kalyanpr guli-y pwesselius threadstonesecure dolugen cognit-org ultranet1 jamesronsonop m-gris tuyen-nnt isabellayuan-07 patrickcmd arfanliaqat duythinbmt spatialedge-ai kirillyarets mikron knightcn1983 maroon4098 encryptblockr lanaflon rajarsigit harshp21 droid95 dkeeney81

pagila's Issues

Where is screenshot showing schema design?

At the very least we want to quickly see the table relations in an ERD diagram
Any plan to include like a screenshot to quickly see how the table relationships look like?
For an example schema design, i think that should be obvious

License?

Hi,

Thanks for doing this, great stuff!

I'd to include Pagila as part of some test fixtures in my current project... would that be OK?
If it's not too much of a pain, might it be possible to slap a LICENSE.md file or similar in there please?

Thanks again!

Insert statements are disordered

I can't execute the file with inserts because the queries order violates the FKs.
For example, the city and country tables are inserted after address.
Please, reorder the inserts.

Add a link to Sakila?

It could be a good idea to have a link to Sakila in the readme, for people who are also interested in the MySQL version.

language.name is `character(20)` type, which probably isn't idiomatic/modern

Second try reporting #32

Maybe it's intended to have some "legacy" data in a fixed length format, but in case not I thought I would report it as it surprised me and it's the only instance in the schema

Strange rental dates

In pagila-data.sql there are several (182, I think) rentals that occur on 2020-02-14 15:16:03+00.

Here is a sample:

12114	2005-08-17 23:02:00+01	1405	65	2005-08-26 18:02:00+01	1	2020-02-16 02:30:53+00
12115	2005-08-17 23:04:15+01	1228	457	2005-08-20 22:25:15+01	2	2020-02-16 02:30:53+00
12116	2020-02-14 15:16:03+00	3082	560	\N	2	2020-02-16 02:30:53+00
12117	2005-08-17 23:11:12+01	4140	303	2005-08-22 23:56:12+01	1	2020-02-16 02:30:53+00
12118	2005-08-17 23:14:25+01	158	89	2005-08-26 22:26:25+01	1	2020-02-16 02:30:53+00
12119	2005-08-17 23:16:44+01	4298	567	2005-08-20 02:13:44+01	2	2020-02-16 02:30:53+00
12120	2005-08-17 23:16:46+01	2912	323	2005-08-19 00:11:46+01	2	2020-02-16 02:30:53+00
12121	2005-08-17 23:20:40+01	3423	69	2005-08-22 21:30:40+01	2	2020-02-16 02:30:53+00
12122	2005-08-17 23:20:45+01	4030	375	2005-08-25 04:23:45+01	2	2020-02-16 02:30:53+00
12123	2005-08-17 23:22:18+01	361	497	2005-08-19 23:36:18+01	2	2020-02-16 02:30:53+00
12124	2005-08-17 23:22:46+01	2036	22	2005-08-21 01:40:46+01	1	2020-02-16 02:30:53+00
12125	2005-08-17 23:24:25+01	136	573	2005-08-25 03:08:25+01	2	2020-02-16 02:30:53+00
12126	2005-08-17 23:25:21+01	2304	302	2005-08-23 21:51:21+01	1	2020-02-16 02:30:53+00
12127	2020-02-14 15:16:03+00	4218	582	\N	2	2020-02-16 02:30:53+00
12128	2005-08-17 23:31:09+01	2252	415	2005-08-24 05:07:09+01	2	2020-02-16 02:30:53+00
12129	2005-08-17 23:31:25+01	891	146	2005-08-26 19:10:25+01	2	2020-02-16 02:30:53+00
12130	2020-02-14 15:16:03+00	1358	516	\N	2	2020-02-16 02:30:53+00
12131	2005-08-17 23:34:16+01	3380	21	2005-08-26 01:18:16+01	1	2020-02-16 02:30:53+00

I'm trying to follow along with an online course. Their pagila db seems to have the same number of rows in rental, but their dates span 5 months. So, it seems like the rentals on 2020-02-14 should be in the database, but maybe they were modified accidentally?

More diverse data

Would you welcome pull requests to diversify the data a bit?

For example:

Even though there are 6 languages in the language table, only English (id 1) appears in the film table.
No film has more than one category assigned even though it's a many-to-many relationship.
Only one rental has more than one payment.
There are only 2 stores and 2 staff.

More variety would make practice query results more interesting and help people see the need to GROUP BY etc.

Add pg_partman support

TODO: Add partitioning with pg_partman for educational purposes.

Neither of SQL files work

No matter which SQL file I try to run (...-data, ...-insert-data) on empty database without any tables theres a whole lot of errors and exceptions happening. Some of them are constraint errors like:

[42710] ERROR: constraint "store_address_id_fkey" for relation "store" already exists

Others are relation errors:

[42P07] ERROR: relation "idx_fk_address_id" already exists

And some are saying that apparently some data was not loaded before relating as Foreign keys:

[23503] Batch entry 0 INSERT INTO public.rental VALUES (7780, '2005-07-28 07:11:55+01', 3069, 236, '2005-08-06 05:41:55+01', 1, '2020-02-16 02:30:53+00') was aborted: ERROR: insert or update on table "rental" violates foreign key constraint "rental_customer_id_fkey"
Detail: Key (customer_id)=(236) is not present in table "customer".  Call getNextException to see other errors in the batch.

payment.payment_id missing PRIMARY KEY constraint

In the original Sakila schema, the payment_id column in the payment table has a PRIMARY KEY constraint. Pagila does not.
I found this by accident when I restored a backup of pagila over itself and duplicated all the rows in the partitioned tables.

Because payment_date is the partition key, it would have to be defined as

ALTER TABLE payment ADD PRIMARY KEY (payment_date, payment_id)

Importing Error on PostgreSQL 9.6

When you run the following command as postgres user on v9.6, I get the following errors:

psql pagila < pagila-schema.sql
[...]
ALTER TABLE
ALTER TABLE
ERROR:  relation "payment_p2017_01" does not exist
ERROR:  relation "payment_p2017_02" does not exist
ERROR:  relation "payment_p2017_03" does not exist
ERROR:  relation "payment_p2017_04" does not exist
ERROR:  relation "payment_p2017_05" does not exist
ERROR:  relation "payment_p2017_06" does not exist

This error is present on CentOS / Debian / Arch / Ubuntu.

Strange default value on customer table

In pagila-schema.sql, the definition for the customer table has the default value for create_date as ('now'::text)::date. Is there a reason why it can't be now()::date?

https://github.com/devrimgunduz/pagila/blob/master/pagila-schema.sql#L276

erro import

ERROR: syntax error at or near "AS"
LINE 580: AS integer
^
SQL state: 42601
Character: 16059

Cannot run pagila-schema.sql in Postgres 10.5

Error

2018-09-30 18:42:09.856 UTC [38] ERROR:  cannot create index on partitioned table "payment"
2018-09-30 18:42:09.856 UTC [38] STATEMENT:  CREATE INDEX idx_fk_customer_id ON payment USING btree (customer_id);
psql:/docker-entrypoint-initdb.d/pagila.sql:1102: ERROR:  cannot create index on partitioned table "payment"

How to reproduce

Install Docker.
Run docker run --rm aa8y/postgres-dataset:pagila

Code

See this and this

JSONB Columns in the domain tables

The current examples for JSONB (yum and apt) are outside the domain of the DVD rental example. It might be more beneficial to add JSONB columns to either the existing view or to create a new (materialized) view that has these json columns. An example below.

-- actor_info view
SELECT a.actor_id,
       a.first_name,
       a.last_name,
       jsonb_object_agg(c.name, (SELECT array_agg(f.title) AS array_agg
                                             FROM film f
                                                      JOIN film_category fc_1 ON f.film_id = fc_1.film_id
                                                      JOIN film_actor fa_1 ON f.film_id = fa_1.film_id
                                             WHERE fc_1.category_id = c.category_id
                                               AND fa_1.actor_id = a.actor_id
                                             GROUP BY fa_1.actor_id)) AS film_info
FROM actor a
         LEFT JOIN film_actor fa ON a.actor_id = fa.actor_id
         LEFT JOIN film_category fc ON fa.film_id = fc.film_id
         LEFT JOIN category c ON fc.category_id = c.category_id
GROUP BY a.actor_id, a.first_name, a.last_name;

This would enable users to get more familiar with different jsonb operator when filtering on the view. For instance this simple example below

SELECT actor_id, first_name, last_name, film_info
FROM actor_info
WHERE film_info -> 'Games' ? 'FEATHERS METAL'

Add materialized view example

Add an MV example to pagila-schema.sql .

Some table data have weird trailing whitespace

I noticed this in language.name but maybe there are others

postgres=# select * from lang limit 2;
 language_id |         name         |        updated         
-------------+----------------------+------------------------
           1 | English              | 2022-02-15 10:02:19+00
           2 | Italian              | 2022-02-15 10:02:19+00

Suggestion for filenaming

I noticed that the .sql files are in PGDMP format. This is fun, but the extension of .SQL is pretty much always in ASCII text or UTF-8 text format.

My suggestion is to rename the PGDMP formatted .sql files to .pgdmp to make it obvious what format these files are in.

Customer table sequence and duplicate key errors

In pagila-schema.sql there is this:

--
-- Name: customer_customer_id_seq; Type: SEQUENCE; Schema: public; Owner: postgres

CREATE SEQUENCE customer_customer_id_seq
START WITH 1
INCREMENT BY 1
NO MINVALUE
NO MAXVALUE
CACHE 1;

ALTER TABLE customer_customer_id_seq OWNER TO postgres;

SET default_tablespace = '';

SET default_with_oids = false;

--
-- Name: customer; Type: TABLE; Schema: public; Owner: postgres

CREATE TABLE customer (
customer_id SERIAL PRIMARY KEY,
...

Unfortunately what you end up with is:

The SERIAL as 'type' for customer_id causes Postgres to create a new sequence bound to the field with a Start value of 1. This means inserting new data fails with duplicate key errors.

A quick scan through the rest of pagila-schema.sql shows that this is the only place this done. The rest of the CREATE TABLE statements follow the pattern:

id_fld fld_type DEFAULT nextval('some_seq'::regclass) NOT NULL,