Coder Social home page Coder Social logo

anthonydb / practical-sql Goto Github PK

View Code? Open in Web Editor NEW
645.0 43.0 398.0 113.81 MB

Code and Data for the First Edition of "Practical SQL" by Anthony DeBarros, published by No Starch Press (2018).

License: Other

PLpgSQL 100.00%
sql postgresql data-analysis

practical-sql's Introduction

Practical SQL: A Beginner's Guide to Storytelling with Data

Practical SQL by journalist and data analyst Anthony DeBarros is a beginner-friendly guide to the database programming language SQL. Starting with SQL basics, you'll work through intermediate and advanced topics including statistics, aggregation, cleaning data, GIS and automating tasks. Along the way, you'll use real-world data from the U.S. Census and other government agencies and learn the fundamentals of good database design. This book is ideal for beginners as well as those who know some SQL and want to go deeper.

A Note About Editions

This repository is for the First Edition of Practical SQL, published in 2018. If you're reading the Second Edition, published in 2022, you must use the code and data in the second edition repository. If the cover of your copy does not say, "2nd Edition," then you're using the first edition.

Who Is This Book For?

Practical SQL is ideal for beginners as well as those who know some SQL and want to go deeper.

Which Database Does The Book Use?

We use PostgreSQL, which is free and open source. PostgreSQL is used by some of the world's largest companies. Its SQL syntax adheres closely to the ANSI SQL standard, and the concepts you learn will apply to most database management systems, including MySQL, Oracle, SQLite, and others. Note that Microsoft SQL Server employs a variant of SQL called T-SQL, which is not covered by Practical SQL.

What's In This Repository?

Code: All the SQL statements and command-line listings used in each chapter, organized by chapter folders.

Data: CSV files and GIS shapefiles for you to import, also organized by chapter. NOTE! See the warning below about opening CSV files with Excel or text editors in the section on Getting the Code and Data.

Exercises: The "Try It Yourself" questions and answers for each chapter, listed separately. Try working through the questions before peeking at the answers!

FAQ, Updates, and Errata: Answers to frequently asked questions, updates, and corrections are noted at faq-updates-errata.md.

Resources: Updates to the book's Appendix on Additional PostgreSQL Resources at resources.md.

Chapters

  • Chapter 1: Creating Your First Database and Table
  • Chapter 2: Beginning Data Exploration with SELECT
  • Chapter 3: Understanding Data Types
  • Chapter 4: Importing and Exporting Data
  • Chapter 5: Basic Math and Stats with SQL
  • Chapter 6: Joining Tables in a Relational Database
  • Chapter 7: Table Design That Works for You
  • Chapter 8: Extracting Information by Grouping and Summarizing
  • Chapter 9: Inspecting and Modifying Data
  • Chapter 10: Statistical Functions In SQL
  • Chapter 11: Working With Dates and Times
  • Chapter 12: Advanced Query Techniques
  • Chapter 13: Mining Text to Find Meaningful Data
  • Chapter 14: Analyzing Spatial Data with PostGIS
  • Chapter 15: Saving Time with Views, Functions, and Triggers
  • Chapter 16: Using PostgreSQL from the Command Line
  • Chapter 17: Maintaining Your Database
  • Chapter 18: Identifying and Telling the Story Behind Your Data
  • Appendix: Additional PostgreSQL Resources

Getting the Code and Data on Your Computer

Non-GitHub Users

You can obtain all the code and data at once by downloading this repository as a .zip file. To do that:

  • Click the Code button at top right.
  • Click Download ZIP
  • Unzip the file on your computer. Place it in a directory that's easy to remember so you can reference it during the exercises that include importing data to PostgreSQL.

Warning about CSV files!: Opening CSV files with Excel could lead to data loss. Excel will remove leading zeros from numbers that are intended to be stored as text, such as zip codes. If you wish to view the contents of a CSV file, only do so with a plain-text editor and be careful not to save the file in an encoding other than UTF-8 (e.g., Notepad.exe on Windows defaults to ANSI).

GitHub Users

GitHub users may want to clone the repository locally and occasionally perform a git pull to receive any updates.

Where Can I Buy the Book?

Practical SQL is published by No Starch Press and available in PDF, .mobi, .epub, and classic print formats.

How Can I Get Help?

Questions? Please email [email protected]. You can also open an issue in this repository.

practical-sql's People

Contributors

anthonydb avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

practical-sql's Issues

Chapter 8: Using GROUP BY with count()

Inconsistencies between code in listing 8 - 9 github respository and textbook:
github code uses order by count(stabr) while textbook lists order by count(*)

Ch. 4 ERROR: missing data for column "state_us_abbreviation"

Listing 4.3 query:

COPY us_counties_2010
FROM '/Users/Hipolito/Downloads/us_counties_2010.csv'
WITH (FORMAT CSV, HEADER);

error:

ERROR: missing data for column "state_us_abbreviation"
CONTEXT: COPY us_counties_2010, line 2: ""
SQL state: 22P04

Mac OS, installed PostgreSQL exactly as suggested in the book, ran the query using pgAdmin4.

regexp_matches queries returning no data

Taking this issue up based on chapter notes.

SELECT crime_id,
       regexp_matches(original_text, '-\d{1,2}\/\d{1,2}\/\d{2}')
FROM crime_reports;

Returned no data when I ran it, and checking the underlying data it was correct not to return any data.

More details to follow.

Chapter 13 tsvector

Hi,
I'm working on Win 10
I've installed PosgreSQL as described in book
I'm using version 4.28
Yes, I did download it from GitHub

When Running Listing 13-15 on page 231, chapter 13,
SELECT to_tsvector('I am walking across the sitting room to sit with you.');

pg gives me result:
'across':4 'am':2 'i':1 'room':7 'sit':9 'sitting':6 'the':5 'to':8 'walking':3 'with':10 'you':11
As long as I know, it is not, how tsvector should work, but instead should provide shortened version, for example: 'sit' :6,9

Im I doing something wrong?

Ch. 4 Copy | Mac OS | Error 42P01

Please include the chapter number in your issue title. For example: "Chapter 13: Regular Expression Question"

Ch. 4: Importing and Exporting Data

What's your operating system (e.g. Windows 10, macOS Catalina, Ubuntu, etc.)?

MacOS Ventura 13.1

Did you install PostgreSQL and pgAdmin according to the steps on page xxviii of the book's Introduction? If not, please describe your installation.

Yes.

Which versions of PostgreSQL and pgAdmin 4 are you using? If you aren't sure, for PostgreSQL run the SQL command SELECT version(); and for pgAdmin 4, navigate to "About pgAdmin 4" under the "Help" menu.

PostgreSQL is 15.1 and pgAdmin is 6.18

Did you download the book's code examples and data from GitHub using the directions on page xxvii of the Introduction? If not, please describe how you loaded the material on your computer.

Yes, I followed the instructions.

Issue or Question

Chapter, page and code listing number:

Ch. 4, pg 47, listing 4-3

Describe the issue or your question:

This seems like the most simple action we've been asked to perform, and yet...I keep getting the 42P01 error..Please help....I've scoured the internet (and all the prior questions) and cannot resolve this :( Thank you

Please paste the code and error message here. It's OK to alter identifying info such as a folder name

ERROR: relation "us_counties_2010" does not exist
SQL state: 42P01

missing "," in chapter 6

"
➌ SELECT c2010.geo_name,
c2010.state_us_abbreviation AS state,
c2010.p0010001 AS pop_2010,
c2000.p0010001 AS pop_2000
c2010.p0010001 - c2000.p0010001 AS raw_change,
"

missing "," in line 4

im reading the book on learning.oreilly

Pg. 50; Listing 4-6 returning with "42601" parameter error

Practical SQL Issue Template

Having trouble with data, code, or exercises in Practical SQL? I'm glad to help. Please answer these questions, and I'll typically reply within one business day.

Please include the chapter number in your issue title. For example: "Chapter 13: Regular Expression Question"

General

What's your operating system (e.g. Windows 10, macOS Mojave, Ubuntu, etc.)?
Windows10
Did you install PostgreSQL and pgAdmin according to the steps on page xxviii of the book's Introduction? If not, please describe your installation.
Yes
Did you download the book's code examples and data from GitHub using the directions on page xxvii of the Introduction? If not, please describe how you loaded the material on your computer.
Yes

Issue or Question

4.6 returning with "42601" parameter error Pg. 50
Chapter, page and code listing number:

Describe the issue or your question. Please include any error messages:

Chapter 13: Datestyle Question

Having trouble with data, code, or exercises in Practical SQL? I'm glad to help. Please answer these questions, and I'll typically reply within one to two business days.

Please include the chapter number in your issue title. For example: "Chapter 13: Regular Expression Question"

General

Please answer all of these as they're essential for troubleshooting.

What's your operating system (e.g. Windows 10, macOS Catalina, Ubuntu, etc.)?
Windows10

Did you install PostgreSQL and pgAdmin according to the steps on page xxviii of the book's Introduction? If not, please describe your installation.
Installed as per instructions.

Which versions of PostgreSQL and pgAdmin 4 are you using? If you aren't sure, for PostgreSQL run the SQL command SELECT version(); and for pgAdmin 4, navigate to "About pgAdmin 4" under the "Help" menu.

PostgreSQL14 and pdAdmin 6.1

Did you download the book's code examples and data from GitHub using the directions on page xxvii of the Introduction? If not, please describe how you loaded the material on your computer.

Downloaded as per instruction.

Issue or Question

Chapter, page and code listing number:
Chapter 13, Listing 13-9

Describe the issue or your question:
So this is not the first time this has come up, and it has to do with loading dates formatted differently from UTC into my database. In the Regexp Chapter we match American style dates in the "MM/DD/YY" format. However when I attempt to update my table, I get the error below. After some research it seems I have to change my date style settings, however is there a better way for PostgreSQL to have a collection of tolerated datestyle settings instead of constantly changing it? I am still getting to terms with the settings of PostgreSQL and considering I am going to be dealing with UTC later, I'm worried about changing something I won't know how to change back. Any recommendations?

Please paste the code and error message here. It's OK to alter identifying info such as a folder name
ERROR: date/time field value out of range: "4/16/17 2100 US/Eastern"
HINT: Perhaps you need a different "datestyle" setting.
SQL state: 22008

Extra data after last expected column

You reference this error in the book when trying to import the US census but do not indicate how to resolve. I am receiving this error: extra data after last expected column
Context: copy us_counties_2010, line 2: Autauga county
Error proceed with exact info in your book.

How do you resolve ?

Inclusion of COPY statements in DDL files prevents readers from simply loading them.

Hey, your tech reviewer here.

Example: https://github.com/anthonydb/practical-sql/blob/master/Chapter_08/Chapter_08.sql

The COPY statements are interspersed with table definition statements. The problem with this is that the COPY statements are going to be specific the the reader's OS and where they saved the files. This prevents the reader from simply running the file using PSQL or pgAdmin.

I suggest instead putting the example COPY statements in there, commented out.

Chapter 4 csv usa_counties2010.csv

Having trouble with data, code, or exercises in Practical SQL? I'm glad to help. Please answer these questions, and I'll typically reply within one to two business days.

Please include the chapter number in your issue title. For example: "Chapter 13: Regular Expression Question"

General

What's your operating system (e.g. , macOS Catalina, Ubuntu, etc.)?

Windows 10
Did you install PostgreSQL and pgAdmin according to the steps on page xxviii of the book's Introduction? If not, please describe your installation.

yes
Which versions of PostgreSQL and pgAdmin 4 are you using? If you aren't sure, for PostgreSQL run the SQL command SELECT version(); and for pgAdmin 4, navigate to "About pgAdmin 4" under the "Help" menu.

Did you download the book's code examples and data from GitHub using the directions on page xxvii of the Introduction? If not, please describe how you loaded the material on your computer.
yes

Issue or Question

Chapter, page and code listing number:
ch 4 usa_counties2010.csv

Describe the issue or your question:

Please answer all of these as they're essential for troubleshooting.
Greetings!
Could you provide the correct CSV file( usa_counties2010.csv), please?
Here's an error which was described in the chapter 4, that there're too many columns.
I tried to fix it by deleting extra columns but that didn't work for me right.
I'd really appreciate a fixed CSV table!
Thanks in advance!
Please paste the code and error message here. It's OK to alter identifying info such as a folder name

Installation Instructions for PL/Python Extension

Over time, various distributions of PostgreSQL have modified the steps for installing PL/Python support. This issue will track updates (which are reflected in the 2nd Edition of Practical SQL).

macOS

If you are using Postgres.app as described in the book, please see its PL/Python support documentation.

Windows 10 and 11

First, make sure you have installed the EDB Language Pack as part of your PostgreSQL installation.

Then follow these steps:

  1. Open the Windows Control Panel by clicking the Search icon on the Windows taskbar, entering Control Panel, and then clicking the Control Panel icon.

  2. In the Control Panel app, enter Environment in the search box. In the list of search results displayed, click Edit the System Environment Variables. A System Properties dialog will appear.

  3. In the System Properties dialog, on the Advanced tab, click Environment Variables. The dialog that opens has two sections: User variables and System variables. In the System variables section, if you don’t see a PATH variable, continue to step a to create a new one. If you do see an existing PATH variable, continue to step b to modify it.

    a. If you don’t see PATH in the System variables section, click New to open a New System Variable dialog.
    In the Variable name box, enter PATH. In the Variable value box, enter C:\edb\languagepack\v2\Python-3.9. (Instead of typing, you can click Browse Directory and navigate to the directory in the Browse For Folder dialog.) When you’ve either entered the path manually or browsed to it, click OK on the dialog to close it.

    b. If you do see an existing PATH variable in the System variables section, highlight it and click Edit. In the list of variables that displays, click New and enter C:\edb\languagepack\v2\Python-3.9. Once you’ve added the Language Pack path, highlight it in the list of variables and click Move Up until the path is at the top of the variables list. That way, PostgreSQL will find the correct Python version if you have additional Python installations.

  4. Finally, in the System variables section, click New. In the New System Variable dialog, enter PYTHONHOME in the Variable name box. In the Variable value box, enter C:\edb\languagepack\v2\Python-3.9.

When you’re finished, click OK in all dialogs to close them. Note that these Python path settings will take effect the next time you restart your system.

File access denied on MacBook Air using COPY

Please include the chapter number in your issue title. For example: "Chapter 3: Listing 3-1

General

Please answer all of these as they're essential for troubleshooting.

**What's your operating system : macOS Catalina

Did you install PostgreSQL and pgAdmin according to the steps on page xxviii of the book's Introduction? If not, please describe your installation.
Yes

*Versions of PostgreSQL and pgAdmin 4 are you using? Version 5.5 (4280.88)

**Did you download the book's code examples and data from GitHub using the directions on page xxvii of the Introduction? Yes

Issue or Question

**Chapter, page and code listing number:3 page 25

Describe the issue or your question:
I cannot get permission to write to the textfile on macbook Air. I have given permission for PGadmin to acces the file

ERROR: could not open file "/Users/dasc/OneDrive - Myname/typetest.txt" for writing: Permission denied
HINT: COPY TO instructs the PostgreSQL server process to write a file. You may want a client-side facility such as psql's \copy.
SQL state: 42501

I dont understand whats is wrong

CH 4 Try it Yourself #2

Hi!

For #2 Try it yourself example in Chapter 4, what would be the statement to file the 20 counties in the United States that have the most housing units. I am confused to what statement would be used to only show the highest 20 without using DESC.

Thank you!

Did you install PostgreSQL and pgAdmin according to the steps on page xxviii of the book's Introduction? If not, please describe your installation.

Which versions of PostgreSQL and pgAdmin 4 are you using? If you aren't sure, for PostgreSQL run the SQL command SELECT version(); and for pgAdmin 4, navigate to "About pgAdmin 4" under the "Help" menu.

Did you download the book's code examples and data from GitHub using the directions on page xxvii of the Introduction? If not, please describe how you loaded the material on your computer.

Issue or Question

Chapter, page and code listing number:

Describe the issue or your question:

Please paste the code and error message here. It's OK to alter identifying info such as a folder name

Chapter 4: COPY us_counties_2010...

COPY us_counties_2010
FROM 'C:\YourDirectory\us_counties_2010.csv'
WITH (FORMAT CSV, HEADER);

will give the error
ERROR: invalid byte sequence for encoding "UTF8": 0xf1 0x61 0x20 0x41
CONTEXT: COPY us_counties_2010, line 1804
SQL state: 22021

This is because line 1804 contains a tilde in the NAME column "Dona Ana County"

Error in executing factorial part of listing 5-3

Having trouble with data, code, or exercises in Practical SQL? I'm glad to help. Please answer these questions, and I'll typically reply within one to two business days.

Please include the chapter number in your issue title. For example: "Chapter 13: Regular Expression Question"

General

Please answer all of these as they're essential for troubleshooting.

What's your operating system (e.g. Windows 10, macOS Catalina, Ubuntu, etc.)?
Windows 10

Did you install PostgreSQL and pgAdmin according to the steps on page xxviii of the book's Introduction? If not, please describe your installation.
Yes

Which versions of PostgreSQL and pgAdmin 4 are you using? If you aren't sure, for PostgreSQL run the SQL command SELECT version(); and for pgAdmin 4, navigate to "About pgAdmin 4" under the "Help" menu.
PostgreSQL = 14
pgAdmin 4 = 6.1

Did you download the book's code examples and data from GitHub using the directions on page xxvii of the Introduction? If not, please describe how you loaded the material on your computer.

Issue or Question

Chapter, page and code listing number:
Chapter 5, listing 5-3

Describe the issue or your question:
Code mentioned in book for calculating factorial of a number is not executing successfully. It is showing an error at the end of the syntax. However, for this I am able to use inbulit factorial function. SELECT FACTORIAL(4);

Please paste the code and error message here. It's OK to alter identifying info such as a folder name
SELECT 4 !; -- factorial
D

ERROR: syntax error at or near ";"
LINE 1: SELECT 4 !; -- factorial
^
SQL state: 42601
Character: 11

Chapter 4 Exporting

When I try Listing 4-8 on page 52

COPY us_counties_2010
TO 'C:\us_counties_export.txt'
WITH (FORMAT CSV, HEADER, DELIMITER '|')

I get the following error message:
ERROR: could not open file "C:\us_counties_export.txt" for writing: Permission denied
HINT: COPY TO instructs the PostgreSQL server process to write a file. You may want a client-side facility such as psql's \copy. SQL state: 42501

I have no idea how to rectify this situation.

Chapter 4: Missing data column and unable to import CSV

Hello,

I downloaded the CSV file of census data but am having trouble importing into the table.
When trying to copy the document using the sql syntax, the following error occurs:
ERROR: missing data for column "county_fips"
CONTEXT: COPY us_counties_pop_est_2019, line 2: ""01,001,3,Alabama,Autauga County,1539602137,25706961,+32.5322367,-086.6464395,55533,55869,624,541,-1..."
SQL state: 22P04

Additionally when attempting to import the data it supplies a 'columns' error. Please advise, thanks!

Chapter 13 regexp error

Hi,
I'm working on Win 10
I've installed PosgreSQL as described in book
I'm using version 4.28
Yes, I did download it from GitHub

When Running Listing 13-9 on page 225, chapter 13,
UPDATE crime_reports
SET date_1 =
(
(regexp_match(original_text, '\d{1,2}/\d{1,2}/\d{2}'))[1]
|| ' ' ||
(regexp_match(original_text, '/\d{2}\n(\d{4})'))[1]
||' US/Eastern'
)::timestamptz;

pg announces error:
ERROR: date/time field value out of range: "4/16/17 2100 US/Eastern"
HINT: Perhaps you need a different "datestyle" setting.
SQL state: 22008

Code is definitely right, I've copied it and all the previous steps as well

Chapter 5 Try it yourself

Hi,

I was doing Task 3 from Chapter 5 Try it Yourself. And I wanted to try and find a median value for all of the states. However, my code doesn't seem to be working when I try to add the geo_name column in the select function (works well without it).

Would you be able to help me amend the SQL statement so that it brings back the medians of each state with the county name next to it?

SELECT DISTINCT state_us_abbreviation, geo_name,
percentile_cont(0.5)
WITHIN GROUP (ORDER BY p0010001) AS "pct"
from us_counties_2010
GROUP BY 1, 2; 

chapter 4 retrieve census .csv file

Practical SQL Issue Template

Having trouble with data, code, or exercises in Practical SQL? I'm glad to help. Please answer these questions, and I'll typically reply within one business day.

Please include the chapter number in your issue title. For example: "Chapter 13: Regular Expression Question"

General

What's your operating system (e.g. Windows 10, macOS Mojave, Ubuntu, etc.)?

Did you install PostgreSQL and pgAdmin according to the steps on page xxviii of the book's Introduction? If not, please describe your installation.

Did you download the book's code examples and data from GitHub using the directions on page xxvii of the Introduction? If not, please describe how you loaded the material on your computer.

Issue or Question

Chapter, page and code listing number:

Describe the issue or your question. Please include any error messages:

When attempting to copy the census file from my desktop i get the following error message:

Here is the SQL statement: COPY us_counties_2010
FROM 'C:\users\Michael\Desktop\us_counties_2010.csv'
WITH (FORMAT CSV, HEADER);

Here is the result:

ERROR: could not open file "C:\users\Michael\Desktop\us_counties_2010.csv" for reading: Permission denied HINT: COPY FROM instructs the PostgreSQL server process to read a file. You may want a client-side facility such as psql's \copy. SQL state: 42501

I thought it was a Windows 10 permissions issue and set up my folders for access by everyone but it still errored out. Then i asked Windows support and they said it was an issue with PostgreSQL.

Thanks for your help.

Chapter 9 (Inspecting and modifying data) sample code of list 9-1 has a minor glitch

Greeting

Hi Anthony, my name is Benny Lin, I ask this question because I am translating your book to traditional Chinese, it will be published in Taiwan in Q2 2019. I found a minor issue which may need to fix your sample code of list 9-1.

I used an old Windows 7 64-bit (where I translated ^^)

I'd followed steps described in Intro chapter to install PostgreSQL and pgAdmin; the only difference I made was I chose Chinese (Traditional)_Taiwan.950 as my locale. Which caused my issue, I would explain how it happened below.

I did download the book's code example from GitHub.

The issue is: from the code of list 9-1, if I used original csv file, there would be error below:

ERROR: date/time field value out of range: "6/27/2016" HINT: Perhaps you need a different "datestyle" setting. CONTEXT: COPY meat_poultry_egg_inspect, line 2, column grant_date: "6/27/2016" SQL state: 22008

It seems that date format caused the error above. Since I am not sure which PostgreSQL function could convert format mm/dd/yyyy to yyyy/mm/dd (though I remember Mysql does), so I tried to amend the csv file in Excel and changed date format to yyyy/mm/dd.

Then I ran COPY again & it went well.

Of course I could try to set datestyle from YMD (coz my local is Chinese) to MDY (with ALTER DATABASE "analysis" SET datestyle TO "ISO, MDY"; then restart posgresql service) so your csv file could work smoothly.

I added a comment in my script to remind local readers about this special case. FYI.

Code Listing 5-7 returns duplicate rows

The query as listed on p. 62 and below:

SELECT geo_name,
state_us_abbreviation AS "st",
(CAST(p0010006 AS numeric(8,1)) / p0010001) * 100 AS "pct_asian"
FROM us_counties_2010
ORDER BY "pct_asian" DESC;

returns duplicate rows. One can suppress these with with a Select Distinct, but it's not clear why one should have to.

Results obtained under Postgres 2.3.3c (66) Mac and pgAdmin 4.17

Chapter 7 Listing 8-1 Import Error

Practical SQL Issue Template

Having trouble with data, code, or exercises in Practical SQL? I'm glad to help. Please answer these questions, and I'll typically reply within one business day.

Please include the chapter number in your issue title. For example: "Chapter 13: Regular Expression Question"

General

What's your operating system (e.g. Windows 10, macOS Mojave, Ubuntu, etc.)?
Windows 10

Did you install PostgreSQL and pgAdmin according to the steps on page xxviii of the book's Introduction? If not, please describe your installation.
Yes.

Did you download the book's code examples and data from GitHub using the directions on page xxvii of the Introduction? If not, please describe how you loaded the material on your computer.
Yes

Issue or Question

Chapter, page and code listing number:
Chapter 7 Listing 8-1 on page 115

Describe the issue or your question. Please include any error messages:
When trying to run the code for listing 8-1 I get the following error message:

ERROR: invalid byte sequence for encoding "UTF8": 0xc9 0x52 CONTEXT: COPY pls_fy2014_pupld14a, line 7400 SQL state: 22021

I believe the issue is because of the accents over certain names like Munoz will have Muñoz. I tried to change some of the names to remove the accents, but there were too many.

I was wondering if there is a way to change them all somehow or if I can do something on PostgreSQL to accept the accented names.

Chapter 4: Try it Yourself #1

Using the following code in pgAdmin4 v3.4:

COPY actor
FROM '/Users/myusername/Desktop/movies.txt'
WITH (FORMAT CSV, HEADER, DELIMITER ':', QUOTE '#');

returns the following error:

ERROR: relation "actor" does not exist 
SQL state: 42P01

My txt file looks like this:

id:movie:actor
50:#Mission: Impossible#:Tom Cruise

Please help on a resolution. TIA!

Error in importing pls_fy2015_pupld14a

Apparently you didn't size the text columns correctly:

postgres=# \copy pls_fy2014_pupld14a from pls_fy2014_pupld14a.csv with csv header
ERROR: value too long for type character varying(60)
CONTEXT: COPY pls_fy2014_pupld14a, line 7400, column libname: "FLORIDA MUNICIPAL ELECTRONIC LIBRARY (LUIS G. PÉREZ VÁZQUEZ)"
Time: 77.828 ms

Chapter 6: SELECTing w/ JOIN by state_fips or county_fips gives inaccurate results

SELECT c2010.geo_name,
/snip
FROM us_counties_2010 c2010 INNER JOIN us_counties_2000 c2000
ON c2010.state_fips = c2000.state_fips
AND c2010.county_fips = c2000.county_fips
AND c2010.p0010001 <> c2000.p0010001
ORDER BY pct_change DESC;

Verify the problem by...

SELECT geo_name, state_us_abbreviation
FROM us_counties_2000
WHERE state_fips = '1';

produces 0 results, but

SELECT geo_name, state_us_abbreviation
FROM us_counties_2010
WHERE state_fips = '1';

produces 67 results

The problem arises with single-digit fips codes. Need to add a leading "0" ('01') to produce correct results for us_counties_2000 database. I copied/pasted the code to produce/query the databases.

Chapter 5 Median Aggregator Code Incompatible with PostgreSQL14

Having trouble with data, code, or exercises in Practical SQL? I'm glad to help. Please answer these questions, and I'll typically reply within one to two business days.

Please include the chapter number in your issue title. For example: "Chapter 13: Regular Expression Question"

General

Please answer all of these as they're essential for troubleshooting.

What's your operating system (e.g. Windows 10, macOS Catalina, Ubuntu, etc.)?
I am running Windows 10

Did you install PostgreSQL and pgAdmin according to the steps on page xxviii of the book's Introduction? If not, please describe your installation.
I installed PostgreSQL and pgAdmin exactly according to the steps in the book.

Which versions of PostgreSQL and pgAdmin 4 are you using? If you aren't sure, for PostgreSQL run the SQL command SELECT version(); and for pgAdmin 4, navigate to "About pgAdmin 4" under the "Help" menu.
I am using PostgreSQL14 and pgAdmin version 6.1.

Did you download the book's code examples and data from GitHub using the directions on page xxvii of the Introduction? If not, please describe how you loaded the material on your computer.
I did download the instructions, in particular for this issue.

Issue or Question

Chapter, page and code listing number:
The problem isn't with the author but one due to PostgreSQL 14 in general, namely the median aggregator from the wiki listed in Chapter 5, page 131/132 (the entire code block).

Describe the issue or your question:
When I run the code mentioned I get the following error message, and considering I am still a beginner in SQL don't know how to fix it, and have opted to continue using the percentile_cont(0.5) WITHIN GROUP (ORDER BY _column name_) function instead

Please paste the code and error message here. It's OK to alter identifying info such as a folder name
Here is the error message:

ERROR:  function array_append(anyarray, anyelement) does not exist
SQL state: 42883

Chapter 4: ERROR: value too long for type character varying(3) CONTEXT: COPY us_counties_2010, line 2, column state_us_abbreviation: "STUSAB"

What's your operating system (e.g. Windows 10, macOS Catalina, Ubuntu, etc.)?
Arch Linux

Did you install PostgreSQL and pgAdmin according to the steps on page xxviii of the book's Introduction? If not, please describe your installation.

I installed PostgreSQL and pgAdmin through the official Arch Linux package manager (pacman) and the Arch User Repository respectively. I have mostly been using the command line via psql, and have not used pgAdmin as much.

Which versions of PostgreSQL and pgAdmin 4 are you using? If you aren't sure, for PostgreSQL run the SQL command SELECT version(); and for pgAdmin 4, navigate to "About pgAdmin 4" under the "Help" menu.

PostgreSQL 14.3 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 12.1.0, 64-bit

Did you download the book's code examples and data from GitHub using the directions on page xxvii of the Introduction? If not, please describe how you loaded the material on your computer.

Yes

Issue or Question

Chapter, page and code listing number:

Chapter 4, listing 4-3

Describe the issue or your question:

I seem to be unable to copy the US_counties_2010 csv file, despite copying the sql code and downloading the csv file.

Please paste the code and error message here. It's OK to alter identifying info such as a folder name

CREATE TABLE us_counties_2010 (
    geo_name varchar(90),                    -- Name of the geography
    state_us_abbreviation varchar(2),        -- State/U.S. abbreviation
    summary_level varchar(3),                -- Summary Level
    region smallint,                         -- Region
    division smallint,                       -- Division
    state_fips varchar(2),                   -- State FIPS code
    county_fips varchar(3),                  -- County code
    area_land bigint,                        -- Area (Land) in square meters
    area_water bigint,                        -- Area (Water) in square meters
    population_count_100_percent integer,    -- Population count (100%)
    housing_unit_count_100_percent integer,  -- Housing Unit count (100%)
    internal_point_lat numeric(10,7),        -- Internal point (latitude)
    internal_point_lon numeric(10,7),        -- Internal point (longitude)

    -- This section is referred to as P1. Race:
    p0010001 integer,   -- Total population
    p0010002 integer,   -- Population of one race:
    p0010003 integer,       -- White Alone
    p0010004 integer,       -- Black or African American alone
    p0010005 integer,       -- American Indian and Alaska Native alone
    p0010006 integer,       -- Asian alone
    p0010007 integer,       -- Native Hawaiian and Other Pacific Islander alone
    p0010008 integer,       -- Some Other Race alone
    p0010009 integer,   -- Population of two or more races
    p0010010 integer,   -- Population of two races:
    p0010011 integer,       -- White; Black or African American
    p0010012 integer,       -- White; American Indian and Alaska Native
    p0010013 integer,       -- White; Asian
    p0010014 integer,       -- White; Native Hawaiian and Other Pacific Islander
    p0010015 integer,       -- White; Some Other Race
    p0010016 integer,       -- Black or African American; American Indian and Alaska Native
    p0010017 integer,       -- Black or African American; Asian
    p0010018 integer,       -- Black or African American; Native Hawaiian and Other Pacific Islander
    p0010019 integer,       -- Black or African American; Some Other Race
    p0010020 integer,       -- American Indian and Alaska Native; Asian
    p0010021 integer,       -- American Indian and Alaska Native; Native Hawaiian and Other Pacific Islander
    p0010022 integer,       -- American Indian and Alaska Native; Some Other Race
    p0010023 integer,       -- Asian; Native Hawaiian and Other Pacific Islander
    p0010024 integer,       -- Asian; Some Other Race
    p0010025 integer,       -- Native Hawaiian and Other Pacific Islander; Some Other Race
    p0010026 integer,   -- Population of three races
    p0010047 integer,   -- Population of four races
    p0010063 integer,   -- Population of five races
    p0010070 integer,   -- Population of six races

    -- This section is referred to as P2. HISPANIC OR LATINO, AND NOT HISPANIC OR LATINO BY RACE
    p0020001 integer,   -- Total
    p0020002 integer,   -- Hispanic or Latino
    p0020003 integer,   -- Not Hispanic or Latino:
    p0020004 integer,   -- Population of one race:
    p0020005 integer,       -- White Alone
    p0020006 integer,       -- Black or African American alone
    p0020007 integer,       -- American Indian and Alaska Native alone
    p0020008 integer,       -- Asian alone
    p0020009 integer,       -- Native Hawaiian and Other Pacific Islander alone
    p0020010 integer,       -- Some Other Race alone
    p0020011 integer,   -- Two or More Races
    p0020012 integer,   -- Population of two races
    p0020028 integer,   -- Population of three races
    p0020049 integer,   -- Population of four races
    p0020065 integer,   -- Population of five races
    p0020072 integer,   -- Population of six races

    -- This section is referred to as P3. RACE FOR THE POPULATION 18 YEARS AND OVER
    p0030001 integer,   -- Total
    p0030002 integer,   -- Population of one race:
    p0030003 integer,       -- White alone
    p0030004 integer,       -- Black or African American alone
    p0030005 integer,       -- American Indian and Alaska Native alone
    p0030006 integer,       -- Asian alone
    p0030007 integer,       -- Native Hawaiian and Other Pacific Islander alone
    p0030008 integer,       -- Some Other Race alone
    p0030009 integer,   -- Two or More Races
    p0030010 integer,   -- Population of two races
    p0030026 integer,   -- Population of three races
    p0030047 integer,   -- Population of four races
    p0030063 integer,   -- Population of five races
    p0030070 integer,   -- Population of six races

    -- This section is referred to as P4. HISPANIC OR LATINO, AND NOT HISPANIC OR LATINO BY RACE
    -- FOR THE POPULATION 18 YEARS AND OVER
    p0040001 integer,   -- Total
    p0040002 integer,   -- Hispanic or Latino
    p0040003 integer,   -- Not Hispanic or Latino:
    p0040004 integer,   -- Population of one race:
    p0040005 integer,   -- White alone
    p0040006 integer,   -- Black or African American alone
    p0040007 integer,   -- American Indian and Alaska Native alone
    p0040008 integer,   -- Asian alone
    p0040009 integer,   -- Native Hawaiian and Other Pacific Islander alone
    p0040010 integer,   -- Some Other Race alone
    p0040011 integer,   -- Two or More Races
    p0040012 integer,   -- Population of two races
    p0040028 integer,   -- Population of three races
    p0040049 integer,   -- Population of four races
    p0040065 integer,   -- Population of five races
    p0040072 integer,   -- Population of six races

    -- This section is referred to as H1. OCCUPANCY STATUS
    h0010001 integer,   -- Total housing units
    h0010002 integer,   -- Occupied
    h0010003 integer    -- Vacant
);

When I use the COPY command it says permission denied, so I have been using the \copy command instead, which worked for me in chapter 3.

\copy us_counties_2010 FROM '/home/xxxx/xxxxx/xxxx/us_counties_2010.csv' WITH (FORMAT CSV, HEADER)

I get the following error:

ERROR:  value too long for type character varying(2)
CONTEXT:  COPY us_counties_2010, line 2, column state_us_abbreviation: "STUSAB"

I have tried changing the character length limit for STUSAB but I still get the same error.

Regex match to newline characters on windows10

Hi Anthony,

Thank you for the wonderful book.

I'm working through Ch. 13 on regular expressions on Windows10 using postgreSQL10 in pgAdmin4

I was having trouble getting the regular expressions to work for example in code listing 13-7.

I believe that the issue is related to the way new line characters are handled on windows.

This also may be related to the following issues (I am using a clean version of the imported csv file from the crime data):

#4 and #10

I was able to solve this issue with this SO answer: https://stackoverflow.com/a/20056634. Apparently windows may match newlines to \r\n

Here is my sql code for the crime time and the output, where crime_type_orig is the original from the book and the other crime_type2 and crime_type3 are based on the above SO answer:

select 
	regexp_match(original_text, '\n(?:\w+ \w+|\w+)\n(.*):') as crime_type_orig,
	-- See https://stackoverflow.com/a/20056634
	regexp_match(original_text, '\r\n(?:\w+ \w+|\w+)\r\n(.*):') as crime_type2,
	-- Based on https://stackoverflow.com/a/20056634
	regexp_match(original_text, '(?:\r\n|\r|\n)(?:\w+ \w+|\w+)(?:\r\n|\r|\n)(.*):') as crime_type3
from crime_reports;

Here is the output from pgAdmin

image


"Chapter 4: COPY us_counties_2010 Permission Denied Question"

Having trouble with data, code, or exercises in Practical SQL? I'm glad to help. Please answer these questions, and I'll typically reply within one to two business days.

Please include the chapter number in your issue title. For example: "Chapter 13: Regular Expression Question"

General

Please answer all of these as they're essential for troubleshooting.

What's your operating system (e.g. Windows 10, macOS Catalina, Ubuntu, etc.)?
WINDOWS 10

Did you install PostgreSQL and pgAdmin according to the steps on page xxviii of the book's Introduction? If not, please describe your installation.
Yes and everything was smooth sailing!

Which versions of PostgreSQL and pgAdmin 4 are you using? If you aren't sure, for PostgreSQL run the SQL command SELECT version(); and for pgAdmin 4, navigate to "About pgAdmin 4" under the "Help" menu.
Version 14

Did you download the book's code examples and data from GitHub using the directions on page xxvii of the Introduction? If not, please describe how you loaded the material on your computer.
Yes :-)

Issue or Question

Chapter, page and code listing number:
Chapter 4, pages 44-45

Describe the issue or your question:
How do I fix the permission denied error for copying over a CSV?
Note: I could not write to a CSV either in Chapter 3.

Please paste the code and error message here. It's OK to alter identifying info such as a folder name
ERROR: could not open file "C:\Users\xxxxxx\Desktop\Practical SQL\us_counties_2010.csv" for reading: Permission denied
HINT: COPY FROM instructs the PostgreSQL server process to read a file. You may want a client-side facility such as psql's \copy.
SQL state: 42501

CHAPTER 5 Try It Yourself no.3 Answer Review

SELECT
  percentile_cont(.5)
  WITHIN GROUP (ORDER BY (
    SELECT
      p0010001
    WHERE state_us_abbreviation = 'CA'
  )) AS "CA State Median",
  percentile_cont(.5)
  WITHIN GROUP (ORDER BY (
    SELECT p0010001
    WHERE state_us_abbreviation = 'NY'
  )) AS "NY State Median"
FROM us_counties_2010;

Is this a valid solution? Do you have a better solution in mind?

regexp_match( ) do not work on PostgreSQL 11 for these cases

SELECT crime_id,
regexp_match(original_text, '(?:Sq.|Plz.|Dr.|Ter.|Rd.)\n(\w+ \w+|\w+)\n')
AS city
FROM crime_reports;

SELECT crime_id,
regexp_match(original_text, 'hrs.\n(\d+ .+(?:Sq.|Plz.|Dr.|Ter.|Rd.))') AS street
FROM crime_reports;

SELECT crime_id,
regexp_match(original_text, '\n(?:\w+ \w+|\w+)\n(.*):') AS crime_type
FROM crime_reports;

Median taxi trip calculation too complicated

For chapter 11, in the NYC median taxi trip calculation, you use a dodge with epoch which is unnecessary and makes the query more complicated to read. Please replace with the following:

SELECT
date_part('hour', tpep_pickup_datetime),
percentile_cont(.5)
WITHIN GROUP (ORDER BY
tpep_dropoff_datetime - tpep_pickup_datetime) AS "median_trip"
FROM nyc_yellow_taxi_trips_2016_06_01
GROUP BY date_part('hour', tpep_pickup_datetime)
ORDER BY date_part('hour', tpep_pickup_datetime);

CHAPTER 8

Hello i have problem with Chapter 8: Table Design That Works for You
i cant create index , I see this message :

ERROR: invalid byte sequence for encoding "UTF8": 0x00
CONTEXT: COPY pls_fy2014_pupld14a, line 1
SQL state: 22021

Can you help me ? thanks a lot

Chapter 4: Performing the census import with COPY.

Having trouble with data, code, or exercises in Practical SQL? I'm glad to help. Please answer these questions, and I'll typically reply within one to two business days.

Please include the chapter number in your issue title. For example: "Chapter 13: Regular Expression Question"

General

Please answer all of these as they're essential for troubleshooting.

What's your operating system (e.g. Windows 10, macOS Catalina, Ubuntu, etc.)?

Did you install PostgreSQL and pgAdmin according to the steps on page xxviii of the book's Introduction? If not, please describe your installation.

Which versions of PostgreSQL and pgAdmin 4 are you using? If you aren't sure, for PostgreSQL run the SQL command SELECT version(); and for pgAdmin 4, navigate to "About pgAdmin 4" under the "Help" menu.

Did you download the book's code examples and data from GitHub using the directions on page xxvii of the Introduction? If not, please describe how you loaded the material on your computer.

Issue or Question

Chapter, page and code listing number:

Describe the issue or your question:

Please paste the code and error message here. It's OK to alter identifying info such as a folder name

Windows File/Folder Permissions for COPY

If you're a Windows 10 user, you may need to set permissions for the folder holding the code and data for the book. If you don't, you may see an error similar to this:

ERROR: could not open file "C:\YourDirectory\us_counties_2010.csv" for reading: Permission denied
HINT: COPY FROM instructs the PostgreSQL server process to read a file. You may want a client-side facility such as psql's \copy.
SQL state: 42501

To correct this, follow the directions in the note on page xxvii of the Introduction (Practical SQL 1st edition), in the section "Using the Book's Code Examples":

After downloading data, Windows users might need to provide permission for the 
database to read files. To do so, right-click the folder containing the code and data, 
select Properties, and click the Security tab. Click Edit, then Add. Type the name 
Everyone into the object names box and click OK. Highlight Everyone in the user 
list, select all boxes under Allow, and then click Apply and OK.

Chapter 15: Python program keeps closing database connection

Having trouble with data, code, or exercises in Practical SQL, 1st Edition? I'm glad to help. Please answer these questions, and I'll typically reply within one to two business days.

Please include the chapter number in your issue title. For example: "Chapter 13: Regular Expression Question"

General

Please answer all of these as they're essential for troubleshooting.

What's your operating system (e.g. Windows 10, macOS Catalina, Ubuntu, etc.)?
Windows 10

Did you install PostgreSQL and pgAdmin according to the steps on page xxviii of the book's Introduction? If not, please describe your installation.
I have.

Which versions of PostgreSQL and pgAdmin 4 are you using? If you aren't sure, for PostgreSQL run the SQL command SELECT version(); and for pgAdmin 4, navigate to "About pgAdmin 4" under the "Help" menu.
PostgreSQL 14.1, pgAdmin 4 version 6.1

Did you download the book's code examples and data from GitHub using the directions on page xxvii of the Introduction? If not, please describe how you loaded the material on your computer.
Yes

Issue or Question

Chapter, page and code listing number:
Chapter 15, Code listing 15-15

Describe the issue or your question:
My issue is primarily with the Python program which had been a bit of a nightmare from the get-go. Currently the issue is that every time I try to execute the function in the listing (or anything ending with LANGUAGE plpython3u since I am using Python3) I get the error below.

My steps to get here: originally Python didn't seem to be installed, so I reran the EDB language pack installer and ran the StackBuilder installer selecting Language Pack version 2. Everything went smoothly from there and I was able to import Python through CREATE EXTENSION plpython3u. However afterwards when attempting to run the function creation above, I get the error below. I have scoured multiple forums and reporting sites, and have done the following.

First, I checked in the directory C:\Program Files\PostgreSQL\14\doc\installation-notes.html to see the language version compatibility. For PostgreSQL 14 it is Python 3.9, and that is both the version installed by the language pack as well as the default on my system (separately installed from the Python website for previous projects). Originally the error was that a python3.dll module could not be found. I solved this by copying the python39.dll from the default language pack directory C:\edb\languagepack\v2\Python-3.9\python39.dll to C:\Windows\System32\ and that seemed to do the trick. However then came the current connection closed error message below (which keeps on reappearing even after you press Continue). I searched a bit online and tried to add the EDB Language Pack directory mentioned above for python to a new System Variable called PYTHONHOME to no avail. I also added the Language Pack directory and %PYTHONHOME% to my PATH variable. Again nothing. At this point I have probably spent too many hours debugging instead of SQL, and would appreciate any help from anyone who has successfully managed to use Python in their SQL system. Thanks again.

Please paste the code and error message here. It's OK to alter identifying info such as a folder name

The application has lost the database connection:
⁃ If the connection was idle it may have been forcibly disconnected.
⁃ The application server or database server may have been restarted.
⁃ The user session may have timed out.
Do you want to continue and establish a new session?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.