The aim of this project is to create a reporting tool that answers three questions from the news database:
- What are the most popular three articles of all time?
- What are the most popular article authors of all time?
- On which days did more than 1% of requests lead to errors?
The reporting tool used is a python 3
prgoram that uses psycopg2
database system to connect to the database.
To be able to run this program the following should be downloaded on your machine:
-
VirtualBox](https://www.virtualbox.org/wiki/Download_Old_Builds_5_1). To bring the virtual machine online use
vagrant up
and to login usevagrant ssh
. -
The data provided by Udacity here. Unzip the file in order to extract newsdata.sql. This file should be placed inside the Vagrant folder.
-
Bring the virtual machine online using
vagrant up
. -
Login using
vagrant ssh
. -
Load the database using
psql -d news -f newsdata.sql
. -
Connect to the database using
psql -d news
. -
Create the views for questions 3.
-
Exit using psql by pressing
\q
. -
Execute the python program using
python3 logs_analysis.py
.
CREATE VIEW right_entries AS
SELECT to_char(time , 'DD-MON-YYYY') AS rightdate, count(*) AS rightentries
FROM log
WHERE status = '200 OK'
GROUP BY rightdate;
CREATE VIEW errors_entries AS
SELECT to_char(time , 'DD-MON-YYYY') AS errordate, count(*) AS errorentries
FROM log
WHERE status = '404 NOT FOUND'
GROUP BY errordate;