This is a python script that creates three reports for a Postgre database.
Prints a table containing a column witht the Article title, and a second column showing the number of views of the corresponding article. Sorted in descending order.
Title | Views |
---|---|
Princess Shellfish Marries Prince Handsome | 1201 |
Baltimore Ravens Defeat Rhode Island Shoggoths | 915 |
Political Scandal Ends In Political Scandal | 600 |
Article author names and views for all articles written by that author. Sorted in descending order and limited to 5 results for performance.
Name | Views |
---|---|
Ursula La Multa | 2304 |
Rudolf von Treppenwitz | 1985 |
Markoff Chaney | 1723 |
Anonymous Contributor | 1023 |
Print out of date(Format: YYYY-MM-DD) and percent of total request that return 404 NOT FOUND for that day, if this percentage was greater than 1%.
Date | Error % |
---|---|
2016-07-29 | 2.5% |
Two views were created when in order to reduce code repetition. One is the popular_articles_view and the second one is the percent_error_view. Both can be found in the file create_views.sql, but for convenience purposes we describe them bellow:
CREATE VIEW popular_articles_view AS
SELECT title, author, count(*) AS num_views
FROM articles, log
WHERE log.path LIKE concat('/article/',articles.slug)
GROUP BY title, author;
CREATE VIEW percent_error_view AS
SELECT time::date as log_time, ROUND(
(100*SUM(CASE log.status WHEN '404 NOT FOUND' THEN 1 ELSE 0 END))::numeric/
count(log.status), 2) as error_percent
FROM log
GROUP BY log_time;
You need to postgre running in a virtual machine or your machine in order to create the database and interact with it using Python. A database must exists and the database name must be news.
Python is the language used by the log data analysis script.
You can download the necessary file Here
Once you have postgre installed, and you have downloaded and unzipped the newsdata.sql file you will need to type the following commands:
Once the newsdata.sql file has been downloaded and unzipped, and postgre has been installed in your machine or the virtual machine you are using you can create the database named news by typing the following in your command line:
psql -d news -f newsdata.sql
In favor of ease-of-use and time saving we have included the SQL code need to create the views in a file name create_views.sql. This code checks for existing views with the same name as the ones that will be created, and drops them if this is the case. This is done in order to avoid any error messages while creating the views. The user can run the code by typing the following command from the command line:
psql -d news -f create_views.sql
In order to execute the report the user needs to type:
python newsdata_report.py
This will print out the report to the command line. If the user wants the output to be in a text file, the user can type the following command:
python newsdata_report.py > {my_output_file}.txt
The output for the report can be found in report_result.txt