Coder Social home page Coder Social logo

news_report's Introduction

Log Analysis tool

This is a python script that creates three reports for a Postgre database.

Reports created

Three most popular articles

Prints a table containing a column witht the Article title, and a second column showing the number of views of the corresponding article. Sorted in descending order.

Example Output for Three Most Popular articles: NOT THE ACTUAL REPORT

Title Views
Princess Shellfish Marries Prince Handsome 1201
Baltimore Ravens Defeat Rhode Island Shoggoths 915
Political Scandal Ends In Political Scandal 600

Most popular article authors of all time

Article author names and views for all articles written by that author. Sorted in descending order and limited to 5 results for performance.

Example Output for Most Popular Article Authors: NOT THE ACTUAL REPORT

Name Views
Ursula La Multa 2304
Rudolf von Treppenwitz 1985
Markoff Chaney 1723
Anonymous Contributor 1023

Days with requests error status largest than 1% of request

Print out of date(Format: YYYY-MM-DD) and percent of total request that return 404 NOT FOUND for that day, if this percentage was greater than 1%.

Example output: NOT THE ACTUAL REPORT

Date Error %
2016-07-29 2.5%

Views used when creating report

Two views were created when in order to reduce code repetition. One is the popular_articles_view and the second one is the percent_error_view. Both can be found in the file create_views.sql, but for convenience purposes we describe them bellow:

View: Popular articles

CREATE VIEW popular_articles_view AS 
    SELECT title, author, count(*) AS num_views 
    FROM articles, log 
    WHERE log.path LIKE concat('/article/',articles.slug)
    GROUP BY title, author; 

View: Percent error

CREATE VIEW percent_error_view AS
    SELECT time::date as log_time, ROUND(
        (100*SUM(CASE log.status WHEN '404 NOT FOUND' THEN 1 ELSE 0 END))::numeric/
        count(log.status), 2) as error_percent
    FROM log
    GROUP BY log_time;

Requirements

Postgre

You need to postgre running in a virtual machine or your machine in order to create the database and interact with it using Python. A database must exists and the database name must be news.

Python

Python is the language used by the log data analysis script.

Data for the database

You can download the necessary file Here

Usage

Once you have postgre installed, and you have downloaded and unzipped the newsdata.sql file you will need to type the following commands:

Create the postgre database using the newsdata.sql file

Once the newsdata.sql file has been downloaded and unzipped, and postgre has been installed in your machine or the virtual machine you are using you can create the database named news by typing the following in your command line:

psql -d news -f newsdata.sql

Create views using the create_views.sql from this repo

In favor of ease-of-use and time saving we have included the SQL code need to create the views in a file name create_views.sql. This code checks for existing views with the same name as the ones that will be created, and drops them if this is the case. This is done in order to avoid any error messages while creating the views. The user can run the code by typing the following command from the command line:

psql -d news -f create_views.sql

Executing the log analysis report

In order to execute the report the user needs to type:

python newsdata_report.py

This will print out the report to the command line. If the user wants the output to be in a text file, the user can type the following command:

python newsdata_report.py > {my_output_file}.txt

The output for the report can be found in report_result.txt

License

news_report's People

Contributors

marodrig avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.