Coder Social home page Coder Social logo

ideal-memory's Introduction

ideal-memory

Data Base Laboratory files project. The project is the modeling of data about the Higher Education.

Download

All database from Inep can be downloaded here (258MB Zip).

Structure

The files is structured the follow way:

  • ANEXOS/ contains all files that describe the datas;
  • DADOS/ contains all database files in zip->csv;
  • FILTROS/ contains one file with some tips of how to model in Sql; and
  • LEIA-ME/ contains a file that explain how to open the database files in software R, SPSS and SAS.

One important file is ANEXOS/ANEXO I - Dicion rio de Dados e Tabelas Auxiliares/Dicion rio_de_Dados.xlsx that contains the description of each field in all database.

Modeling

This is the modeling that how all data are structured.

Diagram

Execute

The nexts steps describe how to create all environment.

Prepare

Execute the follows commands:

git clone https://github.com/danielventurini/ideal-memory
cd ideal-memory/
curl download.inep.gov.br/microdados/microdados_educacao_superior_2017.zip --output microdados.zip
unzip microdados.zip
mv Microdados_Educacao_Superior_2017 microdados  # rename path
cd microdados/DADOS/
find . -name "*.zip" -exec unzip {} \;	# extract all files

Reproduce

Using the PostgreSQL, execute the query files in the follow order:

  1. create_types.sql
  2. insert_types.sql
  3. create_temps.sql

before execute this file - create_temps.sql -, open at line 406-411 and change the /[absolute/path]/ to your absolute path.

  1. create_tables.sql
  2. insert_tables.sql

Tip: Before the execution files, you can use the follow command:

drop schema if exists public cascade;
create schema public;

Observation

  1. The DM_CURSO.csv file contains unique wrong register. At line 6302 and column T, exists a wrong value in TP_ATRIBUTO_INGRESSO that doesn't exists in table TP_ATRIBUTO_INGRESSO. All values from this table is 0, 1 and 2; and the value at line 6302 is 3. For this, in file create_temps, at last line, this register is deleted from the curso_temp.

  2. The file insert_tables.sql has a UPDATE that take many, many, many hours: line 995

ideal-memory's People

Contributors

danielventurini avatar luizassilveira avatar vmichelan avatar

Stargazers

 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.