pymap / ssyt Goto Github PK

View Code? Open in Web Editor NEW

1.0 1.0 1.0 22.71 MB

Visor de precios de alquiler

License: GNU General Public License v3.0

Python 1.48% Jupyter Notebook 98.50% CSS 0.01% Dockerfile 0.01%

ssyt's People

Contributors

Stargazers

Watchers

Forkers

rhnux

ssyt's Issues

Inflation adjustment

Parametrize the "base_period" + "rubro ipc"

See available periods and adjust the data into it

CKAN api

We are reading wages from here https://github.com/PyMap/ssyt/blob/main/ssyt/ssyt_container.py#L66.

A most dynamic way to get that data would be by exploring CKAN API (see an example on how to consume it below)
https://colab.research.google.com/drive/1ssTNVKWTLHOxSd8VUSSrnGPx3lYrZ-QL?usp=sharing

TODO:
Explore that data source and see if it doesn't affect streamlit performance

Scatter mapbox + Scatter density

TODO:

Try animation frame by combinig scatter points with offered rents and plot over heatmap to show concentrations

density scatter mapbox:
https://plotly.com/python-api-reference/generated/plotly.express.density_mapbox

Check this post:
https://stackoverflow.com/questions/69396009/add-us-county-boundaries-to-a-plotly-density-mapbox

... to: 1. Review overlays in current map (prices over time) + 2. Add scatter mapbox & density combination

To get points geometries for density it will be necessary to download the data from Cloud query console

Adjust nominal prices from graduated Point maps

TODO:

Adjust by inflation maps prices by using this method https://github.com/PyMap/ssyt/blob/main/ssyt/visprev.py#L220-L267 and render constant values instead here https://github.com/PyMap/ssyt/blob/main/ssyt/ssyt_container.py#L135. monto is the column to be updated

VISPREV: selection period logic

Now, period selection for ipc adjusted series only allows to select months within the same year:

TODO:

Modify the way year/first_month/last_month parameters are selected from st.selectobox somehow that you can set a dofferent month/year combination as start/end points of the filter. For example, from 08-2018 to 03-2020.

Indec + ICL adjustment series

Rent price series from Properati data starts on 2015. Both IPC/ICL base year are higher (IPC starts on 01-2017 and ICL on 07-2020)

IPC: Is read here https://github.com/PyMap/ssyt/blob/main/ssyt/visprev.py#L194-L218 and downloaded from https://www.indec.gob.ar/indec/web/Nivel4-Tema-3-5-31

To go from 2017 backward, it would be necessary to explore this source https://www.indec.gob.ar/indec/web/Institucional-Indec-InformacionDeArchivo and check how to fill missing months for 2015 (Nov/Dic) and the entire 2016. Another important item would be to check how to link (what coeffcient should be used?) both periods - from 2015 backward + from 2017 onward

ICL: Is read here https://github.com/PyMap/ssyt/blob/main/ssyt/visprev.py#L177-L187 and base period 30.6.20=1. Downloaded from http://bcra.gob.ar/PublicacionesEstadisticas/Principales_variables_datos.asp, this is a day by day index.
To go backwards, we would need to also define if it is possible to use a coefficient to link onward with backward periods. Something additional, is to only pick one day per month (values seems to be pretty similar day by day) to use it in the same way we use the IPC.

Incompatibilidad entre pyarrow - bigquery

El setup.py install requires instala

install_requires=[
        #'google-cloud-bigquery >= 1.9.0',
        #'google-cloud-bigquery-storage >= 2.9.1',
        #'google-cloud-bigquery[bqstorage,pandas]',
        'numpy >= 1.21.3',
        'pandas >= 1.3.4',
        'streamlit >= 1.1.0',
        'matplotlib >= 3.4.3',
    ]

pero google-cloud-bigquery tiene algun conflicto de dependencia con pyarrow.

error: pyarrow 7.0.0 is installed but pyarrow<7.0dev,>=3.0.0 is required by {'google-cloud-bigquery', 'db-dtypes'}

Esto se usaba inicialmente para consumir data desde el marketplace de Properati. En el modulo vivienda.py

Como ahora no se esta usando lo dejo comentado porque puede que no se use mas bigquery

Big Query performance

Now, the following method https://github.com/PyMap/ssyt/blob/main/ssyt/visprev.py#L14-L73 allows to get series of nominal prices from the properati database.

I decided to left it aside because of the bad time performance to achieve the query. Passing trough streamlit is considerably more time consuming than the big query workspace.

TODO:
Explore a way to improve time performance when connecting streamlit with bigquery to query the database. The idea of the method above is to avoid parsing a csv with the time series

Wages adjustment Series

Wages are read from this source https://datos.gob.ar/dataset/sspm-indice-salarios-base-octubre-2016/archivo/sspm_149.1 and loaded here https://github.com/PyMap/ssyt/blob/main/ssyt/ssyt_container.py#L67.

Then is formatted here https://github.com/PyMap/ssyt/blob/main/ssyt/visprev.py#L150-L175 when prices are adjusted using IPC.
Now, we only use the general index

... but we can parametrize and select from others:

Index(['indice_tiempo', 'indice_salarios', 'indice_salarios_registrado',
       'indice_salarios_registrado_sector_privado',
       'indice_salarios_registrado_sector_publico',
       'indice_salarios_no_registrado_sector_privado'],
      dtype='object')

TODO 1: set a new selector to choose form options above here https://github.com/PyMap/ssyt/blob/main/ssyt/ssyt_container.py#L67

Current wages version we use goes up to 2021-07. but IPC end on 2021-10.
TODO 2: set a cut point to use prices based on salary availability or complete salaries based on historical behavior

In this notebook https://colab.research.google.com/drive/1-NSwwSANfATUxjYpyGDogZw8xZHjyv1T?usp=sharing, I take the mean from last available year and progressively add based on IPC limit.
It would be nice to discuss it more and write an automatic way to do it (out of a notebook) without loosing streamlit performance while loading the data

Line charts

TODO:

Add other region line charts according region selection (orca)
Add bar/line with coefficient (inflation adjustment) across periods

Maps workarounds

For maps section here https://github.com/PyMap/ssyt/blob/main/ssyt/charts.py#L130-L179

... we use scattermapbox to draw polygon limits. Another option to render this map would be to:

Use plotly animated choroplets
Use folium animated choroplet
Keep the graduated points strategy but using continuous colorscales instead

For polygons we can try to optimize rendering by using topojson format instead of geojson. To go from one format to another:

From terminal

# GeoJson to TopoJson
sudo npm install topojson-server -g
geo2topo das_4326.geojson > das_4326.topojson
toposimplify -p 1 -f < das_4326.topojson > das_4326_simple.topojson
topoquantize 1e5 < das_4326_simple.topojson > das_4326_quantized.topojson

reference: https://medium.com/@mbostock/command-line-cartography-part-3-1158e4c55a1e

Using mapshaper https://mapshaper.org/ to go from gejson to topojson. Then, load the topojson like this:

import json

# load topojson
def get_toronto_das_topodata():
    with open('data/das_toronto.json') as f:
        das_topo = json.load(f)
    return das_topo

topodas = get_toronto_das_topodata()

folium_map = folium.Map(location=coords,
                        zoom_start=8,
                        tiles="CartoDB dark_matter")

folium.Choropleth(geo_data=topo,
             topojson='objects.das_toronto_data_4326',
             key_on='feature.properties.DAUID',
             data=das_toronto[['mean_hh_income','average_dwelling_value','DAUID']], # my dataset
             columns=['mean_hh_income','average_dwelling_value'], 
             fill_color='GnBu', 
             fill_opacity=0.7, 
             line_opacity=0.5).add_to(folium_map)

folium.TopoJson(topodas,
                object_path='objects.das_toronto').add_to(folium_map);

folium_map.add_child(folium.GeoJson(data = open("data/topo_das_toronto_data_4326.json")))

reference: https://github.com/urbansim/ggh_land_selector/blob/toronto_selector/ggh_land_selector/Toronto_focus.ipynb

geojson to topojson references
https://medium.com/tech-carnot/topojson-based-choropleth-visualization-using-folium-471113fa5964#dd2f
https://github.com/carnot-technologies/MapVisualizations

pymap / ssyt Goto Github PK

ssyt's People

Contributors

Stargazers

Watchers

Forkers

ssyt's Issues

Inflation adjustment

CKAN api

Scatter mapbox + Scatter density

Adjust nominal prices from graduated Point maps

VISPREV: selection period logic

Indec + ICL adjustment series

Incompatibilidad entre pyarrow - bigquery

Big Query performance

Wages adjustment Series

Line charts

Maps workarounds

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent