pymap / ssyt Goto Github PK
View Code? Open in Web Editor NEWVisor de precios de alquiler
License: GNU General Public License v3.0
Visor de precios de alquiler
License: GNU General Public License v3.0
Parametrize the "base_period" + "rubro ipc"
See available periods and adjust the data into it
We are reading wages from here https://github.com/PyMap/ssyt/blob/main/ssyt/ssyt_container.py#L66.
A most dynamic way to get that data would be by exploring CKAN API (see an example on how to consume it below)
https://colab.research.google.com/drive/1ssTNVKWTLHOxSd8VUSSrnGPx3lYrZ-QL?usp=sharing
TODO:
Explore that data source and see if it doesn't affect streamlit performance
TODO:
Try animation frame by combinig scatter points with offered rents and plot over heatmap to show concentrations
density scatter mapbox:
https://plotly.com/python-api-reference/generated/plotly.express.density_mapbox
Check this post:
https://stackoverflow.com/questions/69396009/add-us-county-boundaries-to-a-plotly-density-mapbox
... to: 1. Review overlays in current map (prices over time) + 2. Add scatter mapbox & density combination
To get points geometries for density it will be necessary to download the data from Cloud query console
TODO:
Adjust by inflation maps prices by using this method https://github.com/PyMap/ssyt/blob/main/ssyt/visprev.py#L220-L267 and render constant values instead here https://github.com/PyMap/ssyt/blob/main/ssyt/ssyt_container.py#L135. monto
is the column to be updated
Now, period selection for ipc adjusted series only allows to select months within the same year:
TODO:
year
/first_month
/last_month
parameters are selected from st.selectobox somehow that you can set a dofferent month/year combination as start/end points of the filter. For example, from 08-2018 to 03-2020.Rent price series from Properati data starts on 2015. Both IPC/ICL base year are higher (IPC starts on 01-2017 and ICL on 07-2020)
IPC: Is read here https://github.com/PyMap/ssyt/blob/main/ssyt/visprev.py#L194-L218 and downloaded from https://www.indec.gob.ar/indec/web/Nivel4-Tema-3-5-31
To go from 2017 backward, it would be necessary to explore this source https://www.indec.gob.ar/indec/web/Institucional-Indec-InformacionDeArchivo and check how to fill missing months for 2015 (Nov/Dic) and the entire 2016. Another important item would be to check how to link (what coeffcient should be used?) both periods - from 2015 backward + from 2017 onward
ICL: Is read here https://github.com/PyMap/ssyt/blob/main/ssyt/visprev.py#L177-L187 and base period 30.6.20=1. Downloaded from http://bcra.gob.ar/PublicacionesEstadisticas/Principales_variables_datos.asp, this is a day by day index.
To go backwards, we would need to also define if it is possible to use a coefficient to link onward with backward periods. Something additional, is to only pick one day per month (values seems to be pretty similar day by day) to use it in the same way we use the IPC.
El setup.py install requires instala
install_requires=[
#'google-cloud-bigquery >= 1.9.0',
#'google-cloud-bigquery-storage >= 2.9.1',
#'google-cloud-bigquery[bqstorage,pandas]',
'numpy >= 1.21.3',
'pandas >= 1.3.4',
'streamlit >= 1.1.0',
'matplotlib >= 3.4.3',
]
pero google-cloud-bigquery tiene algun conflicto de dependencia con pyarrow.
error: pyarrow 7.0.0 is installed but pyarrow<7.0dev,>=3.0.0 is required by {'google-cloud-bigquery', 'db-dtypes'}
Esto se usaba inicialmente para consumir data desde el marketplace de Properati. En el modulo vivienda.py
Como ahora no se esta usando lo dejo comentado porque puede que no se use mas bigquery
Now, the following method https://github.com/PyMap/ssyt/blob/main/ssyt/visprev.py#L14-L73 allows to get series of nominal prices from the properati database.
I decided to left it aside because of the bad time performance to achieve the query. Passing trough streamlit is considerably more time consuming than the big query workspace.
TODO:
Explore a way to improve time performance when connecting streamlit with bigquery to query the database. The idea of the method above is to avoid parsing a csv with the time series
Wages are read from this source https://datos.gob.ar/dataset/sspm-indice-salarios-base-octubre-2016/archivo/sspm_149.1 and loaded here https://github.com/PyMap/ssyt/blob/main/ssyt/ssyt_container.py#L67.
Then is formatted here https://github.com/PyMap/ssyt/blob/main/ssyt/visprev.py#L150-L175 when prices are adjusted using IPC.
Now, we only use the general index
... but we can parametrize and select from others:
Index(['indice_tiempo', 'indice_salarios', 'indice_salarios_registrado',
'indice_salarios_registrado_sector_privado',
'indice_salarios_registrado_sector_publico',
'indice_salarios_no_registrado_sector_privado'],
dtype='object')
TODO 1: set a new selector to choose form options above here https://github.com/PyMap/ssyt/blob/main/ssyt/ssyt_container.py#L67
Current wages version we use goes up to 2021-07
. but IPC end on 2021-10
.
TODO 2: set a cut point to use prices based on salary availability or complete salaries based on historical behavior
In this notebook https://colab.research.google.com/drive/1-NSwwSANfATUxjYpyGDogZw8xZHjyv1T?usp=sharing, I take the mean from last available year and progressively add based on IPC limit.
It would be nice to discuss it more and write an automatic way to do it (out of a notebook) without loosing streamlit performance while loading the data
TODO:
For maps section here https://github.com/PyMap/ssyt/blob/main/ssyt/charts.py#L130-L179
... we use scattermapbox to draw polygon limits. Another option to render this map would be to:
For polygons we can try to optimize rendering by using topojson format instead of geojson. To go from one format to another:
# GeoJson to TopoJson
sudo npm install topojson-server -g
geo2topo das_4326.geojson > das_4326.topojson
toposimplify -p 1 -f < das_4326.topojson > das_4326_simple.topojson
topoquantize 1e5 < das_4326_simple.topojson > das_4326_quantized.topojson
reference: https://medium.com/@mbostock/command-line-cartography-part-3-1158e4c55a1e
import json
# load topojson
def get_toronto_das_topodata():
with open('data/das_toronto.json') as f:
das_topo = json.load(f)
return das_topo
topodas = get_toronto_das_topodata()
folium_map = folium.Map(location=coords,
zoom_start=8,
tiles="CartoDB dark_matter")
folium.Choropleth(geo_data=topo,
topojson='objects.das_toronto_data_4326',
key_on='feature.properties.DAUID',
data=das_toronto[['mean_hh_income','average_dwelling_value','DAUID']], # my dataset
columns=['mean_hh_income','average_dwelling_value'],
fill_color='GnBu',
fill_opacity=0.7,
line_opacity=0.5).add_to(folium_map)
folium.TopoJson(topodas,
object_path='objects.das_toronto').add_to(folium_map);
folium_map.add_child(folium.GeoJson(data = open("data/topo_das_toronto_data_4326.json")))
geojson to topojson references
https://medium.com/tech-carnot/topojson-based-choropleth-visualization-using-folium-471113fa5964#dd2f
https://github.com/carnot-technologies/MapVisualizations
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.