$ https://github.com/observatory-economic-complexity/oec-etl
$ cd oec-etl
Use the following as a guide/template for a .env
file:
export CLICKHOUSE_URL="127.0.0.1"
export CLICKHOUSE_DATABASE="oec_test"
The countries dimension pipeline is super fast to run and a great way to test that your setup works.
$ python etl/dim_countries_pipeline.py
When adding a new pipeline script, please use the following naming convention:
Format: <type>_<depth>_<identifier>_<frequency>_<classification>
Params:
type
: What the fact table represents (trade, tariffs, services, etc.).
depth
: i
for international and s
subnational data.
identifier
: For subnational data, this should be the iso3
for the reporter country. For international data, this should be the organization reporting the data.
depth
: a
for annual and m
for monthly.
classification
: The classification used by this table.
Examples:
trade_s_bra_a_hs
for annual Brazilian subnational trade data using the HS classification
trade_i_comtrade_m_hs
for monthly international Comtrade trade data using the HS classification
Format: dim_<identifier>_<dimension>
Params:
identifier
: For subnational data, this should be the iso3
id for the reporter country. For international data, this should say shared
.
dimension
: The name of the dimension held in this table.
Examples:
dim_shared_countries
for a shared countries table
dim_rus_regions
for a Russia dimension table representing national regions