This proposal contributes to <a class="issue-link js-issue-link" data-error-text="Fail

Closing; we will centralize continued discussion on <a class="issue-link js-issue-link

Reporting Enhancement Proposal A about message_ix HOT 7 CLOSED

gidden commented on August 16, 2024 1

Reporting Enhancement Proposal A

from message_ix.

Comments (7)

danielhuppmann commented on August 16, 2024

I don't see a reason why there needs to be a translation to "Variables" at the end, in particular because the groupby vintage_year is probably not always true.

An easier way would be to identify any of the types 'primitives', 'derivatives' and 'aggregates' by an IAMC-compliant variable.

from message_ix.

danielhuppmann commented on August 16, 2024

Another observation: the linearity is not always true. As an example, take energy intensity of GDP. To compute that at the global level, you need to first aggregate, then derive.

from message_ix.

gidden commented on August 16, 2024

I don't see a reason why there needs to be a translation to "Variables" at the end, in particular because the groupby vintage_year is probably not always true.

An easier way would be to identify any of the types 'primitives', 'derivatives' and 'aggregates' by an IAMC-compliant variable.

My initial thought on the "Variables" section is that there we "lose" information from the model. For example, in order to accurately calculate some aggregate, we may need to weight by vintage year. In the IAMC data template, there is no explicit mention of this (i.e., it is message_ix model specific). Thus this would be a interface/processing step invoked after all operations are performed that need message_ix-specific information.

In any case, I completely agree that we should plan to support "optimization" of the process to calculate only those data that users are interested in. Using this proposal, one way would be to:

take a list of IAMC variables of interest as input
identify all necessary root nodes (Cost|Fossil Fuel in this case)
work up the tree to identify necessary aggregates/derivatives/primitives
initiate the process listing only necessary data

from message_ix.

gidden commented on August 16, 2024

Another observation: the linearity is not always true. As an example, take energy intensity of GDP. To compute that at the global level, you need to first aggregate, then derive.

Point taken that this proposal may not fit all use cases (that I haven't thought of!), but (as discussed in person) this example could be computed as follows:

regional EI: energy / GDP
global EI: GDP weighted sum of regional EI

from message_ix.

khaeru commented on August 16, 2024

In any case, I completely agree that we should plan to support "optimization" of the process to calculate only those data that users are interested in. Using this proposal, one way would be to:

take a list of IAMC variables of interest as input

identify all necessary root nodes (Cost|Fossil Fuel in this case)

work up the tree to identify necessary aggregates/derivatives/primitives

initiate the process listing only necessary data

The machine learning world has spawned a lot of packages for managing "workflows". Here's an example from Google's TensorFlow (look at the flowchart). The idea is that each item to be calculated depends on its inputs, in a directed acyclic graph (DAG). Identifying a subset of the nodes and edges is called "pruning".

Usually this is done so that, in big data applications, the different operations can be farmed out/parallelized across multiple nodes/languages. MESSAGE doesn't need those features, so e.g. Luigi is overkill. But we could locate and take advantage of existing packages that implements (including pruning) DAGs…e.g. dask implements graphs; I just stumbled across Pinball and there are likely others.

from message_ix.

OFR-IIASA commented on August 16, 2024

Another observation: the linearity is not always true. As an example, take energy intensity of GDP. To compute that at the global level, you need to first aggregate, then derive.

There are numerous different operations required for calculating the global variable. In some cases, for example with capital costs, no value is reported; for prices, we often use the max value and in some cases e.g. for reporting the global price of gas, we actually report the global lng price, therefore requiring a combination of two commodities reported as the same variable; for some variables we will need to calculate using population data i.e. anything that is reported as per capita. Deriving global values is very diverse and what i did is to always recalculate the glb variable in a separate step after calculating regional values.

from message_ix.

khaeru commented on August 16, 2024

Closing; we will centralize continued discussion on #149 and the linked wiki page.

from message_ix.

Reporting Enhancement Proposal A about message_ix HOT 7 CLOSED

Comments (7)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent