Hi, Thanks for the wonderful toolkit for simulation-based inference. I am learning

Hello <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-ur

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

Is the thing you are trying to predict (<math-renderer class="js-inline-math" style="d

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

Where NPE relies on a normalizing flow with a discrete number of transformations

I am very grateful for your explanation. Does the value of fre

Deal with time-dependent or time-series data about lampe HOT 12 CLOSED

zengjice1991 commented on June 15, 2024

Deal with time-dependent or time-series data

from lampe.

Comments (12)

francois-rozet commented on June 15, 2024

Hello @zengjice1991, thank you for the kind words. There is a tutorial to set up an embedding network for the observation $x$. In the tutorial $x$ is an image so shape $(C, H, W)$, but it should be fairly easy to adapt!

from lampe.

zengjice1991 commented on June 15, 2024

Thanks for the prompt reply. I will take a close look at the example and try to adapt to the time-series data.

from lampe.

francois-rozet commented on June 15, 2024

Hello @zengjice1991, did the tutorial help you ?

from lampe.

zengjice1991 commented on June 15, 2024

Hi @francois-rozet, thanks for checking with me. I am trying to adapt embedding network for FMPE that relaxes the strict constraints in NPE. Does the entire framework concurrently train FMPE and embedding network? I tried to apply the FMPE to SBI benchmark problem like SIR, a multivariate time-dependent data (time steps x channel), it cannot be directly trained for train FMPE. Do you have any experience on similar case in which FMPE is trained on multivariate time-dependent data?

from lampe.

francois-rozet commented on June 15, 2024

Is the thing you are trying to predict ($\theta$ usually) a time series or is it the condition ($x$)?

from lampe.

zengjice1991 commented on June 15, 2024

I am trying to estimate the posterior of physical parameters conditional data, like p($\theta$ | $x$), where $x$ is time-series data with the shape of $m$ x $n$, $m$ is time step, $n$ is the number of channels.

from lampe.

francois-rozet commented on June 15, 2024

Ok then the embedding approach should work. You just have to replace the CNN embedding by another network that takes a time series as input and returns a vector (for example an RNN).

If you already tried, but failed, you can paste your code here for us to debug.

from lampe.

zengjice1991 commented on June 15, 2024

Hi @francois-rozet,
I applied the FMPE to my case in which the data is time dependent. I defined 1D CNN as the embedding network to compress the data [15000,3] to [1024]. I plotted the recover plot:

It works well.
I have a few questions about FMPE and can I discuss with you?

Defining the Time 't' in FMPE:
In the FMPE model, I see the forward function defined as:
def forward(self, theta: torch.Tensor, x: torch.Tensor, t: torch.Tensor) -> torch.Tensor:
return self.fmpe(theta, self.embedding(x),t)

I am confused about how to define the time parameter 't'. In my current implementation, I have not explicitly defined 't'. I suspect it is related to the freqs argument, which has a default value of 3. Could you explain what 't' represents and how it should be defined if we use the default freqs?

I tried to check the summary information about estimator using 'torchsummary(estimator, [(4), (1,1024), (10)], but I have error: RuntimeError: Given groups=1, weight of size [64, 3, 2], expected input[2, 1024, 1] to have 3 channels, but got 1024 channels instead.
It seems there is a mismatch in the input shapes. Could you help me understand how to correctly specify the input sizes for the summary?
Do you have any suggestions on how to effectively tune the FMPE model? Specifically, what hyperparameters are most critical for FMPE? For instance, how do parameters like freqs, tolerance eta, and others impact the model's performance, and what should I consider when adjusting them?

Thank you so much for the excellent toolkit. I look forward to your response.

from lampe.

francois-rozet commented on June 15, 2024

Where NPE relies on a normalizing flow with a discrete number of transformations, FMPE relies on a continuous normalizing flow, such that the transformation from $\theta$ to latent is defined as an integral from $t = 0$ to $t = 1$ of a time-dependent vector field $v_\phi(\theta, x, t)$. The vector field is defined as an MLP in FMPE and to condition it with respect to the scalar time $t$, the latter has to be embedded as a vector, which we do using a typical frequency embedding.

lampe/lampe/inference.py

Lines 626 to 627 in 8eac904

    
           t = self.freqs * t[..., None] 
        
           t = torch.cat((t.cos(), t.sin()), dim=-1)

Without a look at the code, I don't know how to help you.
In my experiments using soft activation functions (e.g. activation=nn.ELU) and normalization (normalize=True) helps. Using residual connections (e.g. build=lampe.nn.ResMLP) can also be better, and deeper/wider networks (hidden_features) helps as well.

However, I would strongly recommend you to compare your results with NPE with a zuko.flows.NSF flow because it is often better/faster than FMPE when $\theta$ is low dimensional.

from lampe.

zengjice1991 commented on June 15, 2024

I am very grateful for your explanation.

Does the value of frequency embedding affect the FMPE a lot? the default is 3, lower or higher value of frequency embedding is better? Any suggestions on it?
In the function: $t$= self.freqs * $t$[..., None], besides the number of frequency embedding that is set as 3, we also need to specify $t$, do we need to sample $t$ from uniform distribution [0,1]? Or we just take $t$ as 0, 1?
I do applied NPE to my case, the results are similar to those from FMPE, probably my case is relatively simple.
Thanks!

from lampe.

francois-rozet commented on June 15, 2024

More frequencies means that the network can discriminate between values of $t$ more easily. This can improve the granularity of the learned vector field, but in my experiments, it usually makes the numerical integration of the vector field less stable.
During training, $t$ is sampled uniformly between $0$ and $1$ (see FMPELoss). During inference, $t$ is handled by the ODE integrator. You should not have to provide $t$ by hand.

from lampe.

zengjice1991 commented on June 15, 2024

I understand now. Thank you for your detailed explanation of the method!
I feel more confident in applying the method moving forward. Thank you again for your help!

from lampe.

Deal with time-dependent or time-series data about lampe HOT 12 CLOSED

Comments (12)

Related Issues (9)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

	t = self.freqs * t[..., None]
	t = torch.cat((t.cos(), t.sin()), dim=-1)