Arxiv/Blog/Paper Link <a href="https://arxiv.org/pdf/2306.06079v2.

Found an implementation of Metnet 3: <a href="https://github.com/lucidrains/metnet3-py

Here is another implementation, already finished: <a href="https://github.com/kyeg

They use dense and sparse inputs, as well as outputs. Use HRRR

More notes: They use 942 weather stations, and train by assign

ASOS 1 minute weather data (public and freely accessible): <a href="https://madis.ncep

Add MetNet-3 about metnet HOT 9 OPEN

openclimatefix commented on September 2, 2024 3

Add MetNet-3

from metnet.

Comments (9)

Raahul-Singh commented on September 2, 2024 3

Found an implementation of Metnet 3: https://github.com/lucidrains/metnet3-pytorch

from metnet.

meteoDaniel commented on September 2, 2024 2

Here is another implementation, already finished:
https://github.com/kyegomez/metnet3

from metnet.

JackKelly commented on September 2, 2024 1

Sounds great! Well done for spotting this publication!

MetNet-3 uses a modified MaxViT model in the centre of the U-Net. Here's the MaxViT paper. The MaxViT authors have also released TensorFlow code. But, TBH, MaxViT sounds so simple that it's probably easier to re-implement MaxViT in PyTorch directly from the MaxViT paper 🙂

from metnet.

jacobbieker commented on September 2, 2024 1

Found a website that has weather station data for the whole world, and easily downloadable, including UK, and other countries https://github.com/akrherz/iem/blob/main/scripts/asos/iem_scraper_example.py from https://mesonet.agron.iastate.edu/request/download.phtml?network=GB__ASOS

from metnet.

jacobbieker commented on September 2, 2024

They use dense and sparse inputs, as well as outputs.

Use HRRR output to help with training, but don't actually look at the predictions
Uses very large center context of 2500km, extra large area of 5000km, and forecasts for 24 hours
Masks out 25% of sites per example, to help with densification

from metnet.

jacobbieker commented on September 2, 2024

More notes:

They use 942 weather stations, and train by assigning the values of the weather observations to the 4x4km pixel in which it lies. If there are multiple weather stations in a single pixel, they average the values.
Evaluation they don't give any past data from the eval weather stations, so its the same as any other grid point to compare against. They then also give eval weather station history to give hyperlocal forecasts which are a decent improvement (could be very useful for site level forecasts)
Inputs include a topographical embedding, instead og directly giving topo map or land/sea mask: grid of 4km stride, with 20 parameters per grid point: "For each input example, wecalculate the topographical embedding of each input pixel center by bilinearly interpolating the embeddingsfrom the grid. The embedding parameters are trained together with other model parameters similarly toembeddings used in NLP."

Architecture:

"Data is then processed by a U-Net backbone, which starts with applying two convolutional ResNetblocks [9] and downsampling the data to 8 km resolution. We then pad the internal representation spatiallywith zeros to 4992 km by 4992 km square and concatenate with the low-resolution, large-context inputs.Afterward, we again apply two convolutional ResNet blocks and downsample the representation to 16 kmresolution. Convolutional ResNet blocks can only handle local interactions and for longer lead times closeto 24 hours, the targets may depend on the entire input. In order to facilitate that, we process the dataat 16 km resolution using a modified version of MaxVit [22] network. MaxVit is a version of Vision Trans-former (ViT, [6]) with attention over local neighbourhood as well as global gridded attention. We modifythe MaxVit architecture by removing all MLP sub-blocks, adding skip connections (to the MaxVit output)after each MaxVit sub-block, and using normalized keys and queries in attention [5].Afterwards, we take the central crop of size 768 km by 768 km, and gradually upsample the representationto 4 km resolution using skip connections from the downsampling path, at which point we again take acentral crop, this time of size 512 km by 512 km. The network outputs a categorical distribution over 256bins for each of 6 ground weather variables and a deterministic prediction for each of 617 assimilated weatherstate channels using an MLP with one hidden layer applied to the representation at 4 km resolution. Forprecipitation (both instantaneous rate and hourly accumulation), we upsample the representation to 1 kmresolution and output for each pixel a categorical distribution over 512 bins. "

Lead time is included by applying time embedding both additively and multiplicitvely to blocks, same as MetNet-2
Forecast lead time for training isn't same across lead times, it follows an exponential drop off, with t=0 having 10 times the probability of being shown vs t=24h
Trained on cross-entropy loss, after rescaling losses to be similar magnitudes. MSE for forecast on HRRR assimilation state, although those predictions weren't looked at, they just helped training

Author Notes:

Tradeoff in performance for precipitation forecast vs ground variables, improving one resulted in decreasing performance for the other
To work with this, trained primarily percipitation model, then "afterwards we increase the weight of the OMO loss by100x compared to the precipitation model and finetune the model. Moreover, we disable topographicalembedding (fix them to zeros) for this OMO-specific model because topographical embedding may hindertransfer between different locations, which is crucial for learning only from targets present at a sparse set of locations."
Loss scaling

from metnet.

jacobbieker commented on September 2, 2024

ASOS 1 minute weather data (public and freely accessible): https://madis.ncep.noaa.gov/madis_OMO.shtml

from metnet.

jacobbieker commented on September 2, 2024

Also, they mention that MetNet-3 is being used for operational forecasts in Google Search already

from metnet.

jacobbieker commented on September 2, 2024

Yeah, timm also has an implementation of MaxViT as well in Pytorch, we could either use or base ours off of it

from metnet.

Add MetNet-3 about metnet HOT 9 OPEN

Comments (9)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent