Coder Social home page Coder Social logo

Comments (94)

EricSinsky-NOAA avatar EricSinsky-NOAA commented on August 31, 2024 1

@JessicaMeixner-NOAA Glad to see you are getting non-zeroes for C768mx025. Were the C768mx025 test cases also based on the HR3 tag (not just the C48mx500 test case)? Also did you run the C768mx025 test case using both your old build and new build too?

Also, I ran an old version of ocnicepost offline. I got non-zeroes in the interpolated NetCDF output. In this test, however, the resolution of the NetCDF input (MOM6) data was mx025.

from global-workflow.

JessicaMeixner-NOAA avatar JessicaMeixner-NOAA commented on August 31, 2024 1

@EricSinsky-NOAA It is nice to see some non-zero values, for sure!!

The tests I ran with the HR3 tag, I ran both the old build and the new build and both had non-zeros.

from global-workflow.

jiandewang avatar jiandewang commented on August 31, 2024

@JessicaMeixner-NOAA we need to check the regular grid ocean nc files (which is used as input for converting to grib2) but they were erased in the g-w runs. For example the following doean't exist anymore:
/scratch1/NCEPDEV/climate/Jessica.Meixner/cycling/iau_06/C384iaucold03/TMP/RUNDIRS/cold03/oceanice_products.2629439

from global-workflow.

JessicaMeixner-NOAA avatar JessicaMeixner-NOAA commented on August 31, 2024

@jiandewang I'll rewind and re-run one of them and save the rundir. I'll post back here when I have that.

from global-workflow.

JessicaMeixner-NOAA avatar JessicaMeixner-NOAA commented on August 31, 2024

Here's the saved output @jiandewang :

TMP: /scratch1/NCEPDEV/climate/Jessica.Meixner/cycling/iau_06/C384iaucold03/TMP/RUNDIRS/cold03/oceanice_products.4064953
LOG: /scratch1/NCEPDEV/climate/Jessica.Meixner/cycling/iau_06/C384iaucold03/cold03/COMROOT/cold03/logs/2021070306/gfsocean_prod_f234-f240.log
COM: /scratch1/NCEPDEV/climate/Jessica.Meixner/cycling/iau_06/C384iaucold03/cold03/COMROOT/cold03/gfs.20210703/06/products/ocean

from global-workflow.

jiandewang avatar jiandewang commented on August 31, 2024

@JessicaMeixner-NOAA quick check for these three files:
ocean.nc: ocean native grid master file, looks good
ocean.0p25.nc: regular grid, all zero
ocean.1p00.nc: regular grid, all zero

so the problem happened on tripolar to regular step, let me go through log file to see if there is any clue

from global-workflow.

jiandewang avatar jiandewang commented on August 31, 2024

@JessicaMeixner-NOAA can you re-run it but set debug to true ?
see last line /scratch1/NCEPDEV/climate/Jessica.Meixner/cycling/iau_06/C384iaucold03/TMP/RUNDIRS/cold03/oceanice_products.4064953/ocnicepost.nml

from global-workflow.

JessicaMeixner-NOAA avatar JessicaMeixner-NOAA commented on August 31, 2024

@jiandewang here's the output with debug=true:
/scratch1/NCEPDEV/climate/Jessica.Meixner/cycling/iau_06/C384iaucold03/TMP/RUNDIRS/cold03/oceanice_products.3448181

from global-workflow.

aerorahul avatar aerorahul commented on August 31, 2024

The output with debug = .true. is tracing the code execution.
I did a ncview and ncdump on intermediate files e.g. ocean.0p25.rdbilin3d.nc, etc., but I am unable to get any clues from them.
I wondered if there has been a change in the interpolation weights.
So, I looked at /scratch1/NCEPDEV/global/glopara/fix/mom6/20240416/post/mx025/
and the timestamp on these files is 20240403 which seems reasonable.

If needed, I can dig deeper into the interpolation code.

from global-workflow.

JessicaMeixner-NOAA avatar JessicaMeixner-NOAA commented on August 31, 2024

@GwenChen-NOAA do you have an idea as to what is going on? We'd appreciate your help to determine issues here.

from global-workflow.

jiandewang avatar jiandewang commented on August 31, 2024

I am trying to understanding the run sequential for this post job: fcst step generate oceannativenc, then it being copied as ocean.nc and further more cut out key variables and saved as ocean_subset.nc. Which one is being used as input for post ? ocean.nc or ocean_subset.nc ?

ls -l /scratch1/NCEPDEV/climate/Jessica.Meixner/cycling/iau_06/C384iaucold03/TMP/RUNDIRS/cold03/oceanice_products.3448181/ocean*nc

-rw-r--r-- 1 Jessica.Meixner climate 1328960900 May 22 10:46 ocean.0p25.nc
-rw-r--r-- 1 Jessica.Meixner climate 83412020 May 22 10:45 ocean.1p00.nc
-rw-r--r-- 1 Jessica.Meixner climate 2090477767 May 21 13:06 ocean.nc
-rw-r--r-- 1 Jessica.Meixner climate 1959785283 May 22 10:46 ocean_subset.nc

ocean.1p00.nc is generated 1 minute before ocean_subset.nc

looked at line 74 /scratch1/NCEPDEV/climate/Jessica.Meixner/cycling/iau_06/C384iaucold03/TMP/RUNDIRS/cold03/oceanice_products.3448181/ocean.post.log
it shows the min/max before and after the interpolation and the # here are totally fine. But somehow when we looked at the final products, they are all zero. Really puzzled here.

from global-workflow.

GwenChen-NOAA avatar GwenChen-NOAA commented on August 31, 2024

@JessicaMeixner-NOAA, can you provide the sea-ice PR number that just merged? It will be helpful to look at the code changes.

from global-workflow.

GwenChen-NOAA avatar GwenChen-NOAA commented on August 31, 2024

I am trying to understanding the run sequential for this post job: fcst step generate ocean_native_nc, then it being copied as ocean.nc and further more cut out key variables and saved as ocean_subset.nc. Which one is being used as input for post? ocean.nc or ocean_subset.nc?

@jiandewang, the ocean.nc files are used to generate grib2 files. The ocean_subset.nc files are moved to the /products directory as the netcdf products to be distributed through NOMADS.

from global-workflow.

JessicaMeixner-NOAA avatar JessicaMeixner-NOAA commented on August 31, 2024

@jiandewang I think ocean.nc is used to create ocean_subset.nc - I could be wrong... let me look into that more.

@GwenChen-NOAA - The PR is #2584 I did just confirm that output from hera from before this PR was merged also had the issue where the grib files were zero output, so the sea-ice analysis PR is not the cause of this problem. I"m not sure how long this issue has been in the develop branch, if it's just a hera issue or something else?

from global-workflow.

GwenChen-NOAA avatar GwenChen-NOAA commented on August 31, 2024

@GwenChen-NOAA - The PR is #2584 I did just confirm that output from hera from before this PR was merged also had the issue where the grib files were zero output, so the sea-ice analysis PR is not the cause of this problem. I"m not sure how long this issue has been in the develop branch, if it's just a hera issue or something else?

@JessicaMeixner-NOAA, can you run it on WCOSS2? I know downstream package can only run on WCOSS2.

from global-workflow.

JessicaMeixner-NOAA avatar JessicaMeixner-NOAA commented on August 31, 2024

@GwenChen-NOAA The ocean post products should be able to be generated on RHDPCS, not just WCOSS2. I don't have a workflow set-up there right now, so it would be great if you could try that out to see if it works.

I did find an old run that I was doing when trying to update the ufs-weather-model to a more recent version and it has non-zero fields: /scratch1/NCEPDEV/climate/Jessica.Meixner/testgw2505/test02/COMROOT/test02/gfs.20191203/00/products/ocean/grib2/1p00/gfs.ocean.t00z.1p00.f072.grib2 (for example has non-zero fields). The commit of g-w was updates from an April 17th commit. We could also look into if module for hera were updated within the ufs-weather-model between the updates as I do think that this job is using the ufs-weather-model.

from global-workflow.

JessicaMeixner-NOAA avatar JessicaMeixner-NOAA commented on August 31, 2024

Okay I did confirm that the ufs-weather-model modules have not changed on hera, so it's not just that.

from global-workflow.

JessicaMeixner-NOAA avatar JessicaMeixner-NOAA commented on August 31, 2024

@EricSinsky-NOAA I see that you've been running some ocean/ice post recently. Thought I'd ping you in this to see if you've noticed that grib files of the ocean were zeros or constant in any of your testing.

from global-workflow.

EricSinsky-NOAA avatar EricSinsky-NOAA commented on August 31, 2024

@JessicaMeixner-NOAA I just ran the C48_S2SWA_gefs CI test case today using the most recent hash (7d2c539). I also see all zeroes in the gridded (5 degree) ocean data. The data is all zeroes in the gridded NetCDF data as well (not just the gridded grib2 data).

from global-workflow.

JessicaMeixner-NOAA avatar JessicaMeixner-NOAA commented on August 31, 2024

@JessicaMeixner-NOAA I just ran the C48_S2SWA_gefs CI test case today using a the most recent hash (7d2c539). I also see all zeroes in the gridded (5 degree) ocean data. The data is all zeroes in the gridded NetCDF data as well (not just the gridded grib2 data).

@EricSinsky-NOAA thanks for the info! what machine was that on?

from global-workflow.

EricSinsky-NOAA avatar EricSinsky-NOAA commented on August 31, 2024

@EricSinsky-NOAA thanks for the info! what machine was that on?

@JessicaMeixner-NOAA This test was on Cactus.

from global-workflow.

JessicaMeixner-NOAA avatar JessicaMeixner-NOAA commented on August 31, 2024

Thanks @EricSinsky-NOAA, seems like this is not just a hera issue then.

I'm re-running my case on hera where i went back and found that I had output I expected. I'm then going to merge in develop and see how that goes as well. Hopefully will have an update on that this afternoon.

from global-workflow.

JessicaMeixner-NOAA avatar JessicaMeixner-NOAA commented on August 31, 2024

Okay, my re-run of something where I thought I had previously had grib2 output that was non-zero, did not give me non-zeros this time.... I believe that should rule out the model version, but not sure what to look at now...

from global-workflow.

JessicaMeixner-NOAA avatar JessicaMeixner-NOAA commented on August 31, 2024

@GwenChen-NOAA when you tested this: #2611 did you get non-zero grib2 output files?

from global-workflow.

GwenChen-NOAA avatar GwenChen-NOAA commented on August 31, 2024

@GwenChen-NOAA when you tested this: #2611 did you get non-zero grib2 output files?

@JessicaMeixner-NOAA, my test used an old version of the ocean.0p25.nc file (i.e., latlon netcdf file output from ocnicepost) and worked fine. I saw the ocean.0p25.nc file under /scratch1/NCEPDEV/climate/Jessica.Meixner/cycling/iau_06/C384iaucold03/TMP/RUNDIRS/cold03/oceanice_products.3448181 also contains all zero. I found a recent closed issue (#2483) that updated fix files for CICE and MOM6/post. Perhaps @DeniseWorthen can provide some clues here.

from global-workflow.

aerorahul avatar aerorahul commented on August 31, 2024

@GwenChen-NOAA when you tested this: #2611 did you get non-zero grib2 output files?

@JessicaMeixner-NOAA, my test used an old version of the ocean.0p25.nc file (i.e., latlon netcdf file output from ocnicepost) and worked fine. I saw the ocean.0p25.nc file under /scratch1/NCEPDEV/climate/Jessica.Meixner/cycling/iau_06/C384iaucold03/TMP/RUNDIRS/cold03/oceanice_products.3448181 also contains all zero. I found a recent closed issue (#2483) that updated fix files for CICE and MOM6/post. Perhaps @DeniseWorthen can provide some clues here.

The issue #2483 only added/corrected the 5-degree fix file. It did not alter the 0.25-degree or 1.0-degree fix files.

from global-workflow.

JessicaMeixner-NOAA avatar JessicaMeixner-NOAA commented on August 31, 2024

Thanks @aerorahul for that information!

from global-workflow.

EricSinsky-NOAA avatar EricSinsky-NOAA commented on August 31, 2024

I just ran the C48_S2SW CI test case on Cactus using the 5/13/2024 commit hash (6ca106e). The gridded ocean data still consists of all zeroes as of the 5/13/2024 gw version. Will keep trying to go back to earlier commit hashes to get a better idea when and why this issue started.

from global-workflow.

JessicaMeixner-NOAA avatar JessicaMeixner-NOAA commented on August 31, 2024

I updated to the latest version of ufs-weather-model on hera and ran another test and got all zeros in the gribs still. @EricSinsky-NOAA we know at least the HR3 tag 6f9afff from Feb 21st has non-zero gribs on wcoss2. On hera, the furthest back of g-w would be the rocky8 transition commit.

from global-workflow.

EricSinsky-NOAA avatar EricSinsky-NOAA commented on August 31, 2024

Thank you @JessicaMeixner-NOAA for confirming that we still had non-zero gribs as of Feb 21st. @jiandewang When you checked PR #2484 on April 17th (this PR added a more strict dependency to the ocean_prod rocoto task), do you remember if the gridded netcdf/grib2 data was non-zero? I just completed a test (C48_S2SW on Cactus) from an April 16th hash and the netcdf gridded data consists of all zeroes.

from global-workflow.

jiandewang avatar jiandewang commented on August 31, 2024

@EricSinsky-NOAA I recall Rahul asked me to test it based on HR3 but manually modified xml file. That worked fine and I checked the ocean regular and grib2 files at that time. They were fine.

from global-workflow.

jiandewang avatar jiandewang commented on August 31, 2024

@EricSinsky-NOAA can you repeat C48 CI or whatever test case you have but add something like sleep 5 minutes before post job is being trigged

from global-workflow.

JessicaMeixner-NOAA avatar JessicaMeixner-NOAA commented on August 31, 2024

@EricSinsky-NOAA can you repeat C48 CI or whatever test case you have but add something like sleep 5 minutes before post job is being trigged

I don't think that should be the cause as my re-runs yesterday were after the full forecast was completed, so there would be no issue of files not being there completely, I would think?

from global-workflow.

EricSinsky-NOAA avatar EricSinsky-NOAA commented on August 31, 2024

Thank you @jiandewang! Sure, I am increasing the sleep time to 5 minute and am rerunning the C48 CI. However, based on what @JessicaMeixner-NOAA said about her re-runs, this might not be the reason for the zeroes in the gridded data.

from global-workflow.

jiandewang avatar jiandewang commented on August 31, 2024

Thank you @jiandewang! Sure, I am increasing the sleep time to 5 minute and am rerunning the C48 CI. However, based on what @JessicaMeixner-NOAA said about her re-runs, this might not be the reason for the zeroes in the gridded data.

agree, 99% chance this is not the reason

from global-workflow.

aerorahul avatar aerorahul commented on August 31, 2024

FWIW, I cloned the hash d6be3b5 corresponding to the PR #2421.
I setup and ran a C48_S2S test on Hera.

The model output contains reasonable values.
Screenshot 2024-05-23 at 11 25 12 AM

The interpolation output however, contains zeros.
Screenshot 2024-05-23 at 11 25 45 AM

If there is someone willing to re-do this exact test on Orion/WCOSS2, we could narrow the issue down between the software stack on Hera and the interpolation code.

edit: One does not need to re-run the entire experiment, just clone and build this hash and re-run the ocean post code with the model output from Hera.
Everything needed is in: /scratch1/NCEPDEV/stmp2/Rahul.Mahajan/RUNDIRS/zeros/oceanice_products.3127422

from global-workflow.

EricSinsky-NOAA avatar EricSinsky-NOAA commented on August 31, 2024

@aerorahul I just ran C48_S2SW on WCOSS2 using hash fa855ba from March 18th (prior to the Rocky 8 hash that you tested). The raw model ocean output contains reasonable values, but the interpolated ocean output are all zeroes.

from global-workflow.

aerorahul avatar aerorahul commented on August 31, 2024

I ran the ocnicepost.x on Orion with the output from Hera and the interpolated output has zeros!

from global-workflow.

JessicaMeixner-NOAA avatar JessicaMeixner-NOAA commented on August 31, 2024

Okay, I'm going to run the C48 S2SW Ci test with the HR3 tag on wcoss2, hopefully that works as we expect....

from global-workflow.

jiandewang avatar jiandewang commented on August 31, 2024

Okay, I'm going to run the C48 S2SW Ci test with the HR3 tag on wcoss2, hopefully that works as we expect....

I am repeating one of HR3 run on wcoss now

from global-workflow.

EricSinsky-NOAA avatar EricSinsky-NOAA commented on August 31, 2024

Thank you for testing the HR3 tag, @JessicaMeixner-NOAA. I just tested the gw hash (9608852) from 2/26/2024 on WCOSS2 (C48 S2SW Ci test). I am still getting all zeroes in the interpolated ocean output. I also wonder if this same issue would occur for a case initialized at 00Z.

from global-workflow.

jiandewang avatar jiandewang commented on August 31, 2024

just had one ocean post done (HR3 tag on wcoss2), grided ocean file looks fine. See cactus
/lfs/h2/emc/ptmp/jiande.wang/HR3-work/RUNDIRS/HR3-20191203/ocean.1p00.nc

from global-workflow.

EricSinsky-NOAA avatar EricSinsky-NOAA commented on August 31, 2024

Thank you @jiandewang. It looks like the case you tested is a 00Z run (2019120300). It will be interesting to see if @JessicaMeixner-NOAA also gets reasonable gridded ocean output for the C48 CI test case (12Z run).

from global-workflow.

jiandewang avatar jiandewang commented on August 31, 2024

one more clue:
in my just finished wcoss2 HR3 run,
/lfs/h2/emc/ptmp/jiande.wang/HR3-work/RUNDIRS/HR3-20191203/oceanice_products.73074
jiande.wang@clogin02:/lfs/h2/emc/ptmp/jiande.wang/HR3-work/RUNDIRS/HR3-20191203/oceanice_products.73074> ls -l tr*nc
-rw-r--r-- 1 jiande.wang emc 443660244 Oct 25 2023 tripole.mx025.Bu.to.Ct.bilinear.nc
-rw-r--r-- 1 jiande.wang emc 322230100 Oct 25 2023 tripole.mx025.Ct.to.rect.0p25.bilinear.nc
-rw-r--r-- 1 jiande.wang emc 344591848 Oct 25 2023 tripole.mx025.Ct.to.rect.0p25.conserve.nc
-rw-r--r-- 1 jiande.wang emc 165958772 Oct 25 2023 tripole.mx025.Ct.to.rect.1p00.bilinear.nc
-rw-r--r-- 1 jiande.wang emc 193551336 Oct 25 2023 tripole.mx025.Ct.to.rect.1p00.conserve.nc
-rw-r--r-- 1 jiande.wang emc 410574804 Oct 25 2023 tripole.mx025.Cu.to.Ct.bilinear.nc
-rw-r--r-- 1 jiande.wang emc 443660244 Oct 25 2023 tripole.mx025.Cv.to.Ct.bilinear.nc

but in Jessica's yesterday's run on HERA, it was /scratch1/NCEPDEV/climate/Jessica.Meixner/cycling/iau_06/C384iaucold03/TMP/RUNDIRS/cold03/oceanice_products.344818
but I think Jessica deleted it. Lucky I made a copy of it yesterday. So if you see HERA /scratch1/NCEPDEV/climate/Jiande.Wang/working/scratch/ocean-zero-value/oceanice_products.3448181

/scratch1/NCEPDEV/climate/Jiande.Wang/working/scratch/ocean-zero-value/oceanice_products.3448181[119]ll tr*nc
-r--r--r-- 1 Jiande.Wang climate 443660268 Apr 3 14:09 tripole.mx025.Bu.to.Ct.bilinear.nc
-r--r--r-- 1 Jiande.Wang climate 322230132 Apr 3 14:09 tripole.mx025.Ct.to.rect.0p25.bilinear.nc
-r--r--r-- 1 Jiande.Wang climate 344591884 Apr 3 14:09 tripole.mx025.Ct.to.rect.0p25.conserve.nc
-r--r--r-- 1 Jiande.Wang climate 165958804 Apr 3 14:09 tripole.mx025.Ct.to.rect.1p00.bilinear.nc
-r--r--r-- 1 Jiande.Wang climate 193551372 Apr 3 14:09 tripole.mx025.Ct.to.rect.1p00.conserve.nc
-r--r--r-- 1 Jiande.Wang climate 410574828 Apr 3 14:09 tripole.mx025.Cu.to.Ct.bilinear.nc
-r--r--r-- 1 Jiande.Wang climate 443660268 Apr 3 14:10 tripole.mx025.Cv.to.Ct.bilinear.nc

are they suppose to be the same size ?

from global-workflow.

JessicaMeixner-NOAA avatar JessicaMeixner-NOAA commented on August 31, 2024

@jiandewang /scratch1/NCEPDEV/climate/Jessica.Meixner/cycling/iau_06/C384iaucold03/TMP/RUNDIRS/cold03/oceanice_products.3448181 is still there? Not sure what was going on.

Sometimes different machines will calculate size a little differently.

from global-workflow.

jiandewang avatar jiandewang commented on August 31, 2024

@JessicaMeixner-NOAA I scp-ed fixed file to HERA, they are not the same. You can do cmp between
/scratch1/NCEPDEV/climate/Jiande.Wang/working/scratch/ocean-zero-value/oceanice_products.3448181/fixed-file-wcoss2
and
/scratch1/NCEPDEV/climate/Jiande.Wang/working/scratch/ocean-zero-value/oceanice_products.3448181

from global-workflow.

JessicaMeixner-NOAA avatar JessicaMeixner-NOAA commented on August 31, 2024

Thank you @jiandewang. It looks like the case you tested is a 00Z run (2019120300). It will be interesting to see if @JessicaMeixner-NOAA also gets reasonable gridded ocean output for the C48 CI test case (12Z run).

@EricSinsky-NOAA - I did look at the output from the cycled test that sparked this issue and 00z is 0s as well for that run.

from global-workflow.

jiandewang avatar jiandewang commented on August 31, 2024

@EricSinsky-NOAA will you be able to repeat your run on HERA but use fixed file I just copied from wcoss2 ?
They are at /scratch1/NCEPDEV/climate/Jiande.Wang/working/scratch/ocean-zero-value/oceanice_products.3448181/fixed-file-wcoss2

from global-workflow.

EricSinsky-NOAA avatar EricSinsky-NOAA commented on August 31, 2024

@jiandewang Sure, I'll rerun using the fixed files from /scratch1/NCEPDEV/climate/Jiande.Wang/working/scratch/ocean-zero-value/oceanice_products.3448181/fixed-file-wcoss2.

from global-workflow.

EricSinsky-NOAA avatar EricSinsky-NOAA commented on August 31, 2024

@jiandewang Do you have equivalent fix files for mx500 resolution? The CI test case I have been running has MOM6 at mx500.

from global-workflow.

jiandewang avatar jiandewang commented on August 31, 2024

@EricSinsky-NOAA no I don't have. HR3 is only for 025 ocean

from global-workflow.

EricSinsky-NOAA avatar EricSinsky-NOAA commented on August 31, 2024

@jiandewang After looking more closely at the fix files you are using for your HR3 runs, it looks like you are using an older version of the fix files from 20231219. Your fix files are identical in size to those found here in glopara: /scratch1/NCEPDEV/global/glopara/fix/mom6/20231219/post/mx025/ (/lfs/h2/emc/global/noscrub/emc.global/FIX/fix/mom6/20231219/post/mx025/ on WCOSS2)

In my test runs (and I believe @JessicaMeixner-NOAA's test runs too), I have been using a newer version of the fix files from 20240416, which are found here in glopara: /scratch1/NCEPDEV/global/glopara/fix/mom6/20240416/post/mx025/

from global-workflow.

jiandewang avatar jiandewang commented on August 31, 2024

I compared tripole.mx025.Ct.to.rect.1p00.conserve.nc between the two, looks there is a 360 offset between them:
xc_a = -299.718339695101, -299.47037035674, -299.22239891217 <-- HERA
xc_a = 60.2816603048989, 60.5296296432605, 60.7776010878256 <--wcoss2

from global-workflow.

jiandewang avatar jiandewang commented on August 31, 2024

@EricSinsky-NOAA can we re-run the exectuable here offline ?
/scratch1/NCEPDEV/climate/Jiande.Wang/working/scratch/ocean-zero-value/oceanice_products.3448181
what kind of module do we need to load ?
I tried but got error
./ocnicepost.x: symbol lookup error: ./ocnicepost.x: undefined symbol: netcdf_mp_nf90_open_

but I do have netcdf4 and hdf5 module loaded

from global-workflow.

EricSinsky-NOAA avatar EricSinsky-NOAA commented on August 31, 2024

@jiandewang Good find. It looks like there is a 360 offset between the 20231219 version and the 20240416 version of these fix files. These can be both found on Hera:

Version used in HR3 (20231219): /scratch1/NCEPDEV/global/glopara/fix/mom6/20231219/post/mx025/tripole.mx025.Ct.to.rect.1p00.conserve.nc

Newer version (20240416): /scratch1/NCEPDEV/global/glopara/fix/mom6/20240416/post/mx025/tripole.mx025.Ct.to.rect.1p00.conserve.nc

from global-workflow.

EricSinsky-NOAA avatar EricSinsky-NOAA commented on August 31, 2024

@EricSinsky-NOAA can we re-run the exectuable here offline ? /scratch1/NCEPDEV/climate/Jiande.Wang/working/scratch/ocean-zero-value/oceanice_products.3448181 what kind of module do we need to load ? I tried but got error ./ocnicepost.x: symbol lookup error: ./ocnicepost.x: undefined symbol: netcdf_mp_nf90_open_

but I do have netcdf4 and hdf5 module loaded

I have ran ocnicepost.x offline before, but it has been a couple of months.

from global-workflow.

EricSinsky-NOAA avatar EricSinsky-NOAA commented on August 31, 2024

@jiandewang I would start by executing source ush/load_fv3gfs_modules.sh before running ocnicepost.x offline.

from global-workflow.

jiandewang avatar jiandewang commented on August 31, 2024

@EricSinsky-NOAA what's wrong in what I did below ? why it added an extar "/" before "ush"

cd /scratch1/NCEPDEV/climate/Jiande.Wang/working/scratch/ocean-zero-value/global-workflow
source ush/load_fv3gfs_modules.sh
Loading modules quietly...
-bash: /ush/detect_machine.sh: No such file or directory
-bash: /ush/module-setup.sh: No such file or directory
-bash: /versions/run.ver: No such file or directory
WARNING: UNKNOWN PLATFORM
No modules loaded

from global-workflow.

EricSinsky-NOAA avatar EricSinsky-NOAA commented on August 31, 2024

@jiandewang I am getting the same error too when I try to load modules using load_fv3gfs_modules.sh. However, I did a quick test in /lfs/h2/emc/stmp/eric.sinsky/RUNDIRS/gw_ocnbugfix2/oceanice_products.242828 and was able to execute ocnicepost.x offline. These are the modules I have loaded
image

from global-workflow.

WalterKolczynski-NOAA avatar WalterKolczynski-NOAA commented on August 31, 2024

@EricSinsky-NOAA what's wrong in what I did below ? why it added an extar "/" before "ush"

cd /scratch1/NCEPDEV/climate/Jiande.Wang/working/scratch/ocean-zero-value/global-workflow source ush/load_fv3gfs_modules.sh Loading modules quietly... -bash: /ush/detect_machine.sh: No such file or directory -bash: /ush/module-setup.sh: No such file or directory -bash: /versions/run.ver: No such file or directory WARNING: UNKNOWN PLATFORM No modules loaded

Do this first:

export HOMEgfs="/scratch1/NCEPDEV/climate/Jiande.Wang/working/scratch/ocean-zero-value/global-workflow"

from global-workflow.

jiandewang avatar jiandewang commented on August 31, 2024

@jiandewang I am getting the same error too when I try to load modules using load_fv3gfs_modules.sh. However, I did a quick test in /lfs/h2/emc/stmp/eric.sinsky/RUNDIRS/gw_ocnbugfix2/oceanice_products.242828 and was able to execute ocnicepost.x offline. These are the modules I have loaded image

@EricSinsky-NOAA can you copy and paste your module list here so that I can do copy and paste ?

from global-workflow.

EricSinsky-NOAA avatar EricSinsky-NOAA commented on August 31, 2024

craype-x86-rome
libfabric/1.11.0.0.
craype-network-ofi
envvar/1.0
intel/19.1.3.304
PrgEnv-intel/8.1.0
imagemagick/7.0.8-7
subversion/1.14.0
libjpeg/9c
grib_util/1.2.2
wgrib2/2.0.8_wmo
GrADS/2.2.2
ecflow/5.6.0.11
cdo/1.9.8
udunits/2.2.28
ncview/2.1.7
python/3.8.6
proj/7.1.0
geos/3.8.1
prod_util/2.0.14
w3nco/2.4.1
core/rocoto/1.3.5
hdf5/1.10.6
netcdf/4.7.4

from global-workflow.

jiandewang avatar jiandewang commented on August 31, 2024

@EricSinsky-NOAA I see you are testing on wcoss2. Can you repeat your testing on HERA but use the following as a template ?
/scratch1/NCEPDEV/climate/Jiande.Wang/working/scratch/ocean-zero-value/oceanice_products.3448181

from global-workflow.

EricSinsky-NOAA avatar EricSinsky-NOAA commented on August 31, 2024

@jiandewang I just ran ocnicepost.x offline on Hera using your template. The interpolated output can be found here: /scratch2/NCEPDEV/ensemble/noscrub/Eric.Sinsky/ocnpost_bugfix/oceanice_products.3448181/ocean.0p25.nc

from global-workflow.

jiandewang avatar jiandewang commented on August 31, 2024

@jiandewang I just ran ocnicepost.x offline on Hera using your template. The interpolated output can be found here: /scratch2/NCEPDEV/ensemble/noscrub/Eric.Sinsky/ocnpost_bugfix/oceanice_products.3448181/ocean.0p25.nc

can you share me your module list on HERA ?

also can you replace fixed file with /scratch2/NCEPDEV/ensemble/noscrub/Eric.Sinsky/ocnpost_bugfix/oceanice_products.3448181/fixed-file-wcoss2
and re-run it ?

from global-workflow.

WalterKolczynski-NOAA avatar WalterKolczynski-NOAA commented on August 31, 2024

@jiandewang I just ran ocnicepost.x offline on Hera using your template. The interpolated output can be found here: /scratch2/NCEPDEV/ensemble/noscrub/Eric.Sinsky/ocnpost_bugfix/oceanice_products.3448181/ocean.0p25.nc

can you share me your module list on HERA ?

also can you replace fixed file with /scratch2/NCEPDEV/ensemble/noscrub/Eric.Sinsky/ocnpost_bugfix/oceanice_products.3448181/fixed-file-wcoss2 and re-run it ?

@jiandewang If you export HOMEgfs first (see above), load_fv3gfs_modules.sh should work

from global-workflow.

jiandewang avatar jiandewang commented on August 31, 2024

@jiandewang I just ran ocnicepost.x offline on Hera using your template. The interpolated output can be found here: /scratch2/NCEPDEV/ensemble/noscrub/Eric.Sinsky/ocnpost_bugfix/oceanice_products.3448181/ocean.0p25.nc

can you share me your module list on HERA ?
also can you replace fixed file with /scratch2/NCEPDEV/ensemble/noscrub/Eric.Sinsky/ocnpost_bugfix/oceanice_products.3448181/fixed-file-wcoss2 and re-run it ?

@jiandewang If you export HOMEgfs first (see above), load_fv3gfs_modules.sh should work

@WalterKolczynski-NOAA no more module loading error after I did export HOMEgfs=.... Thanks

from global-workflow.

EricSinsky-NOAA avatar EricSinsky-NOAA commented on August 31, 2024

Thanks @WalterKolczynski-NOAA. Adding HOMEgfs to my environment allowed me to successfully execute load_fv3gfs_modules.sh.

from global-workflow.

EricSinsky-NOAA avatar EricSinsky-NOAA commented on August 31, 2024

@jiandewang After replacing the fix files with /scratch2/NCEPDEV/ensemble/noscrub/Eric.Sinsky/ocnpost_bugfix/oceanice_products.3448181/fixed-file-wcoss2 and rerunning, I am still getting all zeroes.

from global-workflow.

JessicaMeixner-NOAA avatar JessicaMeixner-NOAA commented on August 31, 2024

My test run of C48 on wcoss2 did not do well: /lfs/h2/emc/couple/noscrub/jessica.meixner/testoceanpost/hr3/test01/COMROOT/c48t01/gfs.20210323/12/products/ocean/grib2/5p00

from global-workflow.

EricSinsky-NOAA avatar EricSinsky-NOAA commented on August 31, 2024

Thank you, @JessicaMeixner-NOAA. It sounds like this might be an issue with the build of ocnicepost.x on WCOSS2 and Hera. @jiandewang When you ran your HR3 test and you got reasonable interpolated ocean output, did you rebuild ocnicepost.x (as well as the other executables related to HR3) during your test?

from global-workflow.

jiandewang avatar jiandewang commented on August 31, 2024

Thank you, @JessicaMeixner-NOAA. It sounds like this might be an issue with the build of ocnicepost.x on WCOSS2 and Hera. @jiandewang When you ran your HR3 test and you got reasonable interpolated ocean output, did you rebuild ocnicepost.x (as well as the other executables related to HR3) during your test?

no I just used my original several month ago's *.x

from global-workflow.

JessicaMeixner-NOAA avatar JessicaMeixner-NOAA commented on August 31, 2024

I did a new build, but I did have an old build too... I'll try the 0.25 case w/the new build and I'll also try using my old build on a C48 case and see what happens.

from global-workflow.

JessicaMeixner-NOAA avatar JessicaMeixner-NOAA commented on August 31, 2024

Update:

  • With the HR3 g-w tag 6f9afff from Feb 21, 2024 on WCOSS2 using both old and new builds I get the zeros for the C48mx500 test cases, and I get actual non-zero answers for the C768mx025 test case I ran.

Therefore, I think there are likely issues with all of the 5 deg cases and so we should not be using that to see if things are working or not.

from global-workflow.

EricSinsky-NOAA avatar EricSinsky-NOAA commented on August 31, 2024

This is my understanding on what we know so far:

  • The C768mx025 case with the HR3 tag result in non-zero values in the interpolated ocean output for both the new build and old build on WCOSS2. This means that building ocnicepost in the present day WCOSS2 environment should be ok.
  • The C48mx500 case with the HR3 tag result in all zero values in the interpolated ocean output for both the new build and old build on WCOSS2.
  • The C384mx025 case using the most recent gw hash results in all zero values in the interpolated ocean output (based on Jessica's tests on Hera). This means that we still get zero values for mx025 cases from hashes newer than the HR3 tag.

from global-workflow.

JessicaMeixner-NOAA avatar JessicaMeixner-NOAA commented on August 31, 2024

@EricSinsky-NOAA I'd say that we get zero's with the newest hashes, where the mx025 issues come in between now and HR3 tag is an open question I think, since most of our previous testing was based on mx500, I'm not sure we have a lot of information about the in-between parts. I'm going to run a few tests on WCOSS2 to see if we can narrow down issues there.

from global-workflow.

aerorahul avatar aerorahul commented on August 31, 2024

Thank you @EricSinsky-NOAA for the summary and @JessicaMeixner-NOAA for the additional information.

A few questions:

  • For the C768mx025 case with the HR3 tag can you drop the date of the fix files (interpolation weights) being used?
  • If we re-run the executable ocnicepost.x by replacing these weights w/ the develop version, do we get non-zero result?

I'ld say we need to find a baseline that works first; I think we have that for C768mx025 case with the HR3 tag. Unfortunately C48mx500 with the HR3 tag resulted in zeros.

from global-workflow.

JessicaMeixner-NOAA avatar JessicaMeixner-NOAA commented on August 31, 2024

For the HR3 tag on WCOSS2 the mom6 fix files are:
mom6 -> /lfs/h2/emc/global/noscrub/emc.global/FIX/fix/mom6/20231219

I'm currently trying to test the commit before the fix file change on wcoss2 with mx025 to see if that works. I did find an experiment on hera that a case using the old fix files and mx025 still gave me zeros...

from global-workflow.

JessicaMeixner-NOAA avatar JessicaMeixner-NOAA commented on August 31, 2024

I ran with mx025 on WCOSS2 for commit hashes 6ca106e and d5366c6 (the one that changed the mom6 fix) and they both give me non-zero output for the grib2 files....

I can share paths if that's helpful. Has anyone tried anything mx025 on orion?

from global-workflow.

JessicaMeixner-NOAA avatar JessicaMeixner-NOAA commented on August 31, 2024

So some random thoughts before the weekend:

  • Is it possible that there could be a module mis-match issue causing problems on hera where gfs_utils is using 1.6.1 spack-stack, but then the ocean post job is loading the ufs-weather-model module files which is 1.5.1?
  • Did we ever confirm that the reason for the diffs between wcoss2 and hera that @jiandewang saw were because of version numbers or were there actually differences?

from global-workflow.

EricSinsky-NOAA avatar EricSinsky-NOAA commented on August 31, 2024
  • Did we ever confirm that the reason for the diffs between wcoss2 and hera that @jiandewang saw were because of version numbers or were there actually differences?

@JessicaMeixner-NOAA The diffs between WCOSS2 and Hera are because the comparisons were between two different versions of the fix files. The fix files being compared from WCOSS2 are the 20231219 version, while the fix files being compared from Hera are the 20240416 version. Both fix file versions exist on both WCOSS2 and Hera. When the fix files of the same version are compared between WCOSS2 and Hera, the file sizes are identical.

from global-workflow.

JessicaMeixner-NOAA avatar JessicaMeixner-NOAA commented on August 31, 2024

@EricSinsky-NOAA thanks for confirming that!

from global-workflow.

jiandewang avatar jiandewang commented on August 31, 2024

some further testing results:
(1) The fix files 20231219 version vs 20240416 version: there is a 360 degree offset in longitute between them. The results generated by them are not identical but differences are on roundoff level (~E-8). So this is not the reason for the zero value in regular grid file.

(2) in HR3 run on wcoss2 which gave us correct results, ocean master files are on 40 levels. However in Jessica's HERA run (/scratch1/NCEPDEV/climate/Jessica.Meixner/cycling/iau_06/C384iaucold03/TMP/RUNDIRS/cold03/oceanice_products.3448181) and Eric's run, ocean.nc are on 75 levels because you are setting as DA

see https://github.com/NOAA-EMC/global-workflow/blob/develop/parm/config/gfs/config.ufs#L454-L459
I used Jessica's run dir as template but replaced ocean.nc by the one from HR3 run (40L), then it generated correct regular grid file.

from global-workflow.

jiandewang avatar jiandewang commented on August 31, 2024

more testing results:
It is the missing value that messed up the results. In HR3 run it is -e34 while in DA it is set as 0.
After I re-set missing value to -e34 in ocean.nc from Jessica's run dir, the interpolated results are correct. I think this missing value is embeded in fixed files when they were generated using one of previous HRx run output where it is -e34.
I did my test on wcoss2. Somehow I had trouble to run it on HERA due to module loading.

@EricSinsky-NOAA : you may repeat your run but use my modified input file at /scratch1/NCEPDEV/climate/Jiande.Wang/working/scratch/ocean-zero-value/ceanice_products.3448181-JM/NCO2/ocean.nc-JM-75L-E34 or you can simply repeat your C48mx500 run but set https://github.com/NOAA-EMC/global-workflow/blob/develop/parm/config/gfs/config.ufs#L456C9-L456C31 as -e34

from global-workflow.

EricSinsky-NOAA avatar EricSinsky-NOAA commented on August 31, 2024

@jiandewang Thank you very much for finding the issue! I just ran the C48_S2SWA_gefs CI test case (MOM6 is set to mx500) using the most recent hash. I have set MOM6_DIAG_MISVAL to -1e34 in parm/config/gefs/config.ufs and this fixed the issue (non-zeroes in the interpolated ocean output).

EDIT: My test was on WCOSS2.

from global-workflow.

JessicaMeixner-NOAA avatar JessicaMeixner-NOAA commented on August 31, 2024

The exception value will need to be resolved with @guillaumevernieres and others, as DA might need the missing value to be set as 0.

@jiandewang what module issues did you have on hera? I was curious on Friday if we had module mis-match issues as a possible issue.

from global-workflow.

jiandewang avatar jiandewang commented on August 31, 2024

@JessicaMeixner-NOAA I followed Walter's method (the g-w I used is the cycle one you asked me to run). No error pop out after I did source ush/......... but when I ran ocnicepost.x it crashed at writing 3D mask file.

from global-workflow.

jiandewang avatar jiandewang commented on August 31, 2024

a quick and dirty solution: apply this command in the script after DA ocean files being generated:
ncatted -a missing_value,,m,f,-1E34
that will make oceanpost happy

from global-workflow.

DeniseWorthen avatar DeniseWorthen commented on August 31, 2024

Apologies for being late to the party. Am I understanding that the missing value is defined as 0.0 in the history file? A missing value of 0.0 makes no sense to me, since it is also a valid value. How do you distinguish where Temp=0 because it really is 0.0C and where it is 0 because it is a land point?

from global-workflow.

jiandewang avatar jiandewang commented on August 31, 2024

Apologies for being late to the party. Am I understanding that the missing value is defined as 0.0 in the history file? A missing value of 0.0 makes no sense to me, since it is also a valid value. How do you distinguish where Temp=0 because it really is 0.0C and where it is 0 because it is a land point?

@DeniseWorthen see https://github.com/NOAA-EMC/global-workflow/blob/develop/parm/config/gfs/config.ufs#L456C9-L456C31

from global-workflow.

DeniseWorthen avatar DeniseWorthen commented on August 31, 2024

@jiandewang Thanks, but that doesn't answer my question really. How is a missing value of 0.0 being distinguished from a physical value of 0.0?

from global-workflow.

guillaumevernieres avatar guillaumevernieres commented on August 31, 2024

@jiandewang Thanks, but that doesn't answer my question really. How is a missing value of 0.0 being distinguished from a physical value of 0.0?

@DeniseWorthen , you just don't construct your mask based on the fill value.

from global-workflow.

DeniseWorthen avatar DeniseWorthen commented on August 31, 2024

@guillaumevernieres Thanks. So where does your mask come from?

edit: I mean, which file? Are you retrieving it from the model output or are you using something else?

from global-workflow.

guillaumevernieres avatar guillaumevernieres commented on August 31, 2024

@guillaumevernieres Thanks. So where does your mask come from?

edit: I mean, which file? Are you retrieving it from the model output or are you using something else?

We use the mom6 grid generation functionality but this is overkill for this issue. The mask could simply be constructed using the layer thicknesses.

from global-workflow.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.