ajdawson / eofs Goto Github PK
View Code? Open in Web Editor NEWEOF analysis in Python
Home Page: http://ajdawson.github.io/eofs/
License: GNU General Public License v3.0
EOF analysis in Python
Home Page: http://ajdawson.github.io/eofs/
License: GNU General Public License v3.0
When I run the following example use case
from eofs.xarray import Eof
solver = Eof(data) # flux is the above DataArray
reconstr = solver.reconstructedField(solver.neofs)
The reconstructedField method call raises an AttributeError, pointing to the following section of the code in /lib/python3.10/site-packages/eofs/standard.py
within reconstructedField
:
# Determine how the PCs and EOFs will be selected.
if isinstance(neofs, collections.Iterable):
modes = [m - 1 for m in neofs]
else:
modes = slice(0, neofs)
Error raised: AttributeError: module 'collections' has no attribute 'Iterable'
It looks like usage of collections.Iterable
has been deprecated, being refactored into a second-level module for "abstract base classes" (a similar issue raised here).
Changing the above to
# Determine how the PCs and EOFs will be selected.
if isinstance(neofs, collections.abc.Iterable):
modes = [m - 1 for m in neofs]
else:
modes = slice(0, neofs)
solved the issue for now.
I don't mean to be dense, but it is a bit difficult to figure out what EOF is from just the repository. The tagline, the readme, and the webpage all use EOF endlessly, but not once is it spelled out in full. You have to go all the way to the overview in the documentation to find it. This is confusing for new (or prospective) users.
PS, I do know what it means; I'm just saying it's hard to figure out if you don't know already.
A recent PR (#55) made the source Python 3 compatible, and therefore the tests can now run against the source version on Python 3 as well as 2. Remove this caveat from the documentation.
Hi! I am trying to use the standard solver on a netcdf file that has a data variable (reflectance values), a lat variable, a lon variable, and a time dimension. When I put it into the standard solver, I get an error saying all input data is missing even though I have verified there is reflectance data present... I also tried converting to an xarray and using that xarray solver but got the same warning. Could it be because there are too many masked values? Sorry if I am missing something obvious, I've been picking away at it for quite a while now- thank you for any insights you may be able to provide! Code and example .nc file below
`
filename = '/home/williamcoast/Desktop/test_csv/test.nc'
ncin = Dataset(filename, 'r')
color = ncin.variables['data'][:]
lons = ncin.variables['longitude'][:]
lats = ncin.variables['latitude'][:]
ncin.close()
coslat = np.cos(np.deg2rad(lats))
wgts = np.sqrt(coslat)[..., np.newaxis]
solver = Eof(color, weights=wgts)
Hello, I am using 1.4.0 of the API with Python 3.10.4 (numpy 1.22.3) and am having problems with eofs.standard.reconstructedField(). It gives the following error:
Traceback (most recent call last): File "/Users/alderj/code/Python/EOFs/Osman_EOF.py", line 33, in <module> reconstruction = solver.reconstructedField([1,2]) File "/Users/alderj/miniconda3/envs/IDLPython/lib/python3.10/site-packages/eofs/standard.py", line 638, in reconstructedField if isinstance(neofs, collections.Iterable): AttributeError: module 'collections' has no attribute 'Iterable'
A quick google indicates collections.Iterable is deprecated. Is there a work around? I'd really like to be able to add PC1 and PC2 in the original data units.
Hello,
I am using the eofs package with the xarray option and ran into a problem with the explained variance. I am comparing sea surface temperature from a satellite product (HadISST) and model output from a global circulation model. See the jupyter notebook here
https://github.com/sryan288/Share/blob/master/EOF_HadISST_vs_ORCA.ipynb
The spatial patterns and PCs for the first two modes look very similar, however, the explained variance for the observations (sst array) is about 60% while it is below 40% for the model (orca array). Generally, the first few modes summed up explain an uncommonly small percentage of the total variance for the model data.
As a test I saved both fields that I plug into the EOF functions, read them into Matlab and performed an EOF analysis there, where I get an explained variance around 60 % for the first modes in both data sets.
I am fairly new to Python and there could definitely be a mistake in my code but I couldn't find anything and don't understand what is going on.
I would greatly appreciate any kind of help!
Svenja
We could do with testing with and without extra dependencies (Iris / cdms2). Currently doing the full package tests on Python 2.7 means we are restricted in what numpy versions can be tested against due to cdat-lite's hard numpy dependency. A travis configuration variable could be used to test just the standard interface, and also the full interface if available.
I have been getting an SVD error when trying to call the xarray eof solver on a daskarray.
The error is due to the following line in eofs.standard:
nonMissingIndex = np.where(np.logical_not(np.isnan(self._data[0])))[0]
np.where always fails and gives nans for dask arrays (see e.g. https://stackoverflow.com/questions/59957541/what-is-the-dask-equivalent-of-numpy-where)
Possibly related to #115 (although I can no longer see those notebooks).
Solved by calling .load() before calling the solver, but loses the advantages of dask.
Hi @ajdawson ,
I am a student and a beginner of Python.My teacher give me a research topic which is to reconstruct the data covered by clouds.When I looked up the literature, I found EOF analysis, and the article used this method to reconstruct the data blocked by clouds, and the effect is very good. However, I have encountered problems when using your library. I don't know if my understanding is correct, so I ask you for advice.
Can the function of projectfield (data) realize the data reconstruction mentioned above? If not, is there any other function that can implement this function.
Thank you. Your reply will be very helpful to me.
Tagging as the new version bump.
Here's a notebook I used in class with your package. It uses OpenDAP data for reproducibility. No weighting, a simple SST gridpoints analysis.
In the cell numbered 70-71 I checked orthogonality with simple np.corrcoef(). Poor for eofs, great for pcs. Any thoughts on why?
np.corrcoef(eofs[0,:,:].ravel(), eofs[1,:,:].ravel())
array([[ 1. , -0.07549177],
[-0.07549177, 1. ]])
np.corrcoef(pcs[:,0], pcs[:,1])
array([[1.00000000e+00, 1.87961874e-08],
[1.87961874e-08, 1.00000000e+00]])
I discovered that (line 639 eofs/xarray):
pcs.coords.update({coord.name: (coord.dims, coord)
for coord in self._time_ndcoords})
should instead use the new data's time coordinates with time_ndcoords:
pcs.coords.update({coord.name: (coord.dims, coord)
for coord in time_ndcoords})
With this modification it is possible find the EOF's of X1 then project the fields from X2 (than contain a different set of time coordinates) back onto the EOF's of X1 using projectField. Your implementation references the X1 time coordinates (by calling self._time_ndcoords).
(First time creating an issue on Github so sorry if it's not formatted etc. the correct way)
Hi @ajdawson ,
first of all, thanks for a tremendously useful package.
here are a few questions I cannot find in the doc or examples:
since the sign of an EOF/PC is arbitrary (only the product counts), I often have to multiply both by -1 to get something sensible (e.g. global warming shows up as an upward trending PC and warm colors, not the reverse). How do I do this with your package? (I should note that I use xarray, and as much as possible would like to use its built-in plot capabilities).
I was able to retrieve the variance fraction - nice feature. Is there an easy way to do a scree plot, including the error bars on the eigenvalues? A related feature might be northTest, but I am not quite sure what to do with the numbers it returns.
Thanks in advance!
Julien
Hello,
Thank you so much for creating this code. I have a question: how do you normalize the input data before applying Multivariate EOFs? In addition, can you disable this option (so we could use our own normalization criterion)? I am looking at the code but cannot find where the normalization process occurs. Many thanks!
Hi everyone,
I'm having trouble to calculate the NAO EOF pattern. Using Eofs package example, the resultant covariance map has opposite signal (it was supposed to have negative covariance in the poles and positive in midi latitudes).
Does anyone know what's the problem?
import cartopy.crs as ccrs
import matplotlib.pyplot as plt
import numpy as np
import xarray as xr
from eofs.xarray import Eof
from eofs.examples import example_data_path
filename = example_data_path('hgt_djf.nc')
z_djf = xr.open_dataset(filename)['z']
z_djf = z_djf - z_djf.mean(dim='time')
coslat = np.cos(np.deg2rad(z_djf.coords['latitude'].values)).clip(0., 1.)
wgts = np.sqrt(coslat)[..., np.newaxis]
solver = Eof(z_djf, weights=wgts)
eof1 = solver.eofsAsCovariance(neofs=1)
clevs = np.linspace(-75, 75, 11)
proj = ccrs.Orthographic(central_longitude=-20, central_latitude=60)
ax = plt.axes(projection=proj)
ax.coastlines()
ax.set_global()
eof1[0, 0].plot.contourf(ax=ax, levels=clevs, cmap=plt.cm.RdBu_r,
transform=ccrs.PlateCarree(), add_colorbar=False)
ax.set_title('EOF1 expressed as covariance', fontsize=16)
plt.show()
=============================================================
Result was supposed to be like this:
https://www.cpc.ncep.noaa.gov/products/precip/CWlink/pna/nao_loading.html
Thanks for your time, hope this is not a silly question.
I am confused by this error as this code was working perfectly fine until last week.
biweekly_data
is an Xarray Dataset and when I type the following:
coslat = np.cos(np.deg2rad(biweekly_data.coords['rlat'].values))
wgts = np.sqrt(coslat)[..., np.newaxis]
solver = Eof(biweekly_data, weights=wgts)
eof1 = solver.eofs()
I get this error:
TypeError: the input must be an xarray DataArray
So I tried:
coslat = np.cos(np.deg2rad(biweekly_data.coords['rlat'].values))
wgts = np.sqrt(coslat)[..., np.newaxis]
solver = Eof(biweekly_data.snowmelt, weights=wgts)
eof1 = solver.eofs()
And then get this:
TypeError: Using a DataArray object to construct a variable is ambiguous, please extract the data using the .data property
So one more time I tried:
coslat = np.cos(np.deg2rad(biweekly_data.coords['rlat'].values))
wgts = np.sqrt(coslat)[..., np.newaxis]
solver = Eof(biweekly_data.snowmelt.data, weights=wgts)
eof1 = solver.eofs()
And get:
TypeError: the input must be an xarray DataArray
It just baffled me because I've used the code as shown in the very top block before to create maps like these:
The solver works if I use eofs.standard
when I use solver = Eof(biweekly_data.snowmelt.data, weights=wgts)
I have a dataset with station observation in csv file. Output is nothing
cf.head()
Height | Stid | ci | |||
---|---|---|---|---|---|
DATE | Lat | Lon | |||
1951-01-01 | 32.56 | 117.22 | 206.0 | 58221 | 0.608196 |
32.05 | 118.45 | 618.0 | 58238 | 0.569477 | |
30.20 | 120.14 | 77.0 | 58457 | 0.626431 | |
30.37 | 117.02 | 263.0 | 58424 | 0.666847 | |
28.40 | 121.30 | 95.0 | 58665 | 0.546766 |
from eofs.xarray import Eof
ds = cf.to_xarray()
solver = Eof(ds["ci"])
eof_ci = solver.eofsAsCorrelation(neofs=1)
here is eof_ci:
<xarray.DataArray 'eofs' (mode: 0, Lat: 154, Lon: 166)>
array([], shape=(0, 154, 166), dtype=float64)
Coordinates:
- mode (mode) float64
- Lat (Lat) float64 2.85 2.86 2.97 3.02 3.03 ... 34.27 34.29 34.5 34.51
- Lon (Lon) float64 11.75 11.81 11.91 11.98 ... 121.6 122.1 122.1 122.3
Attributes:
long_name: correlation_between_pcs_and_ci
Using this tools means no longer having to update the documentation manually for each release.
Hi,
I have used the eofs reconstructedField() function to reconstructing the matrix that decomposed by Eof, but I found that even the neofs is setted as the maximum, I still cannot get the original dataset. I check through out the code, and found that the center value haven't be added back to the reconstruced dataset. Could your help me to make sure that?
# This is the code that I used to do the simple test
import numpy as np
import matplotlib.pyplot as plt
from eofs.standard import Eof
a = np.random.randint(-10,10,(10,20))
solver = Eof(a)
reconstructed_data = solver.reconstructedField(solver.neofs)
plt.scatter(a.reshape(-1),reconstructed_data.reshape(-1))
plt.grid()
plt.show()
plt.scatter(a.reshape(-1),(reconstructed_data+a.mean(axis=0)).reshape(-1))
plt.grid()
plt.show()
To reproduce the methodology CPC use to calculate the NAO e.g http://www.cpc.ncep.noaa.gov/data/teledoc/telepatcalc.shtml
I see they first calculate the 10 leading eofs for each month (3 month average) which this code can do as it stands. They then apply a ‘Varimax rotation’ (I need to read their paper) to obtain the 10 rotated eofs for each month (3 month average)
From a user request: add support for extended EOF analysis.
Hi,
I admit that the following issue is a bit pathological, and you may decide it is not worth doing anything about, but I thought I would flag it anyway.
I found myself in the position of wanting to apply PCA to the product space of the principal components of two separate fields:
import xarray as xr
from eofs.xarray import Eof
Z500_pcs=xr.open_dataarray('DJF_Z500_PCs.nc')
MSLP_pcs=xr.open_dataarray('DJF_MSLP_PCs.nc')
combined_pcs=xr.concat([Z500_pcs,MSLP_pcs],'mode')
solver=Eof(combined_pcs)
print(solver.eofs().shape)
print(solver.eofs()[0].shape)
print(solver.eofs()[0][0][0][0][0].shape)
(13, 13)
(13, 13)
(13, 13)
solver.eofs(neofs=3)
ValueError Traceback (most recent call last)
/tmp/ipykernel_9504/46481902.py in <module>
----> 1 solver.eofs(neofs=3)
~/miniconda3/lib/python3.9/site-packages/eofs/xarray.py in eofs(self, eofscaling, neofs)
227 eofs = xr.DataArray(eofs, coords=coords, name='eofs',
228 attrs={'long_name': long_name})
--> 229 eofs.coords.update({coord.name: (coord.dims, coord)
230 for coord in self._space_ndcoords})
231 return eofs
~/miniconda3/lib/python3.9/site-packages/xarray/core/coordinates.py in update(self, other)
164 [self.variables, other_vars], priority_arg=1, indexes=self.xindexes
165 )
--> 166 self._update_coords(coords, indexes)
167
168 def _merge_raw(self, other, reflexive):
~/miniconda3/lib/python3.9/site-packages/xarray/core/coordinates.py in _update_coords(self, coords, indexes)
340 coords_plus_data = coords.copy()
341 coords_plus_data[_THIS_ARRAY] = self._data.variable
--> 342 dims = calculate_dimensions(coords_plus_data)
343 if not set(dims) <= set(self.dims):
344 raise ValueError(
~/miniconda3/lib/python3.9/site-packages/xarray/core/dataset.py in calculate_dimensions(variables)
203 last_used[dim] = k
204 elif dims[dim] != size:
--> 205 raise ValueError(
206 f"conflicting sizes for dimension {dim!r}: "
207 f"length {size} on {k!r} and length {dims[dim]} on {last_used!r}"
ValueError: conflicting sizes for dimension 'mode': length 3 on <this-array> and length 13 on {'mode': 'mode'}
All these errors vanish when I add the following line:
combined_pcs=combined_pcs.rename({'mode':'original_mode'})
Maybe a warning, or an automatic renaming if an array with a coordinate named 'mode' is passed to Eof would be appropriate?
I have an Iris cube with a time dimension co-ordinate whose coord.name()
is 't'
. When trying to create a solver it crashes because the code assumes time dimensions must be called 'time'
. This is true for univariate and multivariate solvers.
I can work around this by setting coord.standard_name = 'time'
on all my cubes before creating EOFs, but the eofs package itself could allow for any name by doing something like time_name = cube.coord(axis='T').name()
.
Hi @ajdawson,
I have been used the eofs extensively for my on-going research in recent 2 years. Thanks for your efforts on developing and maintaining this awesome package. I am not sure if this issue board is proper for asking question. Please excuse me if not, but I really do need your help for my following questions.
I understand how eofsAsCovariance
and eofsAsCorrelation
are different, but couldn't fully understand what eofs
is for. Could you please give me some detail about the eofs
and how it works differently to above two?
Some interfaces (e.g., eofsAsCovariance
, pcs
) have pcscaling
option while some other interfaces (e.g., eofs
, projectField
) have eofscaling
option. To me it seems like if pcscaling=1
, it works with normalized PC time series that has unit variance (please correct me if I am wrong). But I don't fully understand what eofscaling
option is for. May I have further detail how the eofscaling
option work?
According to the manual, "We could also project another field onto the EOFs to produce a set of pseudo-PCs:pseudo_pcs = solver.projectField(other_field)
"
I am using this call as below:
pseudo_pcs = solver.projectField(field_to_be_projected, neofs=1, eofscaling=0)
The eofscaling=0
says the field is being projected onto "un-scaled EOFs" as default. But to me, it seems like EOF pattern of unit variance (map that has spatial deviation = 1) is being projected onto the given field (That is why I asked question 2).
I see the projectField
is using flatE
, which is coming from E
. I suspect this E
should identical to map of EOF pattern that has unit variance. Could you please confirm or correct this for me?
Thank you for your attention and sorry for deficit of my understanding. Your comment would be tremendously helpful for me. Thank you in advance.
Build conda packages and host them on binstar. Add documentation so users know about this option.
Hello,
Thanks to your effort on this package, I am using this for develop climate model evaluation tool. I could have fraction of variance using solver.varianceFraction()
, but I am wondering if I can get the fraction but from projected field.
I've used below to project arbitrary field to solver and gained pcs of projected field:
pcs_of_projected_field = solver.projectField(field_to_be_projected,neofs=eofn,eofscaling=1)
But couldn't find way to have its fraction of variance. Does EOFs have function to do this efficiently?
Thanks for your attention.
I obtained temperature data from 90 weather stations. Each station has data for 3,481 times. I currently have a two-dimensional (space and time) matrix of temperature data (3481 x 90). When I run the module, the solver returns a 1 x 90 EOF. How can I plot this EOF using contourf?
Should I create an empty 90 x 90 matrix and fill in the diagonal the values of the EOF? And the values of lot and lan should also be 90 x 90 matrices?
fill = ax.contourf(lons, lats, np.fill_diagonal(np.zeros((90,90)), eof1.squeeze()), clevs, cmap=plt.cm.RdBu_r, latlon=True)
I am running the NAO example with xarray. Extracting the 1st EOF repeatedly gives different results (only when using weights). It looks like the weights are applied repeatedly.
import cartopy.crs as ccrs
import matplotlib.pyplot as plt
import numpy as np
import xarray as xr
from eofs.xarray import Eof
from eofs.examples import example_data_path
filename = example_data_path('hgt_djf.nc')
z_djf = xr.open_dataset(filename)['z']
z_djf = z_djf - z_djf.mean(dim='time')
coslat = np.cos(np.deg2rad(z_djf.coords['latitude'].values)).clip(0., 1.)
wgts = np.sqrt(coslat)[..., np.newaxis]
solver = Eof(z_djf, weights=wgts)
eof1 = solver.eofsAsCovariance(neofs=1)
clevs = np.linspace(-75, 75, 11)
proj = ccrs.Orthographic(central_longitude=-20, central_latitude=60)
ax = plt.axes(projection=proj)
ax.coastlines()
ax.set_global()
eof1[0, 0].plot.contourf(ax=ax, levels=clevs, cmap=plt.cm.RdBu_r,
transform=ccrs.PlateCarree(), add_colorbar=False)
ax.set_title('EOF1 expressed as covariance', fontsize=16)
plt.show()
# ========================================================================
# Extract 1st EOF again and redo the same plot
# ========================================================================
eof1 = solver.eofsAsCovariance(neofs=1)
clevs = np.linspace(-75, 75, 11)
proj = ccrs.Orthographic(central_longitude=-20, central_latitude=60)
ax = plt.axes(projection=proj)
ax.coastlines()
ax.set_global()
eof1[0, 0].plot.contourf(ax=ax, levels=clevs, cmap=plt.cm.RdBu_r,
transform=ccrs.PlateCarree(), add_colorbar=False)
ax.set_title('EOF1 expressed as covariance', fontsize=16)
plt.show()
print(eof1-eof1b)
Result of difference between eof1 and eof1b:
<xarray.DataArray 'eofs' (mode: 1, pressure: 1, latitude: 29, longitude: 49)>
array([[[[ 1.09211228e-01, 9.13365940e-02, 7.00722281e-02, ...,
1.05908773e-01, 1.25983739e-01, 1.41891558e-01],
[ 2.34846658e-01, 2.13107734e-01, 1.85683208e-01, ...,
5.54756315e-02, 9.04768289e-02, 1.16930300e-01],
[ 4.52193967e-01, 4.28573805e-01, 3.97803594e-01, ...,
-9.20446292e-02, -4.41052427e-02, -7.67394121e-03],
...,
[-5.45899433e+01, -5.52156445e+01, -5.58994769e+01, ...,
-6.28602958e+01, -6.25882936e+01, -6.22965706e+01],
[-8.68933483e+01, -8.72562317e+01, -8.76033326e+01, ...,
-9.52351039e+01, -9.51051557e+01, -9.50488148e+01],
[ nan, nan, nan, ...,
nan, nan, nan]]]])
Coordinates:
* mode (mode) int64 0
* pressure (pressure) float32 500.0
* latitude (latitude) float32 20.0 22.5 25.0 27.5 ... 82.5 85.0 87.5 90.0
* longitude (longitude) float32 -80.0 -77.5 -75.0 -72.5 ... 35.0 37.5 40.0
Hi,
I have an xarray Dataset :
>>> inFile
<xarray.Dataset>
Dimensions: (time_counter: 6000, x: 182, y: 149)
Coordinates:
* time_counter (time_counter) float64 3.02e+07 ... 1.892e+11
Dimensions without coordinates: x, y
Data variables:
tos_yearmean (time_counter, y, x) float32 ...
I compute the EOFs on tos_yearmean and I get :
>>> solver.eofs()
<xarray.DataArray 'eofs' (mode: 6000, y: 149, x: 182)>
array([[[nan, nan, ..., nan, nan],
[nan, nan, ..., nan, nan]]], dtype=float32)
Coordinates:
* mode (mode) int64 0 1 2 3 4 5 6 7 ... 5993 5994 5995 5996 5997 5998 5999
* y (y) int64 0 1 2 3 4 5 6 7 8 ... 140 141 142 143 144 145 146 147 148
* x (x) int64 0 1 2 3 4 5 6 7 8 ... 173 174 175 176 177 178 179 180 181
Attributes:
long_name: empirical_orthogonal_functions
Dimensions x
and y
have been transformed into coordinates, with index starting at 0. This is spurious information. When I use the input file in Ferret, Ferret assumes that x
and y
start at 1. When I write solver.eofs
in a file and use it in Ferret, the indexing starts at 0, and there is mismatch between the two files that make Ferret to fail on some operations.
Can I prevent eofs
to transform dimension in coordinates ?
Thanks,
Olivier
When trying to run your example code in "standard" dir, I've got a runtime errror and no exception was catched. I debuged it with PyCharm and found that this happened in this line:
A, Lh, E = np.linalg.svd(dataNoMissing, full_matrices=False)
I wonder if there is sth wrong with this input data dataNoMissing
? Or it is caused by Windows version of numpy?
OS: Windows 8.1 with Update1
Numpy version: 1.8.0
I would like to compute an EOF on an ensemble (a xarray with member
dimension) not having different EOFs for all the members. I would like, for example, to stack the dimensions time
and member
, and then computing the EOF on that dimension. Is it possible or it would be better to stick to the plain sklearn
functions?
Thanks for providing this amazing package. It is absolutely one of the best and most useful python packages I know of!
Currently eofs only works with numpy arrays. However, its core computational algorithm, svd, is implemented in dask array. http://dask.pydata.org/en/latest/array-api.html
This means that it would theoretically be possible for eofs to leverage dask to do out of core EOFs with minimal refactoring.
Is this on your roadmap? Would be keen to help if you’re interested.
The Travis CI deploy mechanism can be used to automatically build and upload distributions built from tags.
Hi Andrew,
My issue is that I wasn't able to install eofs via conda, due to some "conflicts":
$ conda install -c https://conda.anaconda.org/ajdawson eofs
Using Anaconda Cloud api site https://api.anaconda.org
Fetching package metadata: ......
Solving package specifications: ....
The following specifications were found to be in conflict:
- bottleneck (target=bottleneck-1.0.0-np110py27_0.tar.bz2) -> numpy 1.10*|1.11*|1.9*
- bottleneck (target=bottleneck-1.0.0-np110py27_0.tar.bz2) -> python 3.4*|3.5*
- eofs
- fontconfig (target=fontconfig-2.11.1-5.tar.bz2) -> freetype 2.5*
- fontconfig (target=fontconfig-2.11.1-5.tar.bz2) -> libpng 1.6.17
Use "conda info <package>" to see the dependencies for each package.
Could you please give me a hint? For instance, what happens if I uninstall bottleneck, may it help?
Please find below my "conda info" and "conda list" outputs.
Thanks in advance
Carlos
$ conda info
Using Anaconda Cloud api site https://api.anaconda.org
Current conda install:
platform : linux-32
conda version : 4.0.5
conda-build version : 1.20.0
python version : 2.7.11.final.0
requests version : 2.9.1
root environment : /home/carlos/anaconda2 (writable)
default environment : /home/carlos/anaconda2
envs directories : /home/carlos/anaconda2/envs
package cache : /home/carlos/anaconda2/pkgs
channel URLs : https://repo.continuum.io/pkgs/free/linux-32/
https://repo.continuum.io/pkgs/free/noarch/
https://repo.continuum.io/pkgs/pro/linux-32/
https://repo.continuum.io/pkgs/pro/noarch/
config file : None
is foreign system : False
$ conda list
# packages in environment at /home/carlos/anaconda2:
#
abstract-rendering 0.5.1 np110py27_0 defaults
alabaster 0.7.7 py27_0 defaults
anaconda-client 1.4.0 py27_0 defaults
argcomplete 1.0.0 py27_1 defaults
astropy 1.1.2 np110py27_0 defaults
babel 2.3.3 py27_0 defaults
backports 1.0 py27_0 defaults
backports-abc 0.4 <pip>
backports.ssl-match-hostname 3.4.0.2 <pip>
backports_abc 0.4 py27_0 defaults
basemap 1.0.7 np110py27_0 anaconda
beautifulsoup4 4.4.1 py27_0 defaults
bitarray 0.8.1 py27_0 defaults
blaze 0.9.0 <pip>
blaze-core 0.9.0 py27_0 defaults
bokeh 0.11.1 py27_0 defaults
boto 2.39.0 py27_0 defaults
bottleneck 1.0.0 np110py27_0 defaults
cairo 1.12.18 6 defaults
cdecimal 2.3 py27_0 defaults
cffi 1.5.2 py27_1 defaults
clyent 1.2.2 py27_0 defaults
colorama 0.3.7 py27_0 defaults
conda 4.0.5 py27_0 defaults
conda-build 1.20.0 py27_0 defaults
conda-env 2.4.5 py27_0 defaults
configobj 5.0.6 py27_0 defaults
configparser 3.5.0b2 py27_1 defaults
cryptography 1.3.1 py27_0 defaults
curl 7.45.0 0 defaults
cycler 0.10.0 py27_0 defaults
cython 0.24 py27_0 defaults
cytoolz 0.7.5 py27_0 defaults
datashape 0.5.1 py27_0 defaults
decorator 4.0.9 py27_0 defaults
docutils 0.12 py27_0 defaults
entrypoints 0.2 py27_1 defaults
enum34 1.1.3 py27_0 defaults
et-xmlfile 1.0.1 <pip>
et_xmlfile 1.0.1 py27_0 defaults
fastcache 1.0.2 py27_0 defaults
flask 0.10.1 py27_1 defaults
fontconfig 2.11.1 5 defaults
freetype 2.5.5 0 defaults
funcsigs 1.0.0 py27_0 defaults
futures 3.0.5 py27_0 defaults
geos 3.3.3 0 anaconda
gevent 1.1.0 py27_0 defaults
gevent-websocket 0.9.5 py27_1 defaults
greenlet 0.4.9 py27_0 defaults
grin 1.2.1 py27_1 defaults
h5py 2.6.0 np110py27_1 defaults
hdf5 1.8.16 0 defaults
idna 2.1 py27_0 defaults
imagesize 0.7.0 py27_0 defaults
ipaddress 1.0.14 py27_0 defaults
ipykernel 4.3.1 py27_0 defaults
ipython 4.1.2 py27_1 defaults
ipython-genutils 0.1.0 <pip>
ipython-notebook 4.0.4 py27_0 defaults
ipython-qtconsole 4.0.1 py27_0 defaults
ipython_genutils 0.1.0 py27_0 defaults
ipywidgets 4.1.1 py27_0 defaults
itsdangerous 0.24 py27_0 defaults
jasper 1.900.1 3 IOOS
jbig 2.1 0 defaults
jdcal 1.2 py27_0 defaults
jedi 0.9.0 py27_0 defaults
jinja2 2.8 py27_0 defaults
jpeg 8d 0 defaults
jsonschema 2.4.0 py27_0 defaults
jupyter 1.0.0 py27_2 defaults
jupyter-client 4.2.2 <pip>
jupyter-console 4.1.1 <pip>
jupyter-core 4.1.0 <pip>
jupyter_client 4.2.2 py27_0 defaults
jupyter_console 4.1.1 py27_0 defaults
jupyter_core 4.1.0 py27_0 defaults
libffi 3.2.1 0 defaults
libgfortran 3.0 0 defaults
libpng 1.6.17 0 defaults
libsodium 1.0.3 0 defaults
libtiff 4.0.6 1 defaults
libxml2 2.9.2 0 defaults
libxslt 1.1.28 0 defaults
llvmlite 0.10.0 py27_0 defaults
lxml 3.6.0 py27_0 defaults
markupsafe 0.23 py27_0 defaults
matplotlib 1.5.1 np110py27_0 defaults
mistune 0.7.2 py27_0 defaults
mkl 11.3.1 0 defaults
mpmath 0.19 py27_0 defaults
multipledispatch 0.4.8 py27_0 defaults
nbconvert 4.2.0 py27_0 defaults
nbformat 4.0.1 py27_0 defaults
networkx 1.11 py27_0 defaults
nltk 3.2.1 py27_0 defaults
nose 1.3.7 py27_0 defaults
notebook 4.2.0 py27_0 defaults
numba 0.25.0 np110py27_0 defaults
numexpr 2.5.2 np110py27_0 defaults
numpy 1.10.4 py27_1 defaults
odo 0.4.2 py27_0 defaults
openblas 0.2.14 4 defaults
openpyxl 2.3.2 py27_0 defaults
openssl 1.0.2g 0 defaults
pandas 0.18.0 np110py27_0 defaults
patchelf 0.8 0 defaults
path.py 8.2 py27_0 defaults
patsy 0.4.1 py27_0 defaults
pep8 1.7.0 py27_0 defaults
pexpect 4.0.1 py27_0 defaults
pickleshare 0.5 py27_0 defaults
pillow 3.2.0 py27_0 defaults
pip 8.1.1 py27_1 defaults
pixman 0.32.6 0 defaults
ply 3.8 py27_0 defaults
psutil 4.1.0 py27_0 defaults
ptyprocess 0.5 py27_0 defaults
py 1.4.31 py27_0 defaults
py2cairo 1.10.0 py27_2 defaults
pyasn1 0.1.9 py27_0 defaults
pycairo 1.10.0 py27_0 defaults
pycosat 0.6.1 py27_0 defaults
pycparser 2.14 py27_0 defaults
pycrypto 2.6.1 py27_0 defaults
pycurl 7.19.5.3 py27_0 defaults
pyflakes 1.1.0 py27_0 defaults
pygments 2.1.3 py27_0 defaults
pygrib 2.0.0 <pip>
pyopenssl 0.15.1 py27_2 defaults
pyparsing 2.1.1 py27_0 defaults
pyqt 4.11.4 py27_1 defaults
pytables 3.2.2 np110py27_3 defaults
pytest 2.9.1 py27_0 defaults
python 2.7.11 0 defaults
python-dateutil 2.5.2 py27_0 defaults
pytz 2016.3 py27_0 defaults
pyyaml 3.11 py27_1 defaults
pyzmq 15.2.0 py27_0 defaults
qt 4.8.7 0 defaults
qtconsole 4.2.1 py27_0 defaults
readline 6.2 2 defaults
requests 2.9.1 py27_0 defaults
rope 0.9.4 py27_1 defaults
scikit-image 0.12.3 np110py27_0 defaults
scikit-learn 0.17.1 np110py27_0 defaults
scipy 0.17.0 np110py27_2 defaults
setuptools 20.7.0 py27_0 defaults
simplegeneric 0.8.1 py27_0 defaults
singledispatch 3.4.0.3 py27_0 defaults
sip 4.16.9 py27_0 defaults
six 1.10.0 py27_0 defaults
snowballstemmer 1.2.1 py27_0 defaults
sockjs-tornado 1.0.1 py27_0 defaults
sphinx 1.4.1 py27_0 defaults
sphinx-rtd-theme 0.1.9 <pip>
sphinx_rtd_theme 0.1.9 py27_0 defaults
spyder 2.3.8 py27_1 defaults
spyder-app 2.3.8 py27_0 defaults
sqlalchemy 1.0.12 py27_0 defaults
sqlite 3.9.2 0 defaults
ssl_match_hostname 3.4.0.2 py27_1 defaults
statsmodels 0.6.1 np110py27_0 defaults
sympy 1.0 py27_0 defaults
tables 3.2.2 <pip>
terminado 0.5 py27_1 defaults
theano 0.7.0 np110py27_0 defaults
tk 8.5.18 0 defaults
toolz 0.7.4 py27_0 defaults
tornado 4.3 py27_0 defaults
traitlets 4.2.1 py27_0 defaults
ujson 1.35 py27_0 defaults
unicodecsv 0.14.1 py27_0 defaults
util-linux 2.21 0 defaults
werkzeug 0.11.8 py27_0 defaults
wheel 0.29.0 py27_0 defaults
xlrd 0.9.4 py27_0 defaults
xlsxwriter 0.8.4 py27_0 defaults
xlwt 1.0.0 py27_0 defaults
xz 5.0.5 1 defaults
yaml 0.1.6 0 defaults
zeromq 4.1.3 0 defaults
zlib 1.2.8 0 defaults
Will eofs be included in the next release of UV-CDAT?
Hi there,
I've learned that sometimes giving weights='coslat' or 'area' option in Eof reverses sign (positive to negative or vise versa), looks like a bug.. (It is weird that not always but for some specific case.. no idea why..) According to the interface description, it looks like weighting is square-root so should not have any effect on sign, but turning off those weight function bring back to original sign.. Do you have any idea? I have some figures can show if you want.
I am testing with UV-CDAT implemented version, and had no chance to test the conda version yet.
Hi @ajdawson ,
I am reporting a potential issue with recent version of MKL.
My code is using eofs in it, and it used to work well in my environment. But one day it stopped working when I upgraded the environment.
The error was coming from standard.py
, returning runningtime overflow error or DLASCL parameter error (like this) from self._L = Lh * Lh / normfactor
.
I learned that my issue was able to be solved by downgrading my mkl
version from 2018 to 2017 as below:
>> conda install mkl=2017.0.3
Fetching package metadata ...........
Solving package specifications: .
Package plan for installation in environment /export/lee1043/anaconda2/envs/pmp_nightly:
The following packages will be DOWNGRADED:
mkl: 2018.0.0-hb491cac_4 --> 2017.0.3-0
numpy: 1.13.1-py27hd1b6e02_2 --> 1.13.1-py27_0
Not sure this would be issue for others as well, but it would be great if this could be checked. Sharing this for your interest.
Thanks.
xarray
released a few days ago the 0.19.0 version which comes with some deprecations (pydata/xarray#5630) that seem to affect data with non-dimension coordinate only.
Sample code
import xarray as xr
from eofs.xarray import Eof
# Load example data from xarray
data = xr.tutorial.open_dataset("air_temperature").air
# Compute anomaly
anom = data.groupby("time.month") - data.groupby("time.month").mean()
# Create the Eof solver with a subset of the data
solver = Eof(anom.sel(time=slice("2013-01", "2013-12")))
# Project all the data
solver.projectField(anom, neofs=2)
This is the error raised
Traceback (most recent call last):
File "/data/users/service/index/test.py", line 15, in <module>
solver.projectField(anom, neofs=2)
File "/home/service/miniconda3/envs/pangeo/lib/python3.9/site-packages/eofs/xarray.py", line 639, in projectField
pcs.coords.update({coord.name: (coord.dims, coord)
File "/home/service/miniconda3/envs/pangeo/lib/python3.9/site-packages/xarray/core/coordinates.py", line 163, in update
coords, indexes = merge_coords(
File "/home/service/miniconda3/envs/pangeo/lib/python3.9/site-packages/xarray/core/merge.py", line 472, in merge_coords
collected = collect_variables_and_indexes(aligned)
File "/home/service/miniconda3/envs/pangeo/lib/python3.9/site-packages/xarray/core/merge.py", line 294, in collect_variables_and_indexes
variable = as_variable(variable, name=name)
File "/home/service/miniconda3/envs/pangeo/lib/python3.9/site-packages/xarray/core/variable.py", line 121, in as_variable
raise TypeError(
TypeError: Using a DataArray object to construct a variable is ambiguous, please extract the data using the .data property.
Changing the last line to
solver.projectField(anom.drop("month"), neofs=2)
fixes the issue, however a non-dimension coordinate is lost, being in this case the coordinate 'month' that comes from the groupby operation from xarray
.
Testing the same sample code with the previous xarray
version (0.18.2) yields the expected result
<xarray.DataArray 'pseudo_pcs' (time: 2920, mode: 2)>
array([[ 50.44886 , -78.26509 ],
[ 21.369547, -98.04355 ],
[ 8.925724, -110.18372 ],
...,
[ -47.0296 , -151.02394 ],
[ -45.16002 , -128.5353 ],
[ -27.55614 , -93.23076 ]], dtype=float32)
Coordinates:
* time (time) datetime64[ns] 2013-01-01 ... 2014-12-31T18:00:00
* mode (mode) int64 0 1
month (time) int64 1 1 1 1 1 1 1 1 1 1 ... 12 12 12 12 12 12 12 12 12 12
Attributes:
long_name: air_pseudo_pcs
Following the suggestion in the error raised by changing this line
Lines 638 to 640 in 603ed8e
to
# Add non-dimension coordinates.
pcs.coords.update({coord.name: (coord.dims, coord.data)
for coord in time_ndcoords})
Solves the issue with no apparent breaking change. I can send a simple PR if it seems okey.
Hi, we've had a user feature request to our project cf-python for the ability to conduct EOF & rotated EOF analysis with our intrinsic data object, the cf-python
field, or more precisely the 'field construct' of the CF data model (see also the cfdm library).
Bryan Lawrence recommended your library as a potential solution for this, notably since it appears to be open for interfacing in a way that allows for management of the underlying metadata, as demonstrated by the Iris interface module. cf-python
makes use of numpy arrays under-the-hood but our philosophy & one of our core USPs is to enable data analysis that preserves CF-compliant metadata, so the standard numpy solver interface is not appropriate.
Therefore we were wondering if you'd be happy to include a module for a cf-python
solver interface? If so, we'd write it (I've volunteered so I would write most, if not all, of it) & then I suggest we can put it up as a Pull Request for your review. If that sounds agreeable, please let us know any requirements/advice you might have for us to help us to develop it so it fits in as you would like, otherwise we can use the iris
module as a guide. Thanks.
Hey, first off thanks for developing the eofs package, it has helped me out alot with performing univariate EOFs.
So this may very well be my lack of theoretical understanding, please forgive me if it is.
But when performing MEOFs I wanted to see how the variance explained was distributed among the correspondingly reconstructed fields, and it seems off to me.
I tried comparing with both anomaly fields and also standardised data before plugging them into the solver.
I compared the outputs of the two final commands below, which is done specifically on standardised data so that the SVD's variance fraction and the constructed data's variance should be equal.
# mean = 0, std = 1 for each dataarray
m_solver = MultivariateEof(list_data_arrays, weights=list_wgts)
# These numbers don't compare with standardised data
var_fraction = numpy.sum(m_solver.varianceFraction(neigs=n))
reconstructed_var = 0
for i in range(0, N_vars):
reconstructed_var += numpy.nanvar(m_solver.reconstructedField(n)[i])
reconstructed_var /= N_vars
# Should be around 0
var_fraction - reconstructed_var
I have three datasets which I have done this with, comparing in any combination of the three.
If this is the wrong output, could it be the way the MEOF is computed? The few papers I have found on MEOFs stack the datasets vertically, along the time axis, which is the opposite of what the MultivariateEOFs object does as far as I could tell.
Regards,
Boooke
e.g. Mode 1 represents 45%, mode 2 represents 13% and so on.
Thanks!
This could be improved further. The main issue is the definition of "complete", which depends on what is available (xarray everywhere, iris on 2.7 and 3.4, cdms2 on 2.7 only).
Re-organise the deployed documentation (gh-pages branch) so that documentation for older versions can be preserved.
Getting some errors and failures with numpy 1.10 on OS X. Most appear to relate to getting MaskedArrays when expecting plain arrays and vice versa.
Hi, I've just found the following error after trying your iris example using my own cube. I'm using numpy version '1.10.1'
AttributeError Traceback (most recent call last)
/home/scott/Copy/WORK/WIP/ASL_SOM_vs_EOF/EOF/asl_eof.py in ()
18 # PC time series and the input SST anomalies at each grid point, and the
19 # leading PC time series itself.
---> 20 eof1 = solver.eofsAsCorrelation(neofs=1)
21 pc1 = solver.pcs(npcs=1, pcscaling=1)
22
/home/scott/PYTHON/eofs/iris.pyc in eofsAsCorrelation(self, neofs)
288
289 """
--> 290 eofs = self._solver.eofsAsCorrelation(neofs)
291 eofdim = DimCoord(range(eofs.shape[0]),
292 var_name='eof',
/home/scott/PYTHON/eofs/standard.pyc in eofsAsCorrelation(self, neofs)
364 # numpy array filled with numpy.nan.
365 if not self._filled:
--> 366 c = c.filled(fill_value=np.nan)
367 return c
368
AttributeError: 'numpy.ndarray' object has no attribute 'filled'
Hi there,
I'm trying to use eofs package. It seems to work OK when I use numpy
arrays or when I only use xarray
. But I can't get around myself using it with xarray
+dask
.
I've reduced my dataset into something very small.
Here are 3 example notebooks...
@ScottWales, am I doing something wrong here? I also tried chunking .chunk('time'=1)
but I still had the same issue...
I would like to use your package for looking at wind fields by doing eof analysis on complex valued arrays. In doing a simple test using a 3x3x3 real value array and the same array as type complex, the first 2 eofs are the same but the third is different. I am guessing the third eof for this simple problem is just noise anyway but was curious if you had done any analysis with complex values.
Thanks
Hi there, I've just got started using the eofs package for analysis of some forecast model data. Really appreciate the functionality of the package and has saved me a lot of time already.
The model I'm working with is an ensemble based system, so ideally for the purposes of the analysis I'd like to treat each individual ensemble as an extra set of samples on the time dimension. (e.g. if I have 12 ensembles over 100 time points, on an x by y lat long grid, I end up computing eofs of 1200 time points on my x, y grid)
My data is stored in iris cubes.
Are there any plans to add support for this? If not I'm happy to try adding it to the source myself if I can find the time. Any suggestions on the best approach would be appreciated.
Cheers,
Tom
Is there a reason MultivariateEof
has not been implemented yet for xarray objects? If not, I'd be happy to contribute the feature if you can give some pointers on how best that would be done.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.