Hello, I try to use your package. I pass it a panda series of events

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

What's the meaning of "ValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions" about pyextremes HOT 10 CLOSED

georgebv commented on July 18, 2024

What's the meaning of "ValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions"

from pyextremes.

Comments (10)

georgebv commented on July 18, 2024 1

@coastalmodeler thank you for your input. In order to help you I'll need to be able to reproduce your error. Please provide the following:

OS
Python version
pyextremes version
numpy version
pandas version

In addition to that I'll need a complete code snippet which can be run as is. For example:

import pandas as pd
import pyextremes

data = pd.read_csv("data.csv")
model = pyextremes.EVA(data)
model.get_extremes()

And provide a link to your data.csv. You can also make a GitHub gist with jupyter notebook if that's what you prefer.

from pyextremes.

wiz21b commented on July 18, 2024

More information. I understand the error comes out of pandas, not your code directly. Just for information, my data looks like:

model = EVA(pd.Series(durations, np.sort(dates)))
print(dates, dates.dtype)
print(durations, durations.dtype)

['2002-01-01T16:00:00.000000000' '2002-01-01T20:00:00.000000000'
 '2002-01-02T03:00:00.000000000' ... '2004-10-02T23:00:00.000000000'
 '2004-10-03T07:00:00.000000000' '2004-10-03T13:00:00.000000000'] datetime64[ns]
[20. 18.  7. ...  4.  5.  7.] float64

Full stack trace:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Input In [115], in <module>
----> 1 model.get_extremes(method="BM", block_size="10D")
      2 model.plot_extremes()

File ~/.local/lib/python3.9/site-packages/pyextremes/eva.py:452, in EVA.get_extremes(self, method, extremes_type, **kwargs)
    450 message = f"for method='{method}' and extremes_type='{extremes_type}'"
    451 logger.debug("extracting extreme values %s", message)
--> 452 self.__extremes = get_extremes(
    453     method=method,
    454     ts=self.data,
    455     extremes_type=extremes_type,
    456     **kwargs,
    457 )
    458 self.__extremes_method = method
    459 self.__extremes_type = extremes_type

File ~/.local/lib/python3.9/site-packages/pyextremes/extremes/extremes.py:59, in get_extremes(ts, method, extremes_type, **kwargs)
     13 """
     14 Get extreme events from time series.
     15 
   (...)
     56 
     57 """
     58 if method == "BM":
---> 59     return get_extremes_block_maxima(
     60         ts=ts,
     61         extremes_type=extremes_type,
     62         **kwargs,
     63     )
     64 if method == "POT":
     65     return get_extremes_peaks_over_threshold(
     66         ts=ts,
     67         extremes_type=extremes_type,
     68         **kwargs,
     69     )

File ~/.local/lib/python3.9/site-packages/pyextremes/extremes/block_maxima.py:148, in get_extremes_block_maxima(ts, extremes_type, block_size, errors, min_last_block)
    137     warnings.warn(
    138         message=f"{empty_intervals} blocks contained no data",
    139         category=NoDataBlockWarning,
    140     )
    142 logger.debug(
    143     "successfully collected %d extreme events, found %s no-data blocks",
    144     len(extreme_values),
    145     empty_intervals,
    146 )
--> 148 return pd.Series(
    149     data=extreme_values,
    150     index=pd.Index(data=extreme_indices, name=ts.index.name or "date-time"),
    151     dtype=np.float64,
    152     name=ts.name or "extreme values",
    153 ).fillna(np.nanmean(extreme_values))

File ~/.local/lib/python3.9/site-packages/pandas/core/series.py:439, in Series.__init__(self, data, index, dtype, name, copy, fastpath)
    437         data = data.copy()
    438 else:
--> 439     data = sanitize_array(data, index, dtype, copy)
    441     manager = get_option("mode.data_manager")
    442     if manager == "block":

File ~/.local/lib/python3.9/site-packages/pandas/core/construction.py:570, in sanitize_array(data, index, dtype, copy, raise_cast_failure, allow_2d)
    567     data = list(data)
    569 if dtype is not None or len(data) == 0:
--> 570     subarr = _try_cast(data, dtype, copy, raise_cast_failure)
    571 else:
    572     subarr = maybe_convert_platform(data)

File ~/.local/lib/python3.9/site-packages/pandas/core/construction.py:760, in _try_cast(arr, dtype, copy, raise_cast_failure)
    755         subarr = maybe_cast_to_integer_array(arr, dtype)
    756     else:
    757         # 4 tests fail if we move this to a try/except/else; see
    758         #  test_constructor_compound_dtypes, test_constructor_cast_failure
    759         #  test_constructor_dict_cast2, test_loc_setitem_dtype
--> 760         subarr = np.array(arr, dtype=dtype, copy=copy)
    762 except (ValueError, TypeError):
    763     if raise_cast_failure:

ValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions. The detected shape was (731,) + inhomogeneous part.

from pyextremes.

wiz21b commented on July 18, 2024

After cleaning my data, that is, making sure all dates are represented and "non existing" data is set to zero, the program runs. But this makes me nervous. Indeed, filling the gaps with zeros is like saying zeros is a data actually when it's not...

from pyextremes.

georgebv commented on July 18, 2024

@wiz21b can you share your data so that I can reproduce your error? This is not meant to happen because EVA pre-processes the data during initialization - this may be a scenario I didn't account for.

Also data doesn't have to be at regular intervals.

from pyextremes.

coastalmodeler commented on July 18, 2024

what was the solution? I'm having the same issue

from pyextremes.

georgebv commented on July 18, 2024

@coastalmodeler I have never heard back from @wiz21b so I don't know if the issue is resolved. I can reopen this issue for you if you post details about your error.

from pyextremes.

coastalmodeler commented on July 18, 2024

Cool thanks. I'm getting the same error as wiz21b when I run the get_extremes command. I also get the error when I try to execute some of the plot functions and POT functions. I haven't been able to figure out why.

I'm following exact tutorial case but using data from a different noaa station using the NOAA_COOPS function to download the data into a dataframe. See code below:

tide_gauge=noaa_coops.Station(8775237)

#https://api.tidesandcurrents.noaa.gov/api/prod/#products
df_water_levels=tide_gauge.get_data(
begin_date="20040406",
end_date="20220925",
product="water_level",
datum="NAVD",
units="english",
time_zone="LST")

I then normalize the dataset by adjusting for RSLR:
measured_rslr=5.54*0.00328084 #ft/yr

df_water_levles_corrected=df_water_levels['water_level'].copy().sort_index(ascending=True).astype(float).dropna()

df_water_levels_corrected=df_water_levels_corrected-(df_water_levels_corrected.index.array-pd.to_datetime("1992"))/pd.to_timedelta("365.2425D")*measured_rslr

am=pyextremes.EVA(df_water_levels_corrected)

Everything works up until this point and here is the command that results in the error:

am.get_extremes(method="BM", block_size="365.2425D",errors="ignore")

am.get_extremes(method="BM", block_size="365.2425D",errors="ignore")
C:\Users: NoDataBlockWarning: 1 blocks contained no data
warnings.warn(
Traceback (most recent call last):

Input In [94] in <cell line: 1>
am.get_extremes(method="BM", block_size="365.2425D",errors="ignore")

File ~.conda\envs\work\lib\site-packages\pyextremes\eva.py:452 in get_extremes
self.__extremes = get_extremes(

File ~.conda\envs\work\lib\site-packages\pyextremes\extremes\extremes.py:59 in get_extremes
return get_extremes_block_maxima(

File ~.conda\envs\work\lib\site-packages\pyextremes\extremes\block_maxima.py:148 in get_extremes_block_maxima
return pd.Series(

File ~.conda\envs\work\lib\site-packages\pandas\core\series.py:451 in init
data = sanitize_array(data, index, dtype, copy)

File ~.conda\envs\work\lib\site-packages\pandas\core\construction.py:594 in sanitize_array
subarr = _try_cast(data, dtype, copy, raise_cast_failure)

File ~.conda\envs\work\lib\site-packages\pandas\core\construction.py:784 in _try_cast
subarr = np.array(arr, dtype=dtype, copy=copy)

ValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions. The detected shape was (18,) + inhomogeneous part.

from pyextremes.

coastalmodeler commented on July 18, 2024

Thank you for your quick response. See info and code below.

OS: Microsoft Windows 10
Python Version: 3.8.13
pyextremes version: 2.2.4
numpy version: 1.21.5
pandas version: 1.4.3
noaa_coops version: 0.1.9

Here's the code. Note, there is no CSV file I'm using noaa-coops to download the data directly into python from the API. The noaa coops wrapper can be found here: https://pypi.org/project/noaa-coops/

import noaa_coops as nc
import pyextremes
import numpy as np
import pandas as pd

tide_gauge=nc.Station(8775237)

#https://api.tidesandcurrents.noaa.gov/api/prod/#products
df_water_levels=tide_gauge.get_data(
    begin_date="20040406",
    end_date="20220925",
    product="water_level",
    datum="NAVD",
    units="english",
    time_zone="LST")

measured_rslr=5.54*0.00328084 

df_water_levels_corrected=df_water_levels['water_level'].copy().sort_index(ascending=True).astype(float).dropna()

df_water_levels_corrected=df_water_levels_corrected-(df_water_levels_corrected.index.array-pd.to_datetime("1992"))/pd.to_timedelta("365.2425D")*measured_rslr

am=pyextremes.EVA(df_water_levels_corrected)
am.get_extremes(method="BM",errors="ignore")

from pyextremes.

coastalmodeler commented on July 18, 2024

I think I found the issue. There were duplicate time steps in the NOAA dataset. Once I removed those the code works as intended. I'd guess that was the same problem @wiz21b was having. Thank you for your time.

from pyextremes.

georgebv commented on July 18, 2024

@coastalmodeler thank you for posting your solution here, it was an issue with the EVA class not removing duplicates - I have included a fix in the latest release

from pyextremes.

What's the meaning of "ValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions" about pyextremes HOT 10 CLOSED

Comments (10)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent