intelpython / mkl_fft Goto Github PK

View Code? Open in Web Editor NEW

64.0 64.0 16.0 254 KB

NumPy-based Python interface to Intel (R) MKL FFT functionality

License: BSD 3-Clause "New" or "Revised" License

Batchfile 0.06% Shell 0.12% Python 93.82% C 6.01%

fft mkl numpy

mkl_fft's People

Contributors

Stargazers

Watchers

Forkers

jakirkham stuartarchibald msarahan briandconnelly sailfish009 aixioma aguzmanballen entropicphys peterbell10 songchaow the-intelligence-of-information wonlee2019 butayama cako xcleancode ekimd

mkl_fft's Issues

top-level documentation for inverse real needs clarification

The top-level documentation does not make it clear that the inverse real transforms are real to complex (not complex to real as some users will expect).

Also, this functionality is very much less useful than complex to real, which is something to consider for future work. I know that this is available through the numpy interface, but numpy lacks the overwrite flag.

Limitation of: Data size of one of transform dimensions exceeds 2^31 - 1 bytes

Based on the error message, it seems like this is a known limitation, but I'm resampling data using ffts and get an error related to the size of the data. The data shape is: (17863680, 128) and the fft is being taken along axis=0.

I'm guessing this might be related to an index being int32, and if it is possible to get rid of this limitation, that would be helpful.

  File "/home/jlivezey/process_nwb/process_nwb/fft.py", line 22, in rfft
    return mklrfft(*args, **kwargs)
  File "/clusterfs/bebb/users/jlivezey/anaconda3/envs/nsds_nwb/lib/python3.7/site-packages/mkl_fft/_numpy_fft.py", line 414, in rfft
    (x,), {'n': n, 'axis': axis})
  File "/clusterfs/bebb/users/jlivezey/anaconda3/envs/nsds_nwb/lib/python3.7/site-packages/mkl_fft/_numpy_fft.py", line 103, in trycall
    raise ve
  File "/clusterfs/bebb/users/jlivezey/anaconda3/envs/nsds_nwb/lib/python3.7/site-packages/mkl_fft/_numpy_fft.py", line 98, in trycall
    res = func(*args, **kwrds)
  File "mkl_fft/_pydfti.pyx", line 792, in mkl_fft._pydfti.rfft_numpy
  File "mkl_fft/_pydfti.pyx", line 701, in mkl_fft._pydfti._rc_fft1d_impl
ValueError: Internal error occurred: b'Intel MKL DFTI ERROR: Data size of one of transform dimensions exceeds 2^31 - 1 bytes'

mkl_fft._pydfti.rfftn_numpy error when using axes specification

I originally posted this as a numpy issue, but after some research it appears to be something wrong with mkl_fft, although it only happens under Python 3 (and only in later releases, where "later" means past whatever the default was in Anaconda 5.0.1).

You can see some background on the issue here - the context is that I use numpy's rfftn to make a PSD of simulated ocean waves.

Long story short, when I make a conda environment like this:

> conda create --name test1 anaconda=2019.03 python=2

and this:

> conda create --name test2 anaconda=2019.03 python=3

If I have a numpy array called 'waves' that is dimensioned [64,512,512] and try to run the following:

import numpy.fft as nfft
realgood2 = nfft.rfftn(waves, axes=[2,1,0])

In test environment 1 it works perfectly. In test environment 2, I get this:

Traceback (most recent call last):
  File "gist.py", line 57, in <module>
    realgood2 = nfft.rfftn(waves, axes=[2,1,0])
  File "C:\Users\asmith\AppData\Local\Continuum\anaconda3\envs\3_2019.03\lib\site-packages\mkl_fft\_numpy_fft.py", line 1043, in rfftn
    output = mkl_fft.rfftn_numpy(a, s, axes)
  File "mkl_fft\_pydfti.pyx", line 951, in mkl_fft._pydfti.rfftn_numpy
ValueError: could not broadcast input array from shape (512,64) into shape (512,512)

Unexpected result of 1D FFT on Fortran arrays of array rank > 2

FFT transformation on different transpositions of the same array should give the same result, within a reasonbale threshold.

This is currently (v1.0.2) not the case as verified by:

import numpy as np
import mkl_fft

d_ccont = np.random.randn(2, 3, 2)
assert d_ccont.flags['C'] and not d_ccont.flags['F']
d_fcont = np.asfortranarray(d_ccont)
assert d_fcont.flags['F'] and not d_fcont.flags['C']

assert np.allclose(d_ccont, d_fcont) # data as the same

f1 = mkl_fft.fft(d_ccont)
f2 = mkl_fft.fft(d_fcont)

assert np.allclose(f1, f2) # this checks fails to pass

Error when rfftn is called with axes and norm='ortho'

I have been getting the following error when rfftn is called with axes and norm='ortho':

~/anaconda3/lib/python3.6/site-packages/mkl_fft/_numpy_fft.py in <listcomp>(.0)
   1040         a = asarray(a)
   1041         s, axes = _cook_nd_args(a, s, axes)
-> 1042         n_tot = numpy.prod([ s[ai] for ai in axes])
   1043 
   1044     output = mkl_fft.rfftn_numpy(a, s, axes)

IndexError: list index out of range

Here is a code snippet to reproduce this:

import numpy as np
x = np.zeros([2, 3])
y = np.fft.rfftn(x, axes=[1], norm='ortho')

numpy without mkl_fft does not produce errors.

_scipy_fft_backend has global side-effects

Here is a fairly minimal reproducer:

In [1]: import mkl, numpy as np
   ...: import mkl_fft
   ...: from mkl_fft._scipy_fft_backend import fft as scipy_fft
   ...:
   ...: x = np.random.rand(100, 100, 100).astype(np.cdouble)
   ...: mkl.set_num_threads(1)
   ...: %timeit scipy_fft(x, workers=8)
   ...: 
   ...: mkl.set_num_threads(8)
   ...: %timeit mkl_fft.fft(x)
320 µs ± 420 ns per loop (mean ± std. dev. of 7 runs, 1000 loops each)
3.36 ms ± 1.38 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

I would expect the mkl_fft.fft call to use 8 threads and so be as fast as scipy_fft with workers=8. What actually happens is that scipy_fft set the fft domain thread count to 1 and the domain has higher precedence than the global thread setting.

mkl_fft/mkl_fft/_scipy_fft_backend.py

Lines 162 to 163 in 4d8cc2a

    
           n_threads = _hardware_counts.get_max_threads_count() 
        
           mkl.domain_set_num_threads(n_threads, domain='fft')

Do you have any plans to add other types of transform

Hi,

I've been using mkl_fft as a backend for scipy for a while now, and it works great!
Unless i'm mistaken, the available 1-D transforms are the fft and rfft.
Do you have any plan to add dct and dst?

Wheels for newer python versions

Please provide wheels for python 3.7 & 3.8 on pyPI so that we can install this module.

numpy distutils creates a deprecated warning.

mkl_fft/setup.py:34: DeprecationWarning:

numpy.distutils is deprecated since NumPy 1.23.0, as a result
of the deprecation of distutils itself. It will be removed for
Python >= 3.12. For older Python versions it will remain present.
It is recommended to use setuptools < 60.0 for those Python versions.
For more details, see:
https://numpy.org/devdocs/reference/distutils_status_migration.html

Provide an interface for scipy.fft

The upcomping SciPy 1.4.0 release will include a new subpackage scipy.fft that supercedes scipy.fftpack. This new interface matches the behaviour of numpy.fft closely including real to complex rffts but is different in a few ways (e.g. dtype casting).

This interface also includes a new backend mechanism (scipy/scipy#10383) that would allow mkl_fft to implement the scipy.fft interface without any monkey patching. e.g. from pyFFTW/pyFFTW#269:

import scipy.fft
scipy.fft.fft([1])  # Calls scipy's own implementation

from pyfftw.interfaces import scipy_fft
scipy.fft.set_global_backend(scipy_fft)
scipy.fft.fft([1])  # Calls into pyfftw

See scipy_fft.py for what needs to be done on the mkl_fft side to support this.

It's also unfortunate that the name mkl_fft._scipy_fft is already taken. Any ideas on how to move forward with this?

Git push heroku master error : Could not find a version that satisfies the requirement mkl-random==1.2.1

While trying to git push heroku master I am encountering an error :

A snippet of the error:

remote:        ERROR: Could not find a version that satisfies the requirement mkl-service==2.3.0 (from -r /tmp/build_3a708448/requirements.txt (line 37)) (from versions: 2.4.0)
        remote:        ERROR: No matching distribution found for mkl-service==2.3.0 (from -r /tmp/build_3a708448/requirements.txt (line 37))
        remote:  !     Push rejected, failed to compile Python app.
        remote:
        remote:  !     Push failed

The Full details from cmd prompt

(debasis_venv) C:\Users\LENOVO\12042021\Sec20_Build_A_Website>heroku login
    heroku: Press any key to open up the browser to login or q to exit:
    Opening browser to https://cli-auth.heroku.com/auth/cli/browser/d611f093-8415-4dbd-a30b-93e7861079d3?requestor=SFMyNTY.g2gDbQAAAA0xMjIuMTYzLjU0Ljc3bgYAYcP3vHkBYgABUYA.qwoIt7JBUdCHNCOZADmhuocvQPTnXlxq7UvuoM9jaDY
    Logging in... done
    Logged in as [email protected]
    
    (debasis_venv) C:\
```Users\LENOVO\12042021\Sec20_Build_A_Website>git init
    Reinitialized existing Git repository in C:/Users/LENOVO/12042021/Sec20_Build_A_Website/.git/
    
    (debasis_venv) C:\Users\LENOVO\12042021\Sec20_Build_A_Website>git add .
    
    (debasis_venv) C:\Users\LENOVO\12042021\Sec20_Build_A_Website>git config --global user.email "[email protected]"
    
    (debasis_venv) C:\Users\LENOVO\12042021\Sec20_Build_A_Website>git config --global user.name "debasissilpython"
    
    (debasis_venv) C:\Users\LENOVO\12042021\Sec20_Build_A_Website>git commit -m "first commit"
    [master ac903ae] first commit
     1 file changed, 1 deletion(-)
    
    (debasis_venv) C:\Users\LENOVO\12042021\Sec20_Build_A_Website>heroku git:remote --app debasissil
    set git remote heroku to https://git.heroku.com/debasissil.git
    
    (debasis_venv) C:\Users\LENOVO\12042021\Sec20_Build_A_Website>git push heroku master
    Enumerating objects: 31, done.
    Counting objects: 100% (31/31), done.
    Delta compression using up to 2 threads
    Compressing objects: 100% (27/27), done.
    Writing objects: 100% (31/31), 103.51 KiB | 2.72 MiB/s, done.
    Total 31 (delta 9), reused 0 (delta 0), pack-reused 0
    remote: Compressing source files... done.
    remote: Building source:
    remote:
    remote: -----> Building on the Heroku-20 stack
    remote: -----> Determining which buildpack to use for this app
    remote: -----> Python app detected
    remote: -----> Using Python version specified in runtime.txt
    remote:  !     Python has released a security update! Please consider upgrading to python-3.8.10
    remote:        Learn More: https://devcenter.heroku.com/articles/python-runtimes
    remote: -----> Installing python-3.8.8
    remote: -----> Installing pip 20.2.4, setuptools 47.1.1 and wheel 0.36.2
    remote: -----> Installing SQLite3
    remote: -----> Installing requirements with pip
    remote:        Collecting addict==2.4.0
    remote:          Downloading addict-2.4.0-py3-none-any.whl (3.8 kB)
    remote:        Collecting aiofiles==0.6.0
    remote:          Downloading aiofiles-0.6.0-py3-none-any.whl (11 kB)
    remote:        Collecting backcall==0.2.0
    remote:          Downloading backcall-0.2.0-py2.py3-none-any.whl (11 kB)
    remote:        Collecting branca==0.4.2
    remote:          Downloading branca-0.4.2-py3-none-any.whl (24 kB)
    remote:        Collecting certifi==2020.12.5
    remote:          Downloading certifi-2020.12.5-py2.py3-none-any.whl (147 kB)
    remote:        Collecting chardet==4.0.0
    remote:          Downloading chardet-4.0.0-py2.py3-none-any.whl (178 kB)
    remote:        Collecting click==7.1.2
    remote:          Downloading click-7.1.2-py2.py3-none-any.whl (82 kB)
    remote:        Collecting colorama==0.4.4
    remote:          Downloading colorama-0.4.4-py2.py3-none-any.whl (16 kB)
    remote:        Collecting csv-to-geojson==0.0.1
    remote:          Downloading csv_to_geojson-0.0.1-py3-none-any.whl (4.2 kB)
    remote:        Collecting cycler==0.10.0
    remote:          Downloading cycler-0.10.0-py2.py3-none-any.whl (6.5 kB)
    remote:        Collecting decorator==5.0.6
    remote:          Downloading decorator-5.0.6-py3-none-any.whl (8.8 kB)
    remote:        Collecting demjson==2.2.4
    remote:          Downloading demjson-2.2.4.tar.gz (131 kB)
    remote:        Collecting et-xmlfile==1.1.0
    remote:          Downloading et_xmlfile-1.1.0-py3-none-any.whl (4.7 kB)
    remote:        Collecting Flask==2.0.1
    remote:          Downloading Flask-2.0.1-py3-none-any.whl (94 kB)
    remote:        Collecting folium==0.12.1
    remote:          Downloading folium-0.12.1-py2.py3-none-any.whl (94 kB)
    remote:        Collecting geographiclib==1.50
    remote:          Downloading geographiclib-1.50-py3-none-any.whl (38 kB)
    remote:        Collecting geojson==2.5.0
    remote:          Downloading geojson-2.5.0-py2.py3-none-any.whl (14 kB)
    remote:        Collecting geopy==2.1.0
    remote:          Downloading geopy-2.1.0-py3-none-any.whl (112 kB)
    remote:        Collecting gunicorn==20.1.0
    remote:          Downloading gunicorn-20.1.0-py3-none-any.whl (79 kB)
    remote:        Collecting h11==0.12.0
    remote:          Downloading h11-0.12.0-py3-none-any.whl (54 kB)
    remote:        Collecting httpcore==0.13.3
    remote:          Downloading httpcore-0.13.3-py3-none-any.whl (57 kB)
    remote:        Collecting httpx==0.18.1
    remote:          Downloading httpx-0.18.1-py3-none-any.whl (75 kB)
    remote:        Collecting idna==2.10
    remote:          Downloading idna-2.10-py2.py3-none-any.whl (58 kB)
    remote:        Collecting ipykernel==5.3.4
    remote:          Downloading ipykernel-5.3.4-py3-none-any.whl (120 kB)
    remote:        Collecting ipython==7.22.0
    remote:          Downloading ipython-7.22.0-py3-none-any.whl (785 kB)
    remote:        Collecting ipython-genutils==0.2.0
    remote:          Downloading ipython_genutils-0.2.0-py2.py3-none-any.whl (26 kB)
    remote:        Collecting itsdangerous==2.0.0
    remote:          Downloading itsdangerous-2.0.0-py3-none-any.whl (18 kB)
    remote:        Collecting jedi==0.17.0
    remote:          Downloading jedi-0.17.0-py2.py3-none-any.whl (1.1 MB)
    remote:        Collecting Jinja2==3.0.1
    remote:          Downloading Jinja2-3.0.1-py3-none-any.whl (133 kB)
    remote:        Collecting jupyter-client==6.1.12
    remote:          Downloading jupyter_client-6.1.12-py3-none-any.whl (112 kB)
    remote:        Collecting jupyter-core==4.7.1
    remote:          Downloading jupyter_core-4.7.1-py3-none-any.whl (82 kB)
    remote:        Collecting justpy==0.1.5
    remote:          Downloading justpy-0.1.5-py3-none-any.whl (4.4 MB)
    remote:        Collecting kiwisolver==1.3.1
    remote:          Downloading kiwisolver-1.3.1-cp38-cp38-manylinux1_x86_64.whl (1.2 MB)
    remote:        Collecting MarkupSafe==2.0.1
    remote:          Downloading MarkupSafe-2.0.1-cp38-cp38-manylinux2010_x86_64.whl (30 kB)
    remote:        Collecting matplotlib==3.3.4
    remote:          Downloading matplotlib-3.3.4-cp38-cp38-manylinux1_x86_64.whl (11.6 MB)
    remote:        Collecting mkl-fft==1.3.0
    remote:          Downloading mkl_fft-1.3.0-1-cp38-cp38-manylinux2014_x86_64.whl (250 kB)
    remote:        ERROR: Could not find a version that satisfies the requirement mkl-service==2.3.0 (from -r /tmp/build_3a708448/requirements.txt (line 37)) (from versions: 2.4.0)
    remote:        ERROR: No matching distribution found for mkl-service==2.3.0 (from -r /tmp/build_3a708448/requirements.txt (line 37))
    remote:  !     Push rejected, failed to compile Python app.
    remote:
    remote:  !     Push failed
    remote: Verifying deploy...
    remote:
    remote: !       Push rejected to debasissil.
    remote:
    To https://git.heroku.com/debasissil.git
     ! [remote rejected] master -> master (pre-receive hook declined)
    error: failed to push some refs to 'https://git.heroku.com/debasissil.git'
    
    (debasis_venv) C:\Users\LENOVO\12042021\Sec20_Build_A_Website>

I tried reinstalling it but it says Requirement already satisfied. The issue is also with mkl-random==1.2.1

I am doing it from virtual env.

Please advice.

Thanks and Regards

Fix performance when using 3d arrays

I installed Intel Python on Windows 10, the version of mkl_fft is 1.3.0 and it is package py37h5a85a7c_0.

If one does a 2D FFT on a ndim=3 array, the resulting computation takes much longer than doing the same 2D FFT on an ndim=2 array. For example:

In [1]: import numpy as np, mkl_fft
In [2]: a = np.ones((4096,4096), dtype=np.complex128)
In [3]: %timeit b = mkl_fft.fftn(a, axes=(0, 1))
183 ms ± 3.26 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
In [4]: b = np.ones((1, 4096,4096), dtype=np.complex128)
In [5]: %timeit c = mkl_fft.fftn(b, axes=(1, 2))
354 ms ± 13.2 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

I believe that what is happening is that any time one feeds in a 3d array, all of the multi-threading turns off and you get the performance of a single-threaded computation.

This can not be the intended behavior. It's 2021 and the MKL has one of the finest FFT algorithms on the planet. It has got to be better than this in Intel's Python distribution.

The above behavior also occurs in Anaconda's python distribution which is where I discovered the problem originally but I switched to ipd to verify.

Getting ValueError: Internal error in both OS X and CentOS, with anaconda, python 3.6, mkl 1.0.6, numpy 1.15.1.

Doing scipy.signal.fftconvolve:

import numpy as np
from scipy.signal import fftconvolve
d1 = np.random.rand(18768768)
d2 = np.random.rand(15243648)
xcraw = fftconvolve(d1, d2, mode='full')

leads to:

/Users/<user>/anaconda3/envs/py36/lib/python3.6/site-packages/mkl_fft/_numpy_fft.py:1044: FutureWarning: Using a non-tuple sequence for multidimensional indexing is deprecated; use `arr[tuple(seq)]` instead of `arr[seq]`. In the future this will be interpreted as an array index, `arr[np.array(seq)]`, which will result either in an error or a different result.
  output = mkl_fft.rfftn_numpy(a, s, axes)
Traceback (most recent call last):
  File "/Users/<user>/anaconda3/envs/py36/lib/python3.6/site-packages/IPython/core/interactiveshell.py", line 2910, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-15-98737a11a4f0>", line 1, in <module>
    xcraw = fftconvolve(d1, d2, mode='full')
  File "/Users/<user>/anaconda3/envs/py36/lib/python3.6/site-packages/scipy/signal/signaltools.py", line 391, in fftconvolve
    sp1 = np.fft.rfftn(in1, fshape)
  File "/Users/<user>/anaconda3/envs/py36/lib/python3.6/site-packages/mkl_fft/_numpy_fft.py", line 1044, in rfftn
    output = mkl_fft.rfftn_numpy(a, s, axes)
  File "mkl_fft/_pydfti.pyx", line 834, in mkl_fft._pydfti.rfftn_numpy
  File "mkl_fft/_pydfti.pyx", line 588, in mkl_fft._pydfti.rfft_numpy
  File "mkl_fft/_pydfti.pyx", line 504, in mkl_fft._pydfti._rc_fft1d_impl
ValueError: Internal error occurred, with status=-1

Numpy config is:

(py36) [kime9@l-1-01 AudVidSync]$ python -c "import numpy; print(numpy.show_config())"
mkl_info:
    libraries = ['mkl_rt', 'pthread']
    library_dirs = ['/home/kime9/anaconda3/envs/py36/lib']
    define_macros = [('SCIPY_MKL_H', None), ('HAVE_CBLAS', None)]
    include_dirs = ['/home/kime9/anaconda3/envs/py36/include']
blas_mkl_info:
    libraries = ['mkl_rt', 'pthread']
    library_dirs = ['/home/kime9/anaconda3/envs/py36/lib']
    define_macros = [('SCIPY_MKL_H', None), ('HAVE_CBLAS', None)]
    include_dirs = ['/home/kime9/anaconda3/envs/py36/include']
blas_opt_info:
    libraries = ['mkl_rt', 'pthread']
    library_dirs = ['/home/kime9/anaconda3/envs/py36/lib']
    define_macros = [('SCIPY_MKL_H', None), ('HAVE_CBLAS', None)]
    include_dirs = ['/home/kime9/anaconda3/envs/py36/include']
lapack_mkl_info:
    libraries = ['mkl_rt', 'pthread']
    library_dirs = ['/home/kime9/anaconda3/envs/py36/lib']
    define_macros = [('SCIPY_MKL_H', None), ('HAVE_CBLAS', None)]
    include_dirs = ['/home/kime9/anaconda3/envs/py36/include']
lapack_opt_info:
    libraries = ['mkl_rt', 'pthread']
    library_dirs = ['/home/kime9/anaconda3/envs/py36/lib']
    define_macros = [('SCIPY_MKL_H', None), ('HAVE_CBLAS', None)]
    include_dirs = ['/home/kime9/anaconda3/envs/py36/include']
None

I get the same error for:
noverlap = fftconvolve(np.ones(d1), np.ones(d2), mode='full')

Taking @oleksandr-pavlyk's advice and doing
np.fft.restore_all()
works around the error.

Is this worth opening a ticket, or am I doing something wrong? Am I describing this in a way that is useful?

Originally posted by @kusanagee in #11 (comment)

setup.py uses numpy.distutils, which has been removed as of numpy 1.25

setup.py uses numpy.distutils, which has been removed as of numpy 1.25.
This prevents python 3.12 builds.

Consider removing cython from install_requires for setup.py used in conda recipe

See AnacondaRecipes/aggregate#53

pip install mkl-fft==1.0.15 , version not found

How can i download and install version 1.0.15 of mkl-fft ? Or does it exist in the past ?

non-native endianness ndarrays cause a crash, instead of silent conversion to native endianness

This is essentially the same issue as cupy/cupy#4288;

A basic example:

from astropy.io import fits
from mkl_fft import _scipy_fft_backend as mklfft

a = fits.getdata(any_fits_file)
mklfft.fft2(a)

Produces the exception:

~/miniconda3/envs/prysm/lib/python3.8/site-packages/mkl_fft/_scipy_fft_backend.py in fft2(a, s, axes, norm, overwrite_x, workers)
    187     x = _float_utils.__upcast_float16_array(a)
    188     with Workers(workers):
--> 189         output = _pydfti.fftn(x, shape=s, axes=axes, overwrite_x=overwrite_x)
    190     if _unitary(norm):
    191         factor = 1

mkl_fft/_pydfti.pyx in mkl_fft._pydfti.fftn()

mkl_fft/_pydfti.pyx in mkl_fft._pydfti._fftnd_impl()

mkl_fft/_pydfti.pyx in mkl_fft._pydfti.iter_complementary()

<__array_function__ internals> in copyto(*args, **kwargs)

TypeError: Cannot cast array data from dtype('complex128') to dtype('>f8') according to the rule 'same_kind'

mkl_fft version 1.3.0

Add planning stage in mkl_fft

Any plan to add a planning stage ? That is very useful when mkl_fft is used inside a loop to reduce runtime.

Provide source code on PyPI

I'm wondering if it's possible to include the source code on PyPI. I could check that this project deals with python=>3.5 (which may cause problems with lastest numpy versions), and I think that it would be nice to have the source code available on PyPI for those cases where aren't python==3.7 (for example, the specific case when I need mkl-ffi==1.2.0 on python==3.6):

Links

FFT2 produces incorrect results for Fortran ordered 3D data

Example

>>> import numpy as np
>>> rng = np.random.RandomState(42)
>>> X_c = rng.rand(1024, 1024, 1).astype(np.complex128)
>>> X_f = X_c.astype(X_c.dtype, order='F')
>>> import mkl_fft
>>> np.abs(mkl_fft.fft2(X_c, axes=(0, 1))).max()
524549.9849164942
>>> np.abs(mkl_fft.fft2(X_f, axes=(0, 1))).max()
/home/rth/.miniconda3/envs/insight-gui/lib/python3.6/site-packages/numpy/core/_methods.py:28: RuntimeWarning: invalid value encountered in reduce
  return umr_maximum(a, axis, None, out, keepdims, initial)
nan
>>> np.abs(mkl_fft.fft2(X_f[:, :, 0], axes=(0, 1))).max()
524549.9849164942

Versions

Python 3.6, Linux, numpy 1.15.4, mkl_fft 1.0.6 (from the default conda channel).

`mkl_fft` available via `conda`?

Hey, thanks a lot for making mkl_fft available!

One quick question: is the package already available on Intel's conda channel? I tried

conda install -c intel mkl_fft

but this failed...

Thanks in advance for your help!

Single Precision FFT for Single Precision Input

Could you check mkl_fft integration into numpy? When using anaconda/numpy with mkl_fft 1.0.1, the FFT returns single precision result for single precision input. But pip/numpy without mkl_fft returns double precision FFT for single precision input. It would be nice if numpy's FFT behaves the same with or without mkl_fft being installed (see ContinuumIO/anaconda-issues#9550 and numpy/numpy#11241 for toy example)

fftn on multiple arrays faster using python loop

Running mkl_fft.fftn on an array with shape (70, 196, 150, 24) only on the last 3 axes is more than 2 times slower than running the transform on each of the 70 sub-arrays individually using a python loop.

This is unexpected as one would assume the loop should actually run faster inside of mkl.

Simple example:

import mkl_fft
import time

shape = (70, 196, 150, 24)
array = np.random.random(shape) + 1j * np.random.random(shape)

def transform(array):
    result = np.empty(array.shape, dtype=complex)
    for ii, arr in enumerate(array):
        result[ii] = mkl_fft.fftn(arr)
    return result

t0 = time.time()
a = mkl_fft.fftn(array, axes=(1, 2, 3))
t1 = time.time()
b = transform(array)
t2 = time.time()

print('fftn on full array: {:.0f} ms'.format(1000*(t1 - t0)))
print('loop of fftn on subarray: {:.0f} ms'.format(1000*(t2 - t1)))
print(np.allclose(a, b))

On my machine this returns:

fftn on full array: 1359 ms
loop of fftn on subarray: 619 ms
True

I also verified the timings with timeit instead of time.time.

ValueError: Internal error occurred: b'Intel MKL DFTI ERROR: Inconsistent configuration parameters' for certain shapes in _numpy_fft.rfft

For certain large 1d rffts, I get the following error. Tested this in anaconda python 3.7 and 3.8 on linux. If the long/fft axis is 1 instead of 0, I don't get this error.

ValueError: Internal error occurred: b'Intel MKL DFTI ERROR: Inconsistent configuration parameters'

This example generates the error below

import numpy as np
from mkl_fft._numpy_fft import rfft


X = np.random.randn(18134053, 1)
rfft(X, axis=0) # ok
print(18134052, 1)

X = np.random.randn(18134053, 16)
rfft(X, axis=0) # generates error
print(18134052, 16)

18134052 1
Traceback (most recent call last):
  File "mkl_test.py", line 10, in <module>
    rfft(X, axis=0)
  File "/home/jesse/anaconda3/envs/mkl_test/lib/python3.7/site-packages/mkl_fft/_numpy_fft.py", line 414, in rfft
    (x,), {'n': n, 'axis': axis})
  File "/home/jesse/anaconda3/envs/mkl_test/lib/python3.7/site-packages/mkl_fft/_numpy_fft.py", line 103, in trycall
    raise ve
  File "/home/jesse/anaconda3/envs/mkl_test/lib/python3.7/site-packages/mkl_fft/_numpy_fft.py", line 98, in trycall
    res = func(*args, **kwrds)
  File "mkl_fft/_pydfti.pyx", line 792, in mkl_fft._pydfti.rfft_numpy
  File "mkl_fft/_pydfti.pyx", line 701, in mkl_fft._pydfti._rc_fft1d_impl
ValueError: Internal error occurred: b'Intel MKL DFTI ERROR: Inconsistent configuration parameters'

This is in a basic conda environment with only numpy added.

$ conda list
# packages in environment at /home/jesse/anaconda3/envs/mkl_test:
#
# Name                    Version                   Build  Channel
_libgcc_mutex             0.1                        main  
blas                      1.0                         mkl  
ca-certificates           2021.1.19            h06a4308_0  
certifi                   2020.12.5        py37h06a4308_0  
intel-openmp              2020.2                      254  
ld_impl_linux-64          2.33.1               h53a641e_7  
libedit                   3.1.20191231         h14c3975_1  
libffi                    3.3                  he6710b0_2  
libgcc-ng                 9.1.0                hdf63c60_0  
libstdcxx-ng              9.1.0                hdf63c60_0  
mkl                       2020.2                      256  
mkl-service               2.3.0            py37he8ac12f_0  
mkl_fft                   1.3.0            py37h54f3939_0  
mkl_random                1.1.1            py37h0573a6f_0  
ncurses                   6.2                  he6710b0_1  
numpy                     1.19.2           py37h54aff64_0  
numpy-base                1.19.2           py37hfa32c7d_0  
openssl                   1.1.1j               h27cfd23_0  
pip                       21.0.1           py37h06a4308_0  
python                    3.7.10               hdb3f193_0  
readline                  8.1                  h27cfd23_0  
setuptools                52.0.0           py37h06a4308_0  
six                       1.15.0           py37h06a4308_0  
sqlite                    3.33.0               h62c20be_0  
tk                        8.6.10               hbc83047_0  
wheel                     0.36.2             pyhd3eb1b0_0  
xz                        5.2.5                h7b6447c_0  
zlib                      1.2.11               h7b6447c_3

and

mkl.get_version()
# {'MajorVersion': 2020, 'MinorVersion': 0, 'UpdateVersion': 2, 'ProductStatus': b'Product',
# 'Build': b'20200624', 'Processor': b'Intel(R) Advanced Vector Extensions 2 (Intel(R) AVX2) enabled processors',
# 'Platform': b'Intel(R) 64 architecture'}

Possibly similar to #24.

cc @hlillemark

numpy.fft.fft2 raises AssertionError

Steps to reproduce:
conda create -n mkl_fft
conda activate mkl_fft
conda install python=3.8 mkl_fft -c intel
python

>>> import numpy
>>> a = numpy.array([[5, 7, 6, 5], [4, 6, 4, 8], [9, 3, 7, 5]])
>>> b = numpy.fft.fft2(a, None, None, "ortho")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/nfs/site/home/denissmi/.conda/envs/mkl_fft/lib/python3.8/site-packages/mkl_fft/_numpy_fft.py", line 1081, in fft2
    return fftn(x, s=s, axes=axes, norm=norm)
  File "/nfs/site/home/denissmi/.conda/envs/mkl_fft/lib/python3.8/site-packages/mkl_fft/_numpy_fft.py", line 859, in fftn
    output = trycall(
  File "/nfs/site/home/denissmi/.conda/envs/mkl_fft/lib/python3.8/site-packages/mkl_fft/_numpy_fft.py", line 98, in trycall
    res = func(*args, **kwrds)
  File "mkl_fft/_pydfti.pyx", line 1105, in mkl_fft._pydfti.fftn
  File "mkl_fft/_pydfti.pyx", line 1083, in mkl_fft._pydfti._fftnd_impl
  File "mkl_fft/_pydfti.pyx", line 930, in mkl_fft._pydfti.iter_complementary
  File "mkl_fft/_pydfti.pyx", line 985, in mkl_fft._pydfti._direct_fftnd
AssertionError

Python 3.6 ImportWarning

I run my CI with PYTHONWARNINGS=error and I am getting the following warning in this package.

File "C:\ProgramData\Miniconda3\envs\sas-apps-jenkins-3.6-1.11\lib\site-packages\mkl_fft_init_.py", line 27, in
from ._pydfti import (fft, ifft, fft2, ifft2, fftn, ifftn, rfft, irfft,
File "mkl_fft_pydfti.pyx", line 27, in init mkl_fft._pydfti
ImportWarning: can't resolve package from spec or package, falling back on name and path

I am not sure how to fix it. Please let me know if you need more information.

I have mkl-fft 1.0.0 installed via conda into a Python 3.6.5 environment.

Warning: Mapping new ns to old ns and emulator stopping abruptly

After upgrading to Arctic Fox , I am getting the following errors, even though the emulator is running but sometimes stopping abruptly. What is this error ? How can I get rid of this ?

I am using the following as copied fromcmd:

> C:\Users\Debasis>flutter doctor Doctor summary (to see all details,
  > run flutter doctor -v): [√] Flutter (Channel stable, 2.2.3, on
  > Microsoft Windows [Version 10.0.19042.1165], locale en-IN) [√] Android
  > toolchain - develop for Android devices (Android SDK version 31.0.0)
  > [√] Chrome - develop for the web [√] Android Studio [√] Connected
  > device (2 available)

• No issues found !

The Error :

Launching lib\main.dart on sdk gphone x86 in debug mode... Running Gradle task 'assembleDebug'... Warning: Mapping new ns http://schemas.android.com/repository/android/common/02 to old ns http://schemas.android.com/repository/android/common/01 Warning: Mapping new ns http://schemas.android.com/repository/android/generic/02 to old ns http://schemas.android.com/repository/android/generic/01 Warning: Mapping new ns http://schemas.android.com/sdk/android/repo/addon2/02 to old ns http://schemas.android.com/sdk/android/repo/addon2/01 Warning: Mapping new ns http://schemas.android.com/sdk/android/repo/repository2/02 to old ns http://schemas.android.com/sdk/android/repo/repository2/01 Warning: Mapping new ns http://schemas.android.com/sdk/android/repo/sys-img2/02 to old ns http://schemas.android.com/sdk/android/repo/sys-img2/01 √ Built build\app\outputs\flutter-apk\app-debug.apk. Installing build\app\outputs\flutter-apk\app.apk... Debug service listening on ws://127.0.0.1:57467/XzCZTOeqyQs=/ws Syncing files to device sdk gphone x86...

`irfftn` corrupts input when dimensions >=3

Originally posted as numpy/numpy#10895

Seems to be introduced recently. It doesn't affect fftn, ifftn or rfft, irfft.

Here is the reproducer.

import numpy
from numpy.testing import assert_allclose

def showbug(nd):
    size = [4] * nd
    x = numpy.random.uniform(size=size)
    y = numpy.fft.rfftn(x)
    y1 = y.copy()

    numpy.fft.irfftn(y1)

    assert_allclose(y1, y)

showbug(3)

Traceback (most recent call last):
  File "fftbug.py", line 14, in <module>
    showbug(3)
  File "fftbug.py", line 12, in showbug
    assert_allclose(y1, y)
  File "/home/yfeng1/anaconda3/install/lib/python3.6/site-packages/numpy/testing/utils.py", line 1395, in assert_allclose
    verbose=verbose, header=header, equal_nan=equal_nan)
  File "/home/yfeng1/anaconda3/install/lib/python3.6/site-packages/numpy/testing/utils.py", line 778, in assert_array_compare
    raise AssertionError(msg)
AssertionError: 
Not equal to tolerance rtol=1e-07, atol=0

(mismatch 100.0%)
 x: array([[[ 1.612765+0.j      , -0.069468+0.278309j, -1.196650+0.j      ],
        [ 1.794408+0.j      ,  0.084526+0.187052j,  0.642339+0.j      ],
        [ 1.536380+0.j      , -0.811544-0.006539j,  0.736002+0.j      ],...
 y: array([[[  3.227387e+01+0.j      ,  -6.852522e-01-0.432786j,
          -2.638166e+00+0.j      ],
        [ -9.017300e-01+2.401103j,   1.732144e+00+0.885346j,...

rfft fails for large data

I can only use rfft up to the size of 2**26, successfully. I get the same behaviour for
in-place and out-of-place variants. I am using Ubuntu 19.10, python 3.7 and installed
via anaconda.
Minimal example:

import numpy as np
import mkl_fft


# This works
N = 2**26
data = np.random.randn(N)
data = data_fft = mkl_fft.rfft(data, n=None, axis=-1, overwrite_x=False)
print('Data of size 2**26 was transformed to Fourier space.')

# This fails
N = 2**27
data = np.random.randn(N)
mkl_fft.rfft(data, n=None, axis=-1, overwrite_x=False)
print('Data of size 2**27 was transformed to Fourier space.')

Traceback (most recent call last):
File "try_intel_mkl_fft.py", line 11, in
mkl_fft.rfft(data, n=None, axis=-1, overwrite_x=False)
File "mkl_fft/_pydfti.pyx", line 411, in mkl_fft._pydfti.rfft
File "mkl_fft/_pydfti.pyx", line 503, in mkl_fft._pydfti._rrfft1d_impl
ValueError: Internal error occurred: b'Intel MKL DFTI ERROR: Inconsistent configuration parameters'

Handling arrays with array.shape[i] == 0

When (accidentally) calling fft on an array with one dimension having shape=0, mkl_fft gives a somewhat cryptic error wheras numpy.fft.fft (with no mkl) returns a sensibly shaped array.

with numpy from conda (uses mkl_fft)

import numpy as np
from numpy.fft import fft

fft(np.zeros((10, 0, 5)))
# ValueError: Internal error occurred: b'Intel MKL DFTI ERROR: Inconsistent configuration parameters'

with numpy from pip

import numpy as np
from numpy.fft import fft

fft(np.zeros((10, 0, 5)))
# array([], shape=(10, 0, 5), dtype=complex128)

Conda list

# packages in environment at /home/jesse/anaconda3/envs/fft_test:
#
# Name                    Version                   Build  Channel
_libgcc_mutex             0.1                        main  
blas                      1.0                         mkl  
ca-certificates           2019.11.27                    0  
certifi                   2019.11.28               py38_0  
intel-openmp              2019.4                      243  
libedit                   3.1.20181209         hc058e9b_0  
libffi                    3.2.1                hd88cf55_4  
libgcc-ng                 9.1.0                hdf63c60_0  
libgfortran-ng            7.3.0                hdf63c60_0  
libstdcxx-ng              9.1.0                hdf63c60_0  
mkl                       2019.4                      243  
mkl-service               2.3.0            py38he904b0f_0  
mkl_fft                   1.0.15           py38ha843d7b_0  
mkl_random                1.1.0            py38h962f231_0  
ncurses                   6.1                  he6710b0_1  
numpy                     1.17.4           py38hc1035e2_0  
numpy-base                1.17.4           py38hde5b4d6_0  
openssl                   1.1.1d               h7b6447c_3  
pip                       19.3.1                   py38_0  
python                    3.8.0                h0371630_2  
readline                  7.0                  h7b6447c_5  
setuptools                44.0.0                   py38_0  
six                       1.13.0                   py38_0  
sqlite                    3.30.1               h7b6447c_0  
tk                        8.6.8                hbc83047_0  
wheel                     0.33.6                   py38_0  
xz                        5.2.4                h14c3975_4  
zlib                      1.2.11               h7b6447c_3

Similar things happen for rfft, fft2, etc.

ImportError: DLL load failed while importing _pydfti

Hi,

I installed the mkl_fft library via pip, when running my code I get the following error,

ImportError: DLL load failed while importing _pydfti: The specified module could not be found

Does anyone know what can cause this error?
Thanks in advance!

Will this code move into the anaconda main release ?

Hi, is it planned for the mkl_fft to move into the default anaconda mkl install ? At the moment, there are quite some differences between conda forge and conda intel. It would be nice to be available with default install.

mkl_fft pypi package crashes on import

Running in gdb, I got the following stack trace:

(gdb) run -c 'import mkl_fft'
The program being debugged has been started already.
Start it from the beginning? (y or n) y
Starting program: /home/clausen/.virtualenvs/libertem/bin/python -c 'import mkl_fft'
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".

Program received signal SIGSEGV, Segmentation fault.
__GI___libc_free (mem=0x7ffff6039010) at malloc.c:3093
3093	malloc.c: No such file or directory.
(gdb) bt
#0  __GI___libc_free (mem=0x7ffff6039010) at malloc.c:3093
#1  0x00007ffff7fdc430 in open_verify (name=name@entry=0x7fffffffa7b0 "/home/clausen/.virtualenvs/libertem/lib/python3.6/site-packages/mkl_fft/../../../libsvml.so", fbp=fbp@entry=0x7fffffffaa00, 
    loader=loader@entry=0xac9750, whatcode=whatcode@entry=4, mode=mode@entry=-2147483648, found_other_class=found_other_class@entry=0x7fffffffa9ef, free_name=false, fd=3) at dl-load.c:1977
#2  0x00007ffff7fdc768 in open_path (name=name@entry=0x7ffff71efe42 "libsvml.so", namelen=namelen@entry=11, mode=mode@entry=-2147483648, sps=sps@entry=0xac9a70, 
    realname=realname@entry=0x7fffffffa9f0, fbp=fbp@entry=0x7fffffffaa00, loader=0xac9750, whatcode=4, found_other_class=0x7fffffffa9ef) at dl-load.c:2058
#3  0x00007ffff7fddfbb in _dl_map_object (loader=0xac9750, name=0x7ffff71efe42 "libsvml.so", type=2, trace_mode=0, mode=<optimized out>, nsid=<optimized out>) at dl-load.c:2274
#4  0x00007ffff7fe22d2 in openaux (a=a@entry=0x7fffffffb020) at dl-deps.c:64
#5  0x00007ffff7b0dfcf in __GI__dl_catch_exception (exception=<optimized out>, operate=<optimized out>, args=<optimized out>) at dl-error-skeleton.c:196
#6  0x00007ffff7fe2646 in _dl_map_object_deps (map=map@entry=0xac9750, preloads=preloads@entry=0x0, npreloads=npreloads@entry=0, trace_mode=trace_mode@entry=0, open_mode=open_mode@entry=-2147483648)
    at dl-deps.c:248
#7  0x00007ffff7fe80a0 in dl_open_worker (a=a@entry=0x7fffffffb660) at dl-open.c:271
#8  0x00007ffff7b0dfcf in __GI__dl_catch_exception (exception=<optimized out>, operate=<optimized out>, args=<optimized out>) at dl-error-skeleton.c:196
#9  0x00007ffff7fe7c0a in _dl_open (file=0x7ffff7622cb0 "/home/clausen/.virtualenvs/libertem/lib/python3.6/site-packages/mkl_fft/_pydfti.cpython-36m-x86_64-linux-gnu.so", mode=-2147483646, 
    caller_dlopen=0x5f8f4d <_PyImport_FindSharedFuncptr+109>, nsid=<optimized out>, argc=3, argv=0x7fffffffdf08, env=0x7fffffffdf28) at dl-open.c:599
#10 0x00007ffff7f80256 in dlopen_doit (a=a@entry=0x7fffffffb880) at dlopen.c:66
#11 0x00007ffff7b0dfcf in __GI__dl_catch_exception (exception=exception@entry=0x7fffffffb820, operate=<optimized out>, args=<optimized out>) at dl-error-skeleton.c:196
#12 0x00007ffff7b0e05f in __GI__dl_catch_error (objname=0xaebec0, errstring=0xaebec8, mallocedp=0xaebeb8, operate=<optimized out>, args=<optimized out>) at dl-error-skeleton.c:215
#13 0x00007ffff7f80975 in _dlerror_run (operate=operate@entry=0x7ffff7f80200 <dlopen_doit>, args=args@entry=0x7fffffffb880) at dlerror.c:163
#14 0x00007ffff7f802e6 in __dlopen (file=<optimized out>, mode=<optimized out>) at dlopen.c:87
#15 0x00000000005f8f4d in _PyImport_FindSharedFuncptr ()
#16 0x00000000005fbbe8 in _PyImport_LoadDynamicModuleWithSpec ()
#17 0x00000000005fbea8 in ?? ()
#18 0x000000000056edce in PyCFunction_Call ()
#19 0x000000000051504c in _PyEval_EvalFrameDefault ()
#20 0x000000000050dee7 in ?? ()
#21 0x000000000050b850 in ?? ()
#22 0x000000000050c24d in ?? ()
#23 0x000000000050fad9 in _PyEval_EvalFrameDefault ()
#24 0x000000000050b519 in ?? ()
#25 0x000000000050c24d in ?? ()
#26 0x000000000050fad9 in _PyEval_EvalFrameDefault ()
#27 0x000000000050b519 in ?? ()
#28 0x000000000050c24d in ?? ()
#29 0x000000000050fad9 in _PyEval_EvalFrameDefault ()
#30 0x000000000050b519 in ?? ()
#31 0x000000000050c24d in ?? ()
#32 0x000000000050fad9 in _PyEval_EvalFrameDefault ()
#33 0x000000000050b519 in ?? ()
#34 0x000000000050c24d in ?? ()
#35 0x000000000050fad9 in _PyEval_EvalFrameDefault ()
#36 0x000000000050ac55 in _PyFunction_FastCallDict ()
#37 0x00000000005ab891 in _PyObject_FastCallDict ()
#38 0x000000000059ecae in _PyObject_CallMethodIdObjArgs ()
#39 0x00000000004f815d in PyImport_ImportModuleLevelObject ()
#40 0x000000000051195f in _PyEval_EvalFrameDefault ()
#41 0x000000000050dee7 in ?? ()
#42 0x000000000051b198 in ?? ()
#43 0x000000000056edce in PyCFunction_Call ()
#44 0x000000000051504c in _PyEval_EvalFrameDefault ()
#45 0x000000000050dee7 in ?? ()
#46 0x000000000050b850 in ?? ()
#47 0x000000000050c24d in ?? ()
#48 0x000000000050fad9 in _PyEval_EvalFrameDefault ()
--Type <RET> for more, q to quit, c to continue without paging--
#49 0x000000000050b519 in ?? ()
#50 0x000000000050c24d in ?? ()
#51 0x000000000050fad9 in _PyEval_EvalFrameDefault ()
#52 0x000000000050b519 in ?? ()
#53 0x000000000050c24d in ?? ()
#54 0x000000000050fad9 in _PyEval_EvalFrameDefault ()
#55 0x000000000050b519 in ?? ()
#56 0x000000000050c24d in ?? ()
#57 0x000000000050fad9 in _PyEval_EvalFrameDefault ()
#58 0x000000000050ac55 in _PyFunction_FastCallDict ()
#59 0x00000000005ab891 in _PyObject_FastCallDict ()
#60 0x000000000059ecae in _PyObject_CallMethodIdObjArgs ()
#61 0x00000000004f815d in PyImport_ImportModuleLevelObject ()
#62 0x000000000051195f in _PyEval_EvalFrameDefault ()
#63 0x000000000050dee7 in ?? ()
#64 0x0000000000635c7f in PyRun_StringFlags ()
#65 0x00000000006390dd in PyRun_SimpleStringFlags ()
#66 0x0000000000639bac in Py_Main ()
#67 0x00000000004b2760 in main ()

Installed versions:

$ pip freeze | grep mkl
mkl==2019.0
mkl-fft==1.0.6
mkl-random==1.0.1.1
$ python --version
Python 3.6.8rc1

To reproduce:

$ python3.6 -m venv venv1
$ . venv1/bin/activate
$ pip install mkl_fft
$ python -c 'import mkl_fft'

I tried to reproduce this with Python 3.7, but there doesn't seem to be a mkl_fft package for that (Btw: debian unstable is about to remove Python 3.6 in favor of 3.7, so it might be a good idea to also publish a 3.7 compatible package soon, but that is orthogonal to this issue...).

Provide Python 3.7 wheels on PyPi

It would be handy if binary wheels were provided on PyPi:

https://pypi.org/project/mkl-fft/#history

I understand that mlk-fft is available through conda, but that can become a bit of a dependency-matching mess to deal with. Installation through PyPi is a lot cleaner and easier to manage, in my experience.

Bad performance

hi all, I have tested torch.fft.fftn and mkl_fft.fftn, the performance is below measure in python

input size: 1,3,2160, 3840
axes = (-2, -1)
OMP_NUM_THREADS=10
mkl_fft.fftn cost: 0:00:05.664933
torch.fft.fftn cost: 0:00:00.404621

import torch
import numpy as np
from mkl_fft import fftn, fft2

import datetime

def numpy_fft(x):
    for i in range(10):
        y = fftn(x, axes=(-2,-1))
    return y

def torch_fft(x):
    for i in range(10):
        y = torch.fft.fftn(x, dim=(-2,-1))
    
    return y

data = np.random.uniform(0, 10, (1,3,2160, 3840))

torch_data = torch.from_numpy(data)

s = datetime.datetime.now()
y1 = numpy_fft(data)
e = datetime.datetime.now()


y2 = torch_fft(torch_data)
k = datetime.datetime.now()

print(np.max(y2.numpy() - y1))
print(e-s, k-e)

could anyone explains why mkl_fft is slow than torch.fft (almost 10x)?

Array size and scipy.sosfiltfilt()

Hello, today I experienced an issue while using mkl_fft on one-dimensional numpy arrays. The following code results in an error:

import numpy as np
from scipy.signal import sosfilt, sosfiltfilt, butter
from mkl_fft import fft

a = np.random.rand(100_000_000)
fft(a) # ok
print("ok 1")

a = sosfilt(butter(4, 0.1, output="sos"), a)
fft(a) # ok
print("ok 2")

a = sosfiltfilt(butter(4, 0.1, output="sos"), a)
fft(a.copy())  # ok
print("ok 3")

fft(a) # ERROR
print("ok 4")

While the first three FFTs execute normally, I get an error in der last step

File "mkl_fft_pydfti.pyx", line 155, in mkl_fft._pydfti.fft
File "mkl_fft_pydfti.pyx", line 401, in mkl_fft._pydfti._fft1d_impl
ValueError: Internal error occurred: b'Intel MKL DFTI ERROR: Inconsistent configuration parameters'

I know from other issues that mkl has problems with array lengths above 2^24 (ca. 16.7M). However, I'm surprised that the first examples execute normally while in all my tests applying fft() to the output of scipy.signal.sosfiltfilt() fails if the array is longer than 2^24 while sosfilt() works fine. I can solve the problem by using a.copy() but I smell something fishy here. Even if this might not be easily solvable I'd still be glad if someone could explain to me why this happens in the first place. Thanks!

I'm on Win11, Anaconda python 3.10, mkl 1.3.1, scipy 1.10.0

Pip version older than conda version

See conda: https://anaconda.org/conda-forge/mkl_fft
Pypi: https://pypi.org/project/mkl-fft/

Additional features in MKL_FFT's API

Thanks again for making mkl_fft available!

Looking at the API, I was wondering if there were plans to expand the API, in order to try to optimize performance, especially:

Are there plans to set the number of threads via the API, or is this set anyway via the environment variable MKL_NUM_THREADS?
Would it be possible to add an argument out, so that the user can pass a pre-allocated numpy array to store the results? (I would expect this to lead to better performance in some cases, since there would be no allocation of memory when calling fft in this case.)
Are there plans to add an alternative lower-level Python API (much like the C API), where users can create and commit a descriptor, and then call the DFTI routines (potentially many times), before finally destroying the descriptor. (Again, I would expect this to lead to better performance ; is that correct?)

Segmentation fault with numpy.fft.rfft and MKL and threads

When using numpy.fft.rfft with mkl_fft in a ThreadPoolExecutor you can run in a segmentation fault.

Why is this a cross-post?
Since this issue is related to Anaconda, Numpy and mkl_fft, this issue is posted on all three locations. Currently, it is not quite clear which party should address the issue.

mkl_fft: No response yet.
numpy: Sees issue at mkl_fft but uses a seemingly buggy library (namely mkl_fft).
anaconda: No response yet.

We get the same error when testing this on different infrastructure/ OS:

Scientific Linux 7.2 (Nitrogen)
Ubuntu 16.04.4 LTS
OSX

import numpy as np
from concurrent.futures import ThreadPoolExecutor, ProcessPoolExecutor


def fun(_):
    print(_)
    frames = np.random.randint(1000, 2000)
    signal = np.ones((2, frames), dtype=np.float64)

    _ = np.fft.rfft(signal)

    return 1,


if __name__ == '__main__':
    # print('ProcessPoolExecutor')
    # with ProcessPoolExecutor(4) as ex:
    #     list(ex.map(fun, range(50)))

    print('ThreadPoolExecutor')
    with ThreadPoolExecutor(4) as ex:
        list(ex.map(fun, range(500)))

Repeating the example code many times yields some kind of traceback, if it did not segfault.

Sometimes occuring traceback

Traceback (most recent call last):
  File "test_script.py", line 35, in <module>
    list(ex.map(fun, range(50)))
  File "/net/home/pyadmin/conda/lib/python3.6/concurrent/futures/_base.py", line 586, in result_iterator
    yield fs.pop().result()
  File "/net/home/pyadmin/conda/lib/python3.6/concurrent/futures/_base.py", line 432, in result
    return self.__get_result()
  File "/net/home/pyadmin/conda/lib/python3.6/concurrent/futures/_base.py", line 384, in __get_result
    raise self._exception
  File "/net/home/pyadmin/conda/lib/python3.6/concurrent/futures/thread.py", line 56, in run
    result = self.fn(*self.args, **self.kwargs)
  File "test_script.py", line 11, in fun
    _ = np.fft.rfft(signal)
  File "/net/home/pyadmin/conda/lib/python3.6/site-packages/mkl_fft/_numpy_fft.py", line 331, in rfft
    output = mkl_fft.rfft_numpy(a, n=n, axis=axis)
  File "mkl_fft/_pydfti.pyx", line 569, in mkl_fft._pydfti.rfft_numpy
  File "mkl_fft/_pydfti.pyx", line 487, in mkl_fft._pydfti._rc_fft1d_impl
ValueError: Internal error occurred

Not working environment:

Conda installed numpy version: Version: 1.14.3

Numpy config

python -c "import numpy; print(numpy.show_config())"
mkl_info:
    libraries = ['mkl_rt', 'pthread']
    library_dirs = ['/Users/lukas/anaconda3/lib']
    define_macros = [('SCIPY_MKL_H', None), ('HAVE_CBLAS', None)]
    include_dirs = ['/Users/lukas/anaconda3/include']
blas_mkl_info:
    libraries = ['mkl_rt', 'pthread']
    library_dirs = ['/Users/lukas/anaconda3/lib']
    define_macros = [('SCIPY_MKL_H', None), ('HAVE_CBLAS', None)]
    include_dirs = ['/Users/lukas/anaconda3/include']
blas_opt_info:
    libraries = ['mkl_rt', 'pthread']
    library_dirs = ['/Users/lukas/anaconda3/lib']
    define_macros = [('SCIPY_MKL_H', None), ('HAVE_CBLAS', None)]
    include_dirs = ['/Users/lukas/anaconda3/include']
lapack_mkl_info:
    libraries = ['mkl_rt', 'pthread']
    library_dirs = ['/Users/lukas/anaconda3/lib']
    define_macros = [('SCIPY_MKL_H', None), ('HAVE_CBLAS', None)]
    include_dirs = ['/Users/lukas/anaconda3/include']
lapack_opt_info:
    libraries = ['mkl_rt', 'pthread']
    library_dirs = ['/Users/lukas/anaconda3/lib']
    define_macros = [('SCIPY_MKL_H', None), ('HAVE_CBLAS', None)]
    include_dirs = ['/Users/lukas/anaconda3/include']
None

Working Environment:

Pip installed numpy version: 1.14.5

Numpy config

python -c "import numpy; print(numpy.show_config())"
blas_mkl_info:
  NOT AVAILABLE
blis_info:
  NOT AVAILABLE
openblas_info:
  NOT AVAILABLE
atlas_3_10_blas_threads_info:
  NOT AVAILABLE
atlas_3_10_blas_info:
  NOT AVAILABLE
atlas_blas_threads_info:
  NOT AVAILABLE
atlas_blas_info:
  NOT AVAILABLE
blas_opt_info:
    extra_compile_args = ['-msse3', '-I/System/Library/Frameworks/vecLib.framework/Headers']
    extra_link_args = ['-Wl,-framework', '-Wl,Accelerate']
    define_macros = [('NO_ATLAS_INFO', 3), ('HAVE_CBLAS', None)]
lapack_mkl_info:
  NOT AVAILABLE
openblas_lapack_info:
  NOT AVAILABLE
openblas_clapack_info:
  NOT AVAILABLE
atlas_3_10_threads_info:
  NOT AVAILABLE
atlas_3_10_info:
  NOT AVAILABLE
atlas_threads_info:
  NOT AVAILABLE
atlas_info:
  NOT AVAILABLE
lapack_opt_info:
    extra_compile_args = ['-msse3']
    extra_link_args = ['-Wl,-framework', '-Wl,Accelerate']
    define_macros = [('NO_ATLAS_INFO', 3), ('HAVE_CBLAS', None)]
None

Installation Error: Could not find a version that satisfies the requirement

System: OS High Sierra 10.13.4
Python: 3.7
Pip: 18.0

sudo -H pip3 install mkl-ffl

Error message:

Collecting mkl-ffl
Could not find a version that satisfies the requirement mkl-ffl (from versions: )
No matching distribution found for mkl-ffl

the same error message appears when I tried to install mkl-random

Weird behavior with FFT on array views

This issue was originally posted here: numpy/numpy#11762

After some investigation it seems the problem is with the mkl accelerated libraries. I hope this is the right place to raise the issue.

Reproducing code example:

import numpy as np

np.set_printoptions(precision=2)

# make random data
a = np.random.rand(3, 3) + 1j * np.random.rand(3, 3)

a_view = a[np.newaxis]
# make a new array with one empty axis
a_new = a.reshape(1, 3, 3)

# take ffts  along the last axis of these arrays
b = np.fft.fftn(a, axes=(-1,)).squeeze()
b_view = np.fft.fftn(a_view, axes=(-1,)).squeeze()
b_new = np.fft.fftn(a_new, axes=(-1,)).squeeze()

print("b looks like:")
print(b)
print("b_view looks like:")
print(b_view)
print("b_new looks like:")
print(b_new)

# check that they all are equal
assert np.array_equal(a_new, a_view), "Arrays not equal ..."
assert np.array_equal(b, b_new), "New array failed ..."
assert np.array_equal(b, b_view), "View failed ..."

Error messages:

   ...: 
b looks like:
[[ 2.08+1.42j -0.19+0.05j -0.01+0.6j ]
 [ 2.34+1.96j -0.27-0.16j  0.4 -0.09j]
 [ 2.33+1.79j  0.44-0.62j  0.09+0.38j]]
b_view looks like:
[[ 2.08+1.42j -0.19+0.05j -0.01+0.6j ]
 [ 2.08+1.42j -0.19+0.05j -0.01+0.6j ]
 [ 2.08+1.42j -0.19+0.05j -0.01+0.6j ]]
b_new looks like:
[[ 2.08+1.42j -0.19+0.05j -0.01+0.6j ]
 [ 2.34+1.96j -0.27-0.16j  0.4 -0.09j]
 [ 2.33+1.79j  0.44-0.62j  0.09+0.38j]]
---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
<ipython-input-1-6af2fa2a5e77> in <module>()
     25 assert np.array_equal(a_new, a_view), "Arrays not equal ..."
     26 assert np.array_equal(b, b_new), "New array failed ..."
---> 27 assert np.array_equal(b, b_view), "View failed ..."

AssertionError: View failed ...

Conda environment

# packages in environment at /Users/david/anaconda3:
#
# Name                    Version                   Build  Channel
_license                  1.1                      py36_1  
alabaster                 0.7.11                     py_3    conda-forge
anaconda                  custom           py36ha4fed55_0  
anaconda-client           1.7.1                      py_0    conda-forge
anaconda-navigator        1.8.7                    py36_0  
anaconda-project          0.8.2                      py_1    conda-forge
appdirs                   1.4.3                      py_1    conda-forge
appnope                   0.1.0            py36hf537a9a_0  
appscript                 1.0.1            py36h1de35cc_1  
asn1crypto                0.24.0                     py_1    conda-forge
astroid                   2.0.4                    py36_0  
astropy                   3.0.4            py36h1de35cc_0  
atomicwrites              1.1.5                    py36_0  
attrs                     18.1.0                     py_1    conda-forge
automat                   0.7.0                    py36_0  
babel                     2.6.0                      py_1    conda-forge
backcall                  0.1.0                    py36_0  
backports                 1.0                      py36_1  
backports.functools_lru_cache 1.5                        py_1    conda-forge
backports.shutil_get_terminal_size 1.0.0                      py_3    conda-forge
beautifulsoup4            4.6.3                    py36_0  
bitarray                  0.8.3            py36h1de35cc_0  
bkcharts                  0.2              py36h073222e_0  
blas                      1.0                         mkl  
blaze                     0.11.3                   py36_0  
bleach                    2.1.3                    py36_0  
blinker                   1.4                        py_1    conda-forge
blosc                     1.14.4               hfc679d8_0    conda-forge
bokeh                     0.13.0                   py36_0    conda-forge
boto                      2.49.0                   py36_0  
boto3                     1.7.70                     py_0    conda-forge
botocore                  1.10.71                    py_0    conda-forge
bottleneck                1.2.1            py36h1d22016_1  
bz2file                   0.98                     py36_0  
bzip2                     1.0.6                h1de35cc_5  
ca-certificates           2018.4.16                     0    conda-forge
cairo                     1.14.12              hc4e6be7_4  
certifi                   2018.8.13                py36_0  
cffi                      1.11.5           py36h5e8e0c9_1    conda-forge
chardet                   3.0.4                    py36_3    conda-forge
clangdev                  6.0.0                 default_0    conda-forge
click                     6.7                        py_1    conda-forge
cloudpickle               0.5.3                    py36_0  
clyent                    1.2.2                      py_1    conda-forge
colorama                  0.3.9                      py_1    conda-forge
colorspacious             1.1.2                     <pip>
conda                     4.5.10                   py36_0  
conda-env                 2.6.0                         1  
constantly                15.1.0           py36h28b3542_0  
contextlib2               0.5.5                      py_2    conda-forge
cryptography              2.3              py36hdbc3d79_0  
cryptography-vectors      2.3                      py36_1    conda-forge
curl                      7.61.0               ha441bb4_0  
cycler                    0.10.0                     py_1    conda-forge
cython                    0.28.5           py36h0a44026_0  
cytoolz                   0.9.0.1          py36h1de35cc_1  
dask                      0.18.2                   py36_0  
dask-core                 0.18.2                   py36_0  
datashape                 0.5.4                    py36_1  
dbus                      1.13.2               h760590f_1  
decorator                 4.3.0                    py36_0  
distributed               1.22.1                   py36_0  
docutils                  0.14             py36hbfde631_0  
entrypoints               0.2.3                    py36_2  
et_xmlfile                1.0.1            py36h1315bdc_0  
expat                     2.2.5                hfc679d8_1    conda-forge
fastcache                 1.0.2            py36h1de35cc_2  
ffmpeg                    4.0                  h01ea3c9_0  
filelock                  3.0.4                      py_1    conda-forge
flask                     1.0.2                    py36_1  
flask-cors                3.0.6                      py_0    conda-forge
fontconfig                2.13.0               h5d5b041_1  
freetype                  2.9.1                hb4e5f40_0  
fribidi                   1.0.4                h1de35cc_0  
gensim                    3.4.0            py36h1de35cc_0  
get_terminal_size         1.0.0                h7520d66_0  
gettext                   0.19.8.1             h15daf44_3  
gevent                    1.3.5            py36h1de35cc_0  
giflib                    5.1.4                h1de35cc_1  
glib                      2.56.1               h35bc53a_0  
glob2                     0.6                      py36_0  
gmp                       6.1.2                hb37e062_1  
gmpy2                     2.0.8            py36h6ef4df4_2  
gnutls                    3.5.19               h2a4e5f8_1    conda-forge
graphite2                 1.3.11               h2098e52_2  
graphviz                  2.40.1               hefbbd9a_2  
greenlet                  0.4.14           py36h1de35cc_0  
h5py                      2.8.0            py36hb794570_1    conda-forge
harfbuzz                  1.8.4                hb8d4a28_0  
hdbscan                   0.8.15           py36h7eb728f_0    conda-forge
hdf5                      1.10.2               hc401514_1    conda-forge
heapdict                  1.0.0                    py36_2  
html5lib                  1.0.1                    py36_0  
hyperlink                 18.0.0                   py36_0  
icu                       58.2                 h4b95b61_1  
idna                      2.7                      py36_2    conda-forge
imageio                   2.3.0                      py_1    conda-forge
imagesize                 1.0.0                      py_1    conda-forge
incremental               17.5.0                   py36_0  
intel-openmp              2018.0.3                      0  
ipykernel                 4.8.2                    py36_0  
ipython                   6.5.0                    py36_0  
ipython_genutils          0.2.0                      py_1    conda-forge
ipywidgets                7.4.0                    py36_0  
isort                     4.3.4                    py36_0  
itsdangerous              0.24                       py_2    conda-forge
jasper                    2.0.14               h636a363_1  
jbig                      2.1                  h4d881f8_0  
jdcal                     1.4                        py_1    conda-forge
jedi                      0.12.1                   py36_0    conda-forge
jinja2                    2.10                       py_1    conda-forge
jmespath                  0.9.3                      py_1    conda-forge
jpeg                      9c                   h470a237_0    conda-forge
jsonschema                2.6.0                    py36_1    conda-forge
jupyter                   1.0.0                    py36_4  
jupyter_client            5.2.3                      py_1    conda-forge
jupyter_console           5.2.0                    py36_1  
jupyter_core              4.4.0                    py36_0  
jupyterlab                0.33.11                  py36_0  
jupyterlab_launcher       0.11.2                   py36_0  
keyring                   13.2.1                   py36_0  
kiwisolver                1.0.1                    py36_1    conda-forge
krb5                      1.16.1               h24a3359_6  
lazy-object-proxy         1.3.1            py36h1de35cc_2  
libcurl                   7.61.0               hf30b1f0_0  
libcxx                    6.0.0                         0    conda-forge
libcxxabi                 4.0.1                hebd6815_0  
libedit                   3.1.20170329         hb402a30_2  
libffi                    3.2.1                h475c297_4  
libgfortran               3.0.1                h93005f0_2  
libiconv                  1.15                 hdd342a3_7  
libidn11                  1.33                 hf837533_0    conda-forge
libopenblas               0.2.20               hdc02c5d_7  
libopus                   1.2.1                h169cedb_0  
libpng                    1.6.34               ha92aebf_1    conda-forge
libprotobuf               3.6.0                hd9629dc_0  
libsodium                 1.0.16               h3efe00b_0  
libssh2                   1.8.0                h322a93b_4  
libtiff                   4.0.9                he6b73bb_1    conda-forge
libvpx                    1.7.0                h378b8a2_0  
libwebp                   0.5.2                         7    conda-forge
libxml2                   2.9.8                h422b904_2    conda-forge
libxslt                   1.1.32               h88dbc4e_1    conda-forge
line_profiler             2.1.2            py36h470a237_1    conda-forge
llvm-meta                 6.0.0                         0    conda-forge
llvmdev                   6.0.0                h137f3e6_4  
llvmlite                  0.24.0           py36hc454e04_0  
locket                    0.2.0                      py_2    conda-forge
lxml                      4.2.4            py36hef8c89e_0  
lzo                       2.10                 h362108e_2  
markupsafe                1.0              py36h1de35cc_1  
matplotlib                2.2.3            py36h54f8f79_0  
mccabe                    0.6.1                      py_1    conda-forge
mistune                   0.8.3            py36h470a237_2    conda-forge
mkl                       2018.0.3                      1  
mkl-service               1.1.2            py36h6b9c3cc_4  
mkl_fft                   1.0.5                    py36_0    conda-forge
mkl_random                1.0.1            py36h5d10147_1  
more-itertools            4.3.0                    py36_0  
mpc                       1.1.0                h6ef4df4_1  
mpfr                      4.0.1                h3018a27_3  
mpmath                    1.0.0                    py36_2  
mrcfile                   1.0.4                     <pip>
msgpack-python            0.5.6            py36h2d50403_2    conda-forge
multipledispatch          0.6.0                    py36_0  
navigator-updater         0.2.1                    py36_0  
nbconvert                 5.3.1                      py_1    conda-forge
nbformat                  4.4.0                      py_1    conda-forge
ncurses                   6.1                  hfc679d8_1    conda-forge
nettle                    3.3                           0    conda-forge
networkx                  2.1                        py_1    conda-forge
nltk                      3.3.0                    py36_0  
nodejs                    9.11.1                        0    conda-forge
nose                      1.3.7                    py36_2  
notebook                  5.6.0                    py36_0  
numba                     0.39.0           py36h6440ff4_0  
numexpr                   2.6.7            py36h4f467ca_0  
numpy                     1.15.0           py36h648b28d_0  
numpy-base                1.15.0           py36h8a80b8c_0  
numpydoc                  0.8.0                      py_1    conda-forge
oauthlib                  2.1.0                      py_0    conda-forge
odo                       0.5.1                      py_1    conda-forge
olefile                   0.45.1                     py_1    conda-forge
openblas                  0.2.20                        8    conda-forge
opencv                    3.4.1            py36h6fd60c2_1  
openh264                  1.7.0                         0    conda-forge
openpyxl                  2.5.5                    py36_0  
openssl                   1.0.2p               h1de35cc_0  
packaging                 17.1                     py36_0  
pandas                    0.23.4           py36h6440ff4_0  
pandoc                    2.2.2                hde52d81_1    conda-forge
pandocfilters             1.4.2                      py_1    conda-forge
pango                     1.42.2               h060686c_0  
parso                     0.3.1                      py_0    conda-forge
partd                     0.3.8                      py_1    conda-forge
path.py                   11.0.1                     py_0    conda-forge
pathlib2                  2.3.2                    py36_0  
patsy                     0.5.0                      py_1    conda-forge
pcre                      8.42                 h378b8a2_0  
pep8                      1.7.1                    py36_0  
pexpect                   4.6.0                    py36_0  
pickleshare               0.7.4            py36hf512f8e_0  
pillow                    5.2.0            py36hb68e598_0  
pip                       18.0                     py36_1    conda-forge
pixman                    0.34.0               hca0a616_3  
pkginfo                   1.4.2                      py_1    conda-forge
plotly                    3.1.0                    py36_0  
pluggy                    0.7.1                    py36_0  
ply                       3.11                       py_1    conda-forge
prometheus_client         0.3.1                    py36_0  
prompt_toolkit            1.0.15           py36haeda067_0  
psutil                    5.4.6            py36h1de35cc_0  
ptyprocess                0.6.0                    py36_0  
py                        1.5.4                    py36_0  
pyasn1                    0.4.4                    py36_0  
pyasn1-modules            0.2.2                    py36_0  
pycodestyle               2.4.0                      py_1    conda-forge
pycosat                   0.6.3            py36h470a237_1    conda-forge
pycparser                 2.18                       py_1    conda-forge
pycpd                     0.4                       <pip>
pycrypto                  2.6.1            py36h1de35cc_9  
pycurl                    7.43.0.2         py36hdbc3d79_0  
pyflakes                  2.0.0                    py36_0  
pygments                  2.2.0                      py_1    conda-forge
pyhamcrest                1.9.0                      py_2    conda-forge
pyjwt                     1.6.4                      py_0    conda-forge
pylint                    2.1.1                    py36_0  
pyodbc                    4.0.23           py36hfc679d8_1    conda-forge
pyopenssl                 18.0.0                   py36_0  
pyparsing                 2.2.0                      py_1    conda-forge
pyqt                      5.9.2            py36h655552a_0  
pysocks                   1.6.8                    py36_1    conda-forge
pytables                  3.4.4            py36h247b57e_1    conda-forge
pytest                    3.7.1                    py36_0  
pytest-arraydiff          0.2              py36h39e3cac_0  
pytest-astropy            0.4.0                    py36_0  
pytest-doctestplus        0.1.3                      py_0    conda-forge
pytest-openfiles          0.3.0                      py_0    conda-forge
pytest-remotedata         0.3.0                    py36_0  
python                    3.6.6                h5001a0f_0    conda-forge
python-crfsuite           0.9.6            py36h470a237_0    conda-forge
python-dateutil           2.7.3                      py_0    conda-forge
python-graphviz           0.8.4                    py36_2    conda-forge
python.app                2                        py36_8  
pytz                      2018.5                   py36_0  
pywavelets                0.5.2            py36h7eb728f_2    conda-forge
pyyaml                    3.13             py36h1de35cc_0  
pyzmq                     17.1.2           py36h1de35cc_0  
qt                        5.9.6                h74ce4d9_0  
qtawesome                 0.4.4              pyh8a2030e_1    conda-forge
qtconsole                 4.3.1            py36hd96c0ff_0  
qtpy                      1.4.2              pyh8a2030e_1    conda-forge
readline                  7.0                  hc1231fa_4  
requests                  2.19.1                   py36_1    conda-forge
requests-oauthlib         1.0.0                      py_1    conda-forge
retrying                  1.3.3                    py36_2  
rope                      0.11.0                   py36_0  
ruamel_yaml               0.15.46          py36h1de35cc_0  
s3transfer                0.1.13                   py36_0  
scikit-image              0.14.0           py36h0a44026_1  
scikit-learn              0.19.1           py36hf9f1f73_0  
scipy                     1.1.0            py36hf1f7d93_0  
seaborn                   0.9.0                    py36_0  
send2trash                1.5.0                    py36_0  
service_identity          17.0.0           py36h28b3542_0  
setuptools                40.0.0                   py36_1    conda-forge
simplegeneric             0.8.1                    py36_2  
simpleitk                 1.1.0            py36h0a44026_0    simpleitk
singledispatch            3.4.0.3          py36hf20db9d_0  
sip                       4.19.8           py36h0a44026_0  
six                       1.11.0                   py36_1  
smart_open                1.6.0                      py_1    conda-forge
snowballstemmer           1.2.1                      py_1    conda-forge
sortedcollections         1.0.1                      py_1    conda-forge
sortedcontainers          2.0.4                      py_1    conda-forge
sphinx                    1.7.6                    py36_0  
sphinxcontrib             1.0                      py36_1  
sphinxcontrib-websupport  1.1.0                    py36_1  
spyder                    3.3.1                    py36_1  
spyder-kernels            0.2.6                    py36_0  
sqlalchemy                1.2.10           py36h470a237_1    conda-forge
sqlite                    3.24.0               h2f33b56_0    conda-forge
statsmodels               0.9.0            py36h1d22016_0  
sympy                     1.2                      py36_0  
tblib                     1.3.2                      py_1    conda-forge
terminado                 0.8.1                    py36_1  
testpath                  0.3.1            py36h625a49b_0  
tk                        8.6.8                         0    conda-forge
toolz                     0.9.0                    py36_0  
tornado                   5.1              py36h470a237_1    conda-forge
tqdm                      4.24.0                   py36_0  
traitlets                 4.3.2            py36h65bd3ce_0  
twisted                   17.5.0                   py36_0  
twython                   3.7.0                    py36_0  
typed-ast                 1.1.0            py36h1de35cc_0  
typing                    3.6.4                    py36_0  
unicodecsv                0.14.1                     py_1    conda-forge
unixodbc                  2.3.6                h3efe00b_0  
urllib3                   1.23                     py36_1    conda-forge
viscm                     0.7                       <pip>
wcwidth                   0.1.7                      py_1    conda-forge
webencodings              0.5.1                    py36_1  
werkzeug                  0.14.1                   py36_0  
wheel                     0.31.1                   py36_1    conda-forge
widgetsnbextension        3.4.0                    py36_0  
wrapt                     1.10.11          py36h1de35cc_2  
x264                      1!152.20180717       h470a237_0    conda-forge
xlrd                      1.1.0                      py_2    conda-forge
xlsxwriter                1.0.5                    py36_0  
xlwings                   0.11.8                   py36_0  
xlwt                      1.3.0                      py_1    conda-forge
xz                        5.2.4                h1de35cc_4  
yaml                      0.1.7                hc338f04_2  
zeromq                    4.2.5                hfc679d8_5    conda-forge
zict                      0.1.3                    py36_0  
zlib                      1.2.11               h470a237_3    conda-forge
zope.interface            4.5.0            py36h470a237_0    conda-forge

repeated indices in axes keyword for N-dimensional FFT are ignored

Repeated indices in axes keyword are ignored while the transform over the repeated axis should be performed multiple times.
Result from stock NumPy

#  Name  Version   Build              Channel
# ──────────────────────────────────────────────
#  numpy  1.26.4   py310hb13e2d6_0  conda-forge

>>> import numpy
>>> in_arr = [[5, 4, 6, 3, 7], [-1, -3, -4, -7, 0]]
>>> dtype = numpy.complex64
>>> a_np = numpy.array(in_arr, dtype=dtype)
>>> numpy.fft.fft2(a_np, axes=(0,1))   
# array([[10.        +0.j        ,  8.09016994+2.17962758j,
        -3.09016994+9.23305061j, -3.09016994-9.23305061j,
         8.09016994-2.17962758j],
       [40.        +0.j        , -5.85410197+0.j        ,
         0.85410197+0.j        ,  0.85410197+0.j        ,
        -5.85410197+0.j        ]])   
		
>>> numpy.fft.fft2(a_np, axes=(0,1,1))
# array([[ 20.+0.j,  35.+0.j, -20.+0.j,  10.+0.j,   5.+0.j],
       [ 30.+0.j,  35.+0.j,  50.+0.j,  50.+0.j,  35.+0.j]])

Result from mkl_fft package (NumPy from intel channel)

#  Name        Version   Build              Channel
# ────────────────────────────────────────────────────
#  numpy       1.26.4   py310h689b997_1    intel  
#  numpy-base  1.26.4   py310h8eeea18_1    intel 
#  mkl_fft     1.3.8    py310h6b114c4_70   intel

>>> import numpy
>>> in_arr = [[5, 4, 6, 3, 7], [-1, -3, -4, -7, 0]]
>>> dtype = numpy.complex64
>>> a_np = numpy.array(in_arr, dtype=dtype)
>>> numpy.fft.fft2(a_np, axes=(0,1))   
# array([[10.        +0.j      ,  8.09017   +2.179628j,
        -3.09017   +9.233051j, -3.09017   -9.233051j,
         8.09017   -2.179628j],
       [40.        +0.j      , -5.854102  +0.j      ,
         0.85410213+0.j      ,  0.85410213+0.j      ,
        -5.854102  +0.j      ]], dtype=complex64)

>>> numpy.fft.fft2(a_np, axes=(0,1,1))	
# array([[10.        +0.j      ,  8.09017   +2.179628j,
        -3.09017   +9.233051j, -3.09017   -9.233051j,
         8.09017   -2.179628j],
       [40.        +0.j      , -5.854102  +0.j      ,
         0.85410213+0.j      ,  0.85410213+0.j      ,
        -5.854102  +0.j      ]], dtype=complex64)

numpy version

when I want to install mkl-fft 1.3.1 and mkl-random 1.2.2 in python 3.9.18，I met this issue：mkl-fft 1.3.1 depends on numpy<1.23.0 and >=1.22.3， mkl-random 1.2.2 depends on numpy<1.25.0 and >=1.24.3. So, it can't install successfully becuase of the conflicted numpy version.

FutureWarning message for non-tuple multidimensional indexing

I started getting a FutureWarning when performing certains rfft with mkl_fft via numpy.

Here is a simple code that produces the warning:

>>> import numpy as np
>>> from mkl_fft import rfft_numpy
>>> rfft_numpy(np.zeros((2, 1)), n=2)
__main__:1: FutureWarning: Using a non-tuple sequence for multidimensional indexing is deprecated; use `arr[tuple(seq)]` instead of `arr[seq]`. In the future this will be interpreted as an array index, `arr[np.array(seq)]`, which will result either in an error or a different result.
array([[0.+0.j, 0.+0.j],
       [0.+0.j, 0.+0.j]])

Here's my setup:

Python 3.6.6
numpy 1.15.1
mkl_fft 1.0.4

(via anaconda on macOS)

versions greater than 1.3.1 downgrade libblas, libcblas and liblapack

mkl_fft.fft and mkl_fft.fftn leaks memory on real inputs

Reported via Intel Developer's Forum post.

Reproducer:

import numpy
import math
import gc


import numpy.fft as fft

while True :
	test=numpy.fft.fft2(numpy.ones((200,200)));
	del test
	gc.collect()

process memory increases unboundedly over time.

Provide Dask interface to mkl_fft

Would be nice if a simple dask interface were provided to mkl_fft (much like the numpy one currently). In fact it could be built right on top of the _numpy_fft module by wrapping these functions using fft_wrap just as it is done in Dask except using the one's in _numpy_fft instead.

multiprocessing does not work with mkl_fft

When import fft from mkl_fft, the code below runs in a single CPU (two threads, image 1). However, when importing fft from scipy, the code runs simultaneously in multiple CPUs (N = 15 in my case, image 2).

#from scipy.fft import fft, ifft
from mkl_fft import fft, ifft
import multiprocessing as mp
import numpy as np

def test_fft(ii):
    for jj in range(100000):
        x = ii*np.random.random(2**12)
        y = fft(x)
        z = ifft(y)

if __name__ == '__main__':
    N = mp.cpu_count() - 1
    pool = mp.Pool(N)
    pool.map(test_fft, range(20))
    pool.close()
    pool.join()

Reduce size of dependencies

Thanks for making this package available!

The readme says

mkl_fft [..] is now being released as a stand-alone package

It is true, but it still depends on the mkl package (of >200MB compressed) which will be downloaded both when installing with conda and pip.

The large size of dependencies can be a significant problem. For comparison, pyFFTW wheels take 2.3MB only. I imagine this package needs only a small part of MKL, is there a chance self-contained wheels could be made (e.g. by statically linking the relevant parts of MKL)?

Confusing results from FFT

This is a cross post from numpy/numpy as advisied by those guys ( numpy/numpy#20113 )

Essentially this: Computing an FFT of a 1kHz sin wave recorded at 44100 samples per second. Sample length 1024. I am expecting to see two peaks in the spectrum at about bin 23 and 1002. Instead I am seeing a strange, 'oscillating' spectrum:

Code to replicate is below, and source data-file (including the results I get) attached (I had to attach the .npz file within a .zip file as github won't let me upload .npz)

FFT test data.zip

import numpy as np
import matplotlib.pyplot as plt

fs = 44100

myData = np.load("FFT test data.npz")
print(myData['theSignal'])

theSignal = myData['theSignal']          # 1kx sin wav recorded at 44100 samples / sec
theFFT = myData['theFFT']                # FFT calculated on my machine using numpy
theAbs = myData['theAbs']                # Absolute value of FFT as caclulated on my machine

newFFT = np.fft.fft(theSignal)           # Fresh computation of FFT for caomparison
newAbs = np.abs(newFFT)

# Calculate the axis for plotting the signal
xAxis = [(x / fs) for x in range(0, theSignal.size)]         # Create a real-time x axis 
lastX = xAxis[len(xAxis) - 1]
maxY = np.amax(theSignal)
minY = np.amin(theSignal)
xfreq = np.fft.fftfreq(newFFT.size, 1/fs)  # calculates the frequencies in the center of each bin in the output of fft(). 


# Now plot the results
fig,ax = plt.subplots(2)

# Plot the oscillascope
ax[0].set_xlim([0,lastX])
ax[0].set_ylim([minY, maxY])
ax[0].set_xlabel('Time (s)')
ax[0].set_ylabel('Amplitude')
ax[0].set_title('Oscillascope')

ax[0].plot(xAxis, theSignal)

# Plot the frequency spectrum
ax[1].set_xlim([np.amin(xfreq), np.amax(xfreq)])
ax[1].set_ylim([np.amin(newAbs), np.amax(newAbs)])
ax[1].set_xlabel('Frequency (Hz)')
ax[1].set_ylabel('Energy')
ax[1].set_title('Spectrum')
ax[1].plot(xfreq, newAbs)

plt.show()

	n_threads = _hardware_counts.get_max_threads_count()
	mkl.domain_set_num_threads(n_threads, domain='fft')