hacarus / spm-image Goto Github PK
View Code? Open in Web Editor NEWSparse modeling and Compressive sensing in Python
Sparse modeling and Compressive sensing in Python
Fix Travis CI Error
Though our name spm-image
sounds like library for image operation, this can be used for various use case like trend filtering other than image data. Not to limit the use case and to include sparse
in its name, change the library name to rarus
that means sparse
in Latin.
As spm-image 0.0.9 does not support sklearn >= 0.24.0 (cf. #92), we need to release a hot-fix so to require scikit-learn>=0.19.0,<0.24.0
for now.
hotfix
branch and update requirements.txt
to require scikit-learn>=0.19.0,<0.24.0
__init__.py
hotfix
to master
hotfix
to develop
Implement Generalized LASSO using ADMM with sklearn interface.
Generalized Lasso provides
You can implement fit method and reuse some other method provided by LinearModel like LassoLars implementation as follows.
https://github.com/scikit-learn/scikit-learn/blob/a24c8b464d094d2c468a16ea9f8bf8d42d949f84/sklearn/linear_model/least_angle.py#L499-L847
We should follow same package structure of sklearn like spmimage.linear_model.admm.py. Within the module, implement ADMM as LassoAdmm class.
There are variable tridiagonal in GeneralizedLassoADMM
and diagonal in FusedLassoADMM
and TrendFilteringADMM
. These variables means the same thing. So, the same variable name should be used.
Implement matching pursuit mode into sparse_encode function.
Implement KSVD with sklearn interface. You can implement fit
method and reuse some other method provided by SparceCodingMixin like the following class.
We should follow same package structure of sklearn like spmimage.decomposition.ksvd.py
. Also implement test code to run KSVD.
With sklearn >= 0.24, from spmimage.decomposition import KSVD
yields ModuleNotFoundError
at
Implement ksvd inpainting example.
Missing values should be ignored when we calculate the reconstruction error in masked ksvd.
The following line should be fixed.
https://github.com/hacarus/spm-image/blob/development/spmimage/decomposition/ksvd.py#L64
https://github.com/hacarus/spm-image/blob/development/spmimage/decomposition/ksvd.py#L91
In test code, missing values are already ignored in calculation of the reconstruction error.
https://github.com/hacarus/spm-image/blob/development/tests/test_decomposition_ksvd.py#L146
missing value should be ignored when we calculate Y[x, :] - np.dot(W[x, :], H)
.
To achieve this, after calculation of sparse code, masked position of Y have to be updated such that Y[masked, :] = np.dot(W[masked,:], H)
.
This operation should be added in line 71.
https://github.com/hacarus/spm-image/blob/development/spmimage/decomposition/ksvd.py#L69-L71
In spmimage.decomposition.dict_learning
, some algorithms such as OMP and MP requires dictionaries to have normalized rows.
We implement alert that works if they are not normalized.
I found wrong message in data.WhiteningScaler._fit()
.
raise ValueError("""
Eigenvalues of X' are degenerated: X'=X-np.mean(X,axis=0), \
try normalize=True or drop_minute=True.
""")
The above message is wrong because WhiteningScaler doesn't have normalize
and drop_minute
arguments. They were absorbed into thresholding
argument and we should fix the code like below.
raise ValueError("""
Eigenvalues of X' are degenerated: X'=X-np.mean(X,axis=0), \
try thresholding='normalize' or theresholding='drop_minute'.
""")
implement preconditioned primal dual algorithm for lasso.
The objective function is the following:
, where
Implement admm_path
like lasso_path
or lars_path
in sklearn and add example code to draw path diagram in lasso_admm.ipynb
spmimage/linear_model/admm.py
tests/test_linear_model_admm.py
examples/lasso_admm.ipynb
exampleYou can create branch issue/33/admm_path
from development
branch.
Speedup GeneralizedLassoADMM when tridiagonal=True
code_init is useless. It should be initialized by zero matrix.
dict_init is initialized by components_ to achieve warm start.
In general, DCT dictionary is useful for initial dictionary of dictionary learning.
Hence, we prepare a function to generate DCT dictionary.
The function should be as follows:
def generate_dct_dictionary(patch_size: int, sqrt_dict_size: int) -> np.ndarray
The shape of returned array is (sqrt_dict_size^2, patch_size*patch_size)-matrix.
A row shows an atom of a dictionary, (patch_size, patch_size) patch flatten.
Orthogonal matching pursuit algorithm (OMP) requires dictionaries to have normalized rows.
However, some dictionaries in unittests are not normalized.
We should fix them.
Fix the following error.
Collecting spm-image
Downloading spm-image-0.0.3.tar.gz
Complete output from command python setup.py egg_info:
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/private/var/folders/tv/56_91scj52dbv6prcvl_yhsc0000gn/T/pip-build-hsk8dwhb/spm-image/setup.py", line 12, in <module>
with Path('requirements.txt').open() as f:
File "/Users/takashi/.pyenv/versions/3.6.3/lib/python3.6/pathlib.py", line 1161, in open
opener=self._opener)
File "/Users/takashi/.pyenv/versions/3.6.3/lib/python3.6/pathlib.py", line 1015, in _opener
return self._accessor.open(self, flags, mode)
File "/Users/takashi/.pyenv/versions/3.6.3/lib/python3.6/pathlib.py", line 387, in wrapped
return strfunc(str(pathobj), *args)
FileNotFoundError: [Errno 2] No such file or directory: 'requirements.txt'
From #74, masked ksvd is fixed.
We compare DCT dictionary and KSVD dictionary in this example.
In general, we cannot say that KSVD is always superior than DCT.
For example, if deficit rate is very high (i.e. 75%), then it is hard to learn dictionary, and DCT perform well.
We should prepare this kind of experiments.
Implement ZCA Whitening with sklearn interface. We want to implement fit and other methods using SVD like PCA implementation as follows.
https://github.com/scikit-learn/scikit-learn/blob/a24c8b46/sklearn/decomposition/pca.py#L107-L564
We should follow same package structure of sklearn like spmimage.decomposition.whitening.py. Within the module, implement WhiteningScaler class.
WhiteningScaler should have 'apply_zca' augment. If apply_zca equals False, WhiteningScaler should just do whitening transformation, not ZCA Whitening, and it acts like PCA implementation of sklearn.
Add examples code to showcase how to use this library.
extract_simple_patches_2d
and reconstruct_from_simple_patches_2d
can not support patches with overlapping. In some cases, overlapping patches is useful.
Hence, we introduce an integer parameter extraction_step
, and modify these two functions.
extract_simple_patches
reconstruct_from_simple_patches_2d
Implement a class for Trend Filtering inherited from GeneralizedLasso.
license
requirements.txt
and set it to install_requires
Here's some resources and examples.
Since joblib will be removed in sklearn 0.23 as the following message shows, replace joblib dependency from sklearn.externals
to joblib
directly.
/home/travis/virtualenv/python3.6.7/lib/python3.6/site-packages/sklearn/externals/joblib/__init__.py:15: DeprecationWarning: sklearn.externals.joblib is deprecated in 0.21 and will be removed in 0.23. Please import this functionality directly from joblib, which can be installed with: pip install joblib. If this warning is raised when loading pickled models, you may need to re-serialize those models with scikit-learn 0.21+.
There is a bug in feature_extraction.reconstruct_from_simple_patches_2d
. It can't reconstruct when type of shape
argument is Tuple[int, int, int]
, even though extract_simple_patches_2d
seems to be working correctly in case of being given the same type.
Since Python 3.7 is already available, will add 3.7 to test target in Travis CI.
.travis.yml
Implement LassoADMMCV
like LassoCV
or LassoLarsCV
in sklearn and add example code to draw diagram in lasso_admm.ipynb
lasso_admm.ipynb
example similar to the one in this exampleLooking at the n_targets
for
-loop, it appears that each k
can be solved independently from other k
's. Given this, it seems like there is a good opportunity to run each k
in parallel from other k
s. Combining this with the usage of Cython ( #35 ) and releasing the GIL, it should be possible for this to be embarrassingly parallel.
https://github.com/hacarus/spm-image/blob/development/spmimage/linear_model/admm.py#L215
In sparse coding step in ksvd algorithm, current dictionary must be normalized.
However, it is not guaranteed at initial dictionary.
I think we need to put normalize function after initialization of dictionary of _ksvd()
in decomposition/ksvd.py
.
There is an unused if-else conditional branch in the function _admm
. So, it should be removed.
The DCT dictionary which is generated by the code of spm-image looks slightly different from examples found on the web.
https://seinzumtode.hatenadiary.jp/entry/20171123/1511468787
Examples found on the web look more symmetric than ours.
We should review the generation code of DCT dictionary or we should find a library which can generate DCT dictionary.
Implement LASSO using ADMM with sklearn interface. You can implement fit method and reuse some other method provided by LinearModel like LassoLars implementation as follows.
We should follow same package structure of sklearn like spmimage.linear_model.admm.py
. Within the module, implement ADMM as LassoAdmm
class.
We're still on the way to refactor the tridiagonal implementation like below.
For now, we'd like to release the implementation of TrendFiltering
and QuadraticTrendFiltering
and thus we'll do
development
branchIssue template and PR template should be added. I prefer .github
directory style since we gonna see many files at the top of the project, sooner or later.
According to lasso_admm.ipynb
example, execution time of ADMM is almost 80x larger than coordinate decent. To reduce this gap, try Cython for faster execution of ADMM.
missing_value
is not considered in current transform
implementation. We'll support missing_value in the method so that user can easily reconstruct image with missing value.
sparce_encode_with_mask
in spmimage.decomposition.dict_learning
transform
method and call sparse_encode_with_mask
when missing_value
is setIf fit_intercept flag is True, The below error was called in set_interept function.
This function is inheritance method from sklearn.linear_model.base.
It is problem that the coef shape is matrix in spite of vector assumption in linear_model
This bug is happened in FusedLassoADMM too.
import numpy as np
from spmimage.linear_model import LassoADMM
X = np.eye(4)
y = np.array([[1, 1],
[1, 0],
[0, 1],
[0, 0]])
clf = LassoADMM(fit_intercept=True).fit(X, y)
Traceback (most recent call last):
File "/Users/yamamoris/hacarus/spm-image/test.py", line 16, in <module>
clf = LassoADMM().fit(X, y)
File "/Users/yamamoris/hacarus/spm-image/spmimage/linear_model/admm.py", line 126, in fit
self._set_intercept(X_offset, y_offset, X_scale)
File "/Users/yamamoris/.pyenv/versions/hacarus/lib/python3.6/site-packages/sklearn/linear_model/base.py", line 264, in _set_intercept
self.coef_ = self.coef_ / X_scale
ValueError: operands could not be broadcast together with shapes (4,2) (4,)
Implement ADMM algorithm to solve the problem below.
Implement HMLasso
When parameter matrices have sparse properties, we should speed up ADMM.
Here's some resources and examples.
Right now, ksvd algorithm cannot receive initial dictionary. Only warm start is available.
However, sometimes, Starting from DCT dictionary will reduce the iteration to achieve minimizing cost function. And, it will also effective high missing ratio image inpainting (> 70% missing).
Hence, we add an option to give an initial dictionary when we make a KSVD instance.
Implement Pliable Lasso
Implement L1-quadratic trend filtering in `linear_model/admm.py'.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.