Coder Social home page Coder Social logo

abess-team / skscope Goto Github PK

View Code? Open in Web Editor NEW
309.0 6.0 13.0 23.79 MB

skscope: Sparse-Constrained OPtimization via itErative-solvers

Home Page: https://skscope.readthedocs.io

License: MIT License

CMake 0.56% C++ 36.05% C 0.21% Python 63.19%
python sparsity-optimization scikit-learn nonlinear-optimization auto-differentiation jax non-convex-optimization

skscope's Introduction

skscope: Fast Sparse-Constraint Optimization

pypi Conda version Build codecov docs pyversions License: MIT Code style: black

What is skscope?

skscope aims to make sparsity-constrained optimization (SCO) accessible to everyone because SCO holds immense potential across various domains, including machine learning, statistics, and signal processing. By providing a user-friendly interface, skscope empowers individuals from diverse backgrounds to harness the power of SCO and unlock its broad range of applications (see examples exhibited below).

Installation

The recommended option for most users:

pip install skscope

For Linux or Mac users, an alternative is

conda install skscope

If you want to work with the latest development version, the further installation instructions help you install from source.

Quick examples

Here's a quick example showcasing how you can use three simple steps to perform feature selection via the skscope:

from skscope import ScopeSolver
from sklearn.datasets import make_regression
import jax.numpy as jnp

## generate data
x, y, coef = make_regression(n_features=10, n_informative=3, coef=True)

## 1. define loss function
def ols_loss(para):
    return jnp.sum(jnp.square(y - x @ para))

## 2. initialize the solver where 10 parameters in total and three of which are sparse
solver = ScopeSolver(10, 3) 

## 3. use the solver to optimized the objective
params = solver.solve(ols_loss) 

Below's another example illustrates that you can modify the objective function to address another totally different problem.

import numpy as np
import jax.numpy as jnp
import matplotlib.pyplot as plt
from skscope import ScopeSolver

## generate data
np.random.seed(2023)
x = np.cumsum(np.random.randn(500)) # random walk with normal increment

## 1. define loss function
def tf_objective(params):
    return jnp.sum(jnp.square(x - jnp.cumsum(params)))  

## 2. initialize the solver where 10 parameters in total and three of which are sparse
solver = ScopeSolver(len(x), 10)

## 3. use the solver to optimized the objective
params = solver.solve(tf_objective)

tf_x = jnp.cumsum(params)
plt.plot(x, label='observation', linewidth=0.8)
plt.plot(tf_x, label='filtering trend')
plt.legend(); plt.show()

The above Figure shows that the solution of ScopeSolver now captures the main trend of the observed random work. Again, 4 lines of code help us attain the solution.

Example gallery

Since skscope can easily be applied to diverse objective functions, we can definitely leverage it to develop various machine learning methods that is driven by SCO. In our example gallery, we supply 25 comprehensive statistical/machine learning examples to illustrate the versatility of skscope.

Why skscope is versatile?

The high versatility of skscope in effectively addressing SCO problems are derived from two key factors: theoretical concepts and computational implementation. In terms of theoretical concepts, there have been remarkable advancements in SCO in recent years, offering a range of efficient iterative methods for solving SCO. Some of these algorithms exhibit elegance by only relying on the current parameters and gradients for the iteration process. On the other hand, significant progress has been made in automatic differentiation, a fundamental component of deep learning algorithms that plays a vital role in computing gradients. By ingeniously combining these two important advancements, skscope emerges as the pioneering tool capable of handling diverse sparse optimization tasks.

With skscope, the creation of new machine learning methods becomes effortless, leading to the advancement of the "sparsity idea" in machine learning. This, in turn, facilitates the availability of a broader spectrum of machine learning algorithms for tackling real-world problems.

Software features

  • Support multiple state-of-the-art SCO solvers. Now, skscope has supported these algorithms: SCOPE, HTP, Grasp, IHT, OMP, and FoBa.

  • User-friendly API

    • zero-knowledge of SCO solvers: the state-of-the-art solvers in skscope has intuitive and highly unified APIs.

    • extensive documentation: skscope is fully documented and accompanied by example gallery and reproduction scripts.

  • Solving SCO and its generalization:

    • SCO: $\arg\min\limits_{\theta \in R^p} f(\theta) \text{ s.t. } ||\theta||_0 \leq s$;

    • SCO for group-structure parameters: $\arg\min\limits_{\theta \in R^p} f(\theta) \text{ s.t. } I(||\theta_{G_i}||2 \neq 0) \leq s$ where ${G_i}{i=1}^q$ is a non-overlapping partition for ${1, \ldots, p}$;

    • SCO when pre-selecting parameters in set $\mathcal{P}$: $\arg\min\limits_{\theta \in R^p} f(\theta) \text{ s.t. } ||\theta_{\mathcal{P}^c}||_0 \leq s$.

  • Data science toolkit

    • Information criterion and cross-validation for selecting $s$

    • Portable interface for developing new machine-learning methods

  • Just-in-time-compilation compatibility

Benchmark

  • Support recovery accuracy
Methods Linear regression Logistic regression Trend filtering Multi-task learning Ising model Nonlinear feature selection
OMPSolver 1.00(0.01) 0.91(0.05) 0.70(0.18) 1.00(0.00) 0.98(0.03) 0.77(0.09)
IHTSolver 0.79(0.04) 0.97(0.03) 0.08(0.10) 0.97(0.02) 0.96(0.05) 0.78(0.09)
HTPSolver 1.00(0.00) 0.84(0.05) 0.41(0.22) 1.00(0.00) 0.97(0.03) 0.78(0.09)
GraspSolver 1.00(0.00) 0.90(0.08) 0.58(0.23) 1.00(0.00) 0.99(0.01) 0.78(0.08)
FoBaSolver 1.00(0.00) 0.92(0.06) 0.87(0.13) 1.00(0.00) 1.00(0.01) 0.77(0.09)
ScopeSolver 1.00(0.00) 0.94(0.04) 0.79(0.19) 1.00(0.00) 1.00(0.01) 0.77(0.09)
cvxpy 0.83(0.17) 0.83(0.05) 0.19(0.22) 1.00(0.00) 0.94(0.04) 0.74(0.09)

All solvers (except IHTSolver) in skscope consistently outperformed cvxpy in terms of accuracy for the selection of the support set.

  • Runtime (measured in seconds):
Methods Linear regression Logistic regression Trend filtering Multi-task learning Ising model Nonlinear feature selection
OMPSolver 0.62(0.11) 0.80(0.11) 0.03(0.00) 2.70(0.26) 1.39(0.13) 13.24(3.91)
IHTSolver 0.23(0.05) 0.18(0.12) 0.30(0.06) 0.80(0.11) 0.98(0.08) 1.67(0.50)
HTPSolver 0.50(0.14) 0.94(0.44) 0.03(0.01) 14.18(5.13) 3.41(1.22) 12.97(6.23)
GraspSolver 0.18(0.06) 2.55(0.86) 0.08(0.03) 0.54(0.28) 0.53(0.22) 3.06(0.75)
FoBaSolver 3.71(0.50) 3.28(0.39) 0.13(0.02) 6.22(0.61) 11.10(1.04) 57.42(12.95)
ScopeSolver 0.30(0.08) 1.20(2.14) 0.09(0.01) 1.14(0.89) 1.17(0.25) 7.78(2.23)
cvxpy 14.59(5.60) 69.45(53.47) 0.47(0.16) 39.36(155.70) 32.26(17.88) 534.49(337.72)

skscope demonstrated significant computational advantages over cvxpy, exhibiting speedups ranging from approximately 3-500 times.

Software architecture

Citation

If you use skscope or reference our tutorials in a presentation or publication, we would appreciate citations of our library.

The corresponding BibteX entry:

@misc{wang2024skscopefastsparsityconstrainedoptimization,
      title={skscope: Fast Sparsity-Constrained Optimization in Python}, 
      author={Zezhi Wang and Jin Zhu and Peng Chen and Huiyang Peng and Xiaoke Zhang and Anran Wang and Yu Zheng and Junxian Zhu and Xueqin Wang},
      year={2024},
      eprint={2403.18540},
      archivePrefix={arXiv},
      primaryClass={stat.ML},
      url={https://arxiv.org/abs/2403.18540}, 
}

Contributions

👏 Thanks for the following support 👏

Stargazers

Stargazers repo roster for @abess-team/skscope

Forkers

Forkers repo roster for @abess-team/skscope



Any kind of contribution to skscope would be highly appreciated! Please check the contributor's guide.

skscope's People

Contributors

anrwang avatar bbayukari avatar belzheng avatar chenpnn avatar everglow00 avatar github-actions[bot] avatar imgbot[bot] avatar imgbotapp avatar luminite9 avatar mamba413 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

skscope's Issues

[Feature] isotonic regression API for practitioners

It would be interesting to give users a easy-to-use API for the isotonic regression. The regression implemented skscope is already provided in example gallery: https://skscope.readthedocs.io/en/latest/gallery/LinearModelAndVariants/isotonic-regression.html. So, this can be easily solved by wrapping the necessary code.

Reference:

[Feature] Primal dual active set method

It seems that the primal-dual active set (PDAS) algorithm is also a (potentially) powerful algorithm for sparsity optimization. It would helpfully enrich our support solvers if it is implemented.

Reference:

  • Wen, C., Zhang, A., Quan, S., & Wang, X. (2020). BeSS: An R Package for Best Subset Selection in Linear, Logistic and Cox Proportional Hazards Models. Journal of Statistical Software, 94(4), 1–24. https://doi.org/10.18637/jss.v094.i04

[Feature] Support for sparse matrix and more optimizers

I have two suggestions for possible improvements as follows:

  1. Sparse matrix: In some practical problems (e.g., document classification) where the covariate matrix is extremely sparse, can skscope utilize the characteristic of sparse matrix to imporve the efficiency of storage and computation? Some reference may be helpful such as scipy.sparse or its JAX version jax.scipy.sparse.
  2. Optimizers: I find that the base numeric solver for skscope is mainly based on nlopt now, and I wonder whether it is possible to support more optimizers. For example, the differentiable optimizer JAXopt may be more suitable for general (unconstrained, constrained and composite) optimization problems .

Thanks!

[Document] Add badges in README.md

Add some badges in the README.md file (see below). Through them, users will convince our library are well-organized.

[![Python Build](https://github.com/abess-team/abess/actions/workflows/python_test.yml/badge.svg)](https://github.com/abess-team/abess/actions/workflows/python_test.yml)
[![R Build](https://github.com/abess-team/abess/actions/workflows/r_test.yml/badge.svg)](https://github.com/abess-team/abess/actions/workflows/r_test.yml)
[![codecov](https://codecov.io/gh/abess-team/abess/branch/master/graph/badge.svg?token=LK56LHXV00)](https://codecov.io/gh/abess-team/abess)
[![docs](https://readthedocs.org/projects/abess/badge/?version=latest)](https://abess.readthedocs.io/en/latest/?badge=latest)
[![R docs](https://github.com/abess-team/abess/actions/workflows/r_website.yml/badge.svg)](https://abess-team.github.io/abess/)
[![cran](https://img.shields.io/cran/v/abess?logo=R)](https://cran.r-project.org/package=abess)
[![pypi](https://img.shields.io/pypi/v/abess?logo=Pypi)](https://pypi.org/project/abess)
[![Conda version](https://img.shields.io/conda/vn/conda-forge/abess.svg?logo=condaforge)](https://anaconda.org/conda-forge/abess)
[![pyversions](https://img.shields.io/pypi/pyversions/abess)](https://img.shields.io/pypi/pyversions/abess)
[![License](https://img.shields.io/badge/License-GPL%20v3-blue.svg)](http://www.gnu.org/licenses/gpl-3.0)
[![Codacy Badge](https://app.codacy.com/project/badge/Grade/3f6e60a3a3e44699a033159633981b76)](https://www.codacy.com/gh/abess-team/abess/dashboard?utm_source=github.com&utm_medium=referral&utm_content=abess-team/abess&utm_campaign=Badge_Grade)
[![CodeFactor](https://www.codefactor.io/repository/github/abess-team/abess/badge)](https://www.codefactor.io/repository/github/abess-team/abess)
[![Platform](https://anaconda.org/conda-forge/abess/badges/platforms.svg)](https://anaconda.org/conda-forge/abess)
[![Downloads](https://pepy.tech/badge/abess)](https://pepy.tech/project/abess)

Sphinx support latex enviroment

The notebook uses a Latex environment, which is currently not directly supported by Sphinx.

So, shall we avoid using this environment? Or, there are some convenience configuration/plugin/library can support this environment? If we consider the latter one, we have to investigate whether it is supported by the readthedocs.

Example code `sparse-precision-matrix.ipynb` fails

When I run the example sparse-precision-matrix.ipynb, the following code block

solver = GraspSolver(
    dimensionality=int(p * (p + 1) / 2),
    sparsity=np.count_nonzero(prec[np.triu_indices(p)]),
    always_select=np.where(np.triu_indices(p)[0] == np.triu_indices(p)[1])[0],
    convex_solver=convex_solver_cvxpy,
)

raises an error as follows

TypeError: __init__() got an unexpected keyword argument 'convex_solver'

It seems that the new version of scope is not compatible with the original parameter convex_solver.

[Feature] Robust variable selection via exponential loss

Robust variable selection is practically helpful for handling real data because it is insensitive to the outliers that frequently appear in the real world. It is strongly recommended to implement it.

Reference:

  • Wang, X., Jiang, Y., Huang, M., & Zhang, H. (2013). Robust variable selection with exponential squared loss. Journal of the American Statistical Association, 108(502), 632-643.

[Question] support python 3.8

Currently, we cannot install skscope use in python 3.8 via pip? Can we support it on Python 3.8 on at least some platform such as MacOS or Linux?

[Feature] "square-root lasso-type" method

It would be interesting to implement the objective function for solving the square-root-lasso type method.

Reference: A. Belloni and others, Square-root lasso: pivotal recovery of sparse signals via conic programming, Biometrika, Volume 98, Issue 4, December 2011, Pages 791–806, https://doi.org/10.1093/biomet/asr043

About `skmodel.PortfolioSelection`

  1. Replace seed with random_state? The latter is a convention parameter in scikit-learn.
  2. Replace hist with cov_matrix ? See this: https://pyportfolioopt.readthedocs.io/en/latest/_modules/pypfopt/efficient_frontier/efficient_frontier.html#EfficientFrontier.__init__
  3. Replace k with s might be better because we use s in the other places when illustrating sparsity constraint optimization.
  4. Replace lambda_ with lambda?
  5. Have you tested whether it is really compatible with scikit-learn? For example, you may test whether sklearn.model_selection.GridSearchCV can be used for selecting lambda?

[Features] The "Layer" concept

In our examples, we often utilize the re-parametrization trick to absorb the equality constraint into the objective function. I think it would be more convenient if we can wrap common re-parametrizations into functions/classes. The reason is three-fold:

  • we can avoid re-programming these tricks again and again;

  • user can easily use re-parametrizations without programming by themselves;

  • we can improve the maintenance for these re-parametrizations.

incorrect configuration causes a 404 page

I click "edit on github"

51901689512099_ pic

but it results in a 404 page:

51911689512149_ pic

It seems that main/docs/userguide/index.rst is an incorrect path. It shall be main/docs/source/userguide/index.rst.

[Feature] sparsity-based clustering method

It would be interesting to use skscope to address the clustering problem.

Reference:

  • Lashkari, D., & Golland, P. (2007). Convex clustering with exemplar-based models. Advances in neural information processing systems, 20.

  • Mahdi Soltanolkotabi, Ehsan Elhamifar, Emmanuel J. Candès "Robust subspace clustering," The Annals of Statistics, Ann. Statist. 42(2), 669-699, (April 2014)

`make html` raises KeyError: 'scope.ScopeSolver'

[AutoAPI] Reading files... [100%] /Users/zhujin/abess-team/scope/scope/numeric_solver.py                                                                   
WARNING: Cannot resolve cyclic import: scope.solver, scope, scope.solver
WARNING: Cannot resolve cyclic import: scope.solver, scope, scope.solver
WARNING: Cannot resolve cyclic import: scope.solver, scope, scope.solver
WARNING: Cannot resolve cyclic import: scope.solver, scope, scope.solver
WARNING: Cannot resolve cyclic import: scope.solver, scope, scope.solver
WARNING: Cannot resolve cyclic import: scope.solver, scope, scope.solver
WARNING: Cannot resolve cyclic import: scope.solver, scope, scope.solver
WARNING: Cannot resolve import of scope._scope in scope.solver
WARNING: Cannot resolve import of scope._scope in scope.model
[AutoAPI] Mapping Data... [100%] /Users/zhujin/abess-team/scope/scope/numeric_solver.py                                                                    
building [mo]: targets for 0 po files that are out of date
building [html]: targets for 57 source files that are out of date
updating environment: [new config] 57 added, 0 changed, 0 removed
docstring of scope.base_solver.BaseSolver:16: WARNING: Unexpected indentation.                                                                             
docstring of scope.base_solver.BaseSolver:8: WARNING: Inline emphasis start-string without end-string.
docstring of scope.base_solver.BaseSolver:17: WARNING: Block quote ends without a blank line; unexpected unindent.
sphinx-sitemap: No pages generated for sitemap.xml

Exception occurred:
  File "/Users/zhujin/miniforge3/envs/convex-solver/lib/python3.9/site-packages/autoapi/directives.py", line 22, in get_items
    obj = all_objects[name]
KeyError: 'scope.ScopeSolver'
The full traceback has been saved in /var/folders/tw/stn1nxqn4y77yhy76z8nkpp00000gn/T/sphinx-err-owx0j58n.log, if you want to report the issue to the developers.
Please also report this if it was a user error, so that a better error message can be provided next time.
A bug report can be filed in the tracker at <https://github.com/sphinx-doc/sphinx/issues>. Thanks!
make: *** [html] Error 2

Extend ``skscope`` with JAXopt

JAXopt is a differentiable optimizers which can solve general optimization problems such as set constrained or non-smooth optimization. I wonder whether it is possible to extend skscope to such a general sparse solver using JAXopt library.

[Testing] improve test coverage

@chenpnn I find that the file [utilities.py](https://app.codecov.io/gh/abess-team/skscope/commit/e7031f68f26eed1fd5301b2c511ca1936dbda380/blob/skscope/utilities.py) is not well covered by python testing.
@bbayukari Shall we our test also cover this utility file?

the docs of `convex_solver_nlopt` incorrectly occurs

I notice that the docs of the function convex_solver_nlopt incorrectly appear in docs/build/html/autoapi/base_solver.html and docs/build/html/autoapi/solver.html#classes.

Additionally, the docs of BaseSolver appears in docs/build/html/autoapi/solver.html.

No warning raises when division by zero

I try to minimize the least square error with simplex constraint via directly optimizing the normalized absolute parameters
w / jnp.sum(w) and the loss function is shown in the following:

def simplex_solver(X, y, sparsity=None):
    n, p = X.shape

    def custom_objective(w):
        w = jnp.abs(w) / jnp.abs(w).sum()
        loss = jnp.mean((X @ w - y) ** 2)
        return loss

    solver = ScopeSolver(p, sparsity=sparsity)
    w = solver.solve(custom_objective)
    return w / w.sum()

This fails because all nonzero elements in the output are nans, but no warning information raises. Then, I replace the line solver = ScopeSolver(p, sparsity=sparsity) with solver = ScopeSolver(p, sparsity=sparsity, init_params=np.ones(p)/p), it works. This phenomena may be due to the division-by-zero error in w = jnp.abs(w) / jnp.abs(w).sum() since the default initial params is a all-zeros vector.

Therefore, colud the ScopeSolver provide more intermediate information when such warning or error happens?
Thanks!

`make html` fail when generate document in local computer

Conduct the command:

make html

but the following error occurs:

/bin/sh: sphinx-build: command not found
make: *** [html] Error 127

But, actually, sphinx has been installed.

Details about Environment:

  • OS: macOS 13.2.1 (22D68)
  • Chip: M1
  • Python: 3.9.12

Remove the TOC on the first page

In my opinion, it is unnecessary to present the (very) very long table of contents (TOC) element on the first page. No body will be interested in it.

image

[Error] an incompatible architecture (have 'x86_64', need 'arm64')

It raises the following error when I import this library.

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/zhujin/abess-team/scope/scope/__init__.py", line 16, in <module>
    from .universal import (ConvexSparseSolver)
  File "/Users/zhujin/abess-team/scope/scope/universal.py", line 4, in <module>
    from .pybind_cabess import pywrap_Universal, UniversalModel, init_spdlog
ImportError: dlopen(/Users/zhujin/abess-team/scope/scope/pybind_cabess.cpython-39-darwin.so, 0x0002): tried: '/Users/zhujin/abess-team/scope/scope/pybind_cabess.cpython-39-darwin.so' (mach-o file, but is an incompatible architecture (have 'x86_64', need 'arm64')), '/System/Volumes/Preboot/Cryptexes/OS/Users/zhujin/abess-team/scope/scope/pybind_cabess.cpython-39-darwin.so' (no such file), '/Users/zhujin/abess-team/scope/scope/pybind_cabess.cpython-39-darwin.so' (mach-o file, but is an incompatible architecture (have 'x86_64', need 'arm64'))

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.