sparsePKL - Sparse Pairwise Kernel Learning Software

sparsePKL is a pairwise kernel learning algorithm based on nonsmooth DC (difference of two convex functions) optimization. It learns sparse models for predicting in pairwise data (e.g. drug-target interactions) by using double regularization with both L1-norm and L0-pseudonorm. The nonsmooth DC optimization problem is solved using the limited memory bundle DC algorithm (LMB-DCA). In addition, sparsePKL uses pairwise Kronecker product kernels computed via generalized vec-trick to model interactions between drug and target features. The included loss-functions for the pairwise kernel problem are:

squared loss,
squared epsilon-insensitive loss,
epsilon-insensitive squared loss,
epsilon-insensitive absolute loss,
absolute loss.

Files included

sparsepkl.py
- Main python file. Includes RLScore calls.
pkl_utility.py
- Python utility programs.
sparsepkl.f95
- Main Fortran file for sparsePKL software.
lmbdca.f95
- LMB-DCA - the limited memory bundle DC algorithm.
solvedca.f95
- Limited memory bundle method for solving convex DCA-type of problems.
objfun.f95
- Computation of the function and subgradients values with different loss functions. Selection between loss functions is made in sparsepkl.py
initpkl.f95
- Initialization of parameters and variables in sparsePKL and LMB-DCA. Includes modules:
  - initpkl - Initialization of parameters for pairwise learning.
  - initlmbdca - Initialization of LMB-DCA.
parameters.f95
- Parameters for Fortran. Inludes modules:
  - r_precision - Precision for reals,
  - param - Parameters,
  - exe_time - Execution time.
subpro.f95
- subprograms for LMB-DCA and LMBM.
data.py
- Contains functions to load the example data sets. Data files are assumed to be in a folder "data" that is not part of the current folder.
- Contains functions to create train-test-validation splits. Splits are created for every experimental setting S1-S4 (see the reference below).
Makefile
- makefile: builds a shared library to allow sparsepkl (Fortran95 code) to be called from Python. Uses f2py, Python3.7, and requires a Fortran compiler (gfortran) to be installed.

Installation and usage

The source uses f2py and Python3.7, and requires a Fortran compiler (gfortran by default) and the RLScore to be installed.

To use the code:

Select the data, loss function, and the desired sparsity level from sparsepkl.py file.
Run Makefile (by typing "make") to build a shared library that allows sparsepkl (Fortran95 code) to be called from Python.
Finally, just type "python3.7 sparsepkl.py".

The algorithm returns a csv-file with performance measures (C-index and MSE) computed in the test set under different experimental settings S1-S4. The best results are selected using a separate validation set and validated w.r.t. C-index. In addition, separate csv-files with predictions under different experimental settings S1-S4 are returned.

References:

sparsePKL and LMB-DCA:
- N. Karmitsa, K. Joki, A. Airola, T. Pahikkala, "Limited memory bundle DC algorithm for sparse pairwise kernel learning", 2023.
RLScore:
- T. Pahikkala, A. Airola, "Rlscore: Regularized least-squares learners", Journal of Machine Learning Research, Vol. 17, No. 221, pp. 1-5, 2016.
LMBM:
- N. Haarala, K. Miettinen, M.M. Mäkelä, "Globally Convergent Limited Memory Bundle Method for Large-Scale Nonsmooth Optimization", Mathematical Programming, Vol. 109, No. 1, pp. 181-205, 2007.
- M. Haarala, K. Miettinen, M.M. Mäkelä, "New Limited Memory Bundle Method for Large-Scale Nonsmooth Optimization", Optimization Methods and Software, Vol. 19, No. 6, pp. 673-692, 2004.
Generalized vec trick and experimental settings:
- A. Airola, T. Pahikkala, "Fast kronecker product kernel methods via generalized vec trick", IEEE Transactions on Neural Networks and Learning Systems, Vol. 29, pp. 3374–3387, 2018.
- M. Viljanen, A. Airola, T. Pahikkala, "Generalized vec trick for fast learning of pairwise kernel models", Machine Learning, Vol. 111, 543–573, 2022.
Nonsmooth optimization:
- A. Bagirov, N. Karmitsa, M.M. Mäkelä, "Introduction to nonsmooth optimization: theory, practice and software", Springer, 2014.

Acknowledgements

The work was financially supported by the Research Council of Finland projects (Project No. #345804 and #345805) led by Antti Airola and Tapio Pahikkala.

napsu / sparsepkl Goto Github PK

sparsepkl's Introduction

sparsePKL - Sparse Pairwise Kernel Learning Software

Files included

Installation and usage

References:

Acknowledgements

sparsepkl's People

Contributors

Watchers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent