Coder Social home page Coder Social logo

gsvd's People

Contributors

derekbeaton avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

gsvd's Issues

better illustrative data

I need better illustrative data for MCA & PLSCA, as well as just generally making these data examples better

fi & fj names

these are not coherent with the notation or the premise established

they should be renamed to lcs and rfc to "left component scores" and "right component scores"

also see Issue #20 -- as these two are now related (but distinct)

Make SVD slightly faster

Check for N < or > P, and transpose the matrix before SVD

svd() is faster when there are there are more rows than columns

error in tolerance_svd when nu = 0 or nv = 0

Hey Derek,

When calling tolerance_svd() with nu = 0 or nv = 0, there are errors since svd_res$u and svd_res$v don't exist.

I'm not immediately sure the cleanest way to check for this and adjust the function. I'm happy to help if you want me to take a stab at it, but maybe you have a good idea of how to fix it.

Thanks,
Luke

small example

library(GSVD)
set.seed(42)
X <- matrix(sample.int(20, 20*3, replace = TRUE), 20, 3)
tolerance_svd(X, nu = 0, nv = 0)

"tidy" up the place

I think that much of the internals of gsvd(), geigen(), and gplssvd() should conform to a more "tidy" verse like way of checking conditions, tests of input, and failures/exits

matrix.* functions

I only use the exponent (by way of %^%) in this package... While the others are nice, I'm thinking I should remove them for now and bring them back later as "useful utility features".

fi/fj & wfi/wfj

this is a very to-be-considered idea but would greatly benefit CCA and correlation PCA by way of geigen when using a covariance matrix...

I should consider introducing the idea of fi/fj and a "weighted" one. The "weighted" one should be perhaps an "unweighted" one, so that fi/fj are in the correct metric.

So either fi/fj = W[P/Q]D (as it is) or just [U/V]D
But perhaps an unweighted or "standard scores" approach should be [U/V]D and then W[P/Q]D remains as the "correct metric" scores

if I introduce them as weighted then fi/fj become [U/V]D where wfi/wfj become W[P/Q]D

but if I flip that, then fi/fj remain W[P/Q]D and then maybe "ufi/ufj" or perhaps something as simple as "ud/vd" for [U/V]D

Optimize decompositions

Consider adding in the switches to alternate decompositions, or introducing an alternating least squares/power method for when data are very large and we only need a few components.

functionalize checks

The same checks are performed in geigen(), gsvd(), and gplssvd() (sometimes also multiple times in each). These should be turned into functions and put into utils.R

Formal tests

I probably want to include formal tests for various conditions by the time we hit a major release.

*sqrt_psd_matrix() should have tol test

these two functions should require a real tol value and shouldn't be allowed to go to the non-numeric pass throughs

effectively, these methods must ensure that the eigenvalues retained are tested to make sure they are positive

geigen: need symmetric test on weights!

as of right now I'm assuming the weights are symmetric but they don't have to be...

which also means that the particular matrix line of

X <- sqrt_W %% X %% sqrt_W

should be

X <- sqrt_W %% X %% sqrt_psd_matrix(t(W))

Vignettes

The package needs vignettes before I push it to CRAN.

print, summary, plot, other classes?

definitely implement print, plot, and summary

should I use or make other classes?

I do not want predict.*() at this time. That should be for actual analyses, not the decompositions.

geigen rewrite: small consideration

it might be worthwhile rewriting geigen (and tolerance_eigen) to compute eigen-things via the SVD. it's a bit safer/faster and I can guarantee no negative eigenvalues

but then that must be a big design decision: am I willing to enforce that level of strictness for analyses? at this time no, but, I'm considering it down the line

check k?

Should I check k before it's entered for various silly values, and then stop/silently work if they are silly values?

compact returns

the g*() functions should have a "verbose" and "compact" return

the "verbose" is what I have now, the "compact" should focus strictly on what is needed for decomposition/rebuilding the matrices

expand class checks

there should be some assurance that it is in fact a GSVD object of a particular type by checking the names of the list, at least.

tolerance_eigen error with rank 1 matrices

Hey Derek,

I got an error when using tolerance_eigen on a rank-1 matrix. The error comes from the call to colSums and says

Error in colSums(eigen_res$vectors) : 
  'x' must be an array of at least two dimensions

In line 64 below, eigen_res$vectors gets converted to a vector when there is only one eigenvalue to keep.

GSVD/R/tolerance_eigen.R

Lines 64 to 68 in 0f41cf2

eigen_res$vectors <- eigen_res$vectors[,evs.to.keep]
rownames(eigen_res$vectors) <- colnames(x)
## new way inspired by FactoMineR but with some changes
vector_signs <- ifelse(colSums(eigen_res$vectors) < 0, -1, 1)

I converted back to matrix and that seems to have fixed the issue for me.
https://github.com/LukeMoraglia/GSVD/blob/a47ee067abe3a468137b34a4d5fd0984abca40ad/R/tolerance_eigen.R#L64

Code to reproduce error

library(GSVD)
set.seed(42)
X <- matrix(sample.int(20, 20*3, replace = TRUE), 20, 3)
R <- cor(X)

# Create a rank-1 matrix from R
eig_R <- tolerance_eigen(R)
R_rank1 <- as.matrix(eig_R$vectors[,1]) %*% eig_R$values[1] %*% t(as.matrix(eig_R$vectors[,1]))

tolerance_eigen(R_rank1)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.