derekbeaton / ours Goto Github PK
View Code? Open in Web Editor NEWOutliers and Robust Structures
License: GNU General Public License v3.0
Outliers and Robust Structures
License: GNU General Public License v3.0
Hi,
We found an error in the cat.mcd.find.sample.R file, in line 51.
Original function:
final.configs <- unique(unique.min.configs[1:min(nrow(unique.min.configs), perc.cut),])
This returns a vector, but the function immediately following this one (final.dets) seems to be expecting a matrix.
We were able to get around this problem by adding a drop=F parameter to the function:
final.configs <- unique(unique.min.configs[1:min(nrow(unique.min.configs), perc.cut),,drop=F])
Should I push this up?
See, e.g., when the CatMCD example is run with num.subsets = 5.
This is because final.configs is a vector and not a matrix. I think at this point if we have a final.configs that is only a vector, we should just return it as is without the final concentration (c-) step.
A function to create a wide table from outlier information and the contributions information.
gen. should actually become the "core", with cat., ord., and mixed. as prefixes for mcd(). then it all gets passed into that gen.mcd core
A function to create a wide table from outlier information and the contributions information.
The ghost of zero variance has re-appeared:
Error in svd(x, nu = nu, nv = nv) : infinite or missing values in 'x'
This happens when a column or row have identical values. This needs to be caught (try/catch) and handled somehow... it might reflect that there is a robust group or not. Not sure yet. Easiest way to handle is to skip over it, but to limit the number of times it is skipped over in succession.
As per our email. @derekbeaton
See recent example of header data.
One of the function names in the OuRS package was misspelled and causing an error in the Outliers app, namely the function supposed to be named continuous_corrmax was written as "continous_corrmax" (the 'u' was missing) in the OuRS package. Should I just change the name in the Outliers app?
In files: corrmaxs.R, continous_corrmax.Rd, NAMESPACE
I'm not so sure those are correct anymore... because of arbitrary flips.
They seem to be unnecessary and were put there for convenience (laziness) for the variable maps. I can make it so that if they do not exist, then the variable maps will use the indices instead of the names.
The gsvd() function needs an update to correctly allow for the ignoring of RW and LW.
As of now I think it's the slightly older code. I need to pull the latest code from the GSVD package. Eventually OuRS will depend on the GSVD package.
For the scale to work correctly, it should be from 0 where the minimum is set to the 0.
A function to create a long table from outlier information and the contributions information.
A function to create a long table from outlier information and the contributions information.
There are some missing pieces, so there needs to be a wrapper around the two.fold.* functions so that all the same kinds of inputs compare to *.mcd() go in and similar results come out (except OD obviously).
drop it from the core package (for now) and then bring it back as a utility. I think we may want to include an "inference_utils.R" for these and other approaches
I think this function should change to where p is based on available rank i.e., length($d) instead of ncol(DATA). That's because ncol(DATA) is inherently collinear for the generalized case. So it should be based on the span of the subspace, not the total number of columns
not only in the code between e.g., find_sample and c_step, but also here in the issues & projects!
eliminate those when possible
The better/more flexible one exists in GPLS. It should be ported over ASAP.
I found this error when I tried running the disjunctive_coding function on a data set and all the elements of one of the columns were NA.
Error in `[<-`(`*tmp*`, which(DATA[, i] == unique_no_na[j]), j, value = 1) :
subscript out of bounds
Error happens in this line since unique_no_na is a logical(0):
mini.mat[which(DATA[, i] == unique_no_na[j]), j] <- 1
Maybe this is fine to leave in if we don't intend to allow NA columns, but I didn't see any mention of it in the docs so I thought I would bring it up.
there are two possible designs for split-half: resampling within a given row factor or resampling constrained to.
The first is to take splits from within. The second is to split the whole sample, but take entire portions of rows together (e.g., fMRI).
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.