Coder Social home page Coder Social logo

Comments (9)

michellekee avatar michellekee commented on August 28, 2024 1

It works perfectly now. Thank you so much! You are fast and efficient as always. Truly appreciate this! :)

from lares.

laresbernardo avatar laresbernardo commented on August 28, 2024

from lares.

michellekee avatar michellekee commented on August 28, 2024

Hi Lares,

I tried corr_cross, but was hoping to get an overview-like plot as in corr_var. I'll try corr_cross again. :)

I'm sorry to hijack on this, but I seem to be having an error msg now with corr_var. I could plot without problems before

Demo_Parenting %>% corr_var(GA, ignore = c("a","b","c","d"), method = "spearman", plot = T, pvalue =T, max_pvalue = 0.05)

but it is giving me an error msg now.

[1] variables corr pvalue <0 rows> (or 0-length row.names) Warning message: In corr_var(., GA, ignore = c("a", "b", "c", : There are not enough observations to plot. Check your 'max_pvalue' input

However, when plot = F, it shows me the columns with p-values that are < 0.05.

I've saved my previous images but I'm getting the new errors now as I'm knitting the .rmd. Please kindly advise. Thank you!

from lares.

laresbernardo avatar laresbernardo commented on August 28, 2024

I've just updated corr_* functions to be a bit more efficient but not sure if I might have fixed this issue. Are you able re-check installing the latest dev version and if not, sharing the dataset with me so I can dig deeper?

from lares.

michellekee avatar michellekee commented on August 28, 2024

I've installed the latest dev version and tried with the command below:

df %>% corr_var(V4, ignore=c("V5","V6","V7","V8"), method = "spearman", plot = T, pvalue =T, max_pvalue = 0.05)

It still gives an error.

[1] variables corr pvalue <0 rows> (or 0-length row.names) Warning message: In corr_var(., V4, ignore = c("V5", "V6", "V7", "V8"), method = "spearman", : There are not enough observations to plot. Check your 'max_pvalue' input

But when plot = F, it shows me the columns that with p-values < .05.

Attached is the dataset. :) Thank you so much!

df.csv

P/S: corr_cross doesn't seem to have this problem though, but for some variables, the cross-correlations gets repeated again in the plot (but in the reverse order..

from lares.

laresbernardo avatar laresbernardo commented on August 28, 2024

Thanks for sharing your dataset. Huge difference debugging.

As you might have discovered already, the function transforms categorical variables with one hot encoding (ohse()) so it can calculate correlations between numerical values. What happened here is that V4 is actually categorical and you must select a transformed variable: if you run colnames(ohse(df)) you'll notice you now have V4_Female and V4_Male. Additionally you can pass the redundant = TRUE parameter to also get V4_NAs as the third option if required explicitly.

Knowing that, this works:

corr_var(df, V4_Female,
         ignore = c("V5","V6","V7","V8"),
         method = "spearman",
         plot = TRUE, pvalue = TRUE,
         max_pvalue = 1)

Now, if you reduce max_pvalue = 0.05, you'll only get the correlation between each of the categorical variables:

        variables      corr        pvalue
V4_Male   V4_Male -0.689684 1.014694e-210
V4_NAs     V4_NAs -0.377415  1.313741e-51

The actual bug here is that, for computing reasons, I ignored the pvalue = TRUE case when printing the plot because I do not show that information on the plot. BUT, if you use the max_pvalue filter, you obviously need those values. Fixed in the latest dev version. Could you please retry and let me know if that worked out for you?

from lares.

michellekee avatar michellekee commented on August 28, 2024

Hi,

I used install_github("laresbernardo/lares") to install the new dev version and restarted R. (Hope this is correct!)

It worked now for V4. However, when I tried again with the following, which worked previously.

corr_var(df, V7, ignore = c("V5","V8", "V9", "V10", "V11", "V12", "V13"), method = "spearman", plot = F, pvalue = T, max_pvalue = 0.05)

I had initially gotten the correlations with p < .05. Now I do not get the values, nor the plot.

Could you please kindly advise again? Thank you!

from lares.

laresbernardo avatar laresbernardo commented on August 28, 2024

Hi @michellekee
If you are looking variable by variable, I strongly suggest running corr_cross() instead, which does that for you.
On the other hand, ignoring all those variables in the latest example, we only have 9% of rows with no missing data. This might be part of the problem. Let me take a look into other possible issues on my side to check what's happening and why is it not giving you anything. Will get back to you. Thanks for reporting this problem!

from lares.

laresbernardo avatar laresbernardo commented on August 28, 2024

Ok, I've fixed this issue. (For some reason I care to admit I don't quite understand) setting cor's exact parameter default value to FALSE did not return the values as it should. I've changed the default to TRUE and should work as seen on this screenshot. You can always set it back manually to FALSE but kept TRUE as default for now.
Screen Shot 2021-10-13 at 9 02 34 AM
Let me know if it works our for you and if you encounter any other issue.

from lares.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.