yhoriuchi / projoint Goto Github PK

View Code? Open in Web Editor NEW

10.0 6.0 0.0 18.36 MB

A package for a more general, more straightforward, and more creative conjoint analysis

Home Page: https://yhoriuchi.github.io/projoint/

License: Other

R 89.02% JavaScript 5.42% HTML 1.24% PHP 4.28% Rez 0.04%

amce conjoint conjoint-analysis conjoint-experiments conjoint-survey-experiments marginal-means measurement-error r

projoint's Introduction

projoint

The One-Stop Conjoint Shop

projoint is a general-purpose R package for conjoint analysis. It produces more reliable estimates of all relevant quantities of interest, after correcting measurement error bias and other problems discussed in the literature. Furthermore, it also implements a more general framework than other approaches, so that researchers can answer a much wider range of substantively important questions.

Installation

You can install the development version of projoint from GitHub with:

# install.packages("devtools")
devtools::install_github("yhoriuchi/projoint")

Example

# Load the projoint package
library(projoint)

# Reshape data for conjoint analysis
# This example includes the repeated task.
data <- reshape_projoint(exampleData1, 
                         c(paste0("choice", 1:8), "choice1_repeated_flipped"))

# Run conjoint analysis
output <- projoint(data)

# Make a figure
plot(output)

# Show the estimated quantities of interest
summary(output)

Relevant Article

Our framework and methods can be found in this paper:

Katherine Clayton, Yusaku Horiuchi, Aaron R. Kaufman, Gary King, and Mayya Komisarchik. “Correcting Measurement Error Bias in Conjoint Survey Experiments”. [Working Paper]
- Abstract: Conjoint survey designs are spreading across the social sciences due to their unusual capacity to estimate many causal effects from a single randomized experiment. Unfortunately, by their ability to mirror complicated real-world choices, these designs often generate substantial measurement errors and thus bias. We replicate both the data collection and analysis from eight prominent conjoint studies, all of which closely reproduce published results, and show that a large proportion of observed variation in answers to conjoint questions is effectively random noise. We then discover a common empirical pattern in how measurement error appears in conjoint studies and, with it, introduce an easy-to-use statistical method to correct the bias.

Join Our Projoint Community

We encourage you to join our Projoint Community by making a GitHub account, subscribing to our Announcements, and actively joining Discussions. We encourage you to post questions, suggest for improvement, and share your research findings using our software. For any problem you find, please report your issue to Issues.

Notes

The current version assumes that the outcome variable is a binary forced choice.
This package is still under construction. Forthcoming features include the following:
- Allow researchers to use weights for features and for respondents.
- Allow researchers to use other outcome variables, such as rating.

projoint's People

Contributors

Stargazers

Watchers

projoint's Issues

write "Consider choice-level analysis"

Write Consider choice-level analysis

how to remove ties in php and javascript

We should add more explanations about how to remove ties or add other cross-profile constraints using a php script or a javascript. This is a key advancement from the traditional approach. So, this instruction is essential.

out1_arranged

The following page should be revised: https://yhoriuchi.github.io/projoint/reference/out1_arranged.html. We need to say that this data frame is an "arranged" version of exampleData1 after using save_labels() and read_labels(). The explanation to make this object is https://yhoriuchi.github.io/projoint/articles/02-wrangle.html#arrange-the-order-and-labels-of-attributes-and-levels. This link should also be added to the documentation of https://yhoriuchi.github.io/projoint/reference/out1_arranged.html.

Citation

The citation page should be improved. We should add a Bibtex entry for our main paper. See the following as an example.

https://declaredesign.org/r/declaredesign/authors.html#citation

I assume that the "authors" of projoint will be Aaron and me.

Revise 03. Predict the intra-respondent reliability (IRR)

We should use summary() and plot(). Do we need print() for predict_tau()?

Choice-level figure

I (Yusaku) will make another function to make a plot for the choice-level anaysis. I will then add another section in Article #5.

projoint() output when .qoi is not NULL

@aaronrkaufman , can you make the following change for improvement?

When we use set_qoi(), and specify a projoint_qoi object with projoint(), then .structure and .estimand should be automatically inherited from the projoint_qoi object.
Then, if .structure = "choice_level", .ignore_position should be set to TRUE.
In the Roxygen2 documentation for projoint(), for .ignore_position, we say, "TRUE (default) if you ignore the location of profile (left or right). Relevant only if analyzed at the choice level." But this is not quite an accurate explanation. The default is NULL (because the default .structure is "profile_level". The default for .ignore_position should be TRUE only if .structure is "choice-level." This condition (or conditions) should be set with or without using a projoint_qoi object.

pj_estimate() fails when irr < 0.5

We need to add a handler here. Possibly throw an error, or figure out a workaround.

ToDo list

@aaronrkaufman , can you check the following file?

https://github.com/yhoriuchi/projoint/blob/master/TODO.txt

If there are some unresolved issues, please submit new issues. Then, please delete the file (TODO.txt).

CRAN

Eventually, we should submit our software to CRAN. I want to show "CRAN" (version), "downloads", "R-CMD-check", "codecov" like the one you see on the following package (https://declaredesign.org/r/estimatr/). We should perhaps follow the instructions on the following page: https://kbroman.org/pkg_primer/pages/cran.html

Feature request: color significant MMs or AMCEs a different color

See title.

More than one "tau"

The default setting is to use "one" tau to fix the estimates. But users may want to specify different taus for the subgroups of respondents or the subsets of profile-pair combinations. We should allow users to allow more than one "tau" -- e.g., "tau" for male respondents and "tau" for female respondents. Estimate AMCEs/MMs for the two different subgroups of respondents, fix them using our method, and then pool data to calculate the overall fixed AMCEs/MMs.

a new function to make a javascript

I actually wrote a function to make a Javascript for the second wave of a conjoint experiment. This function can be improved and added to this package. An important merit of this function is that it can be fully flexible. A user generates ALL possible profile combinations with appropriate (1) cross-attribute constraints (for each profile), (2) cross-profile constraints, and (3) weights. This function should be super helpful for users who want to estimate many QoIs with constraints and weights. I need someone's help!

Replication package

We should use our software package as much as possible in our replication package.

"Guides" and "Articles"

We now have all vignettes in "Articles." But I want to make "Articles" and "Guides" separately. See the following as an example of the website with "Guides" (https://patchwork.data-imaginist.com/). Does anyone know how to do this? I want to add vignettes explaining how to use functions in our package in "Guides." But other more general articles should be "Articles." At least two articles should be added:

Recommendation for choice-level (not profile-level) analysis
Published and working papers examining choice-level quantities of interest

estimation without reshaping

Hello! I have already reshaped my data. How can I run an error-corrected mm or amce estimation without using your reshaping function?

The `agree` variable is currently NA for non-repeated tasks.

@thegaryking and I had a discussion about whether we specify the agree variable (= 1 if a choice in the repeated task is the same as the first task, =0) for non-repeated tasks. Currently, agree is NA for all non-repeated tasks. But this could cause an issue when users want to estimate specific choice-level QOIs, when they do not have many observations, and when they want to estimate the QOIs for some specific subgroups. For example, users may want to estimate the choice-level marginal means of choosing a Korean candidate when the race of two candidates is {Korean, White} among Korean respondents. (This is a real issue I encountered in my working paper.) The number of "relevant" repeated tasks, which include {Korean, White} pairs for Korean respondents, is quite small. So, my suggestion is to "impute" the agree variable not only for the (single) repeated task but also for all the other non-repeated tasks. This "assumption" is based on our empirical finding that whether or not a respondent chooses the same profile or not in the repeated task is independent of information contained in the conjoint table.

Add a blog page so that we can circulate our article via R-Blogger.

Follow the instructions given by Chat GPT4.

plot.projoint_results()

@aaronrkaufman . The projoint_results class now includes "estimand" as a slot. So in plot.projoint_results(), it is better not to allow researchers to specify.estimand. What do you think? If you agree, can you make this change?

More refined trade-off analysis

Eventually, for choice-level analysis, we should allow users to specify any combination of attribute levels using AND and OR. For example, the arguments should be something like the following:

.profile1 = "(att1:level1 OR att1:level2) AND att2:level1"
.profile2 = "(att1:level3) AND (att2:level2 OR att3:level3)"

Subgroup comparison

Another thing we should do is to estimate the difference in marginal means between the two subgroups of respondents and fix the estimated difference. With the "cregg" package, we can estimate it easily. So, it should not be so difficult to fix the estimated difference using our method.

Code of conduct

We should add a Code of conduct like:

https://patchwork.data-imaginist.com/CODE_OF_CONDUCT.html

Reference

I want to organize functions in Reference like the following:
https://wilkelab.org/cowplot/reference/index.html

Test, test, test....

We should follow all the instructions and test whether the software works properly. We need to do the following:

Use our 8 replication data and estimate choice-level and profile-level MMs and AMCEs.
Try three different ways to estimate SEs.
Try to remove or not remove ties
Try clustered and not-clustered standard errors
Importantly, we should try "wrong" arguments -- There should be numerous ways to set arguments incorrectly. Please consider these wrong settings as much as possible. It is also important to check whether an easy-to-understand and "correct" error message is presented to each "wrong" setting.
Try various options for plot().

predict_tau()

Hello, I am currently trying to get the IRR from a conjoint without any repetion, i.e. using the extrapolation method. This seems to work really good, thank you!
I wonder, is it possible to see based on how many datapoints the different points in the "predict_tau()" functions plot are created? I am just wondering as obviously some of the confidence intervals are quite large. From the code, it seems like this data is available just not stored which is why I thought maybe I could suggest it for another update in the future.
Again, thanks so much, cool work!

Originally posted by @brueckmann in #28 (comment)

Roxygen documentation

We need to check all the Roxygen documentation. At least the following part in projoint_results.R is incomplete:

#' @param slots Takes 16 slots: [to be written]

Set multiple levels for the attribute of interest

The current version of our package should estimate the choice-level MMs or AMCEs for multiple levels -- e.g., MM of choosing White when another candidate is Black or Hispanic. I will add a section to Article 7 (Explore further).

Report IRR in print()

I need to add the estimated IRR when we use print(). This is a simple oversight and should be fixed.

.se_type_1, .se_type_2 hard coded?

Thanks to the team for all the work on this project and package. In projoint_level, I think that the values of .se_type_1 and .se_type_2 aren't being passed to pj_estimate, which seems to hard code both values as "classical".

as a result, the two calls below yield the same standard error estimates.

data("exampleData1")

outcomes <- paste0("choice", seq(from = 1, to = 8, by = 1))
outcomes <- c(outcomes, "choice1_repeated_flipped")

reshaped_data <- reshape_projoint(
  .dataframe = exampleData1, 
  .outcomes = outcomes)

summary(projoint(reshaped_data, .se_type_2 = "classical"))
summary(projoint(reshaped_data, .se_type_2 = "CR2"))