Coder Social home page Coder Social logo

openintrostat / openintro Goto Github PK

View Code? Open in Web Editor NEW
226.0 226.0 176.0 173.01 MB

📦 R package for data and supplemental functions for OpenIntro resources

Home Page: http://openintrostat.github.io/openintro/

License: GNU General Public License v3.0

R 100.00%
data openintro rstats rstats-package

openintro's People

Contributors

ameliamn avatar andrewpbray avatar beanumber avatar daviddiez avatar hardin47 avatar jtr13 avatar mine-cetinkaya-rundel avatar ngoguened avatar npaterno avatar openintroorg avatar rudeboybert avatar sjvrensburg avatar suriyaa avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

openintro's Issues

[Bug]: fastfood data has incorrect salad variable

Contact Details

[email protected]

Bug

The fastfood data set has a salad variable with all 515 values "Other". Looking at the item descriptions, it does appear that there are actual salads in the data set.

Reproducible Example

library(openintro)
#> Loading required package: airports
#> Loading required package: cherryblossom
#> Loading required package: usdata
table(fastfood$salad)
#>
#> Other
#> 515

Expected Behavior

I expected to see some foods classified as salads and others, not.

Session Info

No response

Additional context

No response

rosling_responses mentioned in text but not present in package

On page 191 the Fourth Edition of the textbook mentions the rosling_responses data set:

"We will use the rosling_responses data set to evaluate the hypothesis test ..."

Use of the texttt font for "rosling_responses" suggests that such a data set exists in the package, but it doesn't.

yrbss isn't in the OpenIntro packages

Hi,
the yrbss data is used in the OpenIntro text.
The yrbss data is available to download on the Github site.
So far as I can tell, the yrbss data hasn't been added to the OpenIntro packages.
Should it be?

Remove message that appears when package loads

Referring to the text that says "Please visit openintro.org for free statistics". It shows up in the compiled markdown documents (as shown below), and yes, it's possible to mute that with the message = FALSE option in the chunk, but I think we want to be careful about teaching those to students who are new to R.

screen shot 2015-09-16 at 17 05 26

Leaving the issue here to be consider before the next version of the package...

As per the Korean font error

Hi,

When I render the image with Korean Character, the Korean characters are broken.
-. myPDF in variable.R

However, for instance, when I test the CairoPDF, the Korean characters are rendered correctly, but there are width and height issues.

image

I think that the other asian characters will have similar issues when using openintro package.

Thank you.

dotPlot() collides with mosaic::dotPlot()

From looking at your examples, I'm not exactly sure what the purpose of your dotPlot() is supposed to be, but it is unfortunate that you have chosen a name that conflicts with the version in the mosaic package, which makes the kind of dot plot often seen in introductory statistics courses.

mosaic::dotPlot( ~ rnorm(500), width = 0.1)

image

code by chapter

Is there a place where I can find the R code by chapter for the openintro book ?

yrbss documentation

Do we know which year's survey is included in this dataset? Also, do we know if the variable called gender is what's identified in the 2017 data documentation as sex?

I'm happy to do a PR to clarify those things if we can track them down.

Why mask data sets in datasets?

This seems unnecessary and confusing:

library(openintro)
## Please visit openintro.org for free statistics materials
## 
## Attaching package: ‘openintro’
## 
## The following objects are masked from ‘package:datasets’:
## 
##     cars, chickwts, trees

Add a page with csv download for all data

This would be helpful for non-R users of the datasets.

@DavidDiez I know you host these on openintro.org but keeping synced seems a challenge. I could automate it here and post on the package websites and openintro.org could point to them. Or I suppose you could build the page on your end based on the automatically generated files in this repo as well. We should discuss which approach is preferable, but at least automatically generating files as we update the package seems like a good idea.

Add additional citation to BAC

#' @source J. Malkevitch and L.M. Lesser. For All Practical Purposes:

From Jack Miller:

The blood alcohol data set has been around since 1992 and appeared in the Electronic Encyclopedia of Statistical Examples and Exercises. I worked on EESEE and used the data sets at OSU, so I am very familiar with that particular citation. :-) Here is a URL for that particular "story" in EESEE: http://bcs.whfreeman.com/WebPub/Statistics/shared_resources/EESEE/BloodAlcoholContent/index.html.

This change will need to propagate to IMS and other books that reference this dataset as well.

Email data corrections

  • In both email and email50 there are variables in the docs that don't exist in the data: period_mess and signoff -- should be removed from docs
  • email50 example code yields FALSE (random sampling change might be the cause?)
  • In both datasets indicator variables should be factors
  • cc is numeric, not indicator

qqnormsim() ideas

  1. Use scales == "free" or better, add a scales argument that defaults to "free". [Else a sample with an outlier will cause the other plots to look quite different from how they would look if they were generated in isolation.]

  2. Don't hard code the number of simulations. Let 8 be the default if you like.

  3. rename first argument? It's a bit of an odd name. But I'm guessing it will typically be used without naming, so this is not such a big deal.

  4. Consider a version that doesn't label the original data but makes it one of the sample (randomly selecting which location). Not sure the best way to do the "reveal".

  5. Perhaps add a seed argument that sets the seed used. That would solve the reveal issue in one way, since the plot could be generated again withe the original data set distinguished.

  6. Complete the documentation and include examples.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.