Coder Social home page Coder Social logo

scpoecon / scpoeconometrics Goto Github PK

View Code? Open in Web Editor NEW
130.0 130.0 67.0 144.13 MB

Undergraduate textbook for Econometrics with R

Home Page: https://ScPoEcon.github.io/ScPoEconometrics/

License: Other

Shell 0.08% TeX 0.71% CSS 0.17% HTML 98.80% R 0.24%
econometrics political-science r sociology teachers teaching textbook tutorial

scpoeconometrics's People

Contributors

floswald avatar jeroen avatar jjallaire avatar philomonk avatar pvilledieu avatar schloerke avatar vviers avatar yihui avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

scpoeconometrics's Issues

standard error for non-normal model

this is similar to our existing app. the following list could be split into various apps.

  • Now simulate a simple linear model with uniform errors and repeat the preceding experiment with different sample sizes (2, 5, 10, 100). The distribution of estimates should become normal very quickly.
  • Increase N again. You should find that sd tend to 0 and estimates from different draws become identical. Define consistency as this phenomenon.
  • Plot the variances of estimate as a function of N. Add 1/N to the plot. (Or std errors and 1/\sqrt{N}).
  • To understand this phenomenon, calculate the variance of the mean of an iid sample. Recall that the sample mean is the OLS estimator of the most simple linear model. Statistical theory shows that this intuition applies to more complex models.
  • Show then how std errors vary with the error variance and the variance of the regressor. Give the theoretical formula.
  • Plot R2 as a function of N. One should now understand that in small sample, statistical error adds to the population error.

regression + SSR app

related to #6 : same app but also with a 3D plot of SSR(a,b).

try with code in inst/chapter4.

  • change pch: use points?
  • maybe change colorbar?
  • simulated data for y = a + bx

anscombe's quartet

this is related to 4.1.4 in https://scpoecon.github.io/ScPoEconometrics/linreg.html

the example illustrates that linear statistics are only infomrative about linear relationships. all 4 plots have the same statistics: mean, reg line, corr etc, but obviously this is not very informative if the data are very nonlinear. it would be great to have an app that uses this example. One could do

  1. start with the 4 plotted datasets only
  2. one by one uncover the next statistic, making them realize that they are all the same
  3. takeaway should be that there needs to be visual inspection of the data, next to just computing the correlation coefficient.

this is the code form the R help for anscombe which makes the 4 plots.

##-- now some "magic" to do the 4 regressions in a loop:
ff <- y ~ x
mods <- setNames(as.list(1:4), paste0("lm", 1:4))
for(i in 1:4) {
  ff[2:3] <- lapply(paste0(c("y","x"), i), as.name)
  ## or   ff[[2]] <- as.name(paste0("y", i))
  ##      ff[[3]] <- as.name(paste0("x", i))
  mods[[i]] <- lmi <- lm(ff, data = anscombe)
}

op <- par(mfrow = c(2, 2), mar = 0.1+c(4,4,1,1), oma =  c(0, 0, 2, 0))
for(i in 1:4) {
  ff[2:3] <- lapply(paste0(c("y","x"), i), as.name)
  plot(ff, data = anscombe, col = "red", pch = 21, bg = "orange", cex = 1.2,
       xlim = c(3, 19), ylim = c(3, 13),main=paste("dataset",i))
  abline(mods[[i]], col = "blue")
}
par(op)

multicollinearity app

  • illustrate with a 3d plot similar to the bottom of 06-MultipleReg.Rmd
  • fake data, N=100 or so
  • slider that controls correlation between x1 and x2
  • show that as the correlation increases, the resulting plane rests on an ever narrower support
  • need to link that to standard error of estimates somehow

create a manual for each app/tutorial

  • we need a doc associated to each app that tells exactly what this is showing
  • tutorials should be more self-explanatory, but if there is anything noteworthy, it should go into that document.
  • almost like a step-wise guide through the app
  • also a recipe of how to use it with students, eg "first explain this, then that, then tell them to clikc here, then there"
  • should be put in the same folder as the app
  • format? probaly .Rmd so we can include latex easily (and maybe even distribute to students)

chapter 3 correlation

  • make an app that plays on correlation.
library(mvtnorm)
set.seed(10)
cor = 0.9
sig = matrix(c(1,cor,cor,1),c(2,2))
ndat = data.frame(rmvnorm(n=300,sigma = sig))
x = ndat$X1
y = ndat$X2
par(pty="s")
plot(x ~ y, xlab="x",ylab="y")

and move cor

change `draw` button to jump k draws ahead

  • it's cumbersome to get the histograms in the standard_errors_simple app
  • one has to click at least 100 times to get an idea of the distribution of estimates
  • can we have a second button "draw 10 samples" or "draw 20 samples" or so?

random number generation app

  • we need to introduce the idea of a data generating process
  • we are workign towards the concept of model
  • for that we need to understand how to draw observations from a process
  • the idea that every tiem you draw, you get a different sample of points
  • ultimately this will infomr the discussion about standard errors in regression models
  • we've already done "as you increase N, the sample approximates better the pdf", so need something else

app ideas

  • i want them to repeatedly draw from a process and see always differen realizations.
  • let's go with the regression y = a + bx + u
  • assume u ~ N(0,sigma), fix a and b
  • have a button "redraw"
  • scatter plot of y ~ x
  • N < 20
  • could do something with a joint normal distribution instead of regression
  • i like the idea of just observign the changing scatter plot rather than looking at a histogram.
  • other ideas welcome - let's discuss here!

aboutApp returns wrong error message

this:

> aboutApp("regression")
Error: Please run `launchApp()` with a valid app as an argument.
Valid apps are: 'anscombe', 'confidence_intervals', 'corr_continuous', 'datasaurus', 'demeaned_reg', 'multicollinearity', 'reg_constrained', 'reg_dummy', 'reg_dummy_example', 'reg_full', 'reg_multivariate', 'reg_simple', 'reg_standardized', 'rescale', 'sampling', 'SSR_cone', 'standard_errors_changeN', 'standard_errors_simple'

needs to be changed to

  # locate all the shiny app examples that exist
valid <- character(0)
  v <- list.files(system.file("shinys", package = "ScPoEconometrics"),full.names=TRUE)
for (i in v){
# if i has an about.Rmd
if (file.exists(file.path(i,"about.Rmd"))){
valid <- c(valid,basename(i)
}}

chapter 4: regression specifics

  • one app that illustrates that if you choose slope b=0, you estimate the mean of y
  • same app, what happens if you set intercept a=0? what if you demean the data?
  • Make them run the regression on standardized regressor (divided by sd). Compare to correlation between x and y. standardizing vars gives you correlation from b

additional version to regapp

  • Make them run the regression on standardized regressor (divided by sd).
  • Compare to correlation.
  • standardizing vars gives you correlation from β

set code prompt as global var

  • file with prompt
  • don't want to have to change all Rmd each time we change the prompt
  • it's tricky because R CMD CHECK removes unncessary files.

change regapp squared errors

currently you do

    rect(xleft = x, ybottom = y,
         xright = x + abs(errors), ytop = y + errors, density = -1,
         col = rgb(red = 0, green = 1, blue = 0, alpha = 0.1), border = NA)

I find it nicer like this:

    rect(xleft = x, ybottom = y,
         xright = x + abs(errors), ytop = y - errors, density = -1,
         col = rgb(red = 0, green = 1, blue = 0, alpha = 0.1), border = NA)

(just swap the sign in front of errors for y.

this will have rectangles point downwards for points above the line, and vice versa.

anova

some of this is relevant for both book and apps.

  • https://en.wikipedia.org/wiki/Analysis_of_variance
  • For a given sample, show the histogram of yi and the histogram of \hat{y􏰔}i. Make them reflect on what it means for the histograms that the regression line is inside the scatterplot.
  • Calculate the variances of yi and of \hat{y􏰔}i. What do you remark about the difference of these variances and the mean square error (SSR/N)?
  • Define ANOVA
  • Apply ANOVA to wages and education. Estimate between and within education group inequality.
  • define R2

tutorial that plays with rescaling of x,y or both

  • it's always confusing for students to think about what happens when one transforms on of the vars in a regression

  • i am not thinking about log transforms for now

  • just say that we multiply both y and x with a constant c. what happens to a and b?

  • what if only x?

  • what if only y?

  • if you have simple way to show the log-log model, make a proposal, otherwise let's disregard this for now.

bug in tutorials?

On R version 3.4.4 (2018-03-15)
I've loaded the libraries ScPoEconometrics 0.0.1 and learner 0.9.2.1.

When running:

run_tutorial("chapter3",package="ScPoEconometrics")

I obtain:

Listening on http://127.0.0.1:7177
Warning: Error in value[[3L]]: Couldn't normalize path in addResourcePath, with arguments: prefix = 'font-awesome-4.5.0'; directoryPath = '' [No stack trace available]

Related to the following issue:
https://stackoverflow.com/questions/51213726/when-running-a-tutorial-with-learnr-r-gives-an-error

Chapter 1 - Getting Started with R

TODO:

  • add more on control flow, and for-loops in particular
  • Vectorization: leave it mostly as it is but remove the crazy scary word "vectorization". This can be explained much more simply e.g. "most operations made on vectors are made on each elements in the vector individually"
  • is dplyr really needed in an intro to R session? especially the pipe would be very confusing I think
  • in the same vein, we talk about dataframes but make most of our examples with tibbles... Granted, they're the same, but maybe it's a little early to introduce them here?
  • 1.7.7.1 Task 5 asks students to use the table function which is not introduced anywhere else in the book... Maybe let's keep it to teach them the virtue of figuring things out by yourself?
  • Maybe let's find another way to illustrate default function arguments than with biased or unbiased variance (which they are not supposed to know and so will not necessarily "speak" to them...)

Mulitvariate 2: dummy variables

  • continuing from the example of #18 , now we introduce dummy vars
  • a dummy is an X \in {0,1}, as in #18
  • the difference is that if added to another (continuous for a start) regressor, it shifts the intercept.
  • do wage regression w = a + b * gender + c * height, gender \in {0,1}
  • I guess a tickbox with set gender = 1 and a shifting regression line is what we want.
  • any other ideas you have here, just put forward below.

realworld example with `lm`

  • let's have a good real world example with lm to finish off the simple reg_app section
  • let's get a dataset from package Ecdat
  • not too many explanatory variables?
  • needs to come chronologically after #24

another standardize app

  • we another standardization.
  • need x - mean(x), same for y.
  • to show that you get the same slope coefficient b as in the original case, but a zero intercept in the demeanded case.
  • app should be called demeaned-reg
  • it's mostly copy and paste from reg_standardize

reorganize into apps

  • i set this up as a proper package now
  • please file a PR that reorganizes all standalone apps into single file apps
  • an example is shown in /inst/shinys/reg1 for the first regapp
  • create /inst/shinys/reg2 for the second and so on
  • if you are unsure about whether a certain app should be part of a tutorial eventually or be a standalone app, leave as it is. (or ask!)
    thanks!

shorten/simplify chapter 1

  • there is too much stuff in there at the moment
  • split into basic and advanced usage (separate chapters)

standard errors app

  • similar to sampler app
  • plot regression lines
  • next to implied estimates
  • show that they have a disturbiont.

y = a + bx + u , x \in {0,1}

  • special case with one regressor only that takes to 2 discrete values
  • dont mention dummy variable yet, just show what this estimates
  • estimates the conditional mean E[y|x=0]

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.