scpoecon / scpoeconometrics Goto Github PK

View Code? Open in Web Editor NEW

130.0 130.0 67.0 144.13 MB

Undergraduate textbook for Econometrics with R

Home Page: https://ScPoEcon.github.io/ScPoEconometrics/

License: Other

Shell 0.08% TeX 0.71% CSS 0.17% HTML 98.80% R 0.24%

econometrics political-science r sociology teachers teaching textbook tutorial

scpoeconometrics's People

Contributors

Stargazers

Watchers

Forkers

gitter-badger vviers jaimono jngod2011 jsay shshnkg marta-facchini alekproshin snowdj hal2001 rsbivand supermayo tyleransom opmc2 schloerke wacholder000 lnsongxf jonduan vedavyas003 ewillyliew nofacetou pvilledieu fhoces krishnapsrinivasan xavierragot noleoine jonasheipertz lystahi roman-cmyk michelefioretti bjornerstedt srravula1 mmkuang victoriarina huashe418 erwannsbai econometrics nicholaskarlson anonymnous2023-lab tomraster jarodriguez849 anoukbor jaeyungkim sebastian-olascoaga daffeh10 zaynabzaher mcgutes javis25 hoien jannaert nathalienf sirberha wlkpd hsyngmtrk sreconometrics jameb992g3 fbw1 gragusa jau104 yifeiding-ucr

scpoeconometrics's Issues

standard error for non-normal model

this is similar to our existing app. the following list could be split into various apps.

Now simulate a simple linear model with uniform errors and repeat the preceding experiment with different sample sizes (2, 5, 10, 100). The distribution of estimates should become normal very quickly.
Increase N again. You should find that sd tend to 0 and estimates from different draws become identical. Define consistency as this phenomenon.
Plot the variances of estimate as a function of N. Add 1/N to the plot. (Or std errors and 1/\sqrt{N}).
To understand this phenomenon, calculate the variance of the mean of an iid sample. Recall that the sample mean is the OLS estimator of the most simple linear model. Statistical theory shows that this intuition applies to more complex models.
Show then how std errors vary with the error variance and the variance of the regressor. Give the theoretical formula.
Plot R2 as a function of N. One should now understand that in small sample, statistical error adds to the population error.

tutorials prerendering doesn't work

i have issues with run_tutorial("chapter3","ScPoEconometrics")

second lm example

i have a book on my desk with a good lm example

regression + SSR app

related to #6 : same app but also with a 3D plot of SSR(a,b).

try with code in inst/chapter4.

change pch: use points?
maybe change colorbar?
simulated data for y = a + bx

anscombe's quartet

this is related to 4.1.4 in https://scpoecon.github.io/ScPoEconometrics/linreg.html

the example illustrates that linear statistics are only infomrative about linear relationships. all 4 plots have the same statistics: mean, reg line, corr etc, but obviously this is not very informative if the data are very nonlinear. it would be great to have an app that uses this example. One could do

start with the 4 plotted datasets only
one by one uncover the next statistic, making them realize that they are all the same
takeaway should be that there needs to be visual inspection of the data, next to just computing the correlation coefficient.

this is the code form the R help for anscombe which makes the 4 plots.

##-- now some "magic" to do the 4 regressions in a loop:
ff <- y ~ x
mods <- setNames(as.list(1:4), paste0("lm", 1:4))
for(i in 1:4) {
  ff[2:3] <- lapply(paste0(c("y","x"), i), as.name)
  ## or   ff[[2]] <- as.name(paste0("y", i))
  ##      ff[[3]] <- as.name(paste0("x", i))
  mods[[i]] <- lmi <- lm(ff, data = anscombe)
}

op <- par(mfrow = c(2, 2), mar = 0.1+c(4,4,1,1), oma =  c(0, 0, 2, 0))
for(i in 1:4) {
  ff[2:3] <- lapply(paste0(c("y","x"), i), as.name)
  plot(ff, data = anscombe, col = "red", pch = 21, bg = "orange", cex = 1.2,
       xlim = c(3, 19), ylim = c(3, 13),main=paste("dataset",i))
  abline(mods[[i]], col = "blue")
}
par(op)

fill in team in syllabos page

multicollinearity app

illustrate with a 3d plot similar to the bottom of 06-MultipleReg.Rmd
fake data, N=100 or so
slider that controls correlation between x1 and x2
show that as the correlation increases, the resulting plane rests on an ever narrower support
need to link that to standard error of estimates somehow

create a manual for each app/tutorial

we need a doc associated to each app that tells exactly what this is showing
tutorials should be more self-explanatory, but if there is anything noteworthy, it should go into that document.
almost like a step-wise guide through the app
also a recipe of how to use it with students, eg "first explain this, then that, then tell them to clikc here, then there"
should be put in the same folder as the app
format? probaly .Rmd so we can include latex easily (and maybe even distribute to students)

require R 3.5

http://robertgrantstats.co.uk/drawmydata.html

SSR cone does not show "your guess"

i cannot see the values in the "your guess" line on the left.

chapter 3 correlation

make an app that plays on correlation.

library(mvtnorm)
set.seed(10)
cor = 0.9
sig = matrix(c(1,cor,cor,1),c(2,2))
ndat = data.frame(rmvnorm(n=300,sigma = sig))
x = ndat$X1
y = ndat$X2
par(pty="s")
plot(x ~ y, xlab="x",ylab="y")

and move cor

publish slides together with book?

could make _build.sh copy slides into _book directory
could prepare a last chapter in book with links to those slides

change `draw` button to jump k draws ahead

it's cumbersome to get the histograms in the standard_errors_simple app
one has to click at least 100 times to get an idea of the distribution of estimates
can we have a second button "draw 10 samples" or "draw 20 samples" or so?

random number generation app

we need to introduce the idea of a data generating process
we are workign towards the concept of model
for that we need to understand how to draw observations from a process
the idea that every tiem you draw, you get a different sample of points
ultimately this will infomr the discussion about standard errors in regression models
we've already done "as you increase N, the sample approximates better the pdf", so need something else

app ideas

i want them to repeatedly draw from a process and see always differen realizations.
let's go with the regression y = a + bx + u
assume u ~ N(0,sigma), fix a and b
have a button "redraw"
scatter plot of y ~ x
N < 20
could do something with a joint normal distribution instead of regression
i like the idea of just observign the changing scatter plot rather than looking at a histogram.
other ideas welcome - let's discuss here!

aboutApp returns wrong error message

this:

> aboutApp("regression")
Error: Please run `launchApp()` with a valid app as an argument.
Valid apps are: 'anscombe', 'confidence_intervals', 'corr_continuous', 'datasaurus', 'demeaned_reg', 'multicollinearity', 'reg_constrained', 'reg_dummy', 'reg_dummy_example', 'reg_full', 'reg_multivariate', 'reg_simple', 'reg_standardized', 'rescale', 'sampling', 'SSR_cone', 'standard_errors_changeN', 'standard_errors_simple'

needs to be changed to

  # locate all the shiny app examples that exist
valid <- character(0)
  v <- list.files(system.file("shinys", package = "ScPoEconometrics"),full.names=TRUE)
for (i in v){
# if i has an about.Rmd
if (file.exists(file.path(i,"about.Rmd"))){
valid <- c(valid,basename(i)
}}

other interesting apps

https://www.rstudio.com/products/shiny/shiny-user-showcase/

chapter 4: regression specifics

one app that illustrates that if you choose slope b=0, you estimate the mean of y
same app, what happens if you set intercept a=0? what if you demean the data?
Make them run the regression on standardized regressor (divided by sd). Compare to correlation between x and y. standardizing vars gives you correlation from b

additional version to regapp

Make them run the regression on standardized regressor (divided by sd).
Compare to correlation.
standardizing vars gives you correlation from β

IV

rename SSR_cone axis

in SSR_cone app the x-y axis should be "intercept" and "slope" not a_ and b_

set code prompt as global var

file with prompt
don't want to have to change all Rmd each time we change the prompt
it's tricky because R CMD CHECK removes unncessary files.

add SSR print to simple_reg

add SSR, sum of errors (not squared), sum of absolute errors to simpe_reg. just print below.

change regapp squared errors

currently you do

    rect(xleft = x, ybottom = y,
         xright = x + abs(errors), ytop = y + errors, density = -1,
         col = rgb(red = 0, green = 1, blue = 0, alpha = 0.1), border = NA)

I find it nicer like this:

    rect(xleft = x, ybottom = y,
         xright = x + abs(errors), ytop = y - errors, density = -1,
         col = rgb(red = 0, green = 1, blue = 0, alpha = 0.1), border = NA)

(just swap the sign in front of errors for y.

this will have rectangles point downwards for points above the line, and vice versa.

SSR_cone app doesn't display values of a and b

see screenshot on current readme of the repo.

anova

some of this is relevant for both book and apps.

https://en.wikipedia.org/wiki/Analysis_of_variance
For a given sample, show the histogram of yi and the histogram of \hat{y􏰔}i. Make them reflect on what it means for the histograms that the regression line is inside the scatterplot.
Calculate the variances of yi and of \hat{y􏰔}i. What do you remark about the difference of these variances and the mean square error (SSR/N)?
Define ANOVA
Apply ANOVA to wages and education. Estimate between and within education group inequality.
define R2

redo chapter 6 with plotly

redo the 3d graph with plotly in book

Add windows testing

No book build but run tests

dist app for chapter 3

inspiration from https://gallery.shinyapps.io/dist_calc/
dropdown with
1. normal
2. lognormal
3. pareto (DistExtra?)
4. beta
5. poisson
6. uniform
7. logistic
8. maybe binomial?
maybe just copy to start with and then try and integrate in the tutorials Rmd.
maybe don't use kernel but theoretical pdf?

tutorial that plays with rescaling of x,y or both

it's always confusing for students to think about what happens when one transforms on of the vars in a regression
i am not thinking about log transforms for now
just say that we multiply both y and x with a constant c. what happens to a and b?
what if only x?
what if only y?
if you have simple way to show the log-log model, make a proposal, otherwise let's disregard this for now.

standard errors in theory

is imcomplete! chapter 4.

regression to the mean

potential topic

bug in tutorials?

On R version 3.4.4 (2018-03-15)
I've loaded the libraries ScPoEconometrics 0.0.1 and learner 0.9.2.1.

When running:

run_tutorial("chapter3",package="ScPoEconometrics")

I obtain:

Listening on http://127.0.0.1:7177
Warning: Error in value[[3L]]: Couldn't normalize path in addResourcePath, with arguments: prefix = 'font-awesome-4.5.0'; directoryPath = '' [No stack trace available]

mulitple regression: y = a + b1 x1 + b2 x2 + u

let's replicate a few apps from the univariate case to this case
let's have the "find the best surface" app
this should be done with plotly

Chapter 1 - Getting Started with R

TODO:

add more on control flow, and for-loops in particular
Vectorization: leave it mostly as it is but remove the crazy scary word "vectorization". This can be explained much more simply e.g. "most operations made on vectors are made on each elements in the vector individually"
is dplyr really needed in an intro to R session? especially the pipe would be very confusing I think
in the same vein, we talk about dataframes but make most of our examples with tibbles... Granted, they're the same, but maybe it's a little early to introduce them here?
1.7.7.1 Task 5 asks students to use the table function which is not introduced anywhere else in the book... Maybe let's keep it to teach them the virtue of figuring things out by yourself?
Maybe let's find another way to illustrate default function arguments than with biased or unbiased variance (which they are not supposed to know and so will not necessarily "speak" to them...)

RDD

do a chapter on RDD based on this: http://microeconomicinsights.org/opportunity-access-legal-work-status-affects-immigrant-crime-rates/

Mulitvariate 2: dummy variables

continuing from the example of #18 , now we introduce dummy vars
a dummy is an X \in {0,1}, as in #18
the difference is that if added to another (continuous for a start) regressor, it shifts the intercept.
do wage regression w = a + b * gender + c * height, gender \in {0,1}
I guess a tickbox with set gender = 1 and a shifting regression line is what we want.
any other ideas you have here, just put forward below.

realworld example with `lm`

let's have a good real world example with lm to finish off the simple reg_app section
let's get a dataset from package Ecdat
not too many explanatory variables?
needs to come chronologically after #24

Extend test suite to run tutorials

Recent issue #54 about bug in tutorials needs to be covered by tests.

chapter 3: clarify covariance for discrete case as well

we need to test the apps

we need to make sure that apps always run
we need this: https://rstudio.github.io/shinytest/articles/shinytest.html
I would like to keep everything within this one R package
so i think we will have to move away from the apps inside the learnr tutorials and deploy each app as a standalone app.
we can probably keep some tutorials
but the majority of apps will be called with this strategy: https://deanattali.com/2015/04/21/r-package-shiny-app/

standalone shiny apps bunched towards upper bound of browser window

i have noticed that the newly organized apps are all tucked towards the upper bound of the browser window. differently from when we had tham in the .Rmd files. must be some css setting. can you investigate how to have a border around the app?

another standardize app

we another standardization.
need x - mean(x), same for y.
to show that you get the same slope coefficient b as in the original case, but a zero intercept in the demeanded case.
app should be called demeaned-reg
it's mostly copy and paste from reg_standardize

Rescaling Tutorial does not work

when i click on "run code" in the first code box i get object x not found

reorganize into apps

i set this up as a proper package now
please file a PR that reorganizes all standalone apps into single file apps
an example is shown in /inst/shinys/reg1 for the first regapp
create /inst/shinys/reg2 for the second and so on
if you are unsure about whether a certain app should be part of a tutorial eventually or be a standalone app, leave as it is. (or ask!)
thanks!