datacamp / authoring Goto Github PK

View Code? Open in Web Editor NEW

8.0 8.0 9.0 18.27 MB

Home Page: https://authoring.datacamp.com/

License: Other

CSS 100.00%

authoring's People

Contributors

Stargazers

Watchers

Forkers

georgenal optionalg lankylou robert-cabral bgordts vvnkr melindahiggins2000 alqurri77

authoring's Issues

No mention of 5 multiple choice exercise limit in course?

Doesn't appear that there is a mention of not having back-to-back multiple choice exercises either. I think given the discussion Greg led on good multiple choice exercises, we should revisit both of these pseudo-guidelines anyway. I'm going to proceed with instructors assuming these are not strict guidelines since they aren't included here.

add another synonym example

In http://authoring.datacamp.com/courses/design/brainstorming-jargon.html

... it's maybe worth mentioning that "longitudinal data" is also called "repeated measures data" and "panel data".

Add a running example for R

From Dave Robinson's Stochastic Processes in R:

Stochastic Processes in R

Step 1: Brainstorming

What problem(s) will student learn how to solve?

They'll learn the generative models behind several stochastic process models, and understand to simulate them with R functions.

What techniques or concepts will students learn?

In learning three new distributions, the exponential, gamma, and uniform, they'll reinforce the concept that (unlike the binomial) distributions can be continuous
They'll become more familiar with the use of simulation to understand probability models and answer questions about them.

What technologies, packages, or functions will students use?

The distribution-related functions around the exponential, gamma, and uniform functions
The sample, cumsum, replicate, and accumulate functions

What terms or jargon will you define?

Random walk, Markov chain, transition matrix, waiting time, Poisson process, Poisson point process

What analogies will you use to explain concepts?

Two people gambling repeatedly with a coin for a random walk
Randomly generated text in a Markov chain
A light bulb burning out for an exponential variable/Poisson process
Simulating trees in a forest for a spatial Poisson process

Step 2: Who is this course for?

Link to student profiles.
If these don't match exactly, feel free to modify as needed in discussion:

Catalina: Catalina wants to teach probability as a foundation for statistical inference, and this is perfect for such a class.
Yngve is a great audience for the course because stochastic process form the basis of forecasting and time series methods.
Anya is interested in web traffic data, and a Poisson process gives a good background for analyzing this data, though it may be on the more theoretical side for her.
Thanh may already know this material from his background in statistics, though it may serve as a useful refresher (I've met students with similar background who never took a proper probability course).
Mohan wouldn't be interested because this course doesn't give a fast solution to homework problems. Instead, it combines with other courses to build a deep understanding of statistical inference.

Step 3: How far will this course get its students?

This could be similar to the last exercise of the course or a last exercise in a chapter.

Last exercise in Chapter 2: Write a function that simulates 100 steps from a Markov chain of words, given a transition_matrix with row and column names.

Skills:

Understand what a Markov chain is
Understand transition matrices are, including subsetting a row and row/col names
Use sample with prob to find the next state in a transition matrix
Use accumulate to save up many steps in a chain

Solution:

library(purrr)

simulate_step <- function(state) {
  state <- sample(nrow(transition_matrix), 1, prob = transition_matrix[state, ])
  colnames(transition_matrix)[state]
}

accumulate(1:100, simulate_step, .init = "the")

Last exercise in the course: Write a function that, given a number of points, the width of a region, and the height of a region, generate that many points in a Poisson point process.

Use it to plot 50 points in the space of 10 x 10.

Skills required:

Understand how a point process comes from a combination of the Poisson and the uniform, even in two dimensions
Create a function that takes arguments (we'd provide scaffolding)
Simulate from the Poisson distribution using rpois
Simulate from the uniform distribution using runif, twice
Use plot to compare and x and y (we'd give scaffolding).

Solution

simulate_points <- function(density, x_width, y_width) {
  number <- rpois(1, density * x_width * y_width)
  x <- runif(number, 0, x_width)
  y <- runif(number, 0, y_width)
  
  plot(x, y)
}

simulate_points(1, 10, 10)

Step 4: What will the student do along the way?

Middle of chapter 2: Generate three steps in a Markov chain given a transition matrix and starting state.

Solution:

step2 <- sample(nrow(transition_matrix), 1, transition_matrix[state, ])
step3 <- sample(nrow(transition_matrix), 1, transition_matrix[state2, ])
step4 <- sample(nrow(transition_matrix), 1, transition_matrix[state3, ])

Middle of chapter 4: Randomly generate 100 events in a one-dimensional Poisson process with a rate of 3 per second by simulating exponential waiting times. Find the distribution of how many events happen in the first 2 seconds.

Solution:

cumsum(rexp(100, 3))

replicate(1000, sum(cumsum(rexp(100, 3)) <= 2))

Step 5: How are the concepts connected? (Course outline)

Chapter 1: Random walks

Lesson 1.1 - Random walks: Imagine you were gambling with a friend, and betting on a coin. Each time you either lose one dollar or gain one dollar. This is a random walk: at any moment, it could go up or down one step.
- You can simulate a step with sample(), and find the cumulative position with cumsum().
- You can plot this with plot(), which looks a bit like a stock price graph.
Lesson 1.2 - Biased random walk: So far the random walk has been symmetrical, with an equal probability of gaining or losing.
- What if the game were a bit rigged: each time, you had a 51% chance of losing? This is similar to playing in a casino, where the casino has an advantage we call the "house edge".
- Try a very large number of steps: watch it "plummet". In terms of casino games, this proves the old expression that "the house always wins".
Lesson 1.3 - Properties of a random walk: Where will a random walk end up after 10 steps? 100 steps?
- Introduce replicate() for simulating it.
- The end position looks like a normal distribution (bell curve), because of a mathematical result called the central limit theorem.
- Mean and variance of the end position.
Chapter 2: Markov chains
Lesson 2.1 - Transition matrices
- A random walk is one case of a Markov chain: a process where the next step depends only on the current state. Example we'll return to: Markov chain of words.
- Creating a transition matrix, here's what it means: each cell has probability of going from a particular state to another state
- Two multiple choice questions: what is the probability of going from state 1 to 2 with this matrix?
Lesson 2.2 - One step in a Markov chain
- sample(2, 1, prob = transition_matrix[state, ]) lets you randomly step
- Try it three times in a row
- Introduce an "absorbing" state
Lesson 2.3 - Accumulating steps in a chain
- Using the purrr accumulate function to add up states into a chain.
- Notice how an absorbing state works.
Lesson 2.4 - Example: Markov chain of words
- We're giving you a transition matrix of words. These were determined based on a set of newspapers
- Simulate a series of words.

Chapter 3: The exponential distribution

Lesson 3.1 - The exponential distribution Imagine a light bulb has a random chance of burning out, but
- Show using rexp to simulate
- Go over the mean and the expected value
Lesson 3.2 - The memoryless property
- If a light bulb hasn't burned out in the first 5 days, how much longer would we expect it to last?
- Try exp_greater_5 <- exp_sample[exp_sample >= 5], then examine the properties and distribution of exp_greater_5.
- This is still an exponential distribution
Lesson 3.3 - The gamma distribution
- What if you replaced the light bulb each time? How long would it take to replace 3 lightbulbs?
- Simulate with sum(rexp(3, 5))
- This is the gamma distribution

Chapter 4: Poisson processes

Lesson 4.1 - Poisson process
- Take the exponential waiting times from the last exercise, and use cumsum
- These are event times in what we call a Poisson process.
- Example: website visits per hour, radioactive decay, cars driving by.
Lesson 4.2 - The uniform distribution
- Another way of simulating a Poisson process
- Approach this by asking two questions in a row. How many events are in this interval (Poisson), and where are they located (uniform)?
- The uniform distribution randomly picks points in an interval, between a minimum and maximum. ruinf.
Lesson 4.3 - Poisson point process
- Everything in this course was over time, but this also works in spatial simulations.
- Simulate rabbits located in a forest. (It could also be cells under a microscope, or cities in a country). Suppose the position of each rabbit were completely independent of all the others.
- Approach this by asking two questions in a row. How many points are in this region, and where are they located? This is a two-dimensional version of a Poisson process.

Step 6: Course overview

Course Description

Whether it's prices in the stock market, the number of visitors to a website, or the population of rabbits in a forest, many phenomena that we'd like to model with statistics involved numbers tracked over time. In this course, you'll be introduced to the field of stochastic processes, an area of probability studying systems that change over time. You'll learn about common statistical models such as random walks, Poisson processes, and Markov chains, as well as being introduced to the exponential and gamma distributions. These provide the fundamentals for many statistical methods common in finance, biology, and many other fields.

Learning Objectives

Be introduced to statistical models often used to represent a system changing over time: random walks, Poisson processes, and Markov chains
Learn about two new statistical distributions, the exponential and the gamma, that are important in modeling .
Gain experience with probability concepts such as random variables, distributions, expected value, and variance
Practice using simulation to understand and answer probability problems

Prerequisites

Add first exercise sub-step to course spec

For courses in the middle of a track, some instructors find it difficult to know where to begin there course.

I propose updating step three of the course spec so that it has 3 parts.

3.1: Write the first exercise, so you know where the course begins
3.2: Write the last exercise, so you know where the course ends (as before)
3.3: Write some exercises that detail how you get from the start to the end (as before)

Even for courses which aren't in the middle of the track, I think that it can be useful to know the starting point. (It should help clarify prerequisites, and if instructors are throwing in material that is too tricky, we'll discover the problem early.)

In the directory courses/design, this requires

An update to template.md to describe this sub-step.
A new mini-manual, exercises-first.md with examples, etc.
Minor updates to exercises-capstone.md and exercises-examples.md to renumber them.

Different specs for different course types

@richierocks commented on Wed Sep 20 2017

I wonder if we need different specs for different course types. We don't have an official course taxonomy, but I think there are three different types:

Code courses: These start from a set of packages and teach the syntax. e.g., Data Manipulation with dplyr, Data Visualization with ggplot2
Technique courses: These start from a technique and teach how to use it, e.g., Introduction to Time Series Analysis or Intro to Statistics with R: Multiple Regression
Problem courses: These start with a business problem (or scientific problem) and teach what the problem is and how to solve it. The forthcoming business analytics and health analytics courses fall into this category. Arguably, so do the case study courses like Exploratory Data Analysis in R: Case Study.

It might be useful to have a different course spec for each type of course, since the questions the instructor asks will be slightly different. (At least, the order they ask the questions may change.)

How should the specs change by course type?

@ncarchedi commented on Wed Sep 27 2017

It's a good point. Related to my comment here. Though I'd be more inclined to have a single generic/flexible format for course specs that can be adapted to each use case vs. having three different formats.

@gvwilson commented on Mon Oct 02 2017

Address after we have feedback on the first round.

Add page on images in courses

There's some stuff on ShutterStock in the content wiki

https://github.com/datacamp/content-wiki/blob/gitbook/docs/archive/using-images-in-courses.md

Should also mention things like

Sources of free images (unsplash, etc.)
Don't use comedy gifs since they date badly and we want to keep a professional feel
Use of diagrams and animations

Need to distinguish between images in videos and images in coding exercises.

small typo in NormalExercise example

I'm not sure if this was resolved in another PR. There is an extra # in the title ("Interactive Exercise Title")

http://authoring.datacamp.com/courses/exercises/normal-exercise.html

Slides authoring

Verify that slides authoring is up to date.

Enumarate all supported Slide types
Enumerate all blocks per slide type

Add note about ThinkNum dataset access

State that we have access to Thinknum's datasets in the "Where can I find datasets?" section of

http://authoring.datacamp.com/courses/design/brainstorming-datasets.html

Maybe link to their list of datasets:

Thinknum Tearsheet.pdf

Not quite sure what the workflow is. Probably just get the CL to email Justin Zhen, but we can document this once we have a plan.

Have instructors differentiate data sources

From @ismayc on Slack:

Looking over outline feedback (in the old course specs system), I'm wondering if it makes sense to have instructors clearly differentiate the data sources they plan to use in the slides/videos and those they plan to use in the exercises in the new course specs. It seems that when instructors have a good sense for this the course development goes a lot faster/easier.

Add 600 word limit for slide scripts

Would be helpful for instructors to see that they have a maximum of 600 words per slide deck (with preference for closer to 500).

Some broken links

When the gitbook is running, on the tab exercise and bullet exercise pages the links to tab-vs-bullet-exercises.md don't work. The extension needs to be .html for gitbook to find the files (and there are typos in the file name on both pages).

Also on the same pages the links to R examples of tab/bullet exercises actually go to the SQL examples.

Happy to send a PR if you give me push access to the repo.

level 2 header not always used for exercise title

@machow commented on Wed Dec 13 2017

Summary of the issue

When two hashmarks are used to indicate a header, teach uses it as an exercise title. When ---- is used to indicate a level 2 header, teach thinks it is demarcating a new exercise.

What is the relationship between the teach exercise grammar and markdown supposed to be?

Minimally reproducible example

Replace two hash marks with --- underneath.

@machow commented on Wed Dec 13 2017

Wait, I think I understand. It looks like ## and ---* define the opening and closing of an exercise block now. Might be useful to clarify in the authoring book, since it says that ## is a header, but does not mention how to separate exercises.

http://authoring.datacamp.com/courses/exercises/

@gor181 commented on Thu Dec 14 2017

Exactly each exercise starts with --- type: (old syntax) or --- followed by a title for new syntax.
Moving this task there thanks

Style guides are missing

They previously existed at

https://www.datacamp.com/teach/documentation#tab_style_guide_python

https://www.datacamp.com/teach/documentation#tab_style_guide_r

R TabExercise cannot be run without installing dplyr in requirements.sh

Feel free to close, if not an issue :). A reader who copies the R TabExercise example into Teach will see the following.

They would need to install dplyr in their requirements.sh to get the exercise to run. It might be helpful to either

modify the example so that it can run with the base R packages, or
change the base package to include dplyr

option (2) is probably part of a bigger discussion with @rv2e (maybe as a content-tech-request?)

Add heuristic on judging difficulty level

I got a good quote from instructor Melinda Higgins.

When designing introductory courses, if you are bored to tears with the material, you've probably pitched it at the right level. If you find it interesting, it's too hard.

This feels like it belongs in the documentation somewhere. (Course design step 3?)

Learner profiles does not tell me as a CL what I need to know.

@rasmusab commented on Thu Sep 28 2017

What I need to know as a CL is who the Instructor aims to make the course for. This so that both I, the instructor and then the CD are aligned on this. I feel that the Learner profiles does not help me with this.

If I learn that

This course will give Jasmine a basic understanding of the Unix shell so that she can help her students solve the problems they encounter using the university's systems in their statistics courses.

I still know nothing about what the course instructor assumes about the Student, and even if I cross reference Jasmines bio it doesn't tell me much about what tools/techniques the course instructor assumes she has experience with. Also all the extra info like that she did an "MBA at Georgia State" or that she "is partially deaf" feels distracting as I don't know how to use that.

What I would like to see is a list, written in the course instructors own words, on what background the course instructor assumes for the course. Parts of it could look like this:

The Student knows what a file system is, and is comfortable with terms like "files", "folders", "executables" and "filenames".
She has used some kind of terminal-like interface before, for example in our R or Python courses.
She knows that both Linux and MacOs are Unix operating systems and what makes them so is because they adhere to the POSIX standard.

This is useful because now I can point to this list and say that "1 and 2 seems reasonable, but I don't think we should assume they know about POSIX standards."

@rasmusab commented on Mon Oct 02 2017

What I would like to see is something more akin to what I wrote in the course spec for my "foundations of Bayesian statistics in R"-course.

Student Profile

The student is somewhat curious about what Bayesian statistics is, but has no experience with Bayesian statistics.
The student is mostly interested in learning new tools for data science, but is not completely disinterested in the theory of Bayesian data analysis.
The student knows basic probability, and upon hearing the word "Normal distribution" she knows it refers to the bell shaped curve.
The student has done some R programming and does understand what is meant by vector, list, data frame and function.
The student knows what data is, and has at least tried analyzing experimental data before, maybe graphically.
The student has statistics knowledge corresponding to having taken stats 101 5 years ago.
- That is, the student does not remember how to calculate a t-test, but might remember that it was used when comparing groups somehow.
- The student does not remember that classical linear regression is done using least squares, but might remember that linear regression is about drawing a line through a cloud of points.

@richierocks commented on Mon Oct 02 2017

Related: I think that 3 learner profiles might be too much for many courses. It's just difficult to write for multiple audiences.

Thinking about some upcoming courses:

The Developing R Packages course is for 1 type of target audience. Students who have some R experience, and who want to make their code available to others.
Business analytics courses all need two target students: someone with business experience who has been mostly using Excel and wants to know how to do things in R, and someone with some R experience that wants to understand what business problems are being solved.
Similarly for the Bioconductor courses: I want a biologist with few R skills, and an R programmer with no knowledge of the biology.

I suppose that top-of-the-funnel intro courses will have more target audiences, but I think mostly we should be restricting the instructors to thinking about 1 or 2 student archetypes.

@rasmusab commented on Mon Oct 02 2017

But as they are currently written, no single (or even tuple) learner profile matches my course. Like the "space" of our Students is so high dimensional (python/R, mac/win, probability/no-clue, regression/no-clue, etc.), that I don't see how a couple of student instances could possible cover that space.

Question: Are the current profiles based on data?

@rasmusab commented on Mon Oct 02 2017

Business analytics courses all need two target students: someone with business experience who has been mostly using Excel and wants to know how to do things in R, and someone with some R experience that wants to understand what business problems are being solved.

So this seems like a pretty specific student profile.

Question: If none of the currently available student profiles is suitable for a course, can we make up new ones that matches the type of audience we have in mind for that course?

Move teach diagnostics wiki to authoring

@gor181 commented on Mon Nov 27 2017

https://github.com/datacamp/teach-diagnostics/wiki/Diagnostic-rules

Documentation out of step with teach editor (Anaconda)

David Mertz (Anaconda) reported the following:

The documentation at http://authoring.datacamp.com/courses/exercises/ does not well match the Teach Editor. I don't have much insight into what the underlying infrastructure actually does, or which is "less wrong".

Throughout the two the Teach Editor inserts headers similar to *** =pre_exercise_code while the document describes headers like @pre_exercise_code. Nick has stated that these two are equivalent and headers have the same spelling in actual words (just not the surrounding DSL markup).

Within the Teach Editor, I see the following exercise types. Highlighted in bold are those that exist one place but note the other.

Normal
Multiple Choice
Plain Multiple Choice
Pure Multiple Choice
Video
Tab Exercise
Bullet Exercise
Single Process
Exam
Console

The documentation describes:

Video Exercise
Normal Exercise
Multiple Choice Exercise
- Pure Multiple Choice Exerciese
- Plain Multiple Choice Exercise
Tab Exercise
- Tab Console Exercise
Bullet Exercise
- Bullet Console Exercise

There are a variety of places where using the template and/or the documentation does not produce an exercise that behaves as intended, but these are contained in GH issues in the courses-anaconda-ecosystem-1 repo, so I will not duplicate them here.

Add documentation on formula escaping

@vincentvankrunkelsven commented on Fri Jun 16 2017

LaTeX formulas in markdown should be escaped. E.g. in $x_{1}+x_{3}$, the _ must be escaped: $x\_{1}+x\_{3}$

explain the difference between instructor, collaborator, and silent collaborator

Step 5: unclear what is covered in one video in the outline

@LoreDirick commented on Thu Sep 28 2017

What seems a bit weird to me in the listed outline is that it is unclear what is covered in one video and what is covered in another one. What I like about the current outline structure is exactly that: I tell my instructors that "main" bullets are a video, and "sub" bullets add extra details to the video and the exercises that follow if that is something the instructor already gives more details on. Also, as a CL, I don't know from this outline how many videos will be in each chapter, and if the instructor is not trying to put too much in one video.

@yashasroy commented on Fri Sep 29 2017

I agree. I think this stems from the fact that these are example course specs for a course that does not contain videos. I think the idea going forward is that it will be down to the CD and the instructor to figure out how many videos will be in each chapter (and what to cover in them), with the formative assessments being used for guidance.

@gvwilson commented on Mon Oct 02 2017

Please close if the revised wording makes this clearer.

Clarify who the reviewers are

Update the mini-manuals, once we have formalized our policy.

Step 4 not straightforward for each course type

@LoreDirick commented on Thu Sep 28 2017

For some courses it might make sense to create some cornerstone exercises. But for other courses (eg case study courses, also: my credit risk modeling course) the entire workflow is more sequential, and exercises will change while you work through the course and data set. I'm referring to my credit risk course: this step would have been very hard for me at this stage. I could have created exercises here, but would have had to redo them, which is a time drain.

@gvwilson commented on Mon Oct 02 2017

Good point - revisit after feedback from early adopters?

@gvwilson commented on Tue Oct 31 2017

Would it make sense to combine steps 3 and 4 (summative and formative assessments) and just have instructors write a representative subset of exercises that we then put in order? cc @richierocks

Syntax for MultipleChoiceExercise

Currently the documentation is only relevant for Python MCQs:

test_mc(n, [msg1, msg2])

Add examples for R and Shell:

test_mc(n, c(msg1, msg2))

Shell

Ex() >> test_mc(n, [msg1, msg2])

First reported here

"The sample code should run without errors"

This seems pretty inconsistent with how we have instructors actually build exercises using lots of ___ throughout.

https://authoring.datacamp.com/courses/guides/style.html

Example R exercise with files

R Exercise with Files

In this exercise, you will inspect the contents of a text file (data.txt) contained in the data/ directory and call the appropriate base R function to read it into a data frame.

type: NormalExercise
xp: 100
key: asdfa83435

@sample_code

@@data/data.txt

x    y    z
1    5    9
2    6    10
3    7    11
4    8    12

@@script.R

@solution

@@data/data.txt

x    y    z
1    5    9
2    6    10
3    7    11
4    8    12

@@script.R

read.table("data/data.txt", header = TRUE)

Add "working with github issues" tech help

Add link to

https://help.github.com/articles/creating-an-issue

maybe others on tagging, commenting, closing issues?

http://authoring.datacamp.com/courses/design/technical-help-resources.html

Documentation on Slide Timing is missing

The screencast which existed on https://www.datacamp.com/teach/documentation#tab_creating_slides was very helpful.

Make it clear that brainstorming answers are supposed to be really short.

Add CITATION, CONTRIBUTING, and LICENSE files to every repo

Every project should include:

in its root directory (with appropriate edits to author names and project URLs).

Change "title" element

The <title> in the pages generated by GitBook says "Welcome GitBook" - this should say something about DataCamp instead.

Add how to add code blocks and transitions for them

I've had multiple instructors ask how to include code in their slides. It would also be helpful to show how {{1}} can be added at the end of code blocks.

Add Community Tutorials/Blogs?

Are there any plans on adding the production of community tutorials and/or blogs to this webpage? Or does this fall outside the scope?

cc @westonstearns

Remove 'getting started video'

It sucks to keep up to date, and is already hugely outdated. Literally from the first version we deployed EOY 2015.

Add page on math & formulae in courses

Move from content wiki to here, so instructors can see it

https://github.com/datacamp/content-wiki/blob/gitbook/docs/archive/our-philosophy-on-math-and-formulas.md

Add page on course naming

Currently in the internal Content wiki, but we should make this instructor facing.

https://github.com/datacamp/content-wiki/blob/gitbook/docs/archive/course-naming-rules.md

Marketing have advice on this too, so they ought to review anything we write here.

Provide a "minimal manual" for each step of the course spec

Try to limit to 1 page of documentation per step.

Possible format:

What do you have to do?
Examples.
FAQs.
Common problems and their solutions.
How will this be reviewed?

Add spreadsheet course authoring docs

Based upon

https://docs.google.com/document/d/1Tk3URwkiFEMR9h1P0rbEdFDJLYEwnPLfGDPTvfpoSHE/edit

Writing formative assessments is like starting working on the course.

@rasmusab commented on Thu Sep 28 2017

While I really dig that the course instructor put some example exercises in the course outline, what happened to me when I started writing the formative assessments was that I started fully working on the course.

Like, 2-3 exercises per chapter, that's a lot of exercises to create, and to be able to create 3 exercises for each chapter you already need have a pretty good idea what the rest of the course is going to be and to know what exercises and videos go before and after. Also, in many cases a single exercise (background info, description, etc.) can be much more extensive than the example exercises given here.

I'm not saying it's bad that the course instructors create exercises, I'm just saying that we're asking them to create a substantial part of their course already as part of the course outline. One way of mitigating this would be to ask them to write down 2-3 exercise stubs per chapter, where a stub is more of a high level description of the task.

@yashasroy commented on Fri Sep 29 2017

It's worth considering whether asking instructors to write 2-3 formative assessments per chapter is more work than the current system, where they have to create an exercise-by-exercise level index (including videos), before they can actually start writing exercises. Often the course structure changes after the index is created anyway, making the index useless after it is approved. I think these course specs would be useful even after course launch as a way for someone trying to maintain the course 6 months post launch a way to quickly understand the original vision (in a way the current system's index doesn't quite do, but you could argue the current outline does).

I also don't think it is a substantial part of the course. There is still a lot of work remaining in identifying how videos will fit into the course narrative, how to order the exercises, not to mention writing the remaining ~6 exercises per chapter (assuming 12 exercise chapters) and slides/scripts.

Also, in many cases a single exercise (background info, description, etc.) can be much more extensive than the example exercises given here.

Maybe assignment text / background info should not be part of these formative assessments. Instead, the focus should be on the code and the instructions.

@rasmusab commented on Mon Oct 02 2017

For example, here is one of the exercises I created as part of the course specs. This clearly went over board, but then it could go over board for other course instructors as well. The point is that making 2-3 exercises like this per chapter in practice means you left the speccing stage and actually started working on the course:

Use `rbeta` to fit an beta-binomial model and interpret the result.

A Bayesian model which is quick and easy to fit in R is the binomial distribution. Recall that the assumptions of the binomial distribution are that the data is a count of successes (x) out of a number of trials (n), and that there is an underlying proportion of successes (p). To turn the binomial distribution into a fully Bayesian model all we need to do is to specify a prior distribution over p and there are many different ways you can do this. However, if we limit ourselves to using a prior that is a Beta distribution then it turns out there is a simple recipe that allow us to produce samples from the posterior distribution of p:

If the prior over p is Beta distributed with shape parameters prior_shape1 and prior_shape2, and the data we have are x successes and n - x failures, then the posterior distribution of p is also a Beta distribution with shape parameters posterior_shape1 = prior_shape1 + x and posterior_shape2 = prior_shape2 + n - x. To produce samples from the resulting posterior you can then use the rbeta function which takes the sample size as the first argument, and the shape parameters as the second and third argument.

Sample code

# The prior
prior_shape1 <- 1
prior_shape2 <- 1
prior_p <- rbeta(100000, prior_shape1, prior_shape2)

Task 1

The code to the right defines a Beta(1, 1) distribution and samples from this distribution (prior_p). Start by visualizing this prior using a histogram by plotting prior_p using the hist function.

Solution

hist(prior_p)

Task 2

Right! A Beta(1, 1) is the same as a uniform distribution between 0 and 1, a reasonable prior when you have little information regarding the underlying proportion of success.

Now we have some data we want to update this prior with. Say we run a website and we just put up a banner advertising our latest product. Out of the first 100 webpage visitors 32 click on the banner. Assuming the Beta(1, 1) prior, what is probably the underlying proportion of visitors clicking on the banner? Produce a sample from this posterior distribution and visualize it using the hist function.

Solution

clicks <- 32
visitors <- 100
posterior_p <- rbeta(10000, prior_shape1 + clicks, prior_shape2 + visitors - clicks)
hist(posterior_p)

Task 3

If you take a quick look at the distribution you just plotted, what is probably the underlying proportion of visitors clicking on the banner?

Solution

Between 10% and 20% .
→ Between 20% and 40% .
Between 40% and 60% .
Between 0% and 100% .

Task 4

Using the sample from the posterior distribution calculate the probability that the proportion of visitors clicking is more than 25% .

Solution

sum(posterior_p > 0.25) / length(posterior_p)

@gvwilson commented on Mon Oct 02 2017

Agreed that this is getting the instructor to work on the course - as discussed Friday, it isn't extra work, and it gives us early feedback on feasibility.

@rasmusab commented on Mon Oct 02 2017

Right, what I mean is, we don't want other course instructors to make the same mistake as I did: To start writing too extensive exercises before the course spec is finalised. My fear is that that the course instructor will put a lot of work into exercise specifics for many exercises that then might needs to be changed, because the course instructor was still kind of in the planning stage of the course.

That's why I think it could be good to have some wording that directs course instructors more towards writing shorter "exercise stubs" like you have in your example.

Move or remove brainstorming question about analogies and heuristics

Lynne Williams: this was hard to answer early, would probably have been easier later on.

Replace learner profiles in course design process description and template.

Course design process description and template in #6 should refer to profiles created by Marketing - switch to them once they're published.

Add definitive list of packages/tech at the end of specs

@richierocks commented on Fri Nov 10 2017

In order for the CLs to write the requirements.sh/requirements.R file before handing off to the CD, it would be useful if the final step in the README was to produce a definitive list of the packages or other software required for the course.

Add brief human-readable description of IP in course contract

We should add a simplified human-readable explanation of the IP portion of our contracts to http://authoring.datacamp.com, modeled on https://creativecommons.org/licenses/by-sa/4.0/.

Timeline

@ismayc commented on Mon Sep 25 2017

How long do we expect it to take instructors to complete each step in this process?
At what point would we like the instructor to be after an initial chat with Greg? (Is that still the plan going forward?)

@ncarchedi commented on Wed Sep 27 2017

I don't expect this doc to take much longer on average than the overview/outline/index combo takes instructors now. Seems like we should be getting people through the course spec'ing phase within 4 weeks of signing the contract. Some will do it much faster, especially repeat instructors.

@gvwilson commented on Mon Oct 02 2017

Revisit after we have feedback from early adopters.

Tell people that all file extensions are supported for file upload.

Add Documentation on Writing Requirements in Authoring

We should have information for instructors, especially open-course instructors, on writing requirements files.

For context, here is a note from one open-course instructor:
"Hi! I'm trying to make a course with Nipype and I'm curious what is the best way to make user defined environment variables that python will recognize. Nipype is essentially a wrapper for neuroimaging tools that are installed/accessible via the commandline. With a Dockerfile I can use the ENV keyword, but since I'm restricted to having my changes in requirements.sh, I'm wondering what's the best method to get the docker container to recognize environment variables."

General thought on course specs

@rasmusab commented on Thu Sep 28 2017

Hey, these are my thoughts, so when I write should/must/has to it means my humble opinion is that it should/must/has to

The course specs could serve (at least) two purposes:

A specification for us at DataCamp so that we know what kind of course we can expect from the instructor.
A help for the instructor to start thinking about his/her course.

These are two different purposes, and while I think we should help with (2) in the sense that we give guidelines and suggestions I don't think it should be the main purpose of the course specs. We are working with experienced teachers and instructors, and we should let them do their thing when it comes to how they plan/sketch/experiment/think about their course.

The course specs should be focussed on (1). Even though somebody passed an audition doesn't mean we automatically allow them to do any course. The course specs is for us to see if the course is something we actually want, that it's on the right level for our students, that it doesn't overlap too much/little with other courses, etc.

For example, if an instructor likes to do concept maps, then great! But whether we should require all instructors to put concept maps in their course specs should depend on whether it helps CLs and CDs see what kind of course the instructor is making.

Another example: If an instructor likes to use learner profiles, then great! But the reason to include them in the course spec would be if it makes it easier for me as a CL to gauge wether the course is on an reasonable skill/knowledge level for our Students.

@yashasroy commented on Fri Sep 29 2017

We are working with experienced teachers and instructors, and we should let them do their thing when it comes to how they plan/sketch/experiment/think about their course.

In my experience, super experienced teachers and instructors can still struggle with creating interactive online content. I agree with you in principle that we shouldn't be too rigid, but having a process in place (that works! :) ) ensures course development is streamlined.

But I think your point also speaks to @ncarchedi's comment in https://github.com/datacamp/example-course-specs/issues/9 about how creating slides/scripts before exercises may be a better way for instructors to approach their courses.

@gvwilson commented on Mon Oct 02 2017

Please close if these points were addressed in the latest rewrite.

Add diagram showing how steps in course design map to course as presented.

Show clearly that the order in which things are developed is not the order in which they are presented.

courses/README.md screenshots outdated

http://authoring.datacamp.com/courses/ still shows screenshots of the old teach editor, but it has been revamped entirely.

datacamp / authoring Goto Github PK

authoring's People

Contributors

Stargazers

Watchers

Forkers

authoring's Issues

Stochastic Processes in R

Step 1: Brainstorming

Step 2: Who is this course for?

Step 3: How far will this course get its students?

Step 4: What will the student do along the way?

Step 5: How are the concepts connected? (Course outline)

Chapter 1: Random walks

Chapter 2: Markov chains

Chapter 3: The exponential distribution

Chapter 4: Poisson processes

Step 6: Course overview

Summary of the issue

Minimally reproducible example

Student Profile

R Exercise with Files

Use rbeta to fit an beta-binomial model and interpret the result.

Sample code

Task 1

Solution

Task 2

Solution

Task 3

Solution

Task 4

Solution

Recommend Projects

Recommend Topics

Recommend Org

Use `rbeta` to fit an beta-binomial model and interpret the result.