uoftcoders / rcourse Goto Github PK

View Code? Open in Web Editor NEW

21.0 12.0 33.0 70.18 MB

Reproducible Quantitative Methods for EEB

Home Page: https://uoftcoders.github.io/rcourse/

License: Other

HTML 2.09% R 47.94% TeX 49.97%

rstats teaching-materials coding ecology eeb evolutionary-biology

rcourse's People

Contributors

Stargazers

Watchers

rcourse's Issues

Where will final assignments be created?

Assuming they should be under version control and on GitHub... Do we create a repo on UofTCoders for each project and add the team members to each their respective repo? Do we have one of the members create a repo and the other share? If so, how do we decide which student gets the original repo (thus full control and potential perception that it is the lead repo..)?

My thought:

We create each repo on the UofTCoders.
Add each member as a contributor.
Set the master branch as protected (so no one can push to it).
Each member forks it.
Each member submits PR to the main one to keep it updated.
Any can merge PRs.

This way, we know whose done what, we can help out more closely, we can intervene if necessary (aka, resolve any file conflicts), and easily provide direct feedback on the project.

Others opinions?

Upload draft of assignments for plotting lectures

See UofTCoders/council#167

Upload draft of assignments for getting started with the data set

From @joelostblom on July 13, 2017 23:32

Since exactly which data sets will be used is not finalized, focus on what you know you will include in terms of getting setup with prodigenR, tidying data, and what assignments will be associated with this. Might be a bit tricky to measure since the student choose different data sets, so let me know if you think it is unclear and we can discuss other alternatives.

Copied from original issue: UofTCoders/council#168

What's the workflow for grading assignments?

We discussed this briefly before. Should students submit assignments on blackboard, via email, or as pull requests on GitHub? GitHub would be nice just because it is easy to comment on specific lines of a pull request, but would need many repositories so that they cannot see each others work... I guess it works with inline comments in email as well, but it would not be as easy to render it nicely...

Make sure all URL links are listed `"please see/click [this link](https...)"`

I'll have to test this out, but since the assignments will be uploaded to Blackboard as pdfs, the links should be explicit, so they can click it.

(based on a comment from @QuLogic)

Sort lecture files and assignments into folders

I would suggest a couple of folders named 'lectures' and 'assignments'. @lwjohnst86 Did you already have something in mind for this?

Decide on who teaches what

From @joelostblom on March 24, 2017 0:40

Copied from original issue: UofTCoders/council#117

Create end of year survey

For feedback on what was good etc

Upload draft of assignments for plotting lectures

From @joelostblom on July 13, 2017 23:28

I will include some introductory ggplot in the dplyr lecture. Maybe how to use scatter plots, histograms, and grouping by a factor (maybe hive/dotplots also). The idea is that this plotting lesson should be more advanced and could include topics like boxplots, violinplots, heatmaps, clustering, 2d histograms/kdes, facetting, theming, geoms, interactivity, confidence intervals, smoothing, etc.

The second plotting lesson will not have any assignments, but instead focus on the students plotting their own data, ideally in specific ways as defined by these lectures. Feel free to use some time of the second plotting lessons to cover more material if you feel like you need it.

Copied from original issue: UofTCoders/council#167

Finalize date for the final assignment

It is currently listed as Dec 9-20 in the syllabus, let's pick a day.

Finalize RQM syllabus

From @joelostblom on May 4, 2017 1:45

Copied from original issue: UofTCoders/council#151

Should assignment only cover what has been taught in class?

Could a few of the questions be an expansion of the material taught during the lesson? Or do we want every concept to be explained in class and then the students work through it with their own data?

Do you need to be a graduate student to be a TA?

From @joelostblom on March 24, 2017 0:41

Are there restrictions in general for who we can recruit as a helper for the RQM tutorials?

Copied from original issue: UofTCoders/council#118

Touch base with Christie

See UofTCoders/council#174

Survey asking about skill level/etc of students, to complete after the first lesson

I think we discusses this previously. It would be useful to have a survey of the students skill level, familiarity with concepts, etc for our own knowledge and use.

Depending on how we deal with forming groups, this survey could help with that.

Define criteria for how the final project will be graded

From @joelostblom on July 13, 2017 23:39

As per Devin's comments, student's might be keen to know this already from the start and since it is a significant part of their grade, this should be clear upfront.

Copied from original issue: UofTCoders/council#170

Create a DOI of the course content at end of class for future reference.

(This is for future reference and doesn't need to be dealt with now.)

This way, we can reference/cite the tagged final version used in this term for later years, change up the course as we see fit for other purposes (later years, for other universities, for a standalone workshop/course, etc), while still having the previous, citable course.

Thoughts? (again, for later reference. It's not important now)

https://github.com/UofTCoders/council/issues/169

See UofTCoders/council#169

Should we use data.table or dplyr (or something else) for the RQM course?

From @joelostblom on March 25, 2017 14:18

This is a long term item, I listed it for the May meeting (if we have a May meeting). But I'm putting down the details since this popped up in my head now, and we can discuss it here if you have the time.

I know data.table is generally faster than dplyr, but for the purposes of this class, I think we should almost exclusively care about what syntax is more intuitive to understand and what expertise already exists in the council.

I have limited experience in R, but for me data.table is more intuitive. I think this is largely because it has similarities to Python Pandas, and I don't know if someone with a different background would share my opinion.

@lwjohnst86 and @lcoome since you have the most R experience, I have assigned you to this issue. What do you use and what are your thoughts on this topic?

A discussion on the differences between the data.table and dplyr can be found in this SO thread. There are several syntax examples from the authors of the respective packages. I'll list a couple here for those who don't want to read the thread:

# dplyr
diamonds %>%
  filter(cut != "Fair") %>%
  group_by(cut) %>%
  summarize(
    AvgPrice = mean(price),
    MedianPrice = as.numeric(median(price)),
    Count = n()
  ) %>%
  arrange(desc(Count))

# data.table
diamonds[
  cut != "Fair", 
  .(AvgPrice = mean(price),
    MedianPrice = as.numeric(median(price)),
    Count = .N
  ), 
  by = cut
][ 
  order(-Count) 
]

DT[, sum(y), by=z]                       ## data.table
DF %>% group_by(z) %>% summarise(sum(y)) ## dplyr

DT[, if(any(x > 5L)) y[1L]-y[2L] else y[2L], by=z]                        ## data.table
DF %>% group_by(z) %>% summarise(if (any(x > 5L)) y[1L]-y[2L] else y[2L]) ## dplyr

Copied from original issue: UofTCoders/council#132

Publish experience of course after it finishes

See: UofTCoders/council#173

Draft suggestion of RQM course schedule

From @joelostblom on March 25, 2017 12:42

Copied from original issue: UofTCoders/council#130

Touch base with Christie, send the syllabus and inform that the course is happening

From @joelostblom on July 15, 2017 16:27

Copied from original issue: UofTCoders/council#174

Encourage students to use GitHub and Gitter as their primary ways of communication?

One of our grading criteria is on continuous progress. We will already now have access to their commit history and some level of discussions in issues depending on how the students use it. Student's will likely also use some sort of chat service to coordinate meetings etc.

It would be nice if students used GitHub and Gitter almost exclusively instead of email and standard chat apps. Both because this will make them become familiar with this workflow faster and it will facilitate fair grading, since we can take part of their entire discussion and see who leads the group in terms of setting up meetings, making sure deadlines are met, etc.

Potential drawbacks:

Students feel anxious that what they discuss withing the group can be viewed by their TAs (and everyone on the internet if the repo is public).
- This is how they are likely to work in the future and it is important getting exposed to it and experience having their professional opinion associated with the name online.
- They can always send emails between each other for matters they regard as confidential.
More work for us in grading the chat history.
- We don't have to go through it in minute detail. It is more to see if someone really pulls all the weight in a team or if some teams don't communicate at all.

Add question about own data to google sruvey

Finalize topic for what is now "Linear regression population dynamics models"

From @joelostblom on July 13, 2017 23:36

This is currently about linear regression since Martin mentioned this briefly and it ties in nicely with the Tuesday lecture also being on modelling. But, I don't think @mbonsma needs this lecture to tie into her modelling content (correct me if I'm wrong), so feel free to change this to anything you like, if you feel like there is a really important R/analytics/statistic concept that we should include for example. You can also shuffle your lectures and have the data set introduction here if you want student to think about it over the weekend and start working with their data set the following Tuesday.

Copied from original issue: UofTCoders/council#169

Follow up with Francois from CS

From @joelostblom on March 30, 2017 22:36

This is not of immediate concern.
If computer science is not interested, and EEB still wants it to be cross-appointed if it turns in to a full course, I think we should involve Stats or SciNet, which EEB already have connections with. I have heard good things of SciNet, at least from the one student I know who has tried their course offerings =)

Copied from original issue: UofTCoders/council#135

Define criteria for how the final project will be graded

See UofTCoders/council#170

Ask wether we can get MikTech and the necessary R-latex packages on the computers

I think it would be great if these could be preinstalled so that students can easily render PDF reports and see how easy it is once everything is setup. A strong point for me to use markdown in the first place is that I can easily render PDFs and hand in for any class assignment. For this purpose, it is not as useful to teach how to generate markdown documents, since teachers will probably not accept those for assignments.

Setup guide

Decide on which R style guide to follow

From @joelostblom on July 16, 2017 17:46

From UofTCoders/council#164

@lwjohnst86 :
For the file naming, it might be best to encourage them to follow a standard that has been developed, rather than create their own. Like Google's R style guide https://google.github.io/styleguide/Rguide.xml

@joelostblom :
I agree that the students should follow an already developed style guideline, and I think it is important that we are consistent across the lectures as well.

I think it is trickier to agree on which style to follow, since R does not have official guidelines. I believe Google's recommendations appear to be sound in large, with the exception of using periods in object/variable and column names. This is confusion for me coming from and object oriented language and I would prefer snake case (a.k.a. underscores: variable_name) in variable names and Pascal / upper camel case in column names (ColumnName). But, and I keep saying this, I'm happy to conform to other standards based on advice from the more experienced R-users among us and discussion with the rest of the group.

However, I do believe underscores would be helpful for students if they are ever to read or write code written in another language, where periods have special meaning. This coding style is not without precedence in the R community, e.g. Hadley Wickham's stat 405 course recommends underscores for variable names, as does this R style guide. The most compelling reason to use underscores in our case might be to stay consistent with the recommendations in the tidyverse guidelines (naturally similar to those from stat 405), since we will largely be using packages from the tidyverse and it would be confusing to mix these conventions with those from another style guide. The tidyverse favors underscores for objects, functions, and parameters, e.g. tbl_df, group_by, geom_point, facet_wrap, etc. The tidyverse style guide is the most exhaustive of the ones I have seen, and often includes justifications of the chosen style, which helps understanding and memorization. There is also a package, lintr, for automatically controlling for adherence to the tidyverse guidelines, and I believe this is the syntax checker that is builtin to Rstudio.

Some additional info: Bioconductor recommends lower camel case. Some Swedish research shows that all three styles are common in existing CRAN packages. This is a helpful SO question, with several good answers.

Copied from original issue: UofTCoders/council#175

Decide on which one or two datasets to use

As per our discussion with Martin, it would make it simpler for us if we limited the choice of dataset they can use, so that trouble shooting on our end is easier. And in terms of grading, inter-team help/problem solving, etc.

Include working with timeseries data in dplyr?

I don't really have experience with timeseries, and I was initially thinking of not including it. However, it seems to be a prevalent data type to work with in ecology, and Martin mentioned that one of the databases largely is species abundance over time, so I think we should include this. I add it to one of the assignments, probably the last one.

Discuss how many days students should have to complete assignments

See UofTCoders/council#165

Upload draft of assignments for getting started with the data set

See UofTCoders/council#168

Lecture hall items to follow up on ~1 week before RQM classes start

From @joelostblom on May 11, 2017 2:50

~~Check that the temperature in Ramsay Wright 109 has been fixed~~ Yes
~~New Rstudio installed in Carr Hall? Needs to be > 1.0 for notebooks~~
~~Did they fix NetSupport mass login support?~~
~~Is NetSupport now available in Ramsay Wright 109?~~ Yes
~~Did they install the packages or do we have to install every time?~~
~~We should visit and test things out maybe a week or so before the semester starts. Especially if we want to do in class surveys.~~ Done, nothing installed =(

Copied from original issue: UofTCoders/council#156

Style guide for R lectures and courses

See here for initial issue: UofTCoders/council#175

Initial comments:

Originally from pull request UofTCoders/council#164

@joelostblom :
I agree that the students should follow an already developed style guideline, and I think it is important that we are consistent across the lectures as well.

I think it is trickier to agree on which style to follow, since R does not have official guidelines. I believe Google's recommendations appear to be sound in large, with the exception of using periods in object/variable and column names. This is confusion for me coming from and object oriented language and I would prefer snake case (a.k.a. underscores: variable_name) in variable names and Pascal / upper camel case in column names (ColumnName). But, and I keep saying this, I'm happy to conform to other standards based on advice from the more experienced R-users among us and discussion with the rest of the group.

However, I do believe underscores would be helpful for students if they are ever to read or write code written in another language, where periods have special meaning. This coding style is not without precedence in the R community, e.g. Hadley Wickham's stat 405 course recommends underscores for variable names, as does this R style guide. The most compelling reason to use underscores in our case might be to stay consistent with the recommendations in the tidyverse guidelines (naturally similar to those from stat 405), since we will largely be using packages from the tidyverse and it would be confusing to mix these conventions with those from another style guide. The tidyverse favors underscores for objects, functions, and parameters, e.g. tbl_df, group_by, geom_point, facet_wrap, etc. The tidyverse style guide is the most exhaustive of the ones I have seen, and often includes justifications of the chosen style, which helps understanding and memorization. There is also a package, lintr, for automatically controlling for adherence to the tidyverse guidelines, and I believe this is the syntax checker that is builtin to Rstudio.

Ask [email protected] about visting the lecture halls and see if they have computers, R etc...

From @joelostblom on April 29, 2017 13:41

EEB430H1F T 2-4, RW109; TH 2-4, Carr Hall (St. Mike's)

From Martin:

The RW 109 lab should have R installed and you should be able to go and log in with your UTORID and see the set up. I do not know the Carr Hall lab so you’ll have to look into that. Best wishes,

Copied from original issue: UofTCoders/council#143

Where should we upload the answers keys to the assignments?

So that students can't get hold of the answers, should we make the repository private? Or keep them in an encrypted zip? Or make a separate repo?

Publish our experience with the RQM/EE430 course

From @joelostblom on July 15, 2017 16:25

Either in https://github.com/openjournals/jose, as a self-published on the UofTCoders website, or elsewhere.

Copied from original issue: UofTCoders/council#173

Include participation marks?

Like 5% or something. For random, non-assignment related tasks (e.g. for completing an important exercise in class or something).

Finalize RQM syllabus before sending to Martin

See UofTCoders/council#166

Survey students about how challenging the course is?

Martin's mentioned that it is key to adjust the level of teaching to the student's ability, so that students don't feel either bored because it is too easy or lost because it is too hard. How do we want to keep an eye on student's opinion of the teaching level?

We will have the results from the assessments, and students can always contact us via email. Do we want to add a survey (maybe after each block) to actively encourage students to give their opinion about how challenging the course is?

Make list of required R-packages to be installed before the first class

From @joelostblom on May 5, 2017 17:53

We could also install this at the first class, or let the students install them as a learning experience. However, in order to get things running smoothly, I think we should install most, if not all of them, prior to the first class and test that things are working as expected.

Packages:

dplyr
prodigenr
ggplot2
rmarkdown
plotly?

@lwjohnst86 @lcoome what more is not part of a base R-studio install? Does it come with all these?

Copied from original issue: UofTCoders/council#155

Open Access week for RQM course, Oct 23-29

From @joelostblom on March 25, 2017 11:49

Keep an eye on http://www.openaccessweek.org/ and see if they will post anything that we could mention in the course.

Copied from original issue: UofTCoders/council#129

Discuss how many days students should have to complete assignments

From @joelostblom on July 13, 2017 23:8

Joel:
The dates are chosen so that each assignment is due on Monday the following week. Is this enough time if we hand out assignments during the Thursday lecture? I chose Monday since it would give us some time to briefly go over most of the assignments before the next class and briefly repeat general concepts if many students struggled with the same problem.

Thinking about it a little more, I believe a due date on Tuesday, might be better. This gives students a chance to ask questions before and after class regarding the assignment, before they hand it in. It would fit particularly well if we make Tuesdays our office hours. It would also give us an additional day (wednesday) to go over the assignments and bring up and key concept on the following Thursday lecture. I will change to this in the next iteration unless there are opposing views.

lwjohnst86 a day ago Owner
I like this idea a lot

mbonsma a day ago Owner
So they would typically have five days to do the assignments? I think it should be a week and five days, unless they're quick.

joelostblom a day ago Member
I like the idea of a quick turn around (5 days) better, but let's discuss in person after we have a clearer idea of how involved the assignments will be.

joelostblom a minute ago Member
We talked briefly about this today. We agree that the due date should be on the same day that we have office hours so that student's can come and ask last minute questions.

We largely agreed that it would be better if students had 8 days rather than 3 days (excluding weekends) to hand in the assignment, especially since around 10 days seems to be standard and we don't want to scare students away by making them work on the weekends right away... The drawback of this is that it would be harder to follow up promptly if there are particular concepts that many students don't understand. We could remedy this with a socrates online quiz at the beginning of the following lecture (maybe during the break?), and repeat any concept that are unclear for many students. We just don't want to add to much overhead...

Copied from original issue: UofTCoders/council#165

Plan for the first class

This is for the first lecture, exactly one week from now!

This is what I imagine for the first class.

Martin talks for ~15 min and then the stage is ours.
I spend 10 min introducing the general concept of the class, why are excited to teach it, and how we will work throughout the semester. I also talk about the logistics of the class a bit, that we will use github and blackboard, how assignment will be distributed etc.
I am in favor for walking through the syllabus by each person taking ~4 min introducing themselves, what they will teach, and why they are excited over reproducible/quantitative/programming skills and an open science workflow, and how learning these skills have helped them in their research.
(We should be around 40 min into the class now, have a break? or continue since it is a lightweight first class anyways? If we have a break, we can say a few words to Martin and then those who want to leave can do so)
I talk about intro to programming ~(30min)
I talk about intro to Rstudio and Rmarkdown (~30min)
Should I bring up the first assignment anything?

I imagine we don't have to bring up every single item on the syllabus, like academic integrity etc, just mentioning that they need to read the syllabus should be sufficient?

Create templates for Rmd lecture slides

How to form groups?

So the discussion is formalized here. What does everyone think about how the groups will be formed?

I recall we talked about having groups with varying skill sets involved. I like that approach, but how will be actually go about doing that? Randomly split them based on skills? Or some other criteria?

Thoughts?

For modeling, use data to guide what we teach?

This is really for @mbonsma and me. Should we first decide on what one or two datasets we give them before completely fleshing out the lecture material? Makes it easier for us and easier for the students. So we can focus on teaching material that we think they will actually use in the project.

Create templates for Rmd assignment files

Finalize RQM syllabus before sending to Martin

From @joelostblom on July 13, 2017 23:16

#19 How many day to hand in assignment
#14 Define final assignment grading criteria
Deadline for the final project hand in? Final exam period: Dec 9-20. Last day to petition exam: Jan 10.
Finalize marks distribution
Change course title as per @lwjohnst86 's suggestion?

Copied from original issue: UofTCoders/council#166

uoftcoders / rcourse Goto Github PK

rcourse's People

Contributors

Stargazers

Watchers

Forkers

rcourse's Issues

Potential drawbacks:

Initial comments:

Packages:

Recommend Projects

Recommend Topics

Recommend Org