Coder Social home page Coder Social logo

ecology-workshop's Introduction

ecology-workshop

Overview of the ecology workshop

Data Carpentry's aim is to teach researchers basic concepts, skills, and tools for working with data so that they can get more done in less time, and with less pain. This workshop uses a tabular ecology dataset and teaches data cleaning, management, analysis and visualization. There are no pre-requisites, and the materials assume no prior knowledge about the tools.

The workshop uses a single tabular data set that contains observations about adorable small mammals over a long period of time in Arizona. See data.md for more information about this data set, including the download location.

The workshop can be taught using R or Python as the base language.

Overview of the lessons:

  1. Data organization in spreadsheets and data cleaning with OpenRefine
  • Introduction to R or Python
  • Data analysis and visualization in R or Python
  • SQL for data management

An example of the ecology materials in the wild is this Data Carpentry workshop at CalTech in 2015.

Detailed structure

Day 1 morning: Data organization & cleaning

There are two lessons in this section. The first is a spreadsheet lesson that teaches good data organization, and some data cleaning and quality control checking in a spreadsheet program.

The second lesson uses a spreadsheet-like program called OpenRefine to teach data cleaning and filtering, and to introduce scripting, regular expressions and APIs (application programming interfaces).

Day 1 afternoon and Day 2 morning: Data analysis & visualization

These lessons includes a basic introduction to R or Python syntax, importing CSV data, and subsetting and merging data. It finishes with calculating summary statistics and creating simple plots.

Day 2 afternoon: Data management with SQL

This lesson introduces the concept of a database using SQLite, how to structure data for easy database import, and how to import tabular data into SQLite. Then, it teaches basic queries, combining results and doing queries across multiple tables.

Other lessons

There are a number of other ecology lessons that are not part of the base workshop. Some of these are no longer taught, and some are only taught at extended workshops.

ecology-workshop's People

Contributors

adamobeng avatar bencomp avatar erinbecker avatar ethanwhite avatar fmichonneau avatar iramosp avatar jhollist avatar kariljordan avatar katrinleinweber avatar kcranston avatar maneesha avatar quist00 avatar tobyhodges avatar tracykteal avatar villanueval avatar zkamvar avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

ecology-workshop's Issues

Transition to standardized GitHub labels

The lesson infrastructure committee unanimously approved the proposal of using the same set of labels across all our repositories during its last meeting on May 23rd, 2018.

This repository has now been converted to use the standard set of labels.

If this repository used the previous set of recommended labels by Software Carpentry, they have been converted to the new one using the following rules:

SWC legacy labels New 'The Carpentries' labels
bug type:bug
discussion type:discussion
enhancement type:enhancement
help-wanted help wanted
newcomer-friendly good first issue
template-and-tools type:template and tools
work-in-progress status:in progress

The label "instructor-training" was removed as it is not used in the workflow of certifying new instructors anymore. The label "question" was left as is when it was in use and removed otherwise. If your repository used custom labels (and issues were flagged with these labels), they were left as is.

The lesson infrastructure committee hopes the standard set of labels will make it easier for you to manage the issues you receive on the repositories you manage.

The lesson infrastructure committee will evaluate how the labels are being used in the next few months and we will solicit your feedback at this stage. In the meantime, if you have any questions or concerns, please leave a comment on this issue.

-- The Lesson Infrastructure subcommittee

Inconsistent formatting between lessons

When looking through the different lessons listed in the overview, there is a consistent formatting style that is followed by all of the lessons except the Data Analysis and Visualization in R for Ecologists section. Having consistency between all of the lessons may improve reading/navigation speed when switching between lessons.

add information about installing tidyverse and RSQLite for mac and windows

I am teaching a remote workshop starting this week and a learner scheduled time with me because they were stuck on the instructions: https://datacarpentry.org/ecology-workshop/setup-r-workshop.html

They were stuck because on a Windows machine, there were no instructions to install tidyverse and RSQLite packages, even though the beginning of the section says

R and RStudio are separate downloads and installations. R is the underlying statistical computing environment, but using R alone is no fun. RStudio is a graphical integrated development environment (IDE) that makes using R much easier and more interactive. You need to install R before you install RStudio. After installing both programs, you will need to install some specific R packages within RStudio. Follow the instructions below for your operating system, and then follow the instructions to install tidyverse and RSQLite.

I recommend adding the instructions from Linux (copy-pasted below) in all sections, or adding another subheading to clarify this point.

After installing R and RStudio, you need to install the tidyverse and RSQLite packages. Start RStudio by double-clicking the icon and then type: install.packages(c("tidyverse", "RSQLite")). You can also do this by going to Tools -> Install Packages and typing the names of the packages you want to install, separated by a comma.

Typos

Here are two small typos.

In Day1 morning:

The second lesson uses a program called OpenRefine to teach data cleaning and filtering, and to introduce the idea scripting(application programming interfaces)

In Day 1 afternoon and Day 2 morning:

These lessons includes a basic information to R or Python syntax, importing CSV data, subsetting and merging, data, and finishes with how to do plotting.

June 2019 Lesson Release checklist

If your Maintainer team has decided not to participate in the June 2019 lesson release, please close this issue.

To have this lesson included in the 18 June 2019 release, please confirm that the following items are true:

  • Example code chunks run as expected
  • Challenges / exercises run as expected
  • Challenge / exercise solutions are correct
  • Call out boxes (exercises, discussions, tips, etc) render correctly
  • A schedule appears on the lesson homepage (e.g. not “00:00”)
  • Each episode includes learning objectives
  • Each episode includes questions
  • Each episode includes key points
  • Setup instructions are up-to-date, correct, clear, and complete
  • File structure is clean (e.g. delete deprecated files, insure filenames are consistent)
  • Some Instructor notes are provided
  • Lesson links work as expected

When all checkboxes above are completed, this lesson will be added to the 18 June lesson release. Please leave a comment on carpentries/lesson-infrastructure#26 or contact Erin Becker with questions ([email protected]).

add cheat sheet for the workshop

It might be nice to have a two page (1 page, 2 sided) cheat sheet for the topics in this workshop. There could be a 'Tidy Data' section, an 'Open Refine' section, an R or Python section and an SQL section.

This is a nice start on an SQL one
http://www.sql-tutorial.net/SQL-Cheat-Sheet.pdf

Although it might make more sense to have a cheat sheet per topic, so things are more modular, and the cheat sheet stays with the topic.

update data page to describe all data files

The new setup page (PR) includes a single download link to get all of the data files from FigShare, but this downloaded data is used inconsistently across the lessons. The data page is out of date and doesn't describe all of the data files that learners download and use in this lesson (in the "Files we use in this dataset" section).

  • update the "files we use" section of data page to describe all files used
  • update the first exercise in organization lesson to use the data file they've already downloaded, rather than downloading in situ.
  • add the modified (reshaped and filtered) dataset that is created in the R lesson to Figshare as a backup for learners who weren't able to create it and for instructors who had to skip reshape section
  • check whether Python lesson also uses this modified dataset and update if needed

Relevant links:

hshjsjkshkjasdksdjkajkla

Please delete the text below before submitting your contribution.


Thanks for contributing! If this contribution is for instructor training, please send an email to [email protected] with a link to this contribution so we can record your progress. You’ve completed your contribution step for instructor checkout just by submitting this contribution.

Please keep in mind that lesson maintainers are volunteers and it may be some time before they can respond to your contribution. Although not all contributions can be incorporated into the lesson materials, we appreciate your time and effort to improve the curriculum. If you have any questions about the lesson maintenance process or would like to volunteer your time as a contribution reviewer, please contact Kate Hertweck ([email protected]).


Transition to The Carpentries Workbench

(Related to #38)

Hi @datacarpentry/ecology-workshop-maintainers

The Curriculum Team is preparing to transition all of the workshop/curriculum overview sites to the Workbench. @zkamvar has set a tentative migration date of Monday 18th September. This change will bring all of the overview sites, including this one, back in line with all of the official lessons that were transitioned to the new infrastructure in May.

What do you need to do?

To help us prepare for a smooth transition, we invite you to:

  1. explore the preview of the transitioned version of the overview site: https://fishtree-attempt.github.io/ecology-workshop/
  2. process any pull requests made to the repository, ensuring that they are all merged or closed before the transition takes place. Any open pull requests on this repository will be invalidated by the transition. There are currently Zero (0) open pull requests on the repository. 🎉
  3. post the questions and concerns you have about the transition here, tagging @tobyhodges.

What happens next?

After the transition, Maintainers will temporarily lose their write access to the repository as a safety measure, ensuring that the old project history cannot be accidentally pushed back into the repository. The steps to restore access are relatively straightforward (see an example from the completed transition of another lesson) and you can expect detailed instructions and further support from the Curriculum Team when the transition takes place.

Beyond that, you can expect to continue maintaining the repository as you were before, while taking advantage of the simplified layout and syntax that the new infrastructure offers. main will become the default branch of the repository (currently gh-pages), and files currently located in _extras/ will be divided between two new folders: learners/ and instructors/. You can read more about the changes to repository structure/organisation and source file syntax in the Workbench Transition Guide.

Unifying data files among different modules

I've taught a workshop recently, and found out that we have to ask the learners to download the data files in the beginning of (or during) each session. Perhaps we should standardize everything up such that the learners only need to download the data files once throughout the workshop.

I've found out that the zip file for the SQL lesson is the most complete and reliable one, so we should use and refer to it throughout the workshop. The Python (and applicable to R too according to what I've heard) version of data files has a serious bug: the SQLite file that comes with this version incurs a lot of errors when one uses it in the SQL module.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.