Coder Social home page Coder Social logo

datacarpentry / deprecated-cloud-genomics-orig Goto Github PK

View Code? Open in Web Editor NEW
9.0 9.0 38.0 6.12 MB

Deprecated - this repository is an outdated fork from datacarpentry/cloud-genomics

License: Other

HTML 33.85% CSS 11.88% Ruby 3.02% Makefile 2.69% JavaScript 0.87% R 2.67% Python 44.85% Shell 0.17%

deprecated-cloud-genomics-orig's Introduction

Data Carpentry Core Curriculum

This repository is the historical repository of Data Carpentry lessons. Each lesson is now its own repository, so please see those individual repositories for the most recent versions of the lessons or to do any lesson development.

DO NOT USE OR DEVELOP IN THIS REPOSITORY

Individual lessons

The repositories for individual lessons can be found here:

deprecated-cloud-genomics-orig's People

Contributors

abbycabs avatar adina avatar ammatsun avatar billmills avatar blasseigne avatar carynjohansen avatar cjfields avatar devbioinfoguy avatar evanwill avatar fmichonneau avatar gdevenyi avatar gvwilson avatar jasonjwilliamsny avatar jduckles avatar jpallen avatar julianpistorius avatar kariljordan avatar krmaas avatar maxim-belkin avatar neon-ninja avatar pbanaszkiewicz avatar pipitone avatar rgaiacs avatar stephenturner avatar synesthesiam avatar taylorreiter avatar tmorrell avatar tracykteal avatar twitwi avatar wrightaprilm avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

deprecated-cloud-genomics-orig's Issues

transferring data to the instances

We often get a question of how to get data onto the instances. I don't think we have a consistent place where we cover this, and we might want to put it sooner in the lesson.

Lesson 3 doesn't match Lesson 1

Lesson 1 uses ami-6516b30e. Lesson 3 uses ami-3c1c3454 which has a different organization and contents (and also appears not to be available in the AWS Community Marketplace). I think the solution is for Lesson 3 to be updated according to the organization and contents of ami-6516b30e. Are there plans to do this? (This may be the same issue that's described in #5.)

Confusion over which AMI to use

G'day, I am planning to teach the DataCarpentry Genomics Workshop to a group of research students and staff in a few weeks time. In the course of preparing for the workshop, I noted a few inconsistencies w/in the material that I would like to point out / receive clarification about.

First, there are two AMI's referenced in the material. One of which, ami-3c1c3454, doesn't seem to exist w/in the Community AMI's. I tried following Jason Williams updated instructions (found here https://jasonjwilliamsny.github.io/cloud-genomics/logging-onto-cloud.html), and encountered a similar problem locating ami-07b4456a.

The image I could locate and run as an instance (ami-6516b30e) doesn't seem to contain all of the .fastq data described in the rest of the lesson.

So, I guess my questions are two:

  1. Is there a single AMI for these lessons, if so, how can I access it?

  2. Is the difference between the AMI's referred to just the data present? If so, could I use the SRA Toolkit prefetch command to add the necessary data to ami-6516b30e and run the workshop with that?

Attached are some screenshots of the portions of the lessons which reference the AMI's I couldn't find. Thanks in advance for your time and help with my enquiry.

Kind Regards,

Collin

j williams non existent ami
lesson_1_ami_without_data
prefetch to get data
ref_to_nonexistent_ami

documentation on virtual machine

Is there anywhere documentation on how was the virtual machine prepared:

  • what software has been installed?
  • what software is running (RStudio server etc.)?
  • what data and where it is located?

Why is the intro to CC before the comand line?

Is there any particular reason to do the introduction to Cloud computing for genomics (module 3) before introduction to command line (module 4)? Personally, and discussing this with others the command line lesson should be before.
By reading in the contents for Cloud computing (CC) module 3, I found that in
cloud-genomics/lessons/4.parallel-analysis.md
it says: As we learned in the Unix shell lesson,
However that is not until the next module.
Should this order be re-scheduled?

Add instructor notes document for this lesson

I'm working on helping direct instructor attention towards fixing up/contributing to instructor notes. Currently don't have a link to provide for instructor notes for this lesson. Please add - even a blank document would be somewhere to point towards.

things people learned

This is just an issue to collect information on what people are saying they learned and liked about this lesson.

Students learned that:

"Amazon has cloud computing resources!"

Multiple cloud resources for lessons

There are multiple possibilities for cloud resources for these workshops. Places hosting multiple workshops that have local resources or organizations that have resources might want to use their own. We therefore likely need parallel documentation on those resources for components of these lessons.

Likely we should keep them all in this repo and name them 01-aws-logging-in-to-cloud.md and 01-iplant-logging-in-to-cloud.md but maybe we need to think about how to manage this and what cloud resources we want to develop lessons for.

tmux vs screen

Is there a reason to teach tmux over screen?
I think screen is on most of the systems by default, tmux not necessarily.

Proposal for re-organization of cloud genomics lesson

At DIBSI SC Instructor Training, we discussed the need to re-organize the files and order of the lesson to target a novice audience.

https://github.com/datacarpentry/cloud-genomics

@taylorreiter @krmaas @blasseigne @carynJohansen

  1. Why Cloud Computing? to establish motivation: 01-Introduction2.md + intro to different types of platforms from 02-logging-onto-cloud.md

  2. Introduction to commandline basics:

a) Finder/file system hierarchy (GUI) vs. commandline navigation with cd, ls, pwd
b) New lesson needed for Windows users to learn how to force Windows OS to show file hierarchy in explorer
c) Use existing shell lesson on files hierarchy

  1. Connecting to remote instance: 02-why-cloud-computing.md

  2. Data to and from instance: 03-moving-data.md and 06-data-roundtripping.md, + filezilla info from hbc lesson: https://github.com/hbc/dc_2016_04/blob/master/lessons/07_read_qc.md

  3. Keep background processes running with screen (need explanation) and tmux, with info from this lesson: http://www.datacarpentry.org/cloud-genomics/

  4. Single analysis: 04-single-analysis.md

  5. Parallel analysis: 05-parallel-analysis.md

Add LICENSE.md file

This material is generating quite a lot of interest. Would be good to have explicitly stated license.

Image in cyverse results in Deploy_error

The TSW Workshop Williams 1.2 image in Cyverse results in a Deploy_error. Unfortunately the Atmosphere interface doesn't provide more details. You can log into the instance, but itools doesn't seem to be installed.

how to upload files

We often get a question of how to get data onto the instances. I don't think we have a consistent place where we cover this, and we might want to put it sooner in the lesson.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.