Coder Social home page Coder Social logo

bids-collaborative / edam Goto Github PK

View Code? Open in Web Editor NEW
3.0 3.0 6.0 53.06 MB

Ecostations Data Access Monitor

Home Page: http://bids-collaborative.github.io/EDAM/

License: BSD 2-Clause "Simplified" License

Python 75.70% R 0.63% HTML 12.42% CSS 2.52% JavaScript 8.12% Makefile 0.61%

edam's People

Contributors

cleb11 avatar davclark avatar gitter-badger avatar jinzhaohong avatar jonathanwang017 avatar justinmi avatar koalaboy808 avatar lavanyaharinarayan avatar linbrian avatar soham14 avatar triyangle avatar vedants avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

edam's Issues

suggest to add IUCN International Union for Conservation of Nature

From a discussion with Robert Guralnick (co-author of the Meyer et al. 2015 paper about completeness), it appears that the source of expert knowledge was International Union for Conservation of Nature (IUCN, http://iucn.org). Suggest to add this to the existing comparison as a separate row with different islands metrics including (if possible) links to the data source / api call urls.

Feedback from CEGA team

Real-world challenge

What new understanding will be obtained? What public good would be served? What business opportunities are available? Who would benefit, who are consumers/users of your data or technology? Why are you motivated to work on this project?

Ecostations Data Access Monitor (EDAM) identifies current gaps in several existing databases of species biodiversity information. Digital accessible information (DAI) is disjoint, inconsistent, and biased. The team identifies the major problems as the distance between researchers, locally available research funding and the ability/willingness to participate in data sharing networks.

The team hopes that comparing available ecological data across ecostations will help stimulate collaboration between scientists, technologists, educators, local governments and research foundations to help better understand and sustain ecosystems around them. As a result of the project, EDAM aims to help make ecological data available for specific spatio-taxonomic spaces.

It would be great if the team can offer some evidence to support their diagnosis of the main problem.
Are there any data or literature that indicates the magnitude of problem, associated factors, and possible alternatives? Could the team be able to come up with a logical frame underlying its approach toward the real world challenge? We think that it might be a strong assumption to say that “Once we provide methods to combine and process biodiversity data at global scales, institutions can start to re-examine existing data to coordinate data collection efforts, evolve data sharing strategies and discover methods to efficiently sustain ecosystems”. Specifying results chain with input-activity-output-outcome link will help the team to set up their goals in a more systematic fashion.

Data / Materials

What data sources or data collection methods have you used? Are they automatic/numerically collected, or manually collected (as with a survey)? Where / how is data stored? What will you do / have you done to get data? Is there important metadata (e.g., time of collection, source, etc.)? Why is the data you have or expect to have sufficient or insufficient to meet your challenge?

EDAM’s data are derived from openly available biodiversity data repositories (e.g. GBIF, iDigBio, GloBI). They don't store the data, but automatically query them each time they need. Initially only species lists and associated food webs are compiled for participating ecostations using automated data processing algorithms. For each ecostation, the completeness of the lists and webs are estimated. The similarity of the lists and webs are also calculated across the spatially separated island ecosystems to highlight ecological likeness.

We would suggest start from the kind of data base that provide same kind (file form, common category) of data, and dig deeper after having a systematic procedure. EDAM might need more scientists to contribute their own data to our project to share with the world.

Approach

What are the risks? What technical approaches have been explored, and which remain to be explored? Where does the prototype "run?" What methods (e.g., statistics, signal processing, transformation) and tooling (e.g., python libraries, hardware platforms) are being used/evaluated/considered? When does analysis come in. What skills are needed, and what skills remain to be learned by your team? How would someone learn these skills?

The EDAM project aims to build a platform that summarizes data, enabling data sharing to enable a wholesome, side-by-side data comparison of ecosystems. These aims would be achieved by automating data processing algorithms to compile species lists and associated food webs for participating ecostations, estimating the completeness of the lists and webs, calculating similarity of the lists and webs and creating a web accessible visualization tool that allows comparison. EDAM now has a webpage running on Jekyll on github at http://bids-collaborative.github.io/EDAM/ . At the guit hub, there are 21 closed issues and 4 open issues.

So far, EDAM’s analysis has been done manually as a prototype. If they can eventually fully or partially automate manual work, it can expedite the process. Given that the final purpose is building a large database that can support visualization selection, we’d also recommend oracle, by means of which data can be categorically and systematically stored.

Project Management

What is your first milestone? What is your current milestone? What is your "final" milestone for completing this project (even if that milestone is currently unreasonable ;). How do you evaluate progress and work remaining? How are next tasks determined and divided between team members? Who did what / what role did people play, and how could you find out (e.g., from the issue tracker, meeting notes, git commits, etc.)?

Eventually, EDAM plans to model links between climate, biodiversity, human activities and other changes over time. It chooses Islands as our tests models because they are tractable and contained. The first milestone toward integrating the data would be to provide a side-by-side comparison of existing data associated with active island ecostation communities to stimulate knowledge sharing and collaboration. For the current milestone, EDAM needs to gather more specific demand from our target users, scientists who will use and share data on our website. By the end of this initial stage, the team should achieve their goals to build a tool that enables scientists to compare ecological data.

It seems that mentor and advisor Jorrit Poelen is actively engaged with the project, and there are 3 undergraduates and 3 graduate students in the team. Their skill sets include computer science, stat, comparative biochemistry, information and visualization, data visualization, biosensors and software engineering. We would suggest that the team can put their heads together and come up with a plan on how to carry out the project after this class.

Others

Please provide any feedback you find useful! In particular, please help your partner team to identify potential problems that might cause their project to "fail." Also recommend any resources that you think may be useful. Ideally, also "execute" or "test" the current prototype. This could range from assembling the bill of materials (i.e., shopping cart) to build the hardware to actually running some code on your own computer. If this is not possible, please describe what would be necessary to do so, and why you are unable to.

For the next stage, EDAM can talk to research founders and host hackathons (or other forms of gathering) among scientists to promote the data sharing on and usage of our website. There will always be some bright idea when having a hackathon in order to select a better way to solve a problem. We’d also suggest EDAM to make a Kaggle competition on predict model building, there are always some fantastic data scientists ready for funny and challenge tasks.

toggle features

select which features to use in model and PCA

  • subset features
  • PCA loadings
  • feature correlations

analysis page

finish explanation and visualizations on analysis

create wiki with agenda for meeting 4

suggested topics for agenda meeting 4 (aside from default items):

  1. review side-by-side comparison
  2. discuss outcome/ feedback of Oct 15 presentation (see #13)
  3. share what worked well in last weeks, and what we can improve on (communication/ distribute work)

Directions

From conversation with Charlotte Cabasse, recalling conversation with Mattias:

  • Provide a tool for, e.g., CRIOBE that increases the value of their data, and reduces the burden of access, even if it's not open. A kind of index that supports a variety of access restrictions.
  • It would be interesting to integrate with open archaeology, folks like Eric Kansa and John Deck.
  • It would be productive, perhaps, to start with an island where we have "total access" like with Tetiaroa.

What is the narrative that excites, e.g., the IDEA leads about such a tool?

Read Meyer Paper

Goals for the Week: 9/15 - 9/21

  1. Identify open data sources (GBIF)
  2. Identify similar initiatives around the world (map of life (MOL))
  3. Write reflections on the issues being talked about in the paper available at http://dx.doi.org/10.1038/ncomms9221

Read About Island Biodiversity

Read about background information about island biodiversity and please share the link of the source and information below.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.