bids-collaborative / edam Goto Github PK
View Code? Open in Web Editor NEWEcostations Data Access Monitor
Home Page: http://bids-collaborative.github.io/EDAM/
License: BSD 2-Clause "Simplified" License
Ecostations Data Access Monitor
Home Page: http://bids-collaborative.github.io/EDAM/
License: BSD 2-Clause "Simplified" License
EBird, INaturalist
From a discussion with Robert Guralnick (co-author of the Meyer et al. 2015 paper about completeness), it appears that the source of expert knowledge was International Union for Conservation of Nature (IUCN, http://iucn.org). Suggest to add this to the existing comparison as a separate row with different islands metrics including (if possible) links to the data source / api call urls.
Please add your nano-bio to our wiki
What new understanding will be obtained? What public good would be served? What business opportunities are available? Who would benefit, who are consumers/users of your data or technology? Why are you motivated to work on this project?
Ecostations Data Access Monitor (EDAM) identifies current gaps in several existing databases of species biodiversity information. Digital accessible information (DAI) is disjoint, inconsistent, and biased. The team identifies the major problems as the distance between researchers, locally available research funding and the ability/willingness to participate in data sharing networks.
The team hopes that comparing available ecological data across ecostations will help stimulate collaboration between scientists, technologists, educators, local governments and research foundations to help better understand and sustain ecosystems around them. As a result of the project, EDAM aims to help make ecological data available for specific spatio-taxonomic spaces.
It would be great if the team can offer some evidence to support their diagnosis of the main problem.
Are there any data or literature that indicates the magnitude of problem, associated factors, and possible alternatives? Could the team be able to come up with a logical frame underlying its approach toward the real world challenge? We think that it might be a strong assumption to say that “Once we provide methods to combine and process biodiversity data at global scales, institutions can start to re-examine existing data to coordinate data collection efforts, evolve data sharing strategies and discover methods to efficiently sustain ecosystems”. Specifying results chain with input-activity-output-outcome link will help the team to set up their goals in a more systematic fashion.
What data sources or data collection methods have you used? Are they automatic/numerically collected, or manually collected (as with a survey)? Where / how is data stored? What will you do / have you done to get data? Is there important metadata (e.g., time of collection, source, etc.)? Why is the data you have or expect to have sufficient or insufficient to meet your challenge?
EDAM’s data are derived from openly available biodiversity data repositories (e.g. GBIF, iDigBio, GloBI). They don't store the data, but automatically query them each time they need. Initially only species lists and associated food webs are compiled for participating ecostations using automated data processing algorithms. For each ecostation, the completeness of the lists and webs are estimated. The similarity of the lists and webs are also calculated across the spatially separated island ecosystems to highlight ecological likeness.
We would suggest start from the kind of data base that provide same kind (file form, common category) of data, and dig deeper after having a systematic procedure. EDAM might need more scientists to contribute their own data to our project to share with the world.
What are the risks? What technical approaches have been explored, and which remain to be explored? Where does the prototype "run?" What methods (e.g., statistics, signal processing, transformation) and tooling (e.g., python libraries, hardware platforms) are being used/evaluated/considered? When does analysis come in. What skills are needed, and what skills remain to be learned by your team? How would someone learn these skills?
The EDAM project aims to build a platform that summarizes data, enabling data sharing to enable a wholesome, side-by-side data comparison of ecosystems. These aims would be achieved by automating data processing algorithms to compile species lists and associated food webs for participating ecostations, estimating the completeness of the lists and webs, calculating similarity of the lists and webs and creating a web accessible visualization tool that allows comparison. EDAM now has a webpage running on Jekyll on github at http://bids-collaborative.github.io/EDAM/ . At the guit hub, there are 21 closed issues and 4 open issues.
So far, EDAM’s analysis has been done manually as a prototype. If they can eventually fully or partially automate manual work, it can expedite the process. Given that the final purpose is building a large database that can support visualization selection, we’d also recommend oracle, by means of which data can be categorically and systematically stored.
What is your first milestone? What is your current milestone? What is your "final" milestone for completing this project (even if that milestone is currently unreasonable ;). How do you evaluate progress and work remaining? How are next tasks determined and divided between team members? Who did what / what role did people play, and how could you find out (e.g., from the issue tracker, meeting notes, git commits, etc.)?
Eventually, EDAM plans to model links between climate, biodiversity, human activities and other changes over time. It chooses Islands as our tests models because they are tractable and contained. The first milestone toward integrating the data would be to provide a side-by-side comparison of existing data associated with active island ecostation communities to stimulate knowledge sharing and collaboration. For the current milestone, EDAM needs to gather more specific demand from our target users, scientists who will use and share data on our website. By the end of this initial stage, the team should achieve their goals to build a tool that enables scientists to compare ecological data.
It seems that mentor and advisor Jorrit Poelen is actively engaged with the project, and there are 3 undergraduates and 3 graduate students in the team. Their skill sets include computer science, stat, comparative biochemistry, information and visualization, data visualization, biosensors and software engineering. We would suggest that the team can put their heads together and come up with a plan on how to carry out the project after this class.
Please provide any feedback you find useful! In particular, please help your partner team to identify potential problems that might cause their project to "fail." Also recommend any resources that you think may be useful. Ideally, also "execute" or "test" the current prototype. This could range from assembling the bill of materials (i.e., shopping cart) to build the hardware to actually running some code on your own computer. If this is not possible, please describe what would be necessary to do so, and why you are unable to.
For the next stage, EDAM can talk to research founders and host hackathons (or other forms of gathering) among scientists to promote the data sharing on and usage of our website. There will always be some bright idea when having a hackathon in order to select a better way to solve a problem. We’d also suggest EDAM to make a Kaggle competition on predict model building, there are always some fantastic data scientists ready for funny and challenge tasks.
Throw in ideas that our fellow teammates could think about and collect information before our next meeting to pitch the project.
Also, add your bio to the table on wiki home page
@Calebs97
I'm Joseph Fang. github: @sherlockjjj.
it'll save us and our users a little bandwidth
@sherlockjjj @Minsu-Daniel-Kim @jhpoelen Here is the link to the Doodle poll: http://doodle.com/poll/wm9arg685tmaxkr4
Files that still need to be fixed + added to the Makefile:
select which features to use in model and PCA
@numfah23 @JinZhaoHong @calvind777 @leyldy @vedants
Join gitter https://gitter.im/BIDS-collaborative/EDAM
Research data sources (pertinent information/request/format)
Add information to wiki
Update progress/weekly summary on wiki (under weekly report on sidebar)
Right now we use Models, but we should be using Django Forms
finish explanation and visualizations on analysis
add identifier for files so files are not re-uploaded every time a new model is tested
suggested topics for agenda meeting 4 (aside from default items):
fix styling and remove unnecessary components
For example, there is no need to have the same code for a navbar on every template
From conversation with Charlotte Cabasse, recalling conversation with Mattias:
What is the narrative that excites, e.g., the IDEA leads about such a tool?
Goals for the Week: 9/15 - 9/21
how we might present data on maps
@tongzhang1995 @koalaboy808
d3 functions to take data from backend
Read about background information about island biodiversity and please share the link of the source and information below.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.