Coder Social home page Coder Social logo

phd-paper1's Introduction

surveypaper

R Markdown paper on Bioinformatic Software Engineering

This is an example of Reproducible Research as explained in the Coursera course on Data Science It a paper authored by Brendan Lawlor and Paul Walsh of CIT on the topic of the deficit of software engineering values in bioinformatic software development. Also included are data collected as part of two surveys of life science and software engineering practicioners. The R Markdown file contains both the textual content of the paper, and the R language code which reads, cleans, analyses and plots the data. The data files are also present under the /data directory.

phd-paper1's People

Contributors

blawlor avatar

Stargazers

Dale Wyttenbach avatar Giovanni Idili avatar

Watchers

Giovanni Idili avatar  avatar Paul Walsh avatar

phd-paper1's Issues

Drawback of DSLs

pag 7, section 4.3.1: here or elesewhere, when discussing DSLs it would be useful to mention some drawbacks of DSLs, for instance that they may limit cross-pollination of knowledge, by creating 'encrypted' environments, decoded only by a ultra-specialized groups of people

Unpredictability

  • page 7, par 3: the statement "Thus tamed, a biological problem domain is no more unpredictable than a business one.
    Most of the subsequent complexity of any project is down to the unpredictable nature of the people and processes working on it, ..."

sounds an oversimplification of any science domain, but supports indirectly the thesis that the new generation of software engineers shall be directly brought into labs! Again, the difference in ontologies is not mentioned: unpredictability in a science project can be of different nature, compared to the one of business environment.

In Silico Experiments Not Mentioned

  • Pag 2, Par 1: "Because of the fundamentally important role of such tests ....in turn insufficiently tested themselves. Compare this to the use of defective or uncalibrated lab equipment in order to fully appreciate the nature of the problem."

This point, which connects to the more general topic of 'in silico experiments', is indeed fundamental for the whole discussion, but quite ignored in the paper. The fact that software engineering is key to accelerate the conception and design of 'in silico experiments' shall be mentioned, at least.

Diversity of Programming Paradigms

  • pag 9 first paragraph: "In other words, it is important to allow bioinformaticians to create code that is exploratory in nature but fragile from an engineering point of view. But it is equally important to ensure that such code does not form the basis of published findings..."

Here fragility from software engineering perspective, is used as synonym of unreproducibility of results. I find this a bit overshot and counterproductive, as it ignores, among others aspects, the role and diversity of programming paradigms, (not mentioned at all).

In functional programming (e.g.), which is the heart of scientific computing, the correctness of results of an ODE solver does not generally depend on the engineering of the software solution (behind or ahead), but rather on the correctness of the math implementation; which is easy to share and test anywhere. In this sense, SW engineering layers shall not add overhead where not useful..

Piecharts?

Questionnaire:

  • cake plot could possibly help to visualize statistics and conclusions!?

Discovery as main target

General remarks:

  • the acknowledgment of discovery as main target in lifesciences (and bionformatics) is missing, so the specific impact of software engineering in the quest is left implicit and not highlighted (also in fig 7, which could be more expressive); references to notable case studies or areas (computational synthetic biology for instance etc.) would be precious to sublimate concepts

Education link with research

Section 2.1.: The fact that academical research is genetically bound to education is not stressed enough, so the authors miss to comment about the fact that that most 'ad hoc' computer programs are (and will be) written by students (master, phd).

Different Ontologies

  • page 5, paragraph 2: "The goals and aspirations of the life scientists with regard to software architecture are no different to those of commercial software engineers", This conclusion is a 'two-blade knife', its strength covers the fact that the ontologies of the two main groups under analysis
    (SW engineer and life scientists) are totally different. Therefore, those software engineers aware of which commercial development practices are crucial for scientific areas, will likely be those more successful.

In silico success stories

the whole paper would gain in insight by referencing few success cases where software engineering has enabled/boosted 'in silico experiments'

Conclusion 4.1

  • conclusion 4.1 is important, it could be further unfolded

Nature of scientific development

the nature of scientific software development as 'fluid, ongoing prototyping', and its impact on the discussion is quite neglected. Agile development methodologies are mentioned but not stressed, references and insight could be included

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.