Coder Social home page Coder Social logo

nhsdigital / rap-community-of-practice Goto Github PK

View Code? Open in Web Editor NEW
83.0 17.0 15.0 20.92 MB

The RAP community of practice includes all analysts and data scientists who are interested in adopting the working practices included in reproducible analytical pipelines (RAP) at NHS Digital.

Home Page: https://nhsdigital.github.io/rap-community-of-practice/

License: MIT License

HTML 83.19% Python 16.81%
rap healthcare open-source reproducible-analytical-pipeline git python r sql baseline-rap gold-rap

rap-community-of-practice's Introduction

CI Release Version MkDocs Material licence: MIT licence: OGL3

This material is maintained by the NHS RAP Community of Practice.

See our other work here: NHS Digital Analytical Services.


Welcome to the landing page for the RAP Community of Practice repository.

Visit our website for more information about RAP or look in the docs folder!

Please see our contributing instructions if you'd like to suggest a change, or develop our resources.

Licence

Unless stated otherwise, the codebase is released under the MIT Licence. This covers both the codebase and any sample code in the documentation.

HTML and Markdown documentation is ยฉ Crown copyright and available under the terms of the Open Government 3.0 licence.

rap-community-of-practice's People

Contributors

abbieprescott avatar chaeyoonkimnhse avatar connor1q avatar github-actions[bot] avatar harrietrs avatar helrich avatar jenniferstruthers1-nhs avatar josephwilson8-nhs avatar samhollings avatar warren-davies4 avatar xiyaozhuang avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

rap-community-of-practice's Issues

Environment and dependecy management - needs to be clearer

In the "levels of RAP" people become confused by environment and dependency management - we need to link to page which very clearly describe these, what the point of it is, and how they can know if they're meeting this requirement.

Pyspark guidance

I'm not a fan of referring to it as a "flavour of python" (about PYspark page)

I think Pyspark should be contained underneath Python.

I also think it should make it clear that distribution of processing only occurs if its set up right - spark on a normal laptop will not be any more powerful than say pandas. On a big cluster in databricks is a different story.

I think this page might also need a reference to other python datastructures - and how there is a right tool for the right job.

Contributions section

We're keen to encourage external improvements to these resources but we don't yet have a contributions section that explains how we will review and moderate.

Gold RAP - Dependency Management - Not super clear

One of the gold RAP points is listed as

Does your repo include environment management?

It's not super clear what this means. I think I understand that this means docker containers, but the wording is pretty general. If you use a conda environment does that count?

If it is meant to refer to docker images or whatever, it's probably worth saying explicitly.

Test issue

Please note - you need to be logged into github to raise an issue

03_quality-assuring-analytical-ouputs page not clearly linked with levels of RAP

The AQUA page (https://github.com/NHSDigital/rap-community-of-practice/blob/main/implementing_RAP/general_guidance/quality-assuring-analytical-ouputs.md) is not clearly associated with the levels of RAP and so people can find it a bit confusing when and how they should be following it.

We need to more clearly link it into peoples workflow when planning out RAP (some of it is beyond RAP and more general guidance on managing analytical work), and perhaps reduce duplication by removing those bits already covered by the "levels of RAP" - and making these clear.

RAP Publishing Checks - Clarify what are credentials and secrets

We've had some feedback that the part of the publishing checks that says "no credentials or secrets" is not clear, as analysts have not seen these terms before.

The following text might make things easier to understand:

Credentials or secrets are essentially passwords that computers use for encrypted communication or access to services. For example, with many APIs (like the Google Maps API) you must supply a credential code to access the service. Often times these codes look like long strange combinations of letters and numbers (l79sDgH9s...). We must not share our passwords publicly, so you should not commit credentials and secrets.

Code review page ideas

We have recently been doing some code reviewing. Here are a few things that we think might make the page more helpful.

Code review before merge request

Code should be reviewed with someone before submitting a merge request. The reviewer should consider whether the code needs to be refactored or redesigned.

I'm not sure that I always agree with this. Merge requests make it really easy to leave comments on different parts of the code, and in some ways make the life of the reviewer and the merge request submitter easier. Maybe rephrase as

You don't have to save reviewing your code until the end. You can do small reviewing and also pair programming while developing the ticket. Seeking feedback sooner could mean you save time because you do not have to change as much when the final review happens later.

Different types of code review

There are different types of code review that you can get. It may be worth highlighting them.

  1. Merge request code review

    A standard review process that checks whether changes to the codebase are acceptable. You focus only on the code that has changed. It should be relatively quick, and very regular (one every time you implement a new feature). Normally done by a member of the team.

  2. Full code review

    A code review where someone looks at all your code together, and gives you overall feedback. This review allows someone to look at the bigger picture, rather than one individual feature. These reviews take longer, and are less regular. Normally done by members outside your team, so that it is a fresh pair of eyes.

  3. Fitness to publish checks

    A code review to check the code is okay to publish. Note that, in the code review, you will normally limit yourself to making suggestions that you want completed before the code is published. This may mean you avoid suggesting big changes to the code, and instead focus in on checks like ensuring documentation is well written, or removing passwords from the code.

Maybe split code review checklist into beginner and advanced items?

One of the items on the code review checklist is

Documentation is hosted for easy access. GitHub Pages and Read the Docs provide a free service for hosting documentation publicly.

Even with advanced teams in data services I do not see them doing this. It might be worth prioritizing, so that the checklist is less overwhelming.

Maybe organise the checklist items by the RAP level the team is aiming for.

Clean code guidance

some teams want to use clean code - we need guidance on the best way to approach this for analytical code, why you would want to do it, and what to watch out for.

Coding best practice refresh

I think it would be useful to update the 'coding best practice' page to include more guidance from the NHS Digital RAP team.

Split out Terminal guidance from "git" guidance.

The terminal guidance is contained within the git guidance - but the terminal is a separate tool which can be used for many purposes - probably better to have it as its own level alongside Python, git etc, and then for these pages to be referenced by the other technologies.

Fix video links

Some of our videos are still pointing at internally hosted resources. We need to move these to make them available externally

dependency management "not possible in DAE"

In Levels of RAP it say:
Does your repo include dependency management? (i.e. requirements.txt or conda environment for RDS users. Not possible in DAE)

It's not strictly true that this cannot be described for DAE - though it is more limited. One can describe the cluster used (runtime, libraries etc).

R training resources

It might be helpful to link to the following books which are R focused at the moment (and where we have Python it refers back to your resources a lot of the time as they are great ๐Ÿ˜ƒ )

Technical guidance for installing and setting up R https://tools.nhsrcommunity.com/technical-r.html
Links to R training https://resources.nhsrcommunity.com/training.html#specific-software
Details about NHS-R Community, what it is and how to get involved https://nhsrway.nhsrcommunity.com/

I'm happy to do a PR if that's helpful.

Signpost resources to ensure accessibility requirements are met

This is most relevant for any outputs produced. See guidance.

As a starting point, the python visualisation guide should include tips on how to make visualisations more accessible:

  • The Home Office has some posters on accessible design
  • There are also countless online resources on accessibility relating to colour-blindness, visual impairments etc.

We should also consider including a note on accessibility in the design of RAP. A pipeline would be difficult to reproduce if a user could not access any part of the pipeline. This includes README files, as well as output types.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.