The RAP community of practice includes all analysts and data scientists who are interested in adopting the working practices included in reproducible analytical pipelines (RAP) at NHS Digital.

Overview

Warning - this repository is a snapshot of a repository internal to NHS Digital. This means that links to videos and some URLs may not work.

Repository owner: NHS Digital Analytical Services

Email: [email protected]

To contact us raise an issue on Github or via email and will respond promptly.

RAP community of practice

Welcome to the landing page for the RAP community of practice repo.

You can learn all about Reproducible analytical pipelines (RAP) on our what is RAP page. In a nutshell though, RAP is becoming the standard for publishing analytical outputs in government. RAP combines a number of ways of working that help to improve the reliability, transparency, and speed of statistics publications. Reproducible Analytical Pipelines follow the principles of the AQUA Book guidelines, which revolve around analysis being reproducible, auditable, transparent, and quality assured.

The RAP community of practice includes all analysts and data scientists who are interested in adopting the working practices included in reproducible analytical pipelines (RAP). This repo is a central repository for resources and guidance to help teams adopting RAP practices. There is an associated [MS Teams page] where you can introduce yourself, ask for help, or discuss different approaches. Over time we hope to build up a community of people who can self-support and further develop these ways of working.

The community of practice aims to support teams in adopting RAP practices through:

  1. Offering in-person support as teams establish new working practices
  2. Producing learning materials that offer reusable templates adapted for the NHSD analytical environment

This work is prompted by the observations that teams can struggle to adopt RAP practices without direct support. While no one element of RAP is particularly difficult, learning several new skills at the same time as delivering BAU is challenging. Teams can struggle to find the defended time to embed these practices. See the Statistics Authority report on the barriers to RAP adoption for more information. Luckily, in NHSD we have strong senior support for RAP and many teams have already begun to adopt many of the practices included in RAP. Consequently, we already have a large pool of skilled, ethusiastic analysts who are willing to help others. These resources also aim to support the goals laid out in the Goldacre report Bringing NHS data analysis into the 21st century and to align with Tim Berners-Lee's Five star data principles.

Support and training

If your team is embarking upon a RAP journey, you should look at our what is RAP page and try to complete the self-assessment. From there, we recommend reaching out for some in-person support. The RAP Champion Function (within the Data Science Skilled Team) can offer support in many forms:

  • Reviewing your RAP work and assessing your progress against the levels of RAP
  • Peer review of code
  • Workshops for a specific RAP capability
  • Consultancy style engagement where we plan a migration strategy
  • Pair coding
  • Shadowing another team

If you want to talk about any of this then please reach out on the [RAP community of practice MS Teams] page (internal to NHSD).

We maintain a list of people who are willing to dedicate some time to support others. Please add your name to the mix if you are willing to support someone else. You don't need to be an expert - just willing to share what you know.

Tutorials and resources

As we work alongside teams, we try to produce reusable learning materials pitched at specifically supporting NHSD teams. We try (with partial success) to avoid reproducing guidance that is easily available online. Instead, we link to lots of external resources where you can self-serve. Our focus instead aims to create some bespoke guidance that lays out how you would accomplish these practices in the NHSD setting.

Here are some of the initial resources:

These resources are demand-driven so if you want something then please ask on the [MS Teams page]. We would also ask you to contribute if you can improve on any of the resources or can fill in any other gaps.

The resources are not intended to be prescriptive. There are many ways to accomplish a task and teams have valid reasons for choosing other approaches. Instead the intention of the resources provided here is to offer a way in for teams who want to adopt good practices that they have heard about but don't know where to start.

Misc

We have taken inspiration from the NHSD software engineering COP. It has tons of great material so I encourage you to read and reflect on these working practices.

Licence

RAP Community of Practice codebase is released under the MIT License.

The documentation is © Crown copyright and available under the terms of the Open Government 3.0 licence.

Comments
  • Dead link

    Dead link

    opened by abbieprescott 4
  • dependency management

    dependency management "not possible in DAE"

    In Levels of RAP it say: Does your repo include dependency management? (i.e. requirements.txt or conda environment for RDS users. Not possible in DAE)

    It's not strictly true that this cannot be described for DAE - though it is more limited. One can describe the cluster used (runtime, libraries etc).

    opened by SamHollings 2
  • RAP Publishing Checks - Clarify what are credentials and secrets

    RAP Publishing Checks - Clarify what are credentials and secrets

    We've had some feedback that the part of the publishing checks that says "no credentials or secrets" is not clear, as analysts have not seen these terms before.

    The following text might make things easier to understand:

    Credentials or secrets are essentially passwords that computers use for encrypted communication or access to services. For example, with many APIs (like the Google Maps API) you must supply a credential code to access the service. Often times these codes look like long strange combinations of letters and numbers (l79sDgH9s...). We must not share our passwords publicly, so you should not commit credentials and secrets.

    opened by goodyguts 2
  • Environment and dependecy management - needs to be clearer

    Environment and dependecy management - needs to be clearer

    In the "levels of RAP" people become confused by environment and dependency management - we need to link to page which very clearly describe these, what the point of it is, and how they can know if they're meeting this requirement.

    opened by SamHollings 2
  • Pyspark guidance

    Pyspark guidance

    I'm not a fan of referring to it as a "flavour of python" (about PYspark page)

    I think Pyspark should be contained underneath Python.

    I also think it should make it clear that distribution of processing only occurs if its set up right - spark on a normal laptop will not be any more powerful than say pandas. On a big cluster in databricks is a different story.

    I think this page might also need a reference to other python datastructures - and how there is a right tool for the right job.

    duplicate 
    opened by SamHollings 1
  • Split out Terminal guidance from

    Split out Terminal guidance from "git" guidance.

    The terminal guidance is contained within the git guidance - but the terminal is a separate tool which can be used for many purposes - probably better to have it as its own level alongside Python, git etc, and then for these pages to be referenced by the other technologies.

    opened by SamHollings 1
  • code in the open - topics and add to data-analytics-services

    code in the open - topics and add to data-analytics-services

    On the "how to publish your code in the open page" - we should tell people they should add their publication to the page: https://github.com/NHSDigital/data-analytics-services and also that they should set appropriate topics for their publication, i.e. nhs-digital-publication

    opened by SamHollings 1
  • Signpost resources to ensure accessibility requirements are met

    Signpost resources to ensure accessibility requirements are met

    This is most relevant for any outputs produced. See guidance.

    As a starting point, the python visualisation guide should include tips on how to make visualisations more accessible:

    • The Home Office has some posters on accessible design
    • There are also countless online resources on accessibility relating to colour-blindness, visual impairments etc.

    We should also consider including a note on accessibility in the design of RAP. A pipeline would be difficult to reproduce if a user could not access any part of the pipeline. This includes README files, as well as output types.

    opened by harrietrs 1
  • Environment management external links

    Environment management external links

    We should do more to explain how environment management plays into reproducibility.

    This page is quite useful and would save us duplicating: https://realpython.com/python-virtual-environments-a-primer/

    opened by connor1q 1
  • Broken link

    Broken link

    https://github.com/NHSDigital/rap-community-of-practice/blob/main/python/project-structure-and-packaging.md#generic-package-template

    There is a broken link to the generic package template in the section above

    opened by connor1q 1
  • Contributions section

    Contributions section

    We're keen to encourage external improvements to these resources but we don't yet have a contributions section that explains how we will review and moderate.

    opened by connor1q 1
  • Code review page ideas

    Code review page ideas

    We have recently been doing some code reviewing. Here are a few things that we think might make the page more helpful.

    Code review before merge request

    Code should be reviewed with someone before submitting a merge request. The reviewer should consider whether the code needs to be refactored or redesigned.

    I'm not sure that I always agree with this. Merge requests make it really easy to leave comments on different parts of the code, and in some ways make the life of the reviewer and the merge request submitter easier. Maybe rephrase as

    You don't have to save reviewing your code until the end. You can do small reviewing and also pair programming while developing the ticket. Seeking feedback sooner could mean you save time because you do not have to change as much when the final review happens later.

    Different types of code review

    There are different types of code review that you can get. It may be worth highlighting them.

    1. Merge request code review

      A standard review process that checks whether changes to the codebase are acceptable. You focus only on the code that has changed. It should be relatively quick, and very regular (one every time you implement a new feature). Normally done by a member of the team.

    2. Full code review

      A code review where someone looks at all your code together, and gives you overall feedback. This review allows someone to look at the bigger picture, rather than one individual feature. These reviews take longer, and are less regular. Normally done by members outside your team, so that it is a fresh pair of eyes.

    3. Fitness to publish checks

      A code review to check the code is okay to publish. Note that, in the code review, you will normally limit yourself to making suggestions that you want completed before the code is published. This may mean you avoid suggesting big changes to the code, and instead focus in on checks like ensuring documentation is well written, or removing passwords from the code.

    Maybe split code review checklist into beginner and advanced items?

    One of the items on the code review checklist is

    Documentation is hosted for easy access. GitHub Pages and Read the Docs provide a free service for hosting documentation publicly.

    Even with advanced teams in data services I do not see them doing this. It might be worth prioritizing, so that the checklist is less overwhelming.

    Maybe organise the checklist items by the RAP level the team is aiming for.

    on jira workplan 
    opened by goodyguts 2
  • 03_quality-assuring-analytical-ouputs page not clearly linked with levels of RAP

    03_quality-assuring-analytical-ouputs page not clearly linked with levels of RAP

    The AQUA page (https://github.com/NHSDigital/rap-community-of-practice/blob/main/implementing_RAP/general_guidance/quality-assuring-analytical-ouputs.md) is not clearly associated with the levels of RAP and so people can find it a bit confusing when and how they should be following it.

    We need to more clearly link it into peoples workflow when planning out RAP (some of it is beyond RAP and more general guidance on managing analytical work), and perhaps reduce duplication by removing those bits already covered by the "levels of RAP" - and making these clear.

    on jira workplan 
    opened by SamHollings 1
  • Clean code guidance

    Clean code guidance

    some teams want to use clean code - we need guidance on the best way to approach this for analytical code, why you would want to do it, and what to watch out for.

    on jira workplan 
    opened by SamHollings 2
Releases(v1.1.0)
  • v1.1.0(Dec 21, 2022)

    What's Changed

    Automatic Release Notes

    • Release v1.1.0 by @xiyaozhuang in https://github.com/NHSDigital/rap-community-of-practice/pull/35

    New Contributors

    • @xiyaozhuang made their first contribution in https://github.com/NHSDigital/rap-community-of-practice/pull/35

    Full Changelog: https://github.com/NHSDigital/rap-community-of-practice/compare/v1.0.0...v1.1.0

    Source code(tar.gz)
    Source code(zip)
  • v1.0.0(Dec 6, 2022)

    What Changed

    Automatic release notes

    • Hr 1188 r git by @helrich in https://github.com/NHSDigital/rap-community-of-practice/pull/2
    • Add Intro to R link by @helrich in https://github.com/NHSDigital/rap-community-of-practice/pull/3
    • Improving layout and expanding rollout section by @connor1q in https://github.com/NHSDigital/rap-community-of-practice/pull/4
    • Cq updates by @connor1q in https://github.com/NHSDigital/rap-community-of-practice/pull/5
    • Hr changes by @helrich in https://github.com/NHSDigital/rap-community-of-practice/pull/9
    • Hr updates to git by @helrich in https://github.com/NHSDigital/rap-community-of-practice/pull/10
    • Update publishing code in the open by @harrietrs in https://github.com/NHSDigital/rap-community-of-practice/pull/20
    • Sh new front page by @SamHollings in https://github.com/NHSDigital/rap-community-of-practice/pull/22
    • Restructure and edit files by @abbieprescott in https://github.com/NHSDigital/rap-community-of-practice/pull/23
    • Create gh-pages version by @harrietrs in https://github.com/NHSDigital/rap-community-of-practice/pull/31
    • add two new guides and pr prep by @helrich in https://github.com/NHSDigital/rap-community-of-practice/pull/32
    • Publishes when to stop coding guide by @josephwilson8-nhs in https://github.com/NHSDigital/rap-community-of-practice/pull/33
    • Added new improved guides on virtual environments by @xiyaozhuang in https://github.com/NHSDigital/rap-community-of-practice/pull/34

    New Contributors

    • @helrich made their first contribution in https://github.com/NHSDigital/rap-community-of-practice/pull/2
    • @connor1q made their first contribution in https://github.com/NHSDigital/rap-community-of-practice/pull/4
    • @harrietrs made their first contribution in https://github.com/NHSDigital/rap-community-of-practice/pull/20
    • @SamHollings made their first contribution in https://github.com/NHSDigital/rap-community-of-practice/pull/22
    • @abbieprescott made their first contribution in https://github.com/NHSDigital/rap-community-of-practice/pull/23
    • @josephwilson8-nhs made their first contribution in https://github.com/NHSDigital/rap-community-of-practice/pull/33
    • @xiyaozhuang made their first contribution in https://github.com/NHSDigital/rap-community-of-practice/pull/34

    Full Changelog: https://github.com/NHSDigital/rap-community-of-practice/commits/v1.0.0

    Source code(tar.gz)
    Source code(zip)
Owner
NHS Digital
NHS Digital Public Repository
NHS Digital
PaintPrint - This module can colorize any text in your terminal

PaintPrint This module can colorize any text in your terminal Author: tankalxat3

Alexander Podstrechnyy 2 Feb 17, 2022
FindUncommonShares.py is a Python equivalent of PowerView's Invoke-ShareFinder.ps1 allowing to quickly find uncommon shares in vast Windows Domains.

FindUncommonShares The script FindUncommonShares.py is a Python equivalent of PowerView's Invoke-ShareFinder.ps1 allowing to quickly find uncommon sha

Podalirius 184 Jan 03, 2023
Clear merged pull requests ref (branch) on GitHub

GitHub PR Cleansing This tool is used to clear merged pull requests ref (branch) on GitHub. GitHub has no feature to auto delete branches on pull requ

Andi N. Dirgantara 12 Apr 19, 2022
A slapdash script to solve Wordle or Absurdle automatically

A slapdash script to solve Wordle or Absurdle automatically

Michael Anthony 1 Jan 19, 2022
An Advent calendar of small programming puzzles for a variety of skill sets and skill levels.

Advent of Code 2021 The Advent of Code is an Advent calendar of small programming puzzles for a variety of skill sets and skill levels that can be sol

Evan Cope 0 Feb 13, 2022
A simple interface to help lazy people like me to shutdown/reboot/sleep their computer remotely.

🦥 Lazy Helper ! A simple interface to help lazy people like me to shut down/reboot/sleep/lock/etc. their computer remotely. - USAGE If you're a lazy

MeHDI Rh 117 Nov 30, 2022
An app about keyboards, originating from the design of u/Sonnenschirm

keebapp-backend An app about keyboards, originating from the design of u/Sonnenschirm Setup Firstly, ensure that the environment for python is install

8 Sep 04, 2022
Radiosonde Telemetry Decoders

Radiosonde Telemetry Frame Decoders This repository is an attempt to collate the various sources of information on how to decode radiosonde telemetry

Project Horus 3 Jan 04, 2022
Animation retargeting tool for Autodesk Maya. Retargets mocap to a custom rig with a few clicks.

Animation Retargeting Tool for Maya A tool for transferring animation data between rigs or transfer raw mocap from a skeleton to a custom rig. (The sc

Joaen 62 Dec 19, 2022
Moleey Panel with python 3

Painel-Moleey pkg upgrade && pkg update pkg install python3 pip install pyfiglet pip install colored pip install requests pip install phonenumbers pkg

Moleey. 1 Oct 17, 2021
Fortnite StW Claimer for Daily Rewards, Research Points and free Llamas.

Fortnite Save the World Daily Reward, Research Points & free Llama Claimer This program allows you to claim Save the World Daily Reward, Research Poin

PRO100KatYT 27 Dec 22, 2022
Cross-Encoder-with-Bi-Encoder를 활용한 WebPage 데모

Retrieval_Streamlit_Demo Cross-Encoder-with-Bi-Encoder를 활용한

5 Dec 29, 2021
The Playwright Workshop for TAU: The Homecoming

tau-playwright-workshop This repository contains the instructions and example code for the Playwright workshop for TAU: The Homecoming on December 1,

Pandy Knight 134 Dec 30, 2022
dynamically create __slots__ objects with less code

slots_factory Factory functions and decorators for creating slot objects Slots are a python construct that allows users to create an object that doesn

Michael Green 2 Sep 07, 2021
Beatsaber for Python

beatsaber Beatsaber for Python It was automatically generated with mkpylib. If you're reading this message, it m

Shawn Presser 3 Jul 30, 2021
Palestra sobre desenvolvimento seguro de imagens e containers para a DockerCon 2021 sala Brasil

Segurança de imagens e containers direto na pipeline Palestra sobre desenvolvimento seguro de imagens e containers para a DockerCon 2021 sala Brasil.

Fernando Guisso 10 May 19, 2022
API wrapper for VCS hosting system.

PythonVCS API wrapper for VCS hosting system. Supported platforms Gitea Github, Gitlab, Bitbucket support will not, until that packages is not updated

MisileLaboratory 1 Apr 02, 2022
The Python agent for Apache SkyWalking

SkyWalking Python Agent SkyWalking-Python: The Python Agent for Apache SkyWalking, which provides the native tracing abilities for Python project. Sky

The Apache Software Foundation 149 Dec 12, 2022
A faster Python generator that get function results from multi-process workers

multiyield This package implements a Python generator that get function results from multi-process workers. The faster_fifo Queue (instead of the stan

Xin Du 1 Nov 18, 2021
Customisable coding font with alternates, ligatures and contextual positioning

Guide Ligature Support Links Log License Guide Live Preview + Download larsenwork.com/monoid Install Quit your editor/program. Unzip and open the fold

Andreas Larsen 7.6k Dec 30, 2022