co-cddo / open-standards Goto Github PK

View Code? Open in Web Editor NEW

134.0 134.0 18.0 28.74 MB

Collaboration space for discussing and exploring technical and data standards

R 100.00%

service-design-assurance

open-standards's People

Contributors

Stargazers

Watchers

Forkers

pmackay lawrence-g gkiselov cyberdefcon8 techbelly qregina tanya-vedi pulsemedic abdulkhader86 alphagov-mirror artursg koalageo jfontestad robdyke cityofcapetown uk-gov-mirror

open-standards's Issues

What about SSO?

I'm working (ever so painfully) to setup an auth chain using Azure Active Directory as the SSO service and believe me when I tell you that it's a huge hassle.

Do we have any established patterns around SSO? Over here in UKTI Microsoft stuff appears to be the norm, and we've already got a few products using Azure AD, so that's why I have to integrate it, but I think we'd do well not to make a habit of this.

Bulk Download & Streaming

Question from @philandstuff: “do you know of any appropriate standards for use-cases around bulk download & streaming? [12:13] we're kind of imagining how you might have "git clone" and "git pull" for a register copy a register, then at a later time, download everything that has since changed?”

Authors should provide their ORCID iD

Any named individual who writes a government document, standard or code should provide an ORCID iD, which should be published alongside their byline, and in associated metadata.

ORCID iDs are unique identifiers. The use of an ORCID disambiguates two authors with the same (or similar) names; and identifies the work of one person under a variety of names (for example because of differing use of initials, misspellings, name changes, or differing transliterations).

Individuals register and own their ORCID record; it goes with them when they change jobs, or write for other publishers. The record can include details of education, employment, funding and works authored, each of which can be made public or kept private.

Publishers, employers, funders and other bodies can incorporate ORCID into their back-end systems. APIs are available publicly and to paid members of ORCID or a local consortium. Some mandate that authors or people receiving funding must provide an ORCID iD. Others make it optional.

ORCID is a non-profit organisation, and public ORCID data is available under an open licence.

As an organisation, the government should encourage, and where appropriate mandate, the use of ORCID iDs, and should include parameters for them in relevant forms, and databases.

For more info, see http://orcid.org/ or https://en.wikipedia.org/wiki/ORCID

[My ORCID is in my Github profile.]

schema.gov.uk

Would a repository for type definitions in the same spirit as schema.org be a worthy ambition?

Would you make use of it?
Would you contribute to it?
What style of presentation and what notations should it have?
Who would oversee? and what oversight model would work best : OSS, delegating sub schemas to departments, etc.
Who would pay?

SAML

Some services being developed are using SAML. How open is SAML and what is our position on it?

Persistent identifiers for public government documents

Create A Challenge

Title

Persistent identifiers for public government documents

Challenge Owner

Data services and content lead at the British Library.
Manage the DataCite UK service, which makes Digital Object Identifiers (DOIs) available to UK organisations.

Short Description

Assigning resolvable, globally unique and persistent identifiers, such as Digital Object Identifiers (DOIs) to public government documents (reports, data, other papers) allows them to be cited in a stable and trusted way, particularly within an academic context. This supports trust in the research itself.

User Need

Standard web addresses as URLs used in academic citations are prone to link rot (see: https://doi.org/10.1371/journal.pone.0115253), which reduces the ability to verify claims in the research. This also applies to government documents cited with URLs in the literature, and so undermines use of government documents in academia.

More widely, all users find it hard to keep track of the location of any given document as it moves around the government's web estate over the long-term.

Expected Benefits

Resolvable, persistent identifiers like DOIs applied to government documents will ensure stable links. Researchers will be confident that citations to government reports and data found in the literature will work, and content they reference will be available for the long-term. DOIs are well-recognised within the research community, as they have been used to cite online material for more than 15 years.

The additional layers of governance that come with use of identifiers such as DOIs ensure that all kinds of user will be able to find and access government documents via the same URL no matter where on the government's web estate the item is hosted over time.

Globally unique persistent identifiers will also enable government to see how each document is used. The use of each report can be more easily distinguished from its versions. Identifiers such as DOIs allow the tracking of usage metrics such as Altmetrics (http://altmetrics.org/manifesto/) and with services such as DataCite's Event Data (https://eventdata.datacite.org/).

Functional Needs

Users need to be provided a persistent, trustworthy and globally unique way of referring to public government documents.
Government needs to be able to track the academic use of those documents.

Animal accident data in the new forest

Publishinganimal accidents data as open data!

Animal accidents involving livestock have been reducing but there has been some interesting ideas from local councillors to reduce them further, such as stopping the free roaming of commoners livestock, which would have serious implications to tourism. I think that by opening up the animal accident data and publishing reusable information could lead to both innovative ideas to reducing animal accidents involving vehicles and also improve community engagement with the verderers and NFPA.Apologies I haven't followed the standard issue template buthopefully the link below will provide a bit of context!

https://callumrtanner.com/2017/01/16/open-up-the-data-of-animal-accidents-in-the-new-forest/

Authenticating and Authorising is complex. Make it simple.

Create A Challenge

Title

Authentication and Authorisation is complex to understand and get right. GDS should be looking to create a framework on top of OAuth2 and OpenIDC which other departments can benefit from.

Challenge Owner

An IT Software Consultant currently working in an agency of the UK government.

Short Description

The agency has a number of individual applications. Each application has it's own implementation of Authentication and Authorisation. Each application has its own user database. Some applications only allow access to internal users of the organisation and others other allow both internal and external users. Generally there is an inconsistent approach. The inconsistency means a larger attack surface area. It means that developers that don't have experience in the complexities and often bake their own implementation of OAuth2.0 authentication into their application. Use of a framework such as the OpenSource IdentityServer4, or a GDS equivalent would bring conformity.

User Need

The agency would benefit through reduced development time of new applications, better security, potentially a single database of users, easier to grow services in the future. Developers working for the agency\Government Organisation wouldn't need to understand the nuances of application security and make simple mistakes with huge implications, implementation would be quicker and more secure. End users of agency applications would have better protection of their information through a consistent approach.

Expected Benefits

See User Need

Functional Needs

The choosing of technologies to adopt Authentication is fine but the following need to be addressed:

The Agency (I work for) wants a consistent approach to Authentication and Authorisation across all of its applications
Recognition that each 'client' that needs authentication and authorisation can be one of a number of technologies; javascript SPA application, MVC application, WebApi application, other native applications etc
Addition of additional clients shouldn't be a burden on the Authentication and Authorisation framework, i.e. its configuration should be data driven and potentially controlled by a simple management tool. Developers should be able to drop in simple packages into their app and apply common patterns to benefit from Authentication and Authorisation
Authentication and Authorisation should be solved together but kept separate. I have often seen an amalgamation of Authentication and Authorisation which muddies what both are trying to do violating Single Responsibility Principals. Everyone needs to be clear what each means and where the application responsibilities are.
Given that data security breaches are at an all time high, security should be at the top of every ones agenda. It isn't because 'business need' casually passes over security in favor of business functionality that provides Return On Investment. The best way to mitigate this is to create a framework that can be implemented, customized and provides a consistent extensible basis.
It doesn't need to be a one size fits all approach, but it does need to guide adopters on the right data security path. I have just reviewed 3 different applications here, each does Authentication and Authorisation differently. Consistency is key for maintainability.
Authentication and hence Identity is evolving. The chosen solution should evolve with it. Other 'things' that require consideration:

the Government Gateway is already doing some form of the common Authentication\Authorisation.
GDS Verify sounds interesting
Multi-factor Authentication (key fobs, phone apps creating tokens)

I will stop blurting out now...

Accessibility Issue

We're aware that the Challenge Owners' Guide page contains a graphic which doesn't meet our accessibility requirements.

This will need to be updated before go-live.

Schema for spend data

Recently the topic of spend data has come up, particularly how everyone seems to publish their data in slightly different structures, sometimes the same organisation using a different structure each month.

I was aware of the Local Government Association's schema ( https://github.com/esd-org-uk/schemas/blob/master/Spend/Spend.json ) and a HMRC schema was mentioned. I haven't found the HMRC schema yet, although I am presuming it is core-department specific, so if anyone has any pointers.

There's likely to be a problem persuading people to use a specific schema, but we should at least have a schema to suggest and to that end, I'm hoping to gather opinions on the best approach for this. Should we be asking people like https://www.spendnetwork.com/ for guidance on what they would expect? Could @torgo arrange this if so, as I believe they are at ODI.

This might be of interest to @davidread as he's had some experience with the https://openspending.org/ codebase.

API Versioning

Discussion on versioning of APIs: best practices, current approaches, relevant standards...

Updated Open Contracting Data Standard recommendation to version 1.1

The Standards Hub currently recommends Version 1.0 of the Open Contracting Data Standard.

Version 1.1 has recently been released

What would be the process to consider updating the recommendation to suggest use of version 1.1?

Web 2.0 for Emergency Care

Title

Disaster response has always been a challenge during and after major disasters due to the impact of disaster itself, the number of organizations and individuals participating in the response[1] and the lack of rapid social networking to support immediate community response. Disaster, regardless of etiology, exceeds the ability of the local community to cope with the event and requires specialized resources from outside the area impacted[2-4]. In a large-scale destructive event, one of the greatest challenges to public health workers and rescuing teams is to have stable and accessible emergency communication systems[5,6]. However, little researches currently exist regarding the use of communication platforms and internet social networks for emergency response.

Emergency response during disasters is often complicated because communication becomes unavailable. The Chi-Chi earthquake in Taiwan and Hurricane Katrina in US have proven that current telephone, radio and television-based emergency response systems are not capable of meeting all of the community-wide information sharing and communication needs of residents and responders during major disasters[7,8]. After 9/11, Preece and Shneiderman et al proposed the concept of community response grids[9] which would allow authorities, residents, and responders to share information, communicate and coordinate activities via internet and mobile communication devices in response to a major disaster. Information technologies has the potential to provide higher capacity and effective communication mechanisms that can reach citizens and government officials simultaneously[

n the case of typhoon disaster in Taiwan, internet social networking and mobile technology were found to be helpful for community residents, professional emergency rescuers, and government agencies in gathering and disseminating real-time information, regarding volunteer recruitment and relief supplies allocation. We noted that if internet tools are to be integrated in the development of emergency response system, the accessibility, accuracy, validity, feasibility, privacy and the scalability of itself should be carefully considered especially in the effort of applying it in resource poor settings.

Summary

Challenge Owner

Briefly describe yourself.

Short Description

A short summary of the user need and expected benefits of this challenge. This summary will be used to help people to spot which challenges are of interest to them.

User Need

The user need that this challenge seeks to address, including a description of the types of users it involves.

Expected Benefits

In the case of typhoon disaster in Taiwan, internet social networking and mobile technology were found to be helpful for community residents, professional emergency rescuers, and government agencies in gathering and disseminating real-time information, regarding volunteer recruitment and relief supplies allocation. We noted that if internet tools are to be integrated in the development of emergency response system, the accessibility, accuracy, validity, feasibility, privacy and the scalability of itself should be carefully considered especially in the effort of applying it in resource poor settings.

Functional Needs

The functional needs that the proposal must address.

Clarifications recommendation around exchange of location data

I have some questions around the recommendations specified at:

https://www.gov.uk/government/publications/open-standards-for-government/exchange-of-location-point

The document says that:

for the geographical scope of ETRS891, ETRS89 must be used, WGS 84 may also be used
other coordinate systems may be used in addition to WGS 84 and ETRS89

What does this actually mean in practice in terms of specifying conformance criteria and designing data formats? For example, within the scopre of ETRS891, should a data file include points with a ETRS89 CRS and then may, in addition, include the points in other CRS? Or is there a choice?

The section on Functional Needs in the guidance doesn't really elaborate. In fact it makes a case for using WGS84.

As a concrete example, the newly published Brownfield Land Register standard says that local authorities should use ETRS89, but the standard allows points to be specified in in other CRS systems.

I was just at a workshop on this standard where was some debate about the utility of ETRS89. E.g. local authority systems may not store this natively and consumers of the open data are perhaps more likely to want WGS84.

There also seems to be some inconsistency in section 5, which says:

"applications that consume data sets containing points must promote and prefer WGS 84".

Promote and prefer WGS 84 seems at odds with requiring use of ETRS89?

I understand that ETRS89 is the standard CRS used in the EU, so can see why it has been referenced. But I think it might be useful to clarify some of the intended outcomes here.

Word Documents from partner organisations

Raised by @benlaurie on Twitter

I'd love to read all about it but all your docs are in Word format.

https://twitter.com/BenLaurie/status/870867721952645121

The .docx files in question are from https://www.riscs.org.uk/

RISCS is funded by NCSC, but is run by UCL.

An open question to our community - should we encourage our partners to use open standards? Should we require it? How would we enforce this?

Tabular data

Category: Data

Challenge Owner

David Read, tech arch at MoJ's Analytical Platform. Background with GDS on: data.gov.uk, Better Use of Data team.

Short Description

Tabular data (e.g. CSV) is the most common data format but it is loosely defined and users would benefit from standardizing on the details.

This challenge is not about Excel workbooks or similar. It is about data that is primarily consumed by machines/software, rather than humans.

This challenge is not about metadata (e.g. schema / column types, licence) or validation. That's covered in challenge #40 and the options, including CSV on the Web and Tabular Data Package are both about putting metadata in a separate file, so is a separate conversation.

User Need

Off the top of my head:

As a data scientist I want to open the file directly in Python or R into a tabular data structure (e.g. DataFrame) without having to wrangle it (frictionless access to data) so that I can efficiently analyse it
As a developer I want to do simple processing of the data in a range of simple software (e.g. command-line bash, Javascript, Java)
As a citizen with low data literacy I want to open the file directly into Excel to view the table so that I can browse government data (secondary use case)

Expected Benefits

We want to encourage government users and citizens to use government data more, for greater understanding and decision-making. There are plenty of barriers to this, including skills, tools, access, licencing etc but one small but significant one is a proliferation of usage of CSV. These often require users to do extra work::

configure 'dialect', such as quote character, escape character, line ending
collapse multiple header rows into one, or create a missing one
character encoding conversion (happily this is covered by an existing government open standard)

Examples of bad tabular data:

ONS generally put metadata in the CSV e.g. https://www.ons.gov.uk/generator?format=csv&uri=/economy/grossdomesticproductgdp/timeseries/l8gg/qna
Defra example where lots of tables are into a single CSV e.g. https://data.gov.uk/dataset/f0f1a7b9-4a56-48c9-b0b3-99482e7d6980/basic-horticultural-statistics
Cabinet Office example where you'd have to combine 2 rows to get the headers https://data.gov.uk/dataset/14992eea-4f20-479d-8314-d5b08f1a9b9f/cabinet-office-annual-report-and-accounts-2012-13/datafile/803467f0-b2cc-4bde-9fc7-8b14646f8774/preview

Functional Needs

The functional needs that the proposal must address.

Date-Times and Time-stamps Challenge

Date-Times and Time-stamps

The case for ISO8601

A short title which describes the challenge. Avoid acronyms or jargon.

Challenge Owner

Terence Eden and Lawrence Greenwood are the challenge owners. The challenge was originally posted by Chris Little.

Short Description

Humans are adept at accommodating and understanding a variety of time and date notations, such as 7th July 2016, 7/7/16, July 7 2016, 2016-07-07 and 10:12am, 12 past 10 in the morning, 09:12GMT, 09:12UTC, 10:12BST, 12:12EEST.

There is a well established, global, international notation for dates and times: ISO8601. This standard, and subsets of it, are widely used in the ICT domain, and have the advantage of automatically being 'sortable' on most computer systems. That is, dates like 7 Jul, 7 Aug and 7 Sept would usually sort into the alphabetic order: Aug, Jul, Sept, rather than the expected temporal order Jul, Aug, Sept.

By adopting ISO8601 notation for dates and times in online documents, spreadsheets, databases and for filenames and online references such as URLs, greater interoperability will be achieved at less cost and with less confusion.

2016-07-07T09:23:00Z

User Need

Inconsistent recording of dates and times, causing confusion especially with the international exchange of computer documents, where there are different cultural practices. For example, confusion of UK and USA dates, such as 9/11/2001, will be reduced.

Expected Benefits

Users will be able to more easily order documents of interest into strict date-time order, and there will be greater interoperability for transferring date-time information between disparate computer systems. There will be a better validation of date-time information.
Sorting algorithms can be simplified. Listings can be more readily understood. International exchange of information improved.

Understanding Government Information

Challenge Owner

This challenge was presented to the Open Standards Board by A. Seles from The National Archives

Short Description

Government information is produced on many different platforms and can include things like records, emails and data. Users of government information, both citizens and government officials, need to be able to understand it and use it, independent of any platform, furthermore, users should be able to examine and query information without having access to its full contents. In order to accomplish this, government systems need to create a standardised set of information (i.e. metadata) about the resources it manages.

User Need

Users in this context include citizens, civil society and government officials.

Citizens: Standardising how government information is presented to citizens will allow them to understand how it was created, who created them and, in particular, how it was compiled including technical makeup and the original source material used to create it
Government officials: A standardised set of metadata will allow government officials to understand the information in their systems, as well as facilitate the interchange between different government departments and civil society without loss of meaning or misinterpretation. Enabling more information exchange also promotes better collaboration across a number of different technological platforms.

Furthermore, having standardised metadata will also allow public officials to meet legislative requirements as information can be easily retrieved to answer access requests under Freedom of Information or transfer records to The National Archives, as per the Public Records Act.

Expected Benefits

Ease of retrieval and reduction in costs: Standardised metadata, utilising common vocabularies means that information can be queried, retrieved and collated with greater ease. This decreases the amount of staff time spent locating information and correcting data quality issues, allowing for the identification of materials that can be destroyed, in accordance with legislative requirements, freeing up server space. The benefits for government departments is that this optimises the usage of IT infrastructure whether onsite or offsite (i.e. Cloud)
Collaboration and information exchange: Essentially documents can move freely between applications or organisations in a consistent way without the need for further effort or technology to be applied. Therefore, government departments can exchange information between systems, facilitating more collaboration because document metadata through clear and standardised categorisation along with subject identification enables users to know the subject matter of the contents prior to opening. This allows for greater efficiency of time and resources.
Meeting legislative requirements: Departments can search and research information in a more timely and effective manner for FOI access requests. Further it allows government departments to design better information handling practices as protective markings are clearly and accurately reflected, while also ensuring the consistent export of data or records from systems, which ensure that they information has integrity and enables greater automation of the transfer to The National Archives.
Compliance with Public Sector Information (PSI) Directives: Enable public bodies to share government information in a consistent and open method on with citizens to realise economic, societal and democratic benefits. For more information on the PSI Directives please see: http://www.legislation.gov.uk/uksi/2015/1415/contents/made

Functional Needs

The solution to the challenge should be able to meet the following functional needs:

Enable documents to transfer between technologies or between organisations without loss of meaning and understanding.
Allow organisations to meet the statutory requirements to dispose of documents that no longer need to be kept.
Be extensible to allow future changes and additions by users for their own business needs and use cases.
Be machine processable within the application and workflows where documents are created, held and shared.

Form Challenge

The Forms Challenge

Challenge Owner

West Midlands Fire Service

We're a small in-house team within West Midlands Fire Service that has over 12 years experience in developing dynamically-rendered input forms (i.e. a technique that separates the definition of what a form should collect from the code that controls how the user interface is delivered).

This separation-of-concerns has worked really well for us over the years and continues to bring unexpected benefits. We've now accumulated a library of around 100 form definitions which cover all aspects of Fire Service activity.

As we begin work on our next-generation platform we'd like to align to a standard that helps widen the uptake of this approach.

Short Description

This is a challenge to produce an open standard to express user-facing input forms.

The standard should cover several facets:

Facet	Description
Layout	The order and configuration of UI widgets.
Binding	How UI widgets relate to an underlying data model.
Appearance	Prompts, labels, grid arrangement, iconography, styling-overrides etc.
Structure	Arranging UI widgets into sections, groups etc.
Enumeration	For populating drop-down lists and similar.
Context	The specification should be expressive enough to indicate how a form should behave/appear in different contexts (in-the-field, in-the-office etc.)
Validation	Min/max ranges, required/optional attributes, [regular] expressions, function-binding etc.
Dynamic content	Use of REST APIs to get content/values and perform server-side validation, "typeahead" support, UUID generation etc.
Behaviour	Conditional appearance of items/groups/sections, enumerated values, read-only states etc.
Nesting	Allow for repeating groups of UI widgets (with min/max allowed number of entries and similar).
Advanced	Internationalisation, scripting support, tours, offline-fallback configuration...

Along with our prototype DSL here are a couple of established reference-points to start:

Note!

This standard should obviously be technology-agnostic (i.e. it has absolutely no relation to any front-end technology such as React, Angular or any other)
We'll be suggesting related "Data Model" and "Workflow" challenges soon!

User Need

As this is a more general challenge, it doesn't relate to one particular group of user.

That said, what is important from a user-perspective is that the specification be expressive enough for dynamic form renderers (be them running inside web pages or mobile apps) to deliver rich and efficient user experiences.

Expected Benefits

Assuming this standard could be used throughout government to define user-facing forms...

Benefactor	Benefits of a successful challenge
Users	All forms would be consistent (at least within a single platform/product which delivers a set of form definitions) meaning one interface to master and less systems to learn. Users can expect a greatly improved data-collection experience as any effort to improve a generic form renderer would then directly benefit all forms (and by extension, all users). If a form is being delivered from a generic platform, then it is safe to assume authentication is happening across all forms - reducing system-switching burdens
Operational	Assuming a tool-ecosystem built around this standard: organisations will be much better positioned to set and refine their own data collection requirements, making for a more responsive and agile government. Support for deep validation, tours, keyboard-accelerators and other generic functionality will help drive-up general data quality and efficiency. If form content is delivered via a generic platform then authentication and authorisation management would be centralised. Government interoperability and transparency will be greatly improved.
Social	This specification (combined some other challenges such as data models and workflow) begins to pave-the-way to a much more collaborative and open approach to assembling software - akin to the benefits attributed to low-code platforms.
Environmental	When combined with quality tools, the ease of replacing paper forms with electronic equivalents may drive-down paper consumption?

When quantifying the impact of introducing our initial dynamic-form platform, we internally estimated that it was saving 15,000 person-hours per year (as compared with the overheads incurred with traditional "discreet system" approaches and training). If such estimates still hold true today, then the cumulative impact of supporting a switch to dynamic-form-rendering across government would be significant.

Functional Needs

We consider the proposal should be:

Need	Description
Agnostic	The specification should be independent of any technology or vendor.
Lean	Not full of cruft, use intelligent defaults, JSON-over-XML etc.
Intuitive	Needs to be logical: easily read and understood.
Extensible	Over our 10 years, we've accumulated a palette of some 30 different UI components (covering the obvious text-boxes through to gazetteer-selectors and maps). However, the ability to express unforeseen specialist widgets would be required.
Toolable	To deliver wider benefit, the specification will need to play nicely with IDEs, WYSIWYG editors and similar.
Support inference	Meta information, such as prompts, descriptions, data types etc. can be inferred from an associated data model definition (soon to be the focus of another challenge!) As such, the specification should define explicit/predictable behaviour when inferring values.

Add in the existing adopted standards

Bring in the data from:

Authentication standards for government systems Challenge

Authentication standards for government systems

Which standards to use for user credentials

Challenge Owner

The challenge was originally posted on standards.data.gov.uk by Shan Rahulan

Short Description

There has been a proliferation in the number of usernames and passwords required by - Government users to access government systems. With the advent of cloud and the need to build digital services there is an opportunity to set some standards for authentication which will over time reduce this issue.

User Need

As a government user, I want to have one set of credentials to access all the services I need to do my job, instead of having lots of usernames and passwords to remember

Expected Benefits

Government organisations

will find it easier and cheaper to consume new cloud services
can easily use the authentication services of other organisations when building digital services
will have a reduced requirement for custom authentication development
will have access to a wider number of services without redevelopment

Government end users

can use the same set of credentials for both cloud and digital services
only have one set of credentials to remember for cloud services
potential for single sign on
are unique across the whole of government

Functional Needs

a common authentication method to access all cloud services
an authentication service that enables cross government working through the use of unique identities
The ability to buy and build services that use open standards like SAML2 and OIDC for authentication

What would a standard for APIs look like?

Just picking up on the questions posted by myself and @mattlewis_dvla in #tech-standards on slack.

To kick things off I'm reposting @mattlewis_dvla's helpful points from slack:

Implementation - Have people rolled their own API Management solution or used a product such as WSO2, Apigee, CA API Management etc.?
Documentation – We are favouring Swagger 2.0. There is a split between design first or use annotations in the code. What are others doing?
Versioning – Are people using major version number in the URI as favoured by many organisations or starting to support custom media types and gone down the hypermedia route?
Error Codes – I’d like to compare the set of error codes we are looking to support to get some consistency
Error Payload – as above, we are looking to return a custom error payload. I can’t find a standard out there for this but would like to be consistent with others
Naming Standards – adopting camelCase rather than snake_case or hypens. It would be useful to have a common messaging model so that we call the same elements the same things
Authentication/Authorisation – open APIs will just API Key but this will need to be extended later. We use Verify for View Driving Licence API but we have a lot of data that will be shared with organisations and not private individuals.

Very keen to see where this goes, as we are beginning our API journey at DfE too and would like to get things right!

Workshop Happening on the 18th of Feb

Hi @alphagov/tech-standards - I am testing out the ability to notify a team on github and using that as an opportunity to remind you to register (if you have't done so already) for the workshop we're holding in the afternoon of the 18th of February - see https://ti.to/torgo/ukgov-standards-camp for registration info.

API documentation standard

Forked from #9:

Documentation – We are favouring Swagger 2.0. There is a split between design first or use annotations in the code. What are others doing?

From @rossjones:

Are you aware that the swagger 2 spec is now forming the basis for the OpenAPI initiative? The specification is currently available at https://github.com/OAI/OpenAPI-Specification/blob/master/versions/2.0.md it might cover more than a few of these items?

What HR data standards exist?

[Example Issue] Should we use geoJSON in the Registers project

Clarification for ETRS89 as a Point Location

The adopted standard for representing a Point Location is to use ETRS89 ( https://www.gov.uk/government/publications/open-standards-for-government/exchange-of-location-point )

However, a number of councils have come back to ask ‘which flavour of ETRS89 to use’, as GIS systems which support an export to ETRS89 typically offer many options.

There are at least 25 different versions of ETRS 1989

• ETRS 1989 DKTM1 to M5;
• Lots which relate to different countries (there are five different ETRS 1989’s for Poland alone!);
• ETRS 1989 UWPP 1992;
• ETRS 1989 UWPP 2000 PAS 5 to PAS 8

Advice from Ordnance Survey is to use EPSG::4258 (or http://www.opengis.net/def/crs/EPSG/0/4258), because that is what is required by the European Commission.

Can the HMG guidance be reviewed and improved with this in mind?

Evaluate OpenAPI Specification as a common API description format

Evaluate OpenAPI as a standard for describing APIs

Can OpenAPI be the standard used to describe APIs?

Challenge Owner

I'm Ben Henley, a tech writer for GaaP

Short Description

OpenAPI/Swagger is a standard way to describe APIs. It provides a machine-readable description of an API which can be generated by the developers who work on the API. Swagger can be used to generate parts of the documentation, and to create tools like interactive API explorers. There are many tools available which understand Swagger. This makes it easier to maintain accurate documentation and update it quickly. At least two GDS projects already have Swagger descriptions of their APIs.

User Need

Developers who use government APIs need accurate documentation. If all projects that produce APIs were required to maintain a Swagger description, it would make it easier to introduce common documentation tools and make documentation more accurate.

Expected Benefits

Developers would benefit from more accurate API documentation and improved ways to learn about the APIs, like interactive tools. API documentation would be standardised and consistent between projects, reducing time spent for developers to find the information they need. This will increase trust in documentation and reduce time spent on support and increase the pace of integration. Tech writers will need to spend less time maintaining documentation. A Swagger description may also be required for other API-related tools, like monitoring services, management tools and API gateways.

Functional Needs

Teams must be able to produce and maintain Swagger descriptions of their APIs.

XML as an API data format

I would appreciate any input on this. If a stakeholder requests it as an supported format for an API, is that reasonable? How many APIs are being developed that include XML? Or is it considered legacy and clients should simply update to be able to process JSON?

Timestamps: ISO8601 or RFC3339?

@philandstuff asks: on a practical level, why might one prefer RFC 3339 or ISO 8601 as a datetime standard? I'm considering adopting RFC3339 for #registers because it's freely available and simpler

What do we have to say about OAuth 2.0?

OAuth 2.0 https://tools.ietf.org/html/rfc6750 is being used across different projects. What guidance do we have about its use?

Any Feedback Regarding Job Posting Data Format?

The Open Standards Board has taken up a question of data standards for job posting. It's been proposed to use the Schema.org schema developed for this purpose. To quote the issue:

It has been adopted as a voluntary standard in the US to aid the building of a 'Veterans Job Bank' , and has seen some adoption by vacancy publishers and aggregators.

Without getting in the way of the open standards process, I'm just wondering what additional implementer experience there is on using this data schema standard, or indeed other similar schema standards.

Update Unicode to new standard

Unicode 11 will be released on 2018-06-11 - http://unicode.org/versions/Unicode11.0.0/

This issue is for board members, and other interested parties, to leave comments on whether the Open Standards Team should update the Cross platform character encoding profile to support this newer revision of the standard.

As per the terms of reference, the board operates by consensus. If the majority of the board agrees with this change, or there are no significant objections from the community, the standard will be updated.

Background

The current standard is for Unicode 6.2 and UTF-8.

Unicode 6.2 was released in 2012. There have been several important changes since then which will be particularly useful for Government.

Impact

There is no perceived negative impact to updating this standard. Most software will automatically update to support the newer version of Unicode.

Older software will still be able to read documents which use more recent versions of the standard, but newer characters may render as � (Unicode Replacement Character)

Recommendation

It is the Open Standards Team's recommendation that this update be adopted.

status of API to manage Ltd. companies

Create A Challenge

Title

status of API to manage Ltd. companies

Challenge Owner

I'm an activist and open source developers building elearning to teach underprivileged JavaScript with the goal of connecting them to remote JavaScript gigs to kickstart their open source and self-employment career.

Short Description

I would like to help people into self employment and the UK Ltd. seems to be a perfect vehicle, because it's easy to use and affordable. If there was an API to open, manage & close a Ltd. company, it would be a lot easier to build "business software" that directly connects and automates burueacracy which is the major hurdle for underprivileged people who just simply do not have the money and time and often background to cope with the traditional process of managing a Ltd.

User Need

The fear of trying self employment, especially the bureaucracy that comes with it are one of the major show stoppers for many people to even try themselves as little entrepreneurs. Having an API and all the support necessary will allow open source solutions to build tools that will help a variety of users to actually try their luck.

Expected Benefits

Many refugees that came to europe in the recent months and in general unemployed people or people with backgrounds that makes them a hard fit for traditional employment could go into self employment if there was an open source ecosystem of users and contributors that would help to pave the way.
It would mitigate all kinds of rational and irrational fears connected with trying oneself as an entrepreneur.
The major cost currently is filing annual accounts and hiring accountants especially in the beginning, when income is lacking or self employment is a little side project. Every pound or euro matters and an accountant makes more than 80% of all the costs that come with opening and managing the bureaucracy connected to a Ltd. company.

Functional Needs

A nice and easy to use API to open, manage and close a Ltd. company including documentation around how to use the API. HTTP would be great, but having some kind of subscribe machanism like hooks or websockets for real time updates wouldnt be bad

Elections data schema

The LGA is leading work to produce a schema for CSV open data on election candidates and results. The main discussion is in the Knowledge Hub thread Election results schema second round consultation Aug-Oct 2016

The consultation overview document (PDF) is http://e-sd.org/dmKwu

The latest version of the schema description is http://e-sd.org/vgTJ3

Comments can go in the above forum or by email to [email protected]. Alternatively, I'll pass on anything posted here.

Rejected / On Hold Standards

As a proposer, I want to see which standards have been proposed which have not been adopted. This will help me improve the quality of my submissions.

Introduce a PGP signing service at the post office

PGP signing at the post office

Introduce a PGP key-signing service at the Post Office

Challenge Owner

I am a software engineer of over 25 years experience.

Short Description

PGP keys are a recognised open, federated, non-centralized standard for security on the internet. The 'web of trust' ideology relies on people signing each others keys in order to increase the trust the public keys are associated with the individuals that claim to own them. Key signing parties happen to facilitate this. However, since the Post Office provides an identity check system for passports and other official documents, it could likewise provide a key signing service.

User Need

There are many needs for authenticated documents (legal contracts and so on), and for secure document exchange. When purchasing my last property there was all sorts of pointless and insecure messing around with a so-called secure document service between myself and the mortgage provider. PGP provides an open, effective and non-centralized way of solving this problem.

Expected Benefits

People who have PGP keys can have them signed by a trusted public service.

Functional Needs

A person should be able to go into the Post Office with a passport, driving licence, or whatever is required for the existing identity check service, and have the post office sign their PGP key with a trusted government key. The fee would be the same as for the existing identity check service.

Schemas for Tabular Data Challenge

Schemas for Tabular Data

Suggested by

Originally Submitted by pwalsh on Mon, 13/03/2017 on standards.data.gov.uk

Short Description

Much data published by governments is in common tabular data formats: CSV, Excel, and ODS. This is true for the UK government and governments around the world. To provide assurances around reusability of tabular data, consumers (users) need information on the "primitive types" for each column of data (example: is it a number? is it a date?). This also allows for quality checks to ensure consistency and integrity of the data.

Publishing Table Schema with tabular data sources provides this information. Table Schema has previously been used in work by Open Knowledge International (OKI) with Cabinet Office to check the validity of 25K fiscal data, according to publication guidelines. Table Schema is also used widely by other organisations working with public data, such as the Open Data Insititute (ODI).

User Need

I've written several user stories below. Each user story applies equally to a range of users. The user personas are as follows:

Developer: a user reusing public data in derived databases, visualisations, or data processing pipelines.
Business analyst: a user looking to public data as a source of information for analysis of business use cases that involve some component of public good.
Citizen: a non-technical user who expects government to publish consistent, high quality data.

User stories

As a user, I want all public data published by government to conform to a known schema, so I can use this information to validate the data.

As a user, I want public data published by government to have a schema, so I can read the schema and understand at a glance the type of information in the data, and the possibilities for reuse.

Expected Benefits

Vastly increased reuse of public data.
Increased trust in publication flows, generated by publication flows creating quality data.

Functional Needs

The functional needs that the proposal must address.

Report a Food Problem Open Standard

Create A Challenge

Title

Report a Food Problem Open Standard

Challenge Owner

Adam Locker, Data Architect at the Food Standards Agency.

Short Description

The FSA have a service where the public can report food problems, i.e. "I found a gnome in my soup" that sort of thing, providing details of the business, the issue and some limited details about themselves. The FSA then uses this information to work out the appropriate local authority who would investigate any problems and currently hand these off to the LA by email. We're currently in the process of rebuilding this service and we'd like to improve this by using a suitable open standard if possible.

User Need

FSA would benefit from more automated transfer of data to local authorities, preferably with the ability to receive or request updates. An open standard would also help LAs share food problem data better between them.

Expected Benefits

Reusing an open standard always preferable to creating a new one. Also, we could create a standard but it isn't really one without wide adoption. Why reinvent the wheel? Can you find us a wheel that works?

Functional Needs

Not too restrictive on the fields we can pass to LAs.

Add templates for issues

As per https://github.com/blog/2111-issue-and-pull-request-templates

Streaming multiple JSON documents

There is a common need to stream multiple JSON documents, in such a way that you process each document one at a time rather than loading the whole stream in to memory. JSON itself is unsuitable for this type of processing.

There is a useful wikipedia page on JSON Streaming.

There are a number of competing independently-reinvented "standards" in this space:

things which use the ASCII LF char as a separator:
- JSON Lines
- ndjson which can allow blank lines
things which don't use any delimiter at all, relying on the JSON parser to find where one document ends and the next begins
RFC 7464 which uses the ASCII RS (0x1e) char as a separator

The wikipedia page has a summary of existing use of each (of which, I'm most familiar with logstash and jq). I don't know if anyone uses RFC 7464 even if it has theoretical advantages - namely, that the RS byte cannot appear anywhere in a JSON byte stream.

Home Office Not Adopting ODF Internally

Summary

The Home Office is Not Adopting ODF Internally - despite published commitments to do so. GDS opens standards process has stopped chasing to ensure it happens.

https://www.gov.uk/government/publications/home-office-open-document-format-adoption-plan/home-office-open-document-format-adoption-plan

Problem

The Home Office, like many other central government departments, published plans to phase adoption of the mandatory ODF open standard.

Here are the plans: https://www.gov.uk/government/publications/home-office-open-document-format-adoption-plan/home-office-open-document-format-adoption-plan

Adoption of ODF was not just about giving citizens using government digital services a choice over document formats and software (i.e. not imposing and favouring Microsoft) but also about dismantling the internal lock-in. That's why the blogs and published plans were forced to include internal adoption.

The Home Office has made zero attempts to adopt ODF.

There is no senior understanding, nor prioritisation, of this published commitment.

There is no longer any push from GDS / Cabinet Office to ensure departments stick to their plans to phase adoption of ODF internally.

Result of Action 2 Open Standards Board Meeting 2015-11-13

Open Register for Addressing Discovery Project

What are the results of the discovery project initiated here:

https://github.com/alphagov/open-standards/blob/master/docs/_meetings/2015%E2%80%9011%E2%80%9013-open-standards-board.md#action-2

Standard hub text into readme

bring in the text from standards.data.gov.uk so that we can have all documentation in one place.

Updates on ODF & open standards for publishing

Just a quick update on the adoption of ODF (and other open formats) for publishing on GOVUK. Attached is a CSV showing how many open format and closed form documents have uploaded over the last two years.

Important notes:

Departments can "dual-publish" - some will upload an ODS spreadsheet and also an XLS version. Or an ODT and a PDF. This is driven by their customer need.
Open formats here include ODF, PDF, CSV, plain text & plain XML.
Closed formats here include docx, doc, xlsx, ppt, pptx and similar.

Total attachments

This shows the trends in publishing over the last two years. Each data point is how many attachments were published that month.

Observations

Publishers are increasing publishing directly as HTML. This has lead to the general decline in new attachments.
Open Formats are now more popular than closed formats.
In total, ~40% of all published documents are in an open format.

Open Vs Closed

This graph shows the top 50 Departments by number of attachments. There is a long-tail which is not included in this image, but which is included in the data. (Click the image for a larger version.)

Observations

Very few departments only publish closed format documents.
Most departments publish a mix - this may be due to dual publishing.
There are many departments who publish the majority of their content in open formats.

Conclusions

It is encouraging to see open formats gain in popularity - although there is still some way to go. We are making changes to the publishing process to make it clear that open formats must be published.

User feedback has generally been good - although some users with specific software needs still struggle with ODS.

Data

Here is a CSV of the data:
Open Vs Closed format attachments by organisation and month.csv.zip

How do we namespace / identify json representations?

When dealing with a range of currently internal json interchange formats (@ONSdigital) we're pondering how / if to namespace and identify them as we create new data formats. Our current internal examples are survey schema's (how do you define a survey at a structural level, enabling a representation of that survey , be it a web experience, a voice IVR or a mobile app) and responses from the collection of data.

We're considering java style (uk.gov.ons.<SYSTEM_NAME>.<FILE_TYPE>) or some sort of mime-type style that would allow us to reuse it in APIs to set accept headers and response headers.

In trying to find prior art, we haven't found much - be interested in any examples of adding such metadata to json files (specifically as some of this data will exist on large filesystems just as much as being served via API's)

For further context see this documentation PR here: ONSdigital/ons-schema-definitions#1

And comment on that https://github.com/ONSdigital/edc-documentation/pull/1/files#r55227838

Standards for downloading a patch to update a local copy of a dataset to latest

Are there any standards around bulk download and streaming of datasets? I'm thinking about git clone and git pull style operations for a dataset so I can download a dataset then, at a later time, fetch everything that's changed since.

Schema for data on election candidates and results - Challenge

Title

Schema for data on election candidates and results

Challenge Owner

Local Government Association - Tim Adams <Tim.Adams.local.gov.uk>

Short Description

A schema defining the structure data describing an election, the candidates and results. Initially the structure should define a CSV spreadsheet format so that data can be easily published and consumed by non-programmers as well as data experts. The data format should be suitable for first-past-the-post elections in which one or more candidates can be elected for an area.

User Need

The schema is needed particularly for local government elections for which candidates are declared and results published by a few hundred local authorities in different formats.

Although there is no statutory requirement to do so, local authorities generally publish local and national election results on their web sites once those results have been provided to them by the relevant returning officer. There is no guidance or common practice to publish such data in any particular style, format or web location other than the statutory requirement placed on the returning officer to give public notice of the name(s) of the elected candidate(s) (and the fact that they were duly elected), the total number of votes given to each candidate in a contested election and details of the rejected ballot papers as shown in the statement of rejected ballot papers.

Whilst this approach allows scrutiny and review by individuals who discover the local published web pages, the work to locate such information automatically on a larger scale and then to collate data from every local authority to create a national overview is difficult, labour intensive, time consuming and often error prone. Substantial savings and ease of data discovery and reuse is possible if electoral administration departments can be encouraged to publish their data to a simple consistent form which can be read by humans and machines. In May 2016, the Government published its revised National Action Plan for Open Government 2016-2018 and Commitment No. 7 proposes a move towards consistent publishing of elections data to facilitate improved citizen engagement, take-up and innovative re-use by analysts and app developers.

Expected Benefits

Wider outreach of candidates and results data in a timely manner by media outlets, consultancy and advisory networks, political parties and organisations and wider general take-up of the democratic process by data consumers and public and private app developers
Improved analysis of results across different areas
Reduced time and costs associated with comparing and analysing election results

Functional Needs

A schema that defines in human and machine readable format the structure of data describing an election including:

Statements of persons nominated in advance of the election
Election results with votes for each candidate, turnout and who was elected

It is necessary to be able to validate that elections data published:

is in the format given by the schema
uses consistent identifiers for such things as elected bodies, areas represented and political parties
such that multiple election datasets can be combines with meaningful results.

The CSV schema needs to be documented in a way that unambiguously allows data experts to extract spreadsheet data into a structured database format.

Work Completed To Date

The LGA has consulted stakeholders and developed a draft standard according to the iStandUK process for standards development. These are the reference documents:

Process For Standards Adoption (rev2) from iStandUK
The latest version of the schema specification: http://e-sd.org/vgTJ3
The launch news item that has guided information about approach: http://about.esd.org.uk/news/towards-common-standard-publishing-consistent-election-results
The Requirements Spec: http://e-sd.org/Yvkkb
The Landscape Spec: http://e-sd.org/U7AYg
The Approach Spec: http://e-sd.org/78z7Q
The summary of consultation 1: http://e-sd.org/fmcAY
The summary of consultation 2: http://e-sd.org/Rsr9V
The minutes of the EMS suppliers meeting: http://e-sd.org/jIfQR
The minutes of the Electoral Commission meeting: http://e-sd.org/2sGSu
Key submission 1 from Assoc of Electoral Administrators: http://e-sd.org/WyVdu
Key submission 2 from Assoc of electoral Administrators: http://e-sd.org/o2BjJ
The online Knowledge Group thread – (people have to register and sign in but its free): https://khub.net/group/localopendatacommunity/forum/-/message_boards/category/18957717
Sample elections data (manually modified) May 2016 local elections: http://e-sd.org/ygEKS
Open elections data validator tool: http://validator.opendata.esd.org.uk/electionresults
Open data URI search tool: http://uris.opendata.esd.org.uk/

JSON naming conventions

Copied from #9:

Naming Standards – adopting camelCase rather than snake_case or hypens. It would be useful to have a common messaging model so that we call the same elements the same things.

Refs:

Also I notice that Registers are using hyphens. @psd was there some specific thinking behind that?

TSV for arbitrary data

The IANA definition for tsv is great but somewhat limited. In particular it says:

Note that fields that contain tabs are not allowable in this encoding.

One of the benefits of TSV is that it is sufficiently simple that it can be processed by naive command-line utilities: splitting a TSV line on a tab character is more robust than splitting a CSV line on a comma, because CSV has all sorts of quoting rules.

However we may sometimes have a need to actually render data that contains tabs, newlines, and other such things. Is there a good way of doing this?