The all_sorns from gsa-tts

Govt banner expanded on secondary pages

This should be closed by default.

Create new page layout from Igor's design

Update Agency Names

Users are not recognizing the agency names as they are, expecting them to be named differently. We should use the more common version of the name
Ex: Department of Defense not Defense Department, Office of Personnel Management not Personnel Management Office

View more details button does the search again.

Start translating Igor's Designs into Code

What is our total number of SORNs?

Right now we are at 2000. That feels very low.
The Federal Register has way more, this screenshot shows search results that should be similar to our API calls. These are before we filter out matching, rulemaking, or implementation SORNs.

Based on latest user research, I think we may want to include the above, but not fully parse them. Just keep the basic info from the api and show links to them if the user wants to research more.

The low numbers we have can also be from errors in our code, we don't have good error handling set up yet. I've also noticed some SORNs where the title doesn't mention 'matching' or the other unwanted types, but it'll be detailed in the action section.

Talk to partners on what kinds of SORNs to include.
Filter out the unwanted sorns based on action as well as the title. Right now we only filter out on title.

Browse Mode

An alternative navigation model that allows privacy officers to look through SORNs without a keyword search

Visual Design
Implementation

consider using a different route for landing page, results, browse mode. Right now, in order to get to filters / section, must have a search term in the url.

Use the Federal Register API via http instead of the Ruby Gem

What

Replace the Federal Register Ruby Gem with plain http requests to the API.

Why

The Federal Register Ruby Gem isn't feature complete. The two main missing features are:

Filtering on TYPE: 'NOTICE' doesn't work. Our prototypes are full of non-notices.
The data filter only takes exact days, not years like the API does. This is useful for exploring the data.

How

We are already using the httparty to make other web requests, use it to make these API calls as well. They have an example of how to turn it into its own class. Do that or just have big long ugly urls, that works too.

System Numbers - Parsing and Appearance

look for system numbers in summary section if not found in system name to reduce number of unknowns.

Should we change the way we present unavailable SORNs?

Prep for onboarding a new engineer

UI for multiple agency search?

@igorkorenfeld I see this in Figma.

It looks like a combo-box meets a multi-selct meets a checkbox list.

USWDS doesn't have anything like that.

I do see a discussion where the USWDS team didn't get to any answers on it. uswds/uswds#22

I can hack something together, but I can't say it will be elegant or meet accessibility requirements.

To get some ideas of what is possible, ignoring accessibility needs, see https://harvesthq.github.io/chosen/

Find SORNs job improvements

What

Make the find_sorns_job into a production ready service.

As an engineer, I want a job that I can easily start and stop, that will grab all SORNs that we haven't already saved in our database. I want to be able to easily change the params (newest, oldest, etc) and be able to resume a run from where it last stopped.

I also want this job to keep teaching us, so it should report on the data it is finding or not finding. Old SORNs not having xml_urls for example.

Why

Until now we've been using it to learn more about what data the Federal Register has available and start looking as SORNs. I often will change the params and then run it to grab just a few handfuls of SORNs, then stop it from running.

To Do

SORN parsing errors

https://www.federalregister.gov/documents/full_text/xml/2017/05/03/2017-08950.xml
uses another HD for its system name title!

<HD SOURCE="HD2">SYSTEM NAME AND NUMBER:</HD>
<HD SOURCE="HD1">Department of Education Federal Docket Management System (EDFDMS) (18-09-05).</HD>

a11y: Bypass block (skip to main) is needed

Description:

There is no way for keyboard users to skip repetitive content on the page when navigating/loading new pages. Often, a mechanism like a "skip link" is added to allow keyboard-only users to jump easily to the main content in the page.

Screenshots:

Home page tab stops:

Repeated tab stops on search results page:

WCAG SC:

Success Criterion 2.4.1 Bypass Blocks (Level A): A mechanism is available to bypass blocks of content that are repeated on multiple Web pages.

Recommended fix:

Implement a "skip to main content" link per accessibility guidance offered in the USWDS Header component.

Additional resources:

Business Model Prototypes

We were explaining the business models to our partners and they wanted to see what we meant.

How to present the choices
How to present the prototypes

What

Build two versions of the service:

Cloud.gov database backed - Search across lots of data quickly.
Federalist backed - constrained by in browser search.

How

Add a bunch of data to one of the existing cloud.gov views. Be sure to show lots of different agencies in the results.
federalist
- Create a really big csv with the same data?
- Split into csv by agency?

Due

Send it to partners by Thursday, so they can review and talk about it in our call on Friday.

Sidebar

Include all existing search checkboxes, make them match design
agency select
Set up space for Search by Publication date

a11y: Search input is missing an accessible name/label

Description:

Form controls must have accessible names.

Screenshot:

WCAG SC:

Success Criterion 1.3.1 Info and Relationships (Level A): Information, structure, and relationships conveyed through presentation can be programmatically determined or are available in text.
Success Criterion 3.3.2 Labels or Instructions (Level A): Labels or instructions are provided when content requires user input.

Recommended fix:

The second-level heading (h2) above the search form would make for a good label for the search input.

Current code:

<h2>Search for SORNs by entering a keyword (will return exact matches)</h2>

Proposed code:

<label for="general-search">Search for SORNs by entering a keyword (will return exact matches)</label>

Alternate fix:

If the team would prefer to keep the h2 as-is, then the search input could be assigned an aria-label attribute with a concise value to serve as the input's label, like:

<input class="usa-input" id="general-search" type="search" name="search" value="<%= params[:search] %>" aria-label="search"></input>

Non-essential related code found:

There is a fieldset element wrapping the search input and button which is unnecessary. Fieldsets are typically used to represent "a set of form controls optionally grouped under a common name" (W3C HTML 5 Spec). If this is being used to attach CSS classes for presentational purposes, the div element may be a better choice here.

a11y: HTML needs `lang` attribute and value

Description:

The language of each page must be set so that text is presented correctly for assistive technologies and conventional browsers/user agents.

WCAG SC

Success Criterion 3.1.1 Language of Page (Level A): The default human language of each Web page can be programmatically determined.

Location of code:

/app/views/layouts/application.html.erb, line 2

Recommended fix:

Apply lang="en" to the html element in the application layout view as well as anywhere the html element may be rendered from.

P tags render as strings in erb templates

How many SORNs will we have total?

Figure out how many SORNs we can get total from the Federal Register API

We search on the phrase Privacy Act of 1974; System of Records.

We filter out SORNs that have these words in the title, because they don't seem relevant. 'matching', 'rulemaking', 'implementation'. We may reconsider 'Computer matching agreement' SORNs later.

Remove default routes and actions to crud sorns.

Look at all these routes we've still got available.

sorns_path	GET	/sorns(.:format)	
sorns#index

POST	/sorns(.:format)	
sorns#create

new_sorn_path	GET	/sorns/new(.:format)	
sorns#new

edit_sorn_path	GET	/sorns/:id/edit(.:format)	
sorns#edit

sorn_path	GET	/sorns/:id(.:format)	
sorns#show

PATCH	/sorns/:id(.:format)	
sorns#update

PUT	/sorns/:id(.:format)	
sorns#update

DELETE	/sorns/:id(.:format)	
sorns#destroy

We can get rid of these by dropping the resources :sorns line in the routes.rb file.

Cards

Show summary header matching Igor's design
Show results of selected columns, as it is in current demo

Filter validation?

Do we want to allow people to do searches that will have no results? Like no fields selected? Should we show a red validation warning or something?

User Research Plan for testing application

draft

a11y: Download icon missing alt attribute

Description:

The download icon on the search results page is missing the alt attribute.

Screenshot:

Location of code:

/app/views/sorns/search.html.erb, Line 40

WCAG SC

Success Criterion 1.1.1 Non-text Content (Level A): All non-text content that is presented to the user has a text alternative that serves the equivalent purpose, except for the situations listed below.
...

Decoration, Formatting, Invisible

If non-text content is pure decoration, is used only for visual formatting, or is not presented to users, then it is implemented in a way that it can be ignored by assistive technology.

Recommended fix:

Since this icon is decorative and part of the link that reads "Download results as a CSV file", the alt attribute, when added, can be blank (or null) so that assistive technology will ignore the image. The code for this should look something like this:

<%= image_tag("Download_Icon.svg", alt: "")%>

a11y: Repeated IDs // repeated agency names in search filter

Description:

There are duplicated id attributes for elements on the search results page. This appears to be caused by agency names that are doubled-up in the agencies listing of checkboxes to filter results by. Searching by "FedRAMP" will show this issue within results page.

Screenshot:

WCAG SC

Success Criterion 4.1.1 Parsing (Level A): In content implemented using markup languages, elements have complete start and end tags, elements are nested according to their specifications, elements do not contain duplicate attributes, and any IDs are unique, except where the specifications allow these features.

Recommended fix:

Ensure there are no duplicate IDs on the page. This will likely resolve itself if the agencies list does not contain duplicate agency names, as the id for each appear to be derived from the agency name.

SORN section snippets are untested

https://github.com/18F/all_sorns/blob/dc35cea8457957bfc4db7889227e5b9b78ee2e58/app/models/sorn.rb#L105

a11y: List contains elements that should be outside of the list

Description:

There is a div with a nav element (for pagination) as a direct child of the parent unordered list in the search results page. This could be confusing and/or problematic for screen reader users. Per Accessibility Insights, "<ul> and <ol> must only directly contain <li>, <script> or <template> elements. See more info here."

Screenshot:

WCAG SC:

Success Criterion 1.3.1 Info and Relationships (Level A): Information, structure, and relationships conveyed through presentation can be programmatically determined or are available in text.

Location of code:
/app/views/sorns/search.html.erb, Line 168

Recommended fix:

Move the following code to just after the closing </ul> element:

<div class="grid-offset-6 grid-col-6 margin-bottom-3">
  <%= paginate @sorns %>
</div>

System names

Currently, our system names look like

["Department of Homeland Security (DHS)/United States Secret Service (USSS)-001 Criminal Investigation Information System of Records."]

We are missing system names for 1997 out of 6467 SORNs.
We know different agencies publish SORNs differently

Todo

Separate system name and system number
Don't include agency name
Increase the number of system names we have

[Bug] Resolve "Found In" Blanks

The "Found In" section is coming up blank in search results

To replicate:

Open the tool and type in "Financial Audit" in the search box

AC

Any result displayed has a Found In section with at least one match highlighted
Added a test to detect this issue in the future

System name clean up bug

When we don't have a system_name, it throws an error.

from GoodJob(default) in 1058.5ms: NoMethodError (undefined method `join' for nil:NilClass):
   2020-11-17T09:49:59.25-0800 [APP/TASK/6d3e4936/0] OUT /home/vcap/app/app/models/sorn_xml_parser.rb:75:in `get_system_name'
   2020-11-17T09:49:59.25-0800 [APP/TASK/6d3e4936/0] OUT /home/vcap/app/app/models/sorn_xml_parser.rb:15:in `parse_xml'
   2020-11-17T09:49:59.25-0800 [APP/TASK/6d3e4936/0] OUT /home/vcap/app/app/models/sorn.rb:79:in `parse_xml'
   2020-11-17T09:49:59.25-0800 [APP/TASK/6d3e4936/0] OUT /home/vcap/app/app/jobs/parse_sorn_xml_job.rb:11:in `perform'

PR in #54

Search by agency won't show components?

Idempotent SORNs

What

Have our SORN data collection create single records of SORNs it finds. It needs to update the existing data as our parsing gets better and more refined.

Is there a unique identifier for the SORNs that we can use?

Why

My prototype versions just create duplicate SORNs on each run. I would wipe the database often. It is time for us to start keeping the data.

To Do

Find a unique identifier for the SORNs. Probably the Fed Reg number?
When our parsing gets better, update existing data.

Do something with the agency heirarchy.

DoD is the parent agency of the Air Force.
We should be able to return the SORNs of just the DoD and also the DoD and all its component agencies.

This one has multiple agencies)

We can use the relationships from the FedReg API.

     {
          "raw_name": "DEPARTMENT OF DEFENSE",
          "name": "Defense Department",
          "id": 103,
          "url": "https://www.federalregister.gov/agencies/defense-department",
          "json_url": "https://www.federalregister.gov/api/v1/agencies/103",
          "parent_id": null,
          "slug": "defense-department"
        },
        {
          "raw_name": "Department of the Air Force",
          "name": "Air Force Department",
          "id": 13,
          "url": "https://www.federalregister.gov/agencies/air-force-department",
          "json_url": "https://www.federalregister.gov/api/v1/agencies/13",
          "parent_id": 103,
          "slug": "air-force-department"
        }

Close uswds banner on subsequent pages

I think because of turbolinks impacting the uswds.min.js was need to revise how we pull in the uswds js

https://medium.com/@ekaterinalait_15121/a-guide-to-custom-and-third-party-javascript-with-rails-6-webpacker-and-turbolinks-9b36942b8789

CD github action not working

https://github.com/18F/all_sorns/actions/runs/368271274

[Next Feature] Linking SORNs via history citations for a-108 SORNs

analysis - what percentage will we be able to link?
How should we display it?
Is using FR citations the best way to do it?
How many rescindments / mods don't reference an original 'new' SORN
add FR code to automatically turn citations into links

Runs CFR parser regex, then goes back to see if we have the SORN or not. Currently, on pageload - turn into one-off command - or on ingest.

WIP in #43

Where to put pagination menu? attn: igor

[Bug] Filters Page Jump

The page jumps down when clicking on a filter that's below the scroll view.

To replicate:

https://all-sorns.app.cloud.gov/
Scroll down the list of section filters and click on "History"

AC

Clicking on any filter does not shift the scroll position of the page.

Add explanation on the search page

Add explanation on the search interface that search by number is just for a-108 compliant SORNs - and link back to about page for explanation.

String results clean-up

Design for Multi-agency search

add system spec for any js used.

Add short_names from FedReg API

Federal Register API has agency acronyms in this endpoint:

https://www.federalregister.gov/developers/documentation/api/v1#/Agencies/get_agencies

We should add them to our model - it will allow users to find agencies by acronym and resolve some of the confusion around unexpected agency name variants.

We need to get this data once per instance, it should be added to the deployment scripts.

GitHub Action Improvements

Separate the build_and_test and the deploy jobs in Github Actions.

The deploy job gets run all the time, but then bails if its not a push to main. This is scary.

Instead, have two different job files in our workflow folder, set the on action like:

build_and_test:

on: [ push, pull_request ]

deploy:

on:
  push:
    branches: [ main ]

https://docs.github.com/en/free-pro-team@latest/actions/reference/workflow-syntax-for-github-actions

App not updating nightly

Our app isn't updating nightly like it should be.

We have a command run in the middle of the night to look for new SORNs. Yet, our worker isn't awake then. We need to define a worker process to be waiting to do do work.

https://docs.cloudfoundry.org/devguide/deploy-apps/manifest-attributes.html#-processes

We had been starting GoodJob (our worker) by hand. It gets stopped whenever there is a deploy though (I assume).

On Heroku, this would double the monthly bill. I don't know how Cloud.gov charges.

Get Action from SORN content or API?

There are a few SORN fields that we can get from both the content of the SORN or from the API.

There are 920 SORNs that don't have an XML link for us to parse. They all have the text link though. The Federal Register API has done some text parsing for us and made those fields available by API

Action

I compared the Action we are parsing vs the Action available by the API. They are all the same. There are an additional 920 actions available by the API. Let use the api_action instead of our parsed actions.

Dates

Same as above with Dates.

Validating Data Quality of govinfo PAIs

Set up conversation with GPO.

The GPO publishes Privacy Act Issuances. They are packaged as one big XML file containing all agency SORNs to date, and in a structured format. This would be useful, but is the data reliable?

Questions to answer:

Are the initial publication dates accurate (in the <previouslyPublished> section)?

Appears so. They are correct for all of GSAs SORNs published since 2015.

Do the SORNs change at all from bundle to bundle to reflect modifications?

Appears so. For example GSA childcare-1. Modified in 2008 to add breach routine use (h.). In 2007 compilation, childcare-1 has a - g, in 2019 compilation, it has a - h.
Validate with other modified SORNs

Does the most current bundled version match the most current version in the federal register?

Do the bundled versions contain SORNs that have been rescinded?

No. For example GSA's GOVT-5 was rescinded in 2016 but still appears in 2019 compilation.

gsa-tts / all_sorns Goto Github PK

all_sorns's People

Contributors

Stargazers

Watchers

Forkers

all_sorns's Issues

What

Why

How

What

Why

To Do

Description:

Screenshots:

WCAG SC:

Recommended fix:

Additional resources:

What

How

Due

Description:

Screenshot:

WCAG SC:

Recommended fix:

Current code:

Proposed code:

Alternate fix:

Non-essential related code found:

Description:

WCAG SC

Location of code:

Recommended fix:

Description:

Screenshot:

Location of code:

WCAG SC

Decoration, Formatting, Invisible

Recommended fix:

Description:

Screenshot:

WCAG SC

Recommended fix:

Description:

Screenshot:

WCAG SC:

Recommended fix:

Todo

What

Why

To Do

Action

Dates

Recommend Projects

Recommend Topics

Recommend Org