Coder Social home page Coder Social logo

gsa-tts / all_sorns Goto Github PK

View Code? Open in Web Editor NEW
6.0 9.0 8.0 2.92 MB

Repo for SORN DASH

Home Page: https://all-sorns.app.cloud.gov

License: Other

Ruby 67.53% JavaScript 5.22% SCSS 4.45% HTML 21.62% Shell 1.18%
privacy government government-data 18f rails

all_sorns's People

Contributors

dependabot[bot] avatar dtaylor-edc avatar folksgl avatar hartsick avatar igorkorenfeld avatar mogul avatar natashajibrahim avatar ondrae avatar peterrowland avatar rahearn avatar snyk-bot avatar zachmargolis avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

all_sorns's Issues

Update Agency Names

  • Users are not recognizing the agency names as they are, expecting them to be named differently. We should use the more common version of the name

  • Ex: Department of Defense not Defense Department, Office of Personnel Management not Personnel Management Office

What is our total number of SORNs?

Right now we are at 2000. That feels very low.
The Federal Register has way more, this screenshot shows search results that should be similar to our API calls. These are before we filter out matching, rulemaking, or implementation SORNs.
Screen Shot 2020-10-13 at 10 45 01 AM

Based on latest user research, I think we may want to include the above, but not fully parse them. Just keep the basic info from the api and show links to them if the user wants to research more.

The low numbers we have can also be from errors in our code, we don't have good error handling set up yet. I've also noticed some SORNs where the title doesn't mention 'matching' or the other unwanted types, but it'll be detailed in the action section.

  • Talk to partners on what kinds of SORNs to include.
  • Filter out the unwanted sorns based on action as well as the title. Right now we only filter out on title.

Browse Mode

An alternative navigation model that allows privacy officers to look through SORNs without a keyword search

  • Visual Design
  • Implementation

consider using a different route for landing page, results, browse mode. Right now, in order to get to filters / section, must have a search term in the url.

Use the Federal Register API via http instead of the Ruby Gem

What

Replace the Federal Register Ruby Gem with plain http requests to the API.

Why

The Federal Register Ruby Gem isn't feature complete. The two main missing features are:

  • Filtering on TYPE: 'NOTICE' doesn't work. Our prototypes are full of non-notices.
  • The data filter only takes exact days, not years like the API does. This is useful for exploring the data.

How

We are already using the httparty to make other web requests, use it to make these API calls as well. They have an example of how to turn it into its own class. Do that or just have big long ugly urls, that works too.

System Numbers - Parsing and Appearance

look for system numbers in summary section if not found in system name to reduce number of unknowns.

Should we change the way we present unavailable SORNs?

UI for multiple agency search?

@igorkorenfeld I see this in Figma.
Screen Shot 2020-11-25 at 4 09 35 PM

It looks like a combo-box meets a multi-selct meets a checkbox list.

USWDS doesn't have anything like that.

I do see a discussion where the USWDS team didn't get to any answers on it. uswds/uswds#22

I can hack something together, but I can't say it will be elegant or meet accessibility requirements.

To get some ideas of what is possible, ignoring accessibility needs, see https://harvesthq.github.io/chosen/

Find SORNs job improvements

What

Make the find_sorns_job into a production ready service.

As an engineer, I want a job that I can easily start and stop, that will grab all SORNs that we haven't already saved in our database. I want to be able to easily change the params (newest, oldest, etc) and be able to resume a run from where it last stopped.

I also want this job to keep teaching us, so it should report on the data it is finding or not finding. Old SORNs not having xml_urls for example.

Why

Until now we've been using it to learn more about what data the Federal Register has available and start looking as SORNs. I often will change the params and then run it to grab just a few handfuls of SORNs, then stop it from running.

To Do

  • Tests
  • Report errors
  • Save metadata
  • Change params easily
  • Be able to restart the job from where it last stopped

a11y: Bypass block (skip to main) is needed

Description:

There is no way for keyboard users to skip repetitive content on the page when navigating/loading new pages. Often, a mechanism like a "skip link" is added to allow keyboard-only users to jump easily to the main content in the page.

Screenshots:

Home page tab stops:
screen shot of tab stops on home page

Repeated tab stops on search results page:
screen shot of repetitive tab stops on search results page

WCAG SC:

Success Criterion 2.4.1 Bypass Blocks (Level A): A mechanism is available to bypass blocks of content that are repeated on multiple Web pages.

Recommended fix:

Implement a "skip to main content" link per accessibility guidance offered in the USWDS Header component.

Additional resources:

Business Model Prototypes

We were explaining the business models to our partners and they wanted to see what we meant.

  • How to present the choices
  • How to present the prototypes

What

Build two versions of the service:

  1. Cloud.gov database backed - Search across lots of data quickly.
  2. Federalist backed - constrained by in browser search.

How

  • Add a bunch of data to one of the existing cloud.gov views. Be sure to show lots of different agencies in the results.
  • federalist
    • Create a really big csv with the same data?
    • Split into csv by agency?

Due

Send it to partners by Thursday, so they can review and talk about it in our call on Friday.

Sidebar

  • Include all existing search checkboxes, make them match design
  • agency select
  • Set up space for Search by Publication date

a11y: Search input is missing an accessible name/label

Description:

Form controls must have accessible names.

Screenshot:

screenshot of unlabeled search input

WCAG SC:

Recommended fix:

The second-level heading (h2) above the search form would make for a good label for the search input.

Current code:

<h2>Search for SORNs by entering a keyword (will return exact matches)</h2>

Proposed code:

<label for="general-search">Search for SORNs by entering a keyword (will return exact matches)</label>

Alternate fix:

If the team would prefer to keep the h2 as-is, then the search input could be assigned an aria-label attribute with a concise value to serve as the input's label, like:

<input class="usa-input" id="general-search" type="search" name="search" value="<%= params[:search] %>" aria-label="search"></input> 

Non-essential related code found:

There is a fieldset element wrapping the search input and button which is unnecessary. Fieldsets are typically used to represent "a set of form controls optionally grouped under a common name" (W3C HTML 5 Spec). If this is being used to attach CSS classes for presentational purposes, the div element may be a better choice here.

a11y: HTML needs `lang` attribute and value

Description:

The language of each page must be set so that text is presented correctly for assistive technologies and conventional browsers/user agents.

WCAG SC

Success Criterion 3.1.1 Language of Page (Level A): The default human language of each Web page can be programmatically determined.

Location of code:

/app/views/layouts/application.html.erb, line 2

Recommended fix:

Apply lang="en" to the html element in the application layout view as well as anywhere the html element may be rendered from.

How many SORNs will we have total?

Figure out how many SORNs we can get total from the Federal Register API

We search on the phrase Privacy Act of 1974; System of Records.

We filter out SORNs that have these words in the title, because they don't seem relevant. 'matching', 'rulemaking', 'implementation'. We may reconsider 'Computer matching agreement' SORNs later.

Remove default routes and actions to crud sorns.

Look at all these routes we've still got available.

sorns_path	GET	/sorns(.:format)	
sorns#index

POST	/sorns(.:format)	
sorns#create

new_sorn_path	GET	/sorns/new(.:format)	
sorns#new

edit_sorn_path	GET	/sorns/:id/edit(.:format)	
sorns#edit

sorn_path	GET	/sorns/:id(.:format)	
sorns#show

PATCH	/sorns/:id(.:format)	
sorns#update

PUT	/sorns/:id(.:format)	
sorns#update

DELETE	/sorns/:id(.:format)	
sorns#destroy

We can get rid of these by dropping the resources :sorns line in the routes.rb file.

Cards

  • Show summary header matching Igor's design
  • Show results of selected columns, as it is in current demo

Filter validation?

Do we want to allow people to do searches that will have no results? Like no fields selected? Should we show a red validation warning or something?

a11y: Download icon missing alt attribute

Description:

The download icon on the search results page is missing the alt attribute.

Screenshot:

screenshot of download icon with missing alt

Location of code:

/app/views/sorns/search.html.erb, Line 40

WCAG SC

Success Criterion 1.1.1 Non-text Content (Level A): All non-text content that is presented to the user has a text alternative that serves the equivalent purpose, except for the situations listed below.
...

Decoration, Formatting, Invisible

If non-text content is pure decoration, is used only for visual formatting, or is not presented to users, then it is implemented in a way that it can be ignored by assistive technology.

Recommended fix:

Since this icon is decorative and part of the link that reads "Download results as a CSV file", the alt attribute, when added, can be blank (or null) so that assistive technology will ignore the image. The code for this should look something like this:

<%= image_tag("Download_Icon.svg", alt: "")%>

a11y: Repeated IDs // repeated agency names in search filter

Description:

There are duplicated id attributes for elements on the search results page. This appears to be caused by agency names that are doubled-up in the agencies listing of checkboxes to filter results by. Searching by "FedRAMP" will show this issue within results page.

Screenshot:

screenshot of duplicate filter agency names with IDs also duplicated

WCAG SC

Success Criterion 4.1.1 Parsing (Level A): In content implemented using markup languages, elements have complete start and end tags, elements are nested according to their specifications, elements do not contain duplicate attributes, and any IDs are unique, except where the specifications allow these features.

Recommended fix:

Ensure there are no duplicate IDs on the page. This will likely resolve itself if the agencies list does not contain duplicate agency names, as the id for each appear to be derived from the agency name.

a11y: List contains elements that should be outside of the list

Description:

There is a div with a nav element (for pagination) as a direct child of the parent unordered list in the search results page. This could be confusing and/or problematic for screen reader users. Per Accessibility Insights, "<ul> and <ol> must only directly contain <li>, <script> or <template> elements. See more info here."

Screenshot:

screenshot of list with problematic children

WCAG SC:

Success Criterion 1.3.1 Info and Relationships (Level A): Information, structure, and relationships conveyed through presentation can be programmatically determined or are available in text.

Location of code:
/app/views/sorns/search.html.erb, Line 168

Recommended fix:

Move the following code to just after the closing </ul> element:

<div class="grid-offset-6 grid-col-6 margin-bottom-3">
  <%= paginate @sorns %>
</div>

System names

Currently, our system names look like

["Department of Homeland Security (DHS)/United States Secret Service (USSS)-001 Criminal Investigation Information System of Records."]

  • We are missing system names for 1997 out of 6467 SORNs.
  • We know different agencies publish SORNs differently

Todo

  • Separate system name and system number
  • Don't include agency name
  • Increase the number of system names we have

[Bug] Resolve "Found In" Blanks

The "Found In" section is coming up blank in search results

image

To replicate:

  • Open the tool and type in "Financial Audit" in the search box

AC

  • Any result displayed has a Found In section with at least one match highlighted
  • Added a test to detect this issue in the future

System name clean up bug

When we don't have a system_name, it throws an error.

from GoodJob(default) in 1058.5ms: NoMethodError (undefined method `join' for nil:NilClass):
   2020-11-17T09:49:59.25-0800 [APP/TASK/6d3e4936/0] OUT /home/vcap/app/app/models/sorn_xml_parser.rb:75:in `get_system_name'
   2020-11-17T09:49:59.25-0800 [APP/TASK/6d3e4936/0] OUT /home/vcap/app/app/models/sorn_xml_parser.rb:15:in `parse_xml'
   2020-11-17T09:49:59.25-0800 [APP/TASK/6d3e4936/0] OUT /home/vcap/app/app/models/sorn.rb:79:in `parse_xml'
   2020-11-17T09:49:59.25-0800 [APP/TASK/6d3e4936/0] OUT /home/vcap/app/app/jobs/parse_sorn_xml_job.rb:11:in `perform'

PR in #54

Idempotent SORNs

What

Have our SORN data collection create single records of SORNs it finds. It needs to update the existing data as our parsing gets better and more refined.

Is there a unique identifier for the SORNs that we can use?

Why

My prototype versions just create duplicate SORNs on each run. I would wipe the database often. It is time for us to start keeping the data.

To Do

  • Find a unique identifier for the SORNs. Probably the Fed Reg number?
  • When our parsing gets better, update existing data.

Do something with the agency heirarchy.

  • DoD is the parent agency of the Air Force.
  • We should be able to return the SORNs of just the DoD and also the DoD and all its component agencies.

This one has multiple agencies)

We can use the relationships from the FedReg API.

     {
          "raw_name": "DEPARTMENT OF DEFENSE",
          "name": "Defense Department",
          "id": 103,
          "url": "https://www.federalregister.gov/agencies/defense-department",
          "json_url": "https://www.federalregister.gov/api/v1/agencies/103",
          "parent_id": null,
          "slug": "defense-department"
        },
        {
          "raw_name": "Department of the Air Force",
          "name": "Air Force Department",
          "id": 13,
          "url": "https://www.federalregister.gov/agencies/air-force-department",
          "json_url": "https://www.federalregister.gov/api/v1/agencies/13",
          "parent_id": 103,
          "slug": "air-force-department"
        }

[Next Feature] Linking SORNs via history citations for a-108 SORNs

  • analysis - what percentage will we be able to link?
  • How should we display it?
  • Is using FR citations the best way to do it?
  • How many rescindments / mods don't reference an original 'new' SORN
  • add FR code to automatically turn citations into links

Runs CFR parser regex, then goes back to see if we have the SORN or not. Currently, on pageload - turn into one-off command - or on ingest.

WIP in #43

[Bug] Filters Page Jump

The page jumps down when clicking on a filter that's below the scroll view.

To replicate:

  1. https://all-sorns.app.cloud.gov/
  2. Scroll down the list of section filters and click on "History"

AC

  • Clicking on any filter does not shift the scroll position of the page.

Add explanation on the search page

Add explanation on the search interface that search by number is just for a-108 compliant SORNs - and link back to about page for explanation.

App not updating nightly

Our app isn't updating nightly like it should be.

We have a command run in the middle of the night to look for new SORNs. Yet, our worker isn't awake then. We need to define a worker process to be waiting to do do work.

https://docs.cloudfoundry.org/devguide/deploy-apps/manifest-attributes.html#-processes

We had been starting GoodJob (our worker) by hand. It gets stopped whenever there is a deploy though (I assume).

On Heroku, this would double the monthly bill. I don't know how Cloud.gov charges.

Get Action from SORN content or API?

There are a few SORN fields that we can get from both the content of the SORN or from the API.

There are 920 SORNs that don't have an XML link for us to parse. They all have the text link though. The Federal Register API has done some text parsing for us and made those fields available by API

Action

I compared the Action we are parsing vs the Action available by the API. They are all the same. There are an additional 920 actions available by the API. Let use the api_action instead of our parsed actions.

Dates

Same as above with Dates.

Validating Data Quality of govinfo PAIs

  • Set up conversation with GPO.

The GPO publishes Privacy Act Issuances. They are packaged as one big XML file containing all agency SORNs to date, and in a structured format. This would be useful, but is the data reliable?

Questions to answer:

Are the initial publication dates accurate (in the <previouslyPublished> section)?

  • Appears so. They are correct for all of GSAs SORNs published since 2015.

Do the SORNs change at all from bundle to bundle to reflect modifications?

  • Appears so. For example GSA childcare-1. Modified in 2008 to add breach routine use (h.). In 2007 compilation, childcare-1 has a - g, in 2019 compilation, it has a - h.
  • Validate with other modified SORNs

Does the most current bundled version match the most current version in the federal register?

Do the bundled versions contain SORNs that have been rescinded?

  • No. For example GSA's GOVT-5 was rescinded in 2016 but still appears in 2019 compilation.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.