Coder Social home page Coder Social logo

openlawnz / openlawnz-research Goto Github PK

View Code? Open in Web Editor NEW
4.0 2.0 2.0 1.12 MB

Research centre for academics and legal professionals to export data and process bespoke data facets.

JavaScript 72.96% CSS 13.90% HTML 13.15%
research legaltech lawtech javascript html expressjs

openlawnz-research's People

Contributors

dependabot[bot] avatar mhcub3 avatar strum3nt avatar williamparry avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar  avatar

openlawnz-research's Issues

Broken state when all facets completed and not selected

If you complete all the facets and reload the page without a facet hash in the URL it will show a broken state for "Save facet as"

image

And the title is broken too.

The "Save facet as" and title of the facet above the PDF viewer should be hidden.

Function: "click, drag, add" relevant text to facet

When considering outcome variable, for example, can we add a function that lets the researcher select the relevant text that justifies their conclusion in relation to the facet. Ie, with the "Candy" decision, highlight text in last sentence of final paragraph ("claim for weekly compensation must fail"), and have that added to data for that facet.

May also need some way of recording whether claimant is appellant or respondent, although is this already recorded?

Handle variable page heights of PDFs

At the moment the height is calculated as if all PDF pages are the same. This means that in cases where they aren't, the overlay positioning is broken.

Optimise SQL to prevent database timeout

If you select all of the Fixed Columns on the Export page and choose to export all records to CSV it will compile a query that causes the DB server (RDS Postgres t3.medium) to timeout. This DB type should be able to handle reasonable load, so it's likely the query has to be optimised.

The code that compiles the query is here: https://github.com/openlawnz/openlawnz-research/blob/master/index.js#L394

Task:

Amend the queries and the code so that any combination of Fixed Columns, UGC Columns, and Keywords exist doesn't cause the server to crash.

Add mocks for front-end only development

Add the ability to pass in a mock flag that means fixed JSON responses and PDFs are returned to the client. This should help front-end developers get started without having to set things up.

Potential OCR issues, but potentially a keyword issue

in "Manawatu Knitting Mills" case (decision 2 of 1999), the word 'declined" in the final paragraph isn't detected by the parser in relation to the "was the appeal successful" facet. It's not clear if this is an issue with incomplete keywords or whether it is due to the text not being recognised.

Dark mode

Use @media (prefers-color-scheme: dark) {

Convert PDF viewer into a control, and polish

Make the PDF viewer standalone:

  • Clear up the code into the control being its own class that can handle itself being reset (when changing cases). Resetting will also reset the "Find in PDF" control
  • "Find in PDF" to handle clicking the "x" for clearing the input, and pressing "Esc" closes it
  • Add loading state (ideally with PDF download progress bar)
  • Fix dragging minimap behaviour so that if you move off the minimap while dragging it won't select elements on the page

pdfjs requires new matching algorithm

pdfjs splits the text into segments at random points and currently we are not yet accounting for inter segment matches. That means some results will not surface.

Tidy CSS

Across the board.

  • Naming consistency
  • Investigate using CSS variables

Add Project to case sets

Add another layer above so that case sets are tied to projects. This will involve adding another table and foreign key to the case sets.

Projects can have:

  • Title (e.g. My Research Project)
  • Lead (e.g. William)
  • Categories (optional, but can have several)
  • (case sets that can be regenerated)

This way we can have multiple projects each with their own case sets.

Webworker race conditions resulting in stale search bounding boxes

Description

When a user loads a PDF and selects the facet they wish to search by, the webworker processes the case and returns bounding boxes with highlighted words associated with that facet. If however the user changes cases before this is completed for the previous PDF, the bounding boxes can sometimes be stale from the old PDF as the web worker processing the new case data can return the bounding boxes before the previous web worker has finished, effectively overwriting the new PDF bounding boxes.

Proposed Solution

When a user changes their case selection, terminate the currently running web worker, or interrupt it to re-start processing the new PDF from the beginning, (cancelling work on the old one)

Add 404 handling for cases

If a case set is regenerated the old URL breaks. It should show a 404 page that looks up the current case set that the file belongs to and provides a link to it.

Add money facet type

Current facet types (for human refinement) are limited to

  • boolean
  • date

Task

Add a third facet type to store any dollar figures awarded in the judgment

Minimap highlighting should search for any '$' in the case text

Date of ACC decision under appeal facet

Occasionally, the court addresses multiple decisions within the same case + judgment. An example of this is Aalderink v Accident Compensation Compensation.

When this occurs, it may be useful to be able to enter multiple dates for the same facet.

Interested to discuss whether this will cause downstream problems for the data.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.