frabcus / judgmental Goto Github PK
View Code? Open in Web Editor NEWUK case law
Home Page: http://judgmental.org.uk/
UK case law
Home Page: http://judgmental.org.uk/
From fjmd1 (Francis Davey ):
@Judgmentals It would be great to add a contact email so I can ask you to remove the annoying white space at the LHS
What's the name of the technique for adding conditionals into your CSS so that the format can change depending on the page width? We can use that to remove the left nav column (which is a bit empty right now anyway) on narrower displays.
These indexes don't exist right now:
NICC
UKFTT-HESC
UKFTT-TC
UKIAT
UKSSCSC
Similar to issue 1, we could identify all mentions of Acts of Parliament and link to them on legislation.gov.uk.
Tricky enhancement: if "clause X" is mentioned in the same sentence or nearby, link to the correct clause, not just the whole Act.
Bailii's files contain lots of links. We should read them and process them:
feedparser is needed for legislation.py. It's kludged in at the moment but should be properly installed.
Many (approx 1300) judgments aren't being processed since best_filename complains that there is 'no good citation'.
Many judgments have random ASCII control codes in them. This causes problems for lxml (more precisely for the lxml version on the server) and results in judgments without any text (for example, http://www.judgmental.org.uk/judgments/_uk_cases_CAT_2005_11.html).
A few judgments raise errors during the analysis stage complaining that the extracted citation is not unique.
This may be because of duplicated judgments, or because of bugs still remaining in the citation extraction code.
It would be extremely useful if judgments were tagged with keywords and categories, as has been done with PCC complaints data here: http://complaints.pccwatch.co.uk/
So a user could, with ease, view all 'defamation' or 'copyright' cases and filter by judge name, date, court, party name etc.
The extraction of neutral citations from a judgment is a little buggy and also needs a better strategy for extracting citations from titles of judgments where necessary.
Lots of cases mention corporations in their title, and perhaps in their body.
Sainsbury's Supermarkets Ltd, R (on the application of) v Wolverhampton City Council & Anor [2010] UKSC 20 (12 May 2010)
Gold Group Properties Ltd v BDW Trading Ltd [2010] EWHC 1632 (TCC) (01 July 2010)
We should get together with OpenCorporates to name match those, so you can easily find a list of cases about one company, and so you can hyperlink to more info about the company the other way.
We don't have a license file at the moment, so it isn't licensed. Github doesn't have one by default, they even allow hosting of non-open source but "source visible" code.
Although we've gone to some length to deal with character encoding problems, some files simply declare their character encoding incorrectly and/or are not encoded with an encoding known to humans.
We could just be lazy, and throw annotations on every judgment using Disqus.
Very quick and easy to do, some people will already have accounts, user experience is good. Will see if we get valuable comments.
Raised by frabcus on-list: https://groups.google.com/d/msg/judgmental/esZDc1VEGv8/rUTVLMnfJLYJ
The obvious <h1>
of case pages are currently the names of the courts, not the cases, which isn't very intuitive.
From @okfn (okfn)
Mentioned to Nick Bull of @Judgmentals - http://annotateit.org/ (Still in beta). Could be interesting tool for [Judgmental].
At present we need to rebuild the entire site every time we want to make a small modification to pages but not judgmentts, eg. the menu, title, footer, google analytics etc. This is obviously rubbish and we should probably instead generate a marked up html fragment for judgments and use some server side inclusion.
Worth doing perhaps to encourage it to index everything.
Would seem a shame not to be sarky there.
http://www.judgemental.org.uk/judgments/ScotHC/
Latest judgments listed as 2012. Definitely wrong, don't know why!
Francis D really wants good RSS feeds of judgments.
So really ideally by court, by search term, and that kind of thing.
Full content RSS vital - so can actually read it all in news reader.
From @jpstacey (J-P Stacey)
@judgmentals worth a "How can I help?" section on your homepage?
Changes to the code/original judgments could result in some judgments being moved to different locations. We should add links or redirects in this case since someone may have linked to the old location.
This is probably unlikely to occur in practice, but it should be easy to take care of.
By now, the interface to the formatter provided by run.py has a set of command-line options that is becoming bewilderingly numerous and also annoying to use. But on the other hand, these options actually form only a small part of all the options one could want.
Chris and I feel that an interactive launcher would be more helpful. But what technology should it use? Here are the possibilities I can think of:
Any suggestions?
HUDOC is the database of judgments from the European Court of Human Rights.
http://www.echr.coe.int/ECHR/EN/Header/Case-Law/HUDOC/HUDOC+database/
There is no robots.txt, and no conditions of use that I can find.
Update: there is one : "The information and texts available on the Court’s site may be reproduced provided the source is acknowledged. Users should nevertheless be aware that certain information and texts may be protected under intellectual property law, in particular by copyright."
And up the crawl rate to the maximum. This will get us indexed well sooner.
From @A_Ecclestone (Andrew Ecclestone)
@Judgmentals Why don't judgments in Judgmental have paragraph numbers? Isn't that needed for citation and intra-judgment references?
This is because the paragraph numbers are inside LI tags, which get stripped as part of our cleanup operation.
You'd go to a judgment on BAILII and press a button to add it to judgmental.
That wouldn't need screen scraping, but we need to consider the legal issues a bit more before doing it.
From @gwire (Lee Maguire):
@judgmentals any plans to add relevant hyperlinks to [legislation.gov.uk or] the CPR references? http://bit.ly/iKQcno
From @adreagui (Guillaume Adreani )
@Judgmentals Are you compatible with Zotero ? http://www.zotero.org
Usability feature: Zotero allows users to capture the citation of the page in one click. Not sure what's needed on our side to support this.
The Scottish Sheriff Court has issued some judgments in Comic Sans, eg:
http://judgmental.org.uk/judgments/ScotSC/2008/[2008]%20ScotSC%2034.html
This choice of font perhaps doesn't reflect the full might and majesty of the judicial process. Or at any rate looks a bit messy on our site.
It's probably easier to use !important in the CSS to override it than to find and remove instances of
.
May as well, for people who want to use it locally.
Add a search, use a Google Custom site: search.
That's easier and quicker to implement, and also makes sure we do all we can to get every judgment in Google.
Currently, the HTML we generate does not validate; validation is clearly a desirable aim.
If we have to abort processing any file, for example in many cases if exceptions of various sorts are raised, a message is written to a log.
It is well worth increasing the quality of those messages to provide more information about what's going wrong.
Obviously, we're limited by what runs on the server. But it would be nice to take advantage of more modern Python features. Benefits include the following:
In convert.py, we (I) rather dumbly pass the court name to best_filename and then use Levenshtein to find the abbreviated name. This is done for every judgment.
We should put the abbreviated names into the courts table instead.
Raised by frabcus on-list: https://groups.google.com/d/msg/judgmental/esZDc1VEGv8/rUTVLMnfJLYJ
Not sure exactly why this happens, but might be the result of the crawler's logic given that we have the court name, not the case name, as the page's main <h1>
.
Will have to see if the situation improves upon solving gh#39 (the title thing) as well as gh#21 (sitemap.xml).
We need to better utilise the metadata we have collected.
There are several things we have which we are not using at all:
And there are probably prettier and better ways to display the things we are using.
Once we have nice indexing pages (an index for each court, etc), we can use the metadata to generate URLs to the appropriate page.
What we need, really, is an improved HTML template for the page.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.