justicehub-in / justice-hub-docs Goto Github PK

View Code? Open in Web Editor NEW

2.0 2.0 0.0 9.4 MB

Documentation platform for the Justice Hub

Home Page: https://docs.justicehub.in/

CSS 5.23% HTML 73.57% JavaScript 11.73% TeX 0.22% Jupyter Notebook 1.61% Shell 0.08% Python 0.20% R 3.61% SCSS 3.75%

docs hugo-academic justice legal open-data platform

justice-hub-docs's People

Contributors

Stargazers

Watchers

justice-hub-docs's Issues

Create a documention portal for stop 66A

Create a format to curate datasets within an organisation

How to curate datasets within an organisation

Organisations that work across sectors and teams and don't necessarily maintain a data catalog, will have to collect information on what datasets have the potential to be made available on the Justice Hub. A few important fields to identify such datasets are:

Organisation
Dataset Title
Dataset description
Sector
How was the data sourced (RTI's, Web Scraping, etc.)
Availability of raw data (different from processed data)
Availability of data dictionary
Importance (How frequently is the dataset used for research use-cases)
Is the data still maintained
Maintainer email

Add documentaiton on how to search for datasets on the Justice Hub

justicehub-in/ckanext-justicehub_theme#71

Create a page for Open Data Pledge

Cleanup Docs

To track the status of things to do to clean-up the docs website before the alpha launch:

Homepage

Change language - Text should match the narrative at JusticeHub
Change notification content
Redirect Get in touch to JH Contact Page
Way to include substack

Nav Bar

Resources

Update content

Contract Enforcement Litigation Data from District Courts | NIPFP

Data Accessibility Report

Data standardisation

Can the columns with type as HTML string be stored as separate tables
Marking columns as either directly sourced from the source or user-generated E.g : court_code , complexcode, day_pending, etc. in the data dictionary
Only 76 out of the total 86 columns are present in the data dictionary

PII variables present in file

Data License

Mention license[s] under which the data is to be released on the Justice Hub

Other files required (if, available)

Raw Data
Data processing document

Questions

How does the dataset deal with empty values ?

Curate a list of justice and legal data platforms

Update partner curation page

Change URL from partner-curation to partners
Add status of onboarding
Add status of open data pledge
Share links to open data pledge
Add quotes from partners
Add social media links

Add adviosry board members to the Team Page

Data points to curate

Parliament session wise questions for Law and Justice
High Court data released by Daksh

Curate a FAQ section for the website

Data report | Judiciary Expenditure | CBGA

Data Accessibility Report

Files available

Data dictionary (Human readable dictionary of data contents)
Data License (How to use and share the data)
Raw Dataset (The original/first data provided)
Processed Dataset (Final data used in analysis)
Dataset README (A Human readable description of the data)
Citation (How you want your data to be cited)

Data Cleaning & Standardisation Report

Presence of PII's (Personally Identifiable Information)
Data to be uploaded is in a machine-readable format (CSV, JSON)

Other details

Data maintainer details

Comments

Data report | Judicial Vacancies in India | Vidhi

Data Accessibility Report

Links
Sample Dataset
Data Documentation
Data Dictionary

Files available

Files	Status
Data dictionary (Human readable dictionary of data contents)	❌
Data License (How to use and share the data)	❌
Raw Dataset (The original/first data provided)	❌
Processed Dataset (Final data used in analysis)	✅
Dataset README (A Human readable description of the data)	✅
Citation (How you want your data to be cited)	❌

Data Cleaning & Standardisation Report

Issue	Status
Data does not have any PII's (Personally Identifiable Information)	✅
Data to be uploaded is in a machine-readable format (CSV, JSON)	✅

Other details

Data maintainer details

Comments/Next Steps:

Please mention if raw data reports (PDF's - those scraped from the DoJ) are available ?
Mention the date of data collection/publication ? When was this dataset released on the JALDI portal
Share the process of updating the datasets for all levels of the Judiciary - Mention details like Frequency, methodology, etc
- The district court dataset is available for 2017 and 2019, while the datasets for Supreme Court and High Court are only present for 2019, would you like to share any specific reasons for this ?
- Would you like to share the datasets for every year and maintain them as individual files, or there will be one master for each court which will be updated periodically ?
Are there any variables that are not directly available from the reports but calculated by the team (derived variables) ?
Some files are shared as XLS(x) files while some are shared as CSV. Please share all files as CSV files to make this dataset more accessible

❗ Important:

Please share a link to the data dictionary (This is a CSV file which contains information about the columns present in all files under a dataset). Learn more
Mention the license under which this dataset is to be released on the JusticeHub. Please refer to this link for learning more about open data licenses

📈 Improving data accessibility:

If possible, share all files listed under the Files available section above.
Share the data as CSV files.
Include a README file which is short description about the dataset. Refer here, to know more

Data report | Supreme Court workload | Nick Robinson

Data Accessibility Report

Links
Sample Dataset
Data Documentation
Data Dictionary

Files available

Files	Status
Data dictionary (Human readable dictionary of data contents)	❌
Data License (How to use and share the data)	❌
Raw Dataset (The original/first data provided)	✅
Processed Dataset (Final data used in analysis)	❌
Dataset README (A Human readable description of the data)	❌
Citation (How you want your data to be cited)	❌

Data Cleaning & Standardisation Report

Issue	Status
Data does not have any PII's (Personally Identifiable Information)	✅
Data to be uploaded is in a machine-readable format (CSV, JSON)	✅

Other details

Data maintainer details

Comments/Next Steps:

❗ Important:

Please share a link to the data dictionary (This is a CSV file which contains information about the columns present in all files under a dataset). Learn more
Mention the license under which this dataset is to be released on the JusticeHub. Please refer to this link for learning more about open data licenses

📈 Improving data accessibility:

If possible, share all files listed under the Files available section above.
Share the data as CSV files.
Include a README file which is short description about the dataset. Refer here, to know more

Data report | Contract Enforcement Litigation Data from District Courts | NIPFP

Data Accessibility Report

Links
Sample Dataset
Data Documentation
Data Dictionary

Files available

Files	Status
Data dictionary (Human readable dictionary of data contents)	✅
Data License (How to use and share the data)	❌
Raw Dataset (The original/first data provided)	✅
Processed Dataset (Final data used in analysis)	❌
Dataset README (A Human readable description of the data)	❌
Citation (How you want your data to be cited)	❌

Data Cleaning & Standardisation Report

Issue	Status
Data does not have any PII's (Personally Identifiable Information)	❌
Data to be uploaded is in a machine-readable format (CSV, JSON)	✅

Other details

Data maintainer details

Comments/Next Steps:

Can the columns with type as HTML string be stored as separate tables
Mark columns as either directly sourced from the source (raw/original) or derived/user-generated in the data dictionary. E.g. columns court_code , complexcode, day_pending, etc. can be marked as derived
Only 76 out of the total 86 columns are present in the data dictionary
How does the dataset deal with empty values ? Is it different for all individual columns? This information for each column, can be included in the data dictionary as well
Variables with personally identifiable information (PII's) (As per our data sharing policy, we are not uploading any datasets with sensitive information either about communities (CII's) or individuals):

File	Variable
sample_dataframe	petNameAdd
sample_dataframe	pet_adv
sample_dataframe	pet_name
sample_dataframe	petnameadArr
sample_dataframe	petparty_name
sample_dataframe	resNameAdd
sample_dataframe	res_adv
sample_dataframe	res_name
sample_dataframe	resparty_name

❗ Important:

Anonymise sensitive information. To do this, columns with PII's listed above can be removed from the original dataset
Mention the license under which this dataset is to be released on the JusticeHub. Please refer to this link for learning more about open data licenses

📈 Improving data accessibility:

If possible, share all files listed under the Files available section above.
Include a README file which is short description about the dataset. Refer here, to know more

Data report | Company Registration Data | Veratech

Data Accessibility Report

Links
Sample Dataset
Data Documentation
Data Dictionary

Files available

Files	Status
Data dictionary (Human readable dictionary of data contents)	✅
Data License (How to use and share the data)	❌
Raw Dataset (The original/first data provided)	❌
Processed Dataset (Final data used in analysis)	✅
Dataset README (A Human readable description of the data)	✅
Citation (How you want your data to be cited)	❌

Data Cleaning & Standardisation Status

Issue	Status
Data does not have any PII's (Personally Identifiable Information)	❌
Data to be uploaded is in a machine-readable format (CSV, JSON)	❌

Other details

Data maintainer details

Comments/Next Steps:

Data is shared as SQL reports. We can upload this directly to the hub or convert it to CSV files to make it more accessible by our users
A schema/architecture map of the database will help users to navigate the dataset
Variables with personally identifiable information (PII's) (As per our data sharing policy, we are not uploading any datasets with sensitive information either about communities (CII's) or individuals):

File	Variable
charge_dtls	charge_holder_name
charge_dtls	address
company_dtls	company_name
company_dtls	reg_add
company_dtls	email
signatory_dtls	name

❗ Important:

Anonymise sensitive information. To do this, columns with PII's listed above can be removed from the original dataset
Please share a link to the data dictionary (This is a CSV file which contains information about the columns present in all files under a dataset). Learn more
Mention the license under which this dataset is to be released on the JusticeHub. Please refer to this link for learning more about open data licenses

📈 Improving data accessibility:

If possible, share all files listed under the Files available section above.
Share the data as CSV files.
Include a README file which is short description about the dataset. Refer here, to know more

Data report | India Justice Report | Tata Trusts

Data Accessibility Report

Links
Sample Dataset
Data Documentation
Data Dictionary

Files available

Files	Status
Data dictionary (Human readable dictionary of data contents)	✅
Data License (How to use and share the data)	✅
Raw Dataset (The original/first data provided)	✅
Processed Dataset (Final data used in analysis)	✅
Dataset README (A Human readable description of the data)	✅
Citation (How you want your data to be cited)	✅

Data Cleaning & Standardisation Report

Issue	Status
Data does not have any PII's (Personally Identifiable Information)	✅
Data to be uploaded is in a machine-readable format (CSV, JSON)	✅

Other details

Data maintainer details

Comments/Next Steps:

❗ Important:

Mention the license under which this dataset is to be released on the JusticeHub. Please refer to this link for learning more about open data licenses

📈 Improving data accessibility:

If possible, share all files listed under the Files available section above.

Data report | Death Penalty - Annual Statistics 2019 | Project 39A

Data Accessibility Report

Links
Sample Dataset
Data Documentation
Data Dictionary

Files available

Files	Status
Data dictionary (Human readable dictionary of data contents)	✅
Data License (How to use and share the data)	✅
Raw Dataset (The original/first data provided)	❌
Processed Dataset (Final data used in analysis)	✅
Dataset README (A Human readable description of the data)	✅
Citation (How you want your data to be cited)	❌

Data Cleaning & Standardisation Report

Issue	Status
Data does not have any PII's (Personally Identifiable Information)	✅
Data to be uploaded is in a machine-readable format (CSV, JSON)	✅

Other details

Data maintainer details

Comments

~~Remove cell formatting (colors, etc)~~
~~Remove column summaries from the end (Row 105 and beyond, in worksheet Persons sentenced to death)~~
~~Worksheet titled Other is not in a standard format (No columns found)~~
Variables with personally identifiable information (PII's) (As per our data sharing policy, we are not uploading any datasets with sensitive information either about communities (CII's) or individuals):

Worksheet/File	Variable	Status
Persons sentenced to death	Name of person	Deleted. Added an ID column
Movements in HC and SC	Name of person	Deleted. Added an ID column

❗ Important:

Anonymise sensitive information. To do this, columns with PII's listed above can be removed from the original dataset - Done
Please share a link to the data dictionary (This is a CSV file which contains information about the columns present in all files under a dataset). Learn more - Done
Mention the license under which this dataset is to be released on the JusticeHub. Please refer to this link for learning more about open data licenses - Done

📈 Improving data accessibility:

If possible, share all files listed under the Files available section above.
Share the data as CSV files. We can have three individual files i.e. Persons sentenced to death, Movements in HC and SC & Statistics under the Annual Death Penalty Report dataset
Include a README file which is short description about the dataset. Refer here, to know more

Create a data dictionary format

A standard format to be shared with all data contributors. This is required because most of the data contributors maintain their own formats and some don't maintain any form of data dictionaries. Data moderators, while creating reports from these datasets would want to know specific details around variables, such as :

Name
Data type (Numeric, Character, etc)
Definition (How a variable is defined)
Variable Type (Categorical or Continuous)
Variable codes (possible values, if data is categorical)
Missing values code (How are missing values treated in a dataset)
File present in (If a dataset has multiple files)
Variable source (Origin, Created by the user)
- Calculated fields
Mathematical formulas used to calculate a field

Check this - https://karthik.github.io/ddd/#minimal

Add a License Page

Add a page listing all the licences that are available for data contributors to make the data available on the Justice Hub.

Links

Configure an email account for Justice Hub

Email - [email protected]

Data report | Correctional Fascilities in Assam | Studio Nilima

Data Accessibility Report

Links
Sample Dataset
Data Documentation
Data Dictionary

Files available

Files	Status
Data dictionary (Human readable dictionary of data contents)	❌
Data License (How to use and share the data)	❌
Raw Dataset (The original/first data provided)	❌
Processed Dataset (Dataset used for analysis)	✅
Dataset README (A Human readable description of the data)	❌
Citation (How you want your data to be cited)	❌

Data Cleaning & Standardisation Report

Issue	Status
Data does not have any PII's (Personally Identifiable Information)	✅
Data to be uploaded is in a machine-readable format (CSV, JSON, XLS*)	✅

Other details

Data maintainer details

Comments/Next Steps:

Worksheet: Sheet 1

RTI responses under each indicator can be shared as a separate worksheet. For E.g.: Each of Women and Child Health,MEDICAL FACILITIES,MEDICAL STAFF,EDUCATION AND HEALTH,DETENTION MANUAL + FORTNIGHTLY PRISON REPORT can be a separate sheet as they all have information under different heads (columns)
All date columns such as RTI dated on, Reply received on, etc should only contain valid date values, in similar date formats. For Eg: YYYY-MM-DD
Remove cell formats (Colors, Bold, Italics, etc)
Indicators with RTI IGP official no. should be a separate worksheet/file. Eg:

Group	RTI Details
MEDICAL FACILITIES	RTI IGP official no. - 34
Women and Child Health	RTI IGP official no. - 35
EDUCATION AND HEALTH	RTI IGP official no. - 44

All Column names should be standardised (small case, mostly an identifier instead of a description)
Every column should just be a label and its description shall be available in the data dictionary. For E.g. a column name can be gynaecologists_available and its description can be How many gynaecologists are appointed or available for visits in the correctional homes of Assam? Please provide the number of such doctors and frequency of visit (of last 3 years) along with institution/hospital where they are appointed or available. (which is the actual column name in the file shared).
A few RTI responses can be converted to quantitative data as well. For E.g. responses mentioning nil can be converted to 0, etc. (Depends on use-case to use-case, sometimes it is not feasible to assign numbers to text, but should be done where possible)

Worksheet: Nature of illness - details

Share this as a CSV file
Remove Nature of Illness from Cell 1 as this is the title of the file/worksheet
Include geographic details as a sepearate column
Values of the same type should be present in each individual columns. For E.g. column titled Monthly Average (approx) should only have numbers and not dates e.g. 2019

❗ Important:

Please share a link to the data dictionary (This is a CSV file which contains information about the columns present in all files under a dataset). Learn more
Mention the license under which this dataset is to be released on the JusticeHub. Please refer to this link for learning more about open data licenses

📈 Improving data accessibility:

If possible, share all files listed under the Files available section above.
Share the data as CSV files.
Include a README file which is short description about the dataset. Refer here, to know more

Document the JusticeHub - Terms of Service

Justice Hub - Terms of Service

The purpose of the Justice Hub (A collaborative legal data platform) is to enable the sharing of data across the legal and justice sector. We collaborate with our partners, including researchers, practitioners, and government agencies to curate and understand data relevant to the operations of the justice ecosystem.
Only approved organisations are able to share data through the platform.
JusticeHub will work with its partners to assess the quality of the datasets before the data can be shared on the platform. Any dataset that does not fit the data quality framework of the JusticeHub will not be uploaded on the JusticeHub
The data moderators at JusticeHub will work with the data contributors to identify gaps in the datasets and document the processes required before the data can be shared on the platform
JusticeHub does not allow data that includes personally identifiable information (PII) to be shared publicly through the site. All data shared publicly must be sufficiently aggregated or anonymized so as to prevent identification of people or other harm to affected people and the community.
JusticeHub endeavors not to allow publicly shared data that includes community identifiable information (CII) or demographically identifiable information (DII) that may put affected people at risk. However, this type data is more challenging to identify within datasets during our quality assurance process without deeper analysis. We invite users of the JusticeHub to notify us should they become aware of this type of data being shared through the site.
Organisations sharing data through the platform should ensure, to the extent possible, that all data was collected in a legal, ethical and responsible manner
Datasets on the platform can be shared under a user-selected Creative Commons license or as public domain.
Should a user become aware of data shared through the JusticeHub platform that could cause harm by being shared openly, the user should contact [email protected] immediately to request that the data be removed.
Data shared through the JusticeHub platform will be held indefinitely or until such a time that the data contributor deletes it, or there is a request from a user for it to be deleted. In the latter case, the user would have to provide a convincing reason (e.g. privacy) and possibly also supporting evidence for the claim.
If a data source becomes aware of data that has been shared through the JusticeHub platform by a third party and disagrees with it being shared, the data source should contact [email protected] to request that the data be removed.
If CivicDataLab as system administrator of the JusticeHub platform becomes aware of any data that is in violation of these Terms of Service, CivicDataLab will contact the individual or organization to notify them.

About these Terms

The JusticeHub core team (CivicDataLab & Agami) may modify these terms or any additional terms that apply to the JusticeHub platform to reflect changes to our services. Users should look at the terms regularly. We will post notice of modifications to these terms. If you do not agree to these terms, you should discontinue use of the JusticeHub platform.

Curate a list of resources for data contributors

Companies Registration Dataset | Veratech

Data Accessibility Report

Data points (or datasets) from the IndianKanoon portal

ID	Dataset	Category	Columns	Use-Case
1	Aggregated list of cases (judgements) under all IPC acts/sections by bench and time-period (month/year)	Case-Law	`bench`,`month`,`year`,`act`,`section`,`case-count`	Trends of disposed cases under various IPC acts and sections over time
2	Year/Month and Bench wise aggregation of cases	Case-Law	`bench`,`month`,`year`,`total-cases`	Temporal view of total disposed cases by bench (Last n year)
3	Author/Bench/Year wise count of cases	Case-Law	`author`,`bench`,`month`,`year`,`total-cases`	Judge (Author) wise analysis of cases over time
4	Number of citations per act/section. This can further be aggregated by court bench	Citations	`act`,`section`,`total-citations`,`bench`	Total act specific judgement citations categorised by Bench
5	Number of citations per author (judge), all courts	Citations	`author`,`bench`,`total-citations`	Total citations categorised by Judge and Bench
6	Top n (100) most cited judgements of each court/bench	Citations	`bench`,`judgement`,`total-citations`,`rank`	Top most cited judgements of each court
7	A few interesting data points from IndianKanoon Website Analytics (Search requests over time)	Analytics		To analyse search trends over time and other user-specific website usage behavior on IndianKanoon

Update the email in the Contact Us Page

Update the formspree account email to [email protected]

justicehub-in / justice-hub-docs Goto Github PK

justice-hub-docs's People

Contributors

Stargazers

Watchers

justice-hub-docs's Issues

Data Accessibility Report

Links:

Data standardisation

PII variables present in file

Data License

Other files required (if, available)

Questions

Data Accessibility Report

Links:

Files available

Data Cleaning & Standardisation Report

Other details

Comments

Data Accessibility Report

Files available

Data Cleaning & Standardisation Report

Other details

Comments/Next Steps:

Data Accessibility Report

Files available

Data Cleaning & Standardisation Report

Other details

Comments/Next Steps:

Data Accessibility Report

Files available

Data Cleaning & Standardisation Report

Other details

Comments/Next Steps:

Data Accessibility Report

Files available

Data Cleaning & Standardisation Status

Other details

Comments/Next Steps:

Data Accessibility Report

Files available

Data Cleaning & Standardisation Report

Other details

Comments/Next Steps:

Data Accessibility Report

Files available

Data Cleaning & Standardisation Report

Other details

Comments

Data Accessibility Report

Files available

Data Cleaning & Standardisation Report

Other details

Comments/Next Steps:

Justice Hub - Terms of Service

About these Terms

Data Accessibility Report

Links:

Recommend Projects

Recommend Topics

Recommend Org