Coder Social home page Coder Social logo

kescardoso / datasetbucket Goto Github PK

View Code? Open in Web Editor NEW
6.0 2.0 2.0 10.81 MB

A dataset bucket with a machine learning bias auditor. Built with Python-Flask, MaterializeCSS and the Kaggle API.

Home Page: https://datasetbucket.herokuapp.com

License: MIT License

Python 53.37% CSS 1.92% JavaScript 0.87% HTML 43.84%
python python3 flask flask-application mongodb mongodb-database machine-learning unbiased

datasetbucket's Issues

Fix PDF title

Sometimes, the PDF title isn't correct. We need to check that it is getting passed the correct string. It is possible this might be fixed when we fix the issue of the old datasets not being downloaded.

Improve PDF analytics

  • Break down number of samples for each value
  • Plot histograms/diagrams representing distributions for each feature
  • Other analytics improvements

Add dictionary of country demographics

Add population demographics of each country to be used in the analysis of the data.
We want to compare the demographics in the data to the demographics in a given population, to check if the data is representative of the whole.

Setup File Uploading with Cloud Storage

On add_dataset, the file is only successfully uploaded in local deployment.
After deploying our app to heroku, a fix is needed: either link files uploading to mongodb via b64 or via a cloud storage like s3 to handle user uploads in the deployed version (ultimately necessary).

Fix image analysis

image analysis files need their paths updated. Images in a dataset are not currently getting analyzed or reported.

Delete .vscode file

@kescardoso I want to delete the .vscode folder and .DS_Store in main, but I don't want to break the deployment. Can we delete it safely?

generate a pdf of the report

The results from the analysis will be available to download as a pdf, in addition to being displayed in the html.

Fix Category Bug

Categories:
after installing the location select functionality, bugs in category selection appeared

  • previous multi selection not returning from db on edit view
  • string and list not rendering properly (again!)

Better UX/UI/Design rules

Add:

  • Footer
  • About/instructions page
  • Credits

Improve user-friendly and aesthetic features:

  • color key
  • buttons
  • typography

More info/functionalities for user profile

Develop user profile with more functionalities (not a priority, but cool if we have time)

  • hyperlink author on datasets.html
  • NavBar with active user session tag
  • User info (name, position, photo, links, etc.)
  • User tasks, activities, contributions, stats....

Some things left to figure out and learn ๐Ÿค”

Add 2 more acceptable formats to JSON files

We want to include 3 diffterent formats supported:
{'root': {dict}} ,
{'content': {}, 'annotation':[{'labels':{}},{'points':{}}], 'extras':{}} , currently the only accepted format
{'keyword':{}, 'keyword':{}, 'keyword':{}}

Link field with jinja tags (for pdf reports retrieval)

Feature to be implemented as an alternative:

Figure out a way to add a link input to Add New Dataset, to include pdf reports from a cloud drop box.
Specially in case the JSON file handling doesn't integrate well to the project

Categories Query

Create the categories Query and management sessions:

  • List categories with links
  • Create Add New and Edit category forms

Add about page

Add about page to explain bias to the user and explain how we will analyze their data.

Improve PDF analysis

Currently, the analysis and recommendations generated by the program is lacking. For CSV files, it only calculates basic metrics (like mean and variance) and generates a few histograms. For JSON files, the program doesn't give much analysis as well.

We need people to improve on the PDF analysis/recommendations!

Analysis Page

Created a page with a text input form, to analyze datasets from a Kaggle command or url.

The page is not pragmatically functioning, it needs further developments and installation to link the HTML form to the bias auditor app.

Link to the deployed version: https://datasetbucket.herokuapp.com/analyse_data

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.