Coder Social home page Coder Social logo

rhuanbarros / court_decisions_jurimetric_analysis Goto Github PK

View Code? Open in Web Editor NEW
7.0 7.0 1.0 25.75 MB

Analysis of Court Decisions using Machine Learning with Weak Supervision

Jupyter Notebook 95.42% Python 0.05% HTML 4.53% Dockerfile 0.01%
jurimetria jurimetrics legal legaltech machine-learning

court_decisions_jurimetric_analysis's Introduction

Hi there πŸ‘‹

I'm a Master's graduate from the Federal University of Rio Grande do Sul, specializing in Data Science, Artificial Intelligence, and Natural Language Processing. My expertise lies in applying these technologies to legal documents and the justice system.

πŸ“– Master's degree project

  • Title: "Case Law Analysis with Machine Learning in Brazilian Court"
  • This research involved the automatic extraction of judicial decisions from web pages and their processing through Natural Language Processing techniques.
  • Subsequently, this data was used in the development of Machine Learning models to classify documents based on the judge's ruling.
  • The necessary statistical tests were also developed to validate the findings.
  • Finally, graphs and dashboards were created to enable analysis by stakeholders.
  • πŸš€ It received over 20 academic citations, and its presentation (in English) at a conference in Canada. Link

Skils

  • Machine Learning: PyTorch, TensorFlow, NLP, LLM, Transformers, Vision models, EDA, ETL, Sci-kit Leaning lib,
  • Full stack development: Full stack C# Blazor developer.
  • Code: Python, C#, HTML, CSS, Java Script, Powershel, ...

πŸ“– Publications

  • Case Law Analysis with Machine Learning in Brazilian Court Link
  • Programming the Nationality Identity in the Federal Constitution of Brazil Link

Blog posts

Projects portfolio

  • 🧠 Machine Learning projects

    • Analysis of Court Decisions using Machine Learning with Weak Supervision
      • Decription:
        • Automatic extraction of documents from internet and pre-processing unsing NLP techniques
        • Development of Machine Learning models to classify documents based on the judge's ruling.
        • πŸ“Š Statistical tests to validate the findings.
        • Graphs plots and dashboards
      • Technologies: Python, Sci-kit Learning lib, Scrapy, Google BigQuery, Snorkel Framework
  • πŸ’» Fullstack projects

    • Materiale
      • Solution for managing construction material budgets for stores serving various clients. Deployed using Supabase Serverless technologies. The system works entirely on the front end, leveraging C# Blazor WebAssembly, and Supabase Realtime Postgres database.
      • Technologies: C#, Blazor, Supabase, HTML, CSS
  • πŸ€– LLM AI Agents projects

    • LLM RAG Agent Knowledgebase

      • Full-stack AI project to talk with personal documents. πŸš€
      • AI agent for chatting about ingested document files.
      • πŸ“ Handles file ingestion, vector stores, user chat, and advanced search queries.
      • πŸ› οΈ Key technologies:
        • LangGraph: Agent orchestration.
        • FastAPI: Backend framework.
        • Unstructured Package: File ingestion and OCR.
        • Weaviate: Vector store.
        • 🐳 Docker Compose: Weaviate containerization.
        • C# Blazor: Frontend framework.
        • Backend (FastAPI)
          • Processes diverse files and performs OCR in Portuguese.
          • Enhances semantic similarity with chunk splitting.
          • Engages in conversation, supports tool use, and saves history via SQLite.
          • Implements keyword, semantic, and hybrid search.
        • Frontend (C# Blazor):
          • πŸš€ Runs in the browser with WebAssembly.
          • Fetches data from the backend API.
    • πŸ€– Machine Learning Interview Preparation Quiz Trainer - Anvil framework version

      • Problem Addressed: Lack of specific machine learning quizzes, progress tracking and memorization techiniques included.
      • πŸ’‘ Solution:
        • 🐍 Python: Completely coded in Python using the Anvil framework (frontend and backend).
        • πŸ“ Customized Prompts: Create better questions by prompting ChatGPT with subject texts.
        • Integrated Explanations: Use LLM models to explain topics within the app.
        • πŸ“Š Result Tracking: Track study progress and quiz results using Supabase backend.
        • Cloud Accessibility: hosted in the Anvil cloud.
      • πŸ”‘ Key Points:
        • Enhanced Learning: Improved question creation with customized prompts.
        • Seamless Knowledge Access: Direct topic explanations from Gemini model.
        • Progress Tracking: Monitor study progress and quiz outcomes.
        • Anywhere Access: Use the app on mobile devices via cloud hosting.
    • πŸ“š English sentence creator

      • Prototype Purpose: Assists students in learning English with tech industry sentences.
      • πŸ“ Transcribe Audio Files: Use Whisper model from OpenAI to transcribe audio to text.
      • Sentence Separation: Separate transcribed sentences for better language model understanding.
      • Generate Tech Vocabulary Sentences: Adapt existing course content with tech vocabulary.
      • πŸŽ™οΈ Convert Text to Speech: Use Microsoft's speecht5_tts model for speech synthesis.
      • 🎡 Process Audio Files: Convert generated audio to MP3 format.
      • πŸ”‘ Key Points:
        • Transcription Accuracy: Ensure high accuracy using Whisper model.
        • Sentence Separation: Develop methods to cleanly segment sentences.
        • Tech Vocabulary Adaptation: Adapt sentences to include tech terms.
        • πŸ”Š Speech Conversion: Ensure natural and clear text-to-speech using speecht5_tts.
        • 🎢 Audio Processing: Convert and optimize audio files to MP3.
    • πŸ“§πŸ€– LLM Agent Gmail Parser Better RAG

      • Problem Addressed: Challenges in indexing emails for RAG applications or knowledge extraction.
        • Issues include noise and garbage in text and dealing with email threads.
      • πŸ’‘ Solution:
        • Utilized various prompt techniques to extract crucial information from emails.
        • Found that report-style summaries are more effective than generic summaries.
      • 🌟 Key Points:
        • Noise Reduction: Implemented techniques to filter out irrelevant information.
        • Thread Handling: Developed methods to accurately parse and summarize email threads.
        • πŸ“ Report-Style Summaries: Retain more essential information than generic summaries.
        • πŸ”§ Prompt Engineering: Experimented with different prompt structures to improve extraction of valuable insights.
    • πŸ€– Machine Learning Interview Preparation Trainer - legacy Streamlit version

      • Problem Addressed: Lack of specific machine learning quizzes and progress tracking.
      • πŸ’‘ Solution:
        • πŸ“ Customized Prompts: Create better questions by prompting ChatGPT with subject texts.
        • Integrated Explanations: Use Gemini model to explain topics within the app.
        • πŸ“Š Result Tracking: Track study progress and quiz results using Supabase backend.
        • Cloud Accessibility: Streamlit UI hosted in the cloud for mobile access.
      • πŸ”‘ Key Points:
        • Enhanced Learning: Improved question creation with customized prompts.
        • Seamless Knowledge Access: Direct topic explanations from Gemini model.
        • Progress Tracking: Monitor study progress and quiz outcomes.
        • Anywhere Access: Use the app on mobile devices via cloud hosting.

How to reach me πŸ“«

court_decisions_jurimetric_analysis's People

Contributors

rhuanbarros avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

Forkers

baco23

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.