Coder Social home page Coder Social logo

fbms's Introduction

💥 whats up?

Currently working as an independent consultant. I use my expertise in recommendation systems to helps fast-growing startups build out their RAG applications. I am also the creator of Instructor, Flight, and an ML and data science educator.

Jason's GitHub stats

Support

I want to support me, you can sponsor me on github, an subscribe to my newsletter.


  • 567 Advisors - 2023 - Present
  • Creator of Instructor - 2023 - Present
  • Sabbatical @ South Park Commons - 2023 - Present
  • Staff Machine Learning Engineer @ Stitchfix — 2016, 2018-2023
  • Prev, Meta, ActionIQ, NYU, Meltwater - 2013-2018
  • Computational Mathematics and Statistics @ University of Waterloo

RAG (Retrieval-Augmented Generation)

Career and Personal Development

The links have been updated as requested. Is there anything else you'd like me to do with this content?

Talks and Podcasts

fbms's People

Contributors

henryboldi avatar jxnl avatar ttaylorr avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

fbms's Issues

Systems.

A bit hand wavy but the guys working on the datastore and systems should talk together and think of smart ways to automate the systems involved and to connect all the pieces.

Bulk Loading Scripts

Now that the MongoDB is set up we'll need some class that can obtain everything before transforming it for our training algorithms

Classification

We will need to have some automated ways to train and test our algorithm and serlize them into our data store.

We will need to think of relevant features and how we acn effectivly extract them. More to come later.

Queuing System

A lot of discussion needs to be had about how we want to do all the procesing.

Will we be using a messaging broker? What technologies will ne required?

Some research will need to be done for

  • celery
  • gevent
  • multiprocessing

Access Tokens exposed!

Just an FYI: A bunch of your MongoDB, and Facebook are exposed. I wouldn't want you guys to lose all this stuff so maybe hide them? Or just change the secrets!

Otherwise: Great work! It looks very promising :)

Data store

We should have a way to write to our datastore and retain group ID information

We will need databases for group, group content, and group users. Schemas are still up in the air, would love to discuss some potential ways.

  • write to datastore.
  • read from datastore for admin dashboard.
  • schedule reads to datastore.
  • propertly bootstrap from datastore.
  • properly prepare data for ML api.

__subject to change

What we want to do is persist all the group data so it is searchable. Not only will it be a data store but it will be used for spam detection and flagging users.

We can consider both SQL and noSQL techonologies. I personally think MongoDB may be the way to do due to its synergy with python dictoinaries.

Admin dashboards

we will also need a way for admins to access and interact with the data we provide.

Facebook Layer

To automate the moderation of facebook groups the first thing we need to do is have a consistence way of accessing facebook.

Essentially a class that wraps the Graph API that gives us well defined access to groups, posts, and actions on those posts.

It will need to do the following:

  • get a list of groupids associated with the group
  • access the post content of a group
  • access the comments of a post
  • access the userids of posts and comments
  • comment on a post
  • bulk read the content of a group when invited to group.
  • tag a user in a comment

More will be added as I come up with more requirements.

Overall required tasks

I'm going to slowly try to refactor this thing. I would love some help.

Here are some core components of the Mark Sweep project, each will have its own main issue and I would love if someone could champion a component.

  1. Facebook Access Layer.
  2. MongoDB Access Layer.
  3. Queuing Systems (async? multiprocessing?, celery? Gevent?).
  4. Server side shenagians
  5. Management.
  6. Machine Learning.
  7. Admin console for group moderators.

The first milestone is to have a version 0.1.0 release that can automatically retrain daily and moderate all groups.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.