Coder Social home page Coder Social logo

manga-newsletter's People

Contributors

kashyapdevesh avatar

Stargazers

 avatar

Watchers

 avatar

manga-newsletter's Issues

Optimization in Machine Learning Model

In my project, I'm currently using a text summarization model and sentiment analysis model provided by the hugging face hub.

I intend to replace it with a custom model with comparatively less latency and faster processing time.

If you have any suggestions on how to proceed with respect to the nature of the work, please do comment or make a pull request with a sample work/prototype.

Also if you need the dataset, for training the model or for simple observation, do comment.

Implement Python classes and objects

The python codes in the project are written while following the functional programming paradigm. The functions written can easily be converted into python class object format and be made modular.

The given issue is good to get a rough estimate of the project and is a beginner-friendly issue as the codebase is still growing and this issue can easily be resolved at this stage.

Parallelizing the Scrapers

Currently, the flow of execution of the program is sequential and it's taking lots of time while computing a single pic.

I intend to run this program on a real-time feed and on such a system such a processing time would suck. I'm currently stuck with another aspect of the project, so I'm not able to focus my time here.

Any form of help regarding this issue that could reduce the overall processing time would be very beneficial and highly appreciated.

Suggest an Idea, Raise an issue, Push a PR ๐Ÿš€

Other than the issues mentioned, if you have any suggestions regarding some specific aspect of the project, feel free to raise an issue
with the specific suggestion.

The overall structure of the project at its current stage is as follows:
photo_2022-10-13_16-57-58

Help with documentation and code comments

The entire code was written over a very short period of time, and I didn't pay much attention to code commenting styles and documentation paradigms.

Now that the scope of the project is increasing, I'm having issues with keeping logs and track of the project.

I need help with ordering the comments in the code files and further, if you have any ideas or suggestions, please do share

Multithreaded Queue Support

The entire structure of the project is as follows:

photo_2022-10-13_16-57-58

I am facing a critical issue with implementing the multithreaded queue data structure which should be connected with the Manganelo Scraper, Page Scraper, Sentiment Analysis, and Summary Generator and final Newsletter Generation files.

The idea behind using a multithreaded queue data structure is such that the whole pipelined process(from scraping to newsletter generation) can be treated as a transaction and if the process fails at any step due to any issue we can push it into a queue shared over the running processes.

This would give me a running solution that I could use as a placeholder and move forward with the project. I know an optimal solution may be more complex or even require writing a custom multithreaded priority queue, but I need a working solution at the moment.

PS:
This is the first time I working with concurrency control in python, and I don't have much prior knowledge of the same, If you have another solution, suggestion or idea, please please do comment.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.