Coder Social home page Coder Social logo

mahalrs / newsgen Goto Github PK

View Code? Open in Web Editor NEW
0.0 2.0 0.0 365 KB

Multi-Modal Image Generation for News Stories

License: Apache License 2.0

TypeScript 8.01% Jupyter Notebook 24.97% Python 67.03%
clip dalle-mini multi-modal text-image transformers vqgan vqgan-clip

newsgen's People

Contributors

arjunkrishna3367 avatar mahalrs avatar

Watchers

 avatar  avatar

newsgen's Issues

Fix VQGAN notebook training loop

Since current VQGAN model is using NN module and not Lightning module, the training loop should handle calls such as optimizer.step, loss.backward(), optimizer.zero_grad(), etc.

Log images every 100 mini-batches

Currently we are logging images every 1000 mini batches during validation and testing. However, we will not have enough images logged in case we have a small dataset or higher number of devices. Either we should change it to 100 mini-batches or take this value as a command line argument.

Fix VQGAN to use PyTorch Lightning

Original VQGAN implementation used PyTorch Lightning module but we converted to PyTorch NN module. It is much easier to do distributed training with PyTorch Lightning, so let's convert it to Lightning module.

crawler: normalize urls

Crawler needs to normalize urls. For example, https://www.example.com and https://www.example.com/ are the same and crawler shouldn't treat them as separate.

Add script to encode dataset (image tokens)

Using VQGAN encoder, convert all images in the given dataset to image tokens. These image tokens along with tokens from BART encoder (encoded captions/news headline) are fed to BART decoder to train it to generate image tokens given encoded captions/headlines.

We can do this as part of data transform step, however, doing it beforehand will speed up the training.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.