Coder Social home page Coder Social logo

mambafacekiss's Introduction

license datasets
apache-2.0
huggan/anime-faces

Mamba face kiss

KISS

This repo contains two Keep It Simple Stupid anime face generators that generates 64x64 faces from 8x8 provided images.

Basic idea was to take 64x64 anime faces dataset(https://huggingface.co/datasets/huggan/anime-faces), resize it to 8x8, then teach the model to restore original images, intuition is that after that if new unseen images are provided, it will make some face.

Validation

Mamba is being fed a sequence [A][A]...[A][SEP][B][B][B]...[B] where there are 64 [A] that came from the 8x8 draft. there are 64x64 [B]s that are initially are upscaled draft(nearest neighbor) with addition of PAE. Model run several layers of mamba, and spits last 64x64 into RGB image. ([SEP] is not used for anything significant other than BERT has it to separate sentences, so I used it too as placeholder for command "Upscale from here")

Two models are used.

RNN goess brr (one way)

One(imgen3test.ipynb and imgen3.py) always feeds images from top-left pixel to bottom-right pixel row by row

Non-flip image

"Bi-directional"

Another take(imgen3test_flip.ipynb and imgen3_flip.py) feed from top-left pixel to bottom-right pixel in every even layer and every odd layer sees upscaled images in reverse order

Flip image

This flip version also uses way more parameters and different dtype. I didn't notice that much difference.

Command line tool

Simple script can be used to call the model on a single image

$ cli_imgen3_flip ./krita/face1.png face1.out.png

python cli_imgen3_flip.py ./krita/face1.png /tmp/face1.png
Weight path is data/image-flip-weights-1024x4-torch.bfloat16.bin
Loading the model
Loading 8x8 input image from ./krita/face1.png
Writing 64x64 image to /tmp/face1.png

It's not really good way to use, comparing to calling through jupyter it though: mamba2 is implemented using triton and it takes around 30 seconds to initialize the model each time (on Raider GE76).

Recreating

Training is done in imgen3(_flip)?.py. Testing is in notebook. Image_utils should provide path to anime faces dataset.

Naming and configuring

Name imgen3 comes from "image generation 3". Two other attemts are not that interesting to even backup them.

I'm too lazy to pass configuration around so parameters are hardcoded in the beginning of the file.

mambafacekiss's People

Contributors

maykeye avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.