Coder Social home page Coder Social logo

behzadshomali / image-describe-pipe Goto Github PK

View Code? Open in Web Editor NEW
2.0 3.0 0.0 2.15 MB

This app outputs the name, coordinations, sentiment of each extracted face, and besides a brief description of the scene's context for each input image.

Python 74.88% CSS 6.05% HTML 18.92% Dockerfile 0.15%
python deepface html bootstrap css flask

image-describe-pipe's Introduction

Hi there ๐Ÿ‘‹

  • ๐Ÿท๏ธ My name is Behzad Shomali
  • ๐ŸŽ“ Master's degree student of Computer Science
  • ๐Ÿ“š Highly interested in Deep Learning & Computer Vision
  • ๐Ÿ‘จ๐Ÿปโ€๐Ÿ’ป Superfan of Python!
  • โ˜•๏ธ Coffee lover!

Let's keep in touch:

email linkedin researchgate twitter

Buy Me A Coffee


image-describe-pipe's People

Contributors

behzadshomali avatar yasaminesmati avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar  avatar

image-describe-pipe's Issues

Training (and dev) phase

Build the learning matrix based on the faces in the database.

Let's set the success bar to 85% of accuracy.

Cover the literature

  • Collect a list of relevant patents to ImageDescribe using Google Patents. Only care about EU and US patents. You'll find many Chinese patents which you ignore.
  • Read and summarize those patents.
  • Position ImageDescribe against the most relevant patents by highlighting the winning cards in our project.

Do research on the database

  • What database architecture is required for the problem?
  • Which DBMS to use? Why?
  • What are the attributes and their data types?

Make ImageDescribe a web service

  • Inspired by Google Image Search, design a web service on top of the ImageDescribe engine so that the end-user can simply type in an URL, upload a photo, and receive descriptions immediately.
  • Given the fact that the two main objectives of this project is accessibility and availability, an important (but easy) feature for the web service is the text-to-speech feature. There should be a button next to the generated description to read it.

Multi-face recognition

Let's make sure the solution we pick in #1 be multi-face. While the simpler version of the problem is to recognize one face per image, it is more natural to assume that one image may depict a group of people.

Mobile interface

After #17 is done, transform the ImageDescribe web service into an Android/iPhone app. The advantage of doing so is that the user can easily input images using her phone camera.

Database ER

After #3 is done.

  • Determine the concepts / tables.
  • Determine the attributes and their types.
  • Determine the relations / keys (primary and foreign).
  • Draw ER.
  • Determine necessary indexes.

AR extensions (future work)

  • Design Augmented Reality (AR) features for ImageDesribe
  • After #19 is done, design the AR system in a way that the user can turn on her camera in video mode to point to any photo or person in real-time, and ImageDescribe will overlay the scene with labels of names for faces and objects.

Can you run main.py without problem?

I ran the same code in the Colab notebook but as our IPs are restricted by Tensorflow, I had some issues installing the required dependencies. Now I want to ensure that main.py can be executed successfully.

@behroozomidvar would you please test this? By the way in case everything was ok, please close this issue.
To test, you only need to install dependencies placed in requirements.txt and just run main.py

Final deadline

Let's consider September 10, 2021 as the ultimate deadline for this project. @behzadshomali please define internal deadlines accordingly.

Do research on the implementation

  • What are the available implementations for the problem at hand?
  • Which implementation has the best trade-off between ease-of-use and efficiency?
  • Is it a fork solution or from scratch? Why?
  • What is the pseudo code of the final solution?

Users' images location

@behroozomidvar do we store users' images themselves at our server or do they have to upload their images into a cloud server and share with us their images' corresponding URLs?

Do research on the problematic

  • What is the exact problem we are going to solve as "face recognition"?
  • What are the inputs to this problem? What are their corresponding data types?
  • What are the outputs of this problem?
  • What are different options for developing an algorithmic solution for the aforementioned problem?
  • What is the most efficient solution?

Face sentiments

The descriptions generated by ImageDescribe should not be limited to face names and objects, and it should also report sentiments in the faces.

For instance, in this famous photo of Elen Degeneres, ImageDesribe should be able to describe the following elements:

  • Names of people in the photo, including Bradley Cooper, Angelina Jolie, Brad Pitt, and Jennifer Lawrence.
  • The scene, which can be described for instance with words such as "ceremony" or "soirรฉe".
  • The emotions and sentiments in the faces, e.g., "Bradley Cooper is smiling", "Jared Leto is surprised", and "Julia Roberts is laughing".

What is the next step?

Considering the closing issue #2, what should be done in next step? Shall we start searching for a proper DBMS as discussed in #3 or implement the first part of our pipeline?

Remaining things before test phase

@yasminesmati would you please fix the problem with "Remove user logs" as you think it will be fine with foreign keys? And meanwhile, I will work on the function responsible for evaluating the user's input image (i.e. combining the "detection.py" file with "postgres.py")

Doubt about format of logs

@behroozomidvar as I was developing the logs formation and its corresponding stuff, I doubted which format should be used as the basis of the logs:

  1. $(user_x) deleted an image
  2. $(user_x) deleted $(image_url)

by $(.) I mean some value will be replaced there.

User sessions

  • Inspired by Google Photos, design user session functionality.
  • The user can sign up with an email, and then login into her account to receive personalized info.
  • In her account, she can introduce and label faces, so that the ImageDescribe engine will be trained to recognize those faces in future uploaded photos.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.