Light

behzadshomali / image-describe-pipe Goto Github PK

This app outputs the name, coordinations, sentiment of each extracted face, and besides a brief description of the scene's context for each input image.

Python 74.88% CSS 6.05% HTML 18.92% Dockerfile 0.15%

python deepface html bootstrap css flask

image-describe-pipe's Introduction

Hi there 👋

🏷️ My name is Behzad Shomali
🎓 Master's degree student of Computer Science
📚 Highly interested in Deep Learning & Computer Vision
👨🏻‍💻 Superfan of Python!
☕️ Coffee lover!

Let's keep in touch:

image-describe-pipe's People

Contributors

Stargazers

Watchers

image-describe-pipe's Issues

Training (and dev) phase

Build the learning matrix based on the faces in the database.

Let's set the success bar to 85% of accuracy.

Cover the literature

Collect a list of relevant patents to ImageDescribe using Google Patents. Only care about EU and US patents. You'll find many Chinese patents which you ignore.
Read and summarize those patents.
Position ImageDescribe against the most relevant patents by highlighting the winning cards in our project.

Do research on the database

What database architecture is required for the problem?
Which DBMS to use? Why?
What are the attributes and their data types?

Make ImageDescribe a web service

Inspired by Google Image Search, design a web service on top of the ImageDescribe engine so that the end-user can simply type in an URL, upload a photo, and receive descriptions immediately.
Given the fact that the two main objectives of this project is accessibility and availability, an important (but easy) feature for the web service is the text-to-speech feature. There should be a button next to the generated description to read it.

Let's make sure the solution we pick in #1 be multi-face. While the simpler version of the problem is to recognize one face per image, it is more natural to assume that one image may depict a group of people.

Mobile interface

After #17 is done, transform the ImageDescribe web service into an Android/iPhone app. The advantage of doing so is that the user can easily input images using her phone camera.

Database ER

After #3 is done.

Determine the concepts / tables.
Determine the attributes and their types.
Determine the relations / keys (primary and foreign).
Draw ER.
Determine necessary indexes.

Populate DB

Once #5 is done, insert example data into the DB.

AR extensions (future work)

Design Augmented Reality (AR) features for ImageDesribe
After #19 is done, design the AR system in a way that the user can turn on her camera in video mode to point to any photo or person in real-time, and ImageDescribe will overlay the scene with labels of names for faces and objects.

Can you run main.py without problem?

I ran the same code in the Colab notebook but as our IPs are restricted by Tensorflow, I had some issues installing the required dependencies. Now I want to ensure that main.py can be executed successfully.

@behroozomidvar would you please test this? By the way in case everything was ok, please close this issue.
To test, you only need to install dependencies placed in requirements.txt and just run main.py

Add indexes to the database (future work)

Final deadline

Let's consider September 10, 2021 as the ultimate deadline for this project. @behzadshomali please define internal deadlines accordingly.

Do research on the implementation

What are the available implementations for the problem at hand?
Which implementation has the best trade-off between ease-of-use and efficiency?
Is it a fork solution or from scratch? Why?
What is the pseudo code of the final solution?

Implement database

Implement the DB based on #4 specifications.
Implement the indexes.

Users' images location

@behroozomidvar do we store users' images themselves at our server or do they have to upload their images into a cloud server and share with us their images' corresponding URLs?

Do research on the problematic

What is the exact problem we are going to solve as "face recognition"?
What are the inputs to this problem? What are their corresponding data types?
What are the outputs of this problem?
What are different options for developing an algorithmic solution for the aforementioned problem?
What is the most efficient solution?

Face sentiments

The descriptions generated by ImageDescribe should not be limited to face names and objects, and it should also report sentiments in the faces.

For instance, in this famous photo of Elen Degeneres, ImageDesribe should be able to describe the following elements:

Names of people in the photo, including Bradley Cooper, Angelina Jolie, Brad Pitt, and Jennifer Lawrence.
The scene, which can be described for instance with words such as "ceremony" or "soirée".
The emotions and sentiments in the faces, e.g., "Bradley Cooper is smiling", "Jared Leto is surprised", and "Julia Roberts is laughing".

What is the next step?

Considering the closing issue #2, what should be done in next step? Shall we start searching for a proper DBMS as discussed in #3 or implement the first part of our pipeline?

Remaining things before test phase

@yasminesmati would you please fix the problem with "Remove user logs" as you think it will be fine with foreign keys? And meanwhile, I will work on the function responsible for evaluating the user's input image (i.e. combining the "detection.py" file with "postgres.py")

Doubt about format of logs

@behroozomidvar as I was developing the logs formation and its corresponding stuff, I doubted which format should be used as the basis of the logs:

$(user_x) deleted an image
$(user_x) deleted $(image_url)

by $(.) I mean some value will be replaced there.

User sessions

Inspired by Google Photos, design user session functionality.
The user can sign up with an email, and then login into her account to receive personalized info.
In her account, she can introduce and label faces, so that the ImageDescribe engine will be trained to recognize those faces in future uploaded photos.

Test phase

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.