I've put together a little side project called ScreenGPT-Vision. Think of it as a casual chat with GPT-4, but from the comfort of your desktop and with a cool twist: the ability to throw in images! Yep, now asking questions about those pesky screenshots or any visual puzzles is just a click or a hotkey away.
Ever been knee-deep in work and stumbled upon an error message as cryptic as an ancient manuscript? Or maybe you've had a math equation staring at you from a photo, begging to be solved. Instead of tab-hopping and website navigating, why not just press a key, snag a screenshot, and get querying? That's the convenience ScreenGPT-Vision brings to your desktop!
- Chat with GPT-4: Just type in your question and voilΓ β wisdom from GPT-4.
- Screenshot Magic: Capture your screen's content and let GPT-4 do the heavy lifting.
- Lazy Loading: Queue up those screenshots and fire away all your queries at once.
- Shortcut Simplicity: Hit Ctrl+Shift + P and your screenshot is ready to be analyzed. Ctrl + W to quickly send your messages.
- Feather-Light: A minimalist app that's easy on your machine.
Cobbled together in just 3 days, this app is my humble foray into desktop development β no prior experience, just a lot of googling and some help from my code mentor, Code Mentor GPT (he even helped with the editing this README). The code might be rough around the edges, and the UI won't win beauty contests, but it gets the job done. If this little tool piques your interest, I'm all in for round two of development. Think object localization, automated actions... the sky's the limit!
Stumble upon this repo and find it neat? Star it, fork it, send pull requests, or just spread the good word β every bit of support counts. Your interest is the fuel for this project's growth. Let's make desktop AI chat a thing!
Doesn't support MacOs
Curious to see ScreenGPT-Vision in action? I've put together some screenshots to show you just how easy and fun it is to use:
Each image is a snapshot of the app in use, showcasing the various features in action.
- Python 3.8+
- OpenAI API key (Don't have one? No worries! Head over to OpenAI's API keys page to get your key.)
- Docker (optional, for Docker setup)
git clone https://github.com/AmT42/ScreenGPT-Vision.git
cd ScreenGPT-Vision
Inside the ScreenGPT-Vision directory, create a file named .env. Unix-based systems (Linux, macOS, etc.)
touch .env
Windows
type nul > .env
Open your new .env file with your favorite text editor and add the following line: OPEN_API_KEY=your-openai-key
- Run docker compose
docker-compose up --build
- Install PyQT5 to run the front
pip install -r requirements_GUI.txt
- In a separate terminal, execute the following command:
python GUI/app.py
- Install Requirements
pip install -r requirements.txt
- Start the Backend
uvicorn app.main:app --reload
In a separate terminal, execute the following command:
python GUI/app.py
[Include instructions on how to use the application, along with any screenshots or videos if available.]
If you've got ideas or code to improve this app, I'm all ears! Contributing is simple:
- Fork this repo.
- Create a branch for your awesome new feature (
git checkout -b amazing-feature
). - Commit your changes (
git commit -m 'Add some awesomeness'
). - Push to the branch (
git push origin amazing-feature
). - Create a new Pull Request.
This project is licensed under the MIT License - see the LICENSE file for details.
If you end up using ScreenGPT-Vision in your work, a shoutout would be super cool:
@misc{ScreenGPT-Vision,
author = "AmT42",
title = "ScreenGPT-Vision for Desktop",
year = "2023",
url = "https://github.com/AmT42/ScreenGPT-Vision"
}
AmT42 - [email protected] / https://www.linkedin.com/in/ahmet-celebi-973b63197/
Project Link: https://github.com/AmT42/ScreenGPT-Vision
=======