rahb-realtors-association / chat2gpt Goto Github PK

View Code? Open in Web Editor NEW

7.0 2.0 1.0 274 KB

Chat²GPT is a ChatGPT (and DALL·E 2/3, and ElevenLabs) chat bot for Google Chat. 🤖💬

Home Page: https://chat2gpt.oncornerstone.app

License: MIT License

Python 73.15% HTML 21.29% JavaScript 3.89% Dockerfile 1.60% Procfile 0.06%

ai google-chat google-chat-bot google-cloud openai python dall-e elevenlabs text-to-speech docker

chat2gpt's People

Contributors

Stargazers

Watchers

chat2gpt's Issues

[Bug] Images lack direct download functionality

🐛 Bug Report

Description:
Image cards within Chat²GPT are not directly clickable for downloading. While users can right-click to save the image, this doesn't provide an intuitive user experience.

Steps to Reproduce:

Engage in a conversation using Chat²GPT where an image card response is triggered.
Attempt to click directly on the image card to download or view the image in a separate tab.

Expected Behavior:
Users should be able to either click directly on the image to initiate a download or have a dedicated download button accompanying the image for easier access.

Actual Behavior:
Clicking directly on the image does nothing. Users are required to right-click and select 'Save Image As' to download.

Additional Information:
This issue might affect users who aren't familiar with the right-click method or expect a more direct approach to downloading images.

Possible Solution (Optional):
Consider implementing a direct click-to-download functionality for the image cards or adding a prominent 'Download' button adjacent to the images for better accessibility.

TTS audio is returned as binary data, not a URL.

This feature is going to need more work due to the way the audio data is returned by the API (streamed binary data). A mechanism to store and serve files is beyond the scope of this project, so unless a very simple method is found we may abandon this feature branch.

Originally posted by @justinh-rahb in #42 (comment)

[Feature Request] Loading indicator for /image and /tts

🚀 Feature Request

Description:
Introduce an immediate feedback mechanism in the chatbot to notify users when their request is being processed, specifically for image and TTS generation. This will provide a more interactive and user-friendly experience.

Problem Statement:
Currently, there can be a noticeable delay between when the user sends a request (e.g., /image or /tts) and when the bot responds with the generated image or audio. During this waiting period, users are left without feedback, which might lead them to believe the bot is unresponsive or broken.

Proposed Solution:
Before starting the image or audio generation process, the bot should instantly send a response to inform the user that their request is being processed. This can be achieved by sending a "Processing..." message accompanied by a card that provides more details about the ongoing task. This will keep users informed and set the expectation that a more substantial response will follow shortly.

Benefits:

Enhances user experience by providing immediate feedback.
Reduces perceived waiting time.
Minimizes user uncertainty during processing-intensive tasks.

Additional Context:
Below is a code snippet that demonstrates a possible implementation:

# ... existing code ...

def handle_message(user_id, user_message):
    try:
        # ... existing code ...

        # Check if the user input starts with /image
        elif user_message.strip().lower().startswith('/image'):
            # ... existing code ...
            
            # Before the actual image generation, immediately respond with a processing message
            return jsonify({
                'text': 'Processing your image request...',
                'cardsV2': [{
                    'cardId': generate_unique_card_id(),
                    'card': {
                        'header': {
                            'title': 'Processing...',
                        },
                        'sections': [
                            {
                                'widgets': [
                                    {
                                        'textParagraph': {
                                            'text': 'Generating the image based on your request. Please wait...'
                                        }
                                    }
                                ]
                            }
                        ]
                    }
                }],
            })

        # Check if the user input starts with /tts
        elif user_message.strip().lower().startswith('/tts'):
            # ... existing code ...
            
            # Before the actual text-to-speech conversion, immediately respond with a processing message
            return jsonify({
                'text': 'Processing your TTS request...',
                'cardsV2': [{
                    'cardId': generate_unique_card_id(),
                    'card': {
                        'header': {
                            'title': 'Processing...',
                        },
                        'sections': [
                            {
                                'widgets': [
                                    {
                                        'textParagraph': {
                                            'text': 'Generating the speech output based on your request. Please wait...'
                                        }
                                    }
                                ]
                            }
                        ]
                    }
                }],
            })

        # ... rest of the code ...

    except Exception as e:
        # ... existing code ...

Screenshots / Mockups (Optional):
No visual mockups provided.

[Feature Request] Add /help command to print docs/usage.md

🚀 Feature Request

Description:
Add a new /help command to the chatbot that provides users with guidance on how to interact with it. This command will display content from the docs/usage.md file, allowing users to understand the bot's functionalities without having to refer to external documentation.

Problem Statement:
Users often require guidance on how to use chatbots effectively, especially when they encounter the bot for the first time. Providing them with a /help command directly within the chat interface can improve their experience by offering immediate access to helpful information.

Proposed Solution:
Implement a new /help command in the bot's message handling logic. Upon invocation, the bot should read the docs/usage.md file, extract the content below the --- header, and return it as its response. Here's a code snippet illustrating the proposed implementation:

# Check if the user input starts with /help
elif user_message.strip().lower() == '/help':
    try:
        # Read the docs/usage.md file
        with open('docs/usage.md', 'r') as file:
            content = file.read()

        # Split the content at the "---" header line and get the second part
        help_content = content.split("---", 2)[-1].strip()

        # Return the extracted content as the bot's response
        return jsonify({'text': help_content})

    except Exception as e:
        print(f"Error reading help content: {str(e)}")
        return jsonify({'text': 'Sorry, I encountered an error retrieving the help content.'})

Benefits:

Enhances user experience by providing immediate access to helpful guidance.
Reduces the need for users to search for external documentation.
Utilizes existing documentation, ensuring that the bot's guidance is consistent with other resources.

Additional Context:
The proposed solution assumes that the docs/usage.md file exists and is structured with a --- header. If the file structure changes in the future, the implementation may require adjustments.

Screenshots / Mockups (Optional):
N/A

[Feature Request] Generate voices.json upon first run

🚀 Feature Request

Description:
Introduce an automated function to fetch and filter voice data, saving the results to data/voices_data.json.

Problem Statement:
Currently, the process of extracting and filtering voice data is manual. Automating this process will streamline data collection and ensure consistency in the data used across the project.

Proposed Solution:
Implement a function that:

Downloads all the voice data.
Filters the data to retain only voice_id, name, and labels for each voice entry.
Saves the filtered data to data/voices_data.json.

The filtering process can be based on the given code snippet:

# Extract the list of voices and then filter it
voices_data = data["voices"]

filtered_voices = [
    {
        "voice_id": voice["voice_id"],
        "name": voice["name"],
        "labels": voice["labels"]
    }
    for voice in voices_data
]

Benefits:

Simplifies the data extraction and filtering process.
Ensures that the voice data used in the project is consistent and up-to-date.
Reduces potential manual errors in the data preparation process.

Additional Context:
To automate the data extraction process, the function should interact with the ElevenLabs voices API. Here are the specifics of the API endpoint:

Endpoint: /v1/voices
Required Parameters: xi-api-key

Expected Response (200 Successful Response):

voices: A list containing voice data. Each voice entry has several fields, including voice_id, name, samples, category, fine_tuning, labels, etc.
For the purpose of our feature, we are primarily interested in voice_id, name, and labels.

Sample response structure:

{
  "voices": [
    {
      "voice_id": "string",
      "name": "string",
      ...
      "labels": {
        "additionalProp1": "string",
        "additionalProp2": "string",
        "additionalProp3": "string"
      },
      ...
    }
  ]
}

The need for this feature was realized during a manual extraction process. Having an automated solution will be crucial as the project scales and voice data updates become more frequent.

[Refactor] Modularize main.py

⤲ Refactor

Description:
Refactor the existing main.py to adopt a more modular structure by dividing its functionalities into separate modules under different directories. This ensures clearer code organization and ease of maintainability.

Problem Statement:
The current structure of main.py is dense and combines multiple functionalities. This dense structure can hinder navigation, make debugging challenging, and complicate future extensions. Combining numerous responsibilities in one file can lead to decreased developer productivity and a heightened potential for errors.

Proposed Solution:

Retain a stub for process_events() in main.py but move its core logic into the following directory and file structure:

chat2gpt/
│
├── main.py (contains a stub for process_event())
│
├── handlers/
│   ├── process_event.py (contains the actual logic for process_event())
│   ├── chat_response.py
│   ├── image_response.py
│   ├── tts_response.py
│   └── slash_commands.py
│
├── settings/
│   ├── env_loader.py
│   └── sessions.py
│
└── utils/
    ├── moderation.py
    ├── voices.py
    ├── tokenizer.py
    ├── text_to_speech.py
    ├── google_cloud.py
    └── ...

Segregate functionalities and transfer them into their respective files:
- Move event processing logic to handlers/process_event.py.
- Shift message handling to handlers/chat_response.py.
- Transfer image response code to handlers/image_response.py.
- Relocate TTS and voice list response functionality to handlers/tts_response.py.
- Move any Google Cloud related functions to utils/google_cloud.py.
- Move any voices related utility functions to utils/voices.py.
- Move the text-to-speech function to utils/text_to_speech.py.
- Move any slash command handling logic to handlers/slash_commands.py.
- Transfer environment variable and session management to the settings/ directory, adopting a streamlined method using a loop or function to load environment variables.
- Migrate any other auxiliary utility functions to the utils/ directory, including creating a function to centralize API call headers for the ElevenLabs API.
Ensure all module imports remain absolute to prevent potential path issues and maintain clarity.
Ensure that existing user input and output moderation rules are applied.

Benefits:

Enhanced code readability and maintainability.
Simplified project structure navigation.
Clearer separation of concerns, ensuring each module has a distinct responsibility.
Easier collaboration for developers, allowing them to work on separate modules with reduced risk of merge conflicts.

Additional Context:
This architectural change is pivotal for the sustainable evolution of the project. As we incorporate more features, having a well-defined and modular structure will be indispensable. Furthermore, breaking down functions like handle_message into more specific sub-functions will aid in readability and error management.

[Feature Request] Efficient GCS Bucket Deletion

🚀 Feature Request: Efficient GCS Bucket Deletion

Description:
Enhance the efficiency of file deletions in our GitHub Action, specifically in cases where there are a significant number of files.

Problem Statement:
Currently, our GitHub Action deletes files one by one, which can be time-consuming when there are many files to delete. The goal is to ensure rapid deletions without compromising the integrity of the operation, especially when the file count varies between deployments.

Proposed Solution:
Integrate the -m flag into the gsutil rm command in our GitHub Action. This will allow for multi-threaded operations and speed up the deletion of multiple files.

# Check if the bucket exists and delete it
- name: Delete Existing Bucket
  run: |
    if gsutil ls "gs://$GCS_BUCKET_NAME"; then
      gsutil -m rm -r "gs://$GCS_BUCKET_NAME"
    fi

Benefits:

Efficiency: Speeds up the deletion process for multiple files, reducing the overall runtime of the GitHub Action.
Scalability: Even if the number of files increases in the future, the operation will remain efficient.
Flexibility: The -m flag is adaptive, meaning it won't have a negative impact if there are fewer files or even none.

Additional Context:
In the current setup, when there are multiple files, the Action suggests using the -m flag for efficiency, as evident from the logs.

[Feature Request] Dynamic style and quality for DALL-E 3 via prompt

🚀 Feature Request

Description:
Introduce a feature in our application that allows users to dynamically set the 'style' and 'quality' parameters for DALL-E 3 image generation at runtime through their prompts. This would be an enhancement over the current implementation, which uses global, static environment variables to set these parameters.

Problem Statement:
While our application currently supports setting DALL-E 3's 'style' and 'quality' parameters via environment variables, these settings are global and inflexible, applying uniformly to all images generated during a session. This setup limits the user's ability to tailor the style and quality of each image to their specific needs or preferences at the moment.

Proposed Solution:
Implement functionality where the application interprets specific hashtags within the user's text prompt (e.g., #natural, #vivid for style, and #standard, #hd for quality) and adjusts the 'style' and 'quality' parameters for that particular DALL-E 3 API call accordingly. The system should revert to the default values set by the environment variables if no relevant hashtags are detected in the prompt.

Benefits:
This feature would significantly enhance user experience by providing the flexibility to customize each image generation request. It would encourage creative experimentation with different styles and qualities, potentially leading to more engaging and diverse outputs. Furthermore, it adds an interactive element to the application, making it more responsive to user inputs.

Additional Context:
Given that users might not be familiar with the usage of these hashtags initially, it would be important to incorporate a user guide or help section explaining how to use these hashtags effectively. The feature should be designed to elegantly handle scenarios where users might input conflicting hashtags or invalid combinations.

rahb-realtors-association / chat2gpt Goto Github PK

chat2gpt's People

Contributors

Stargazers

Watchers

chat2gpt's Issues

[Bug] Images lack direct download functionality

TTS audio is returned as binary data, not a URL.

[Feature Request] Loading indicator for /image and /tts

[Feature Request] Add /help command to print docs/usage.md

[Feature Request] Generate voices.json upon first run

[Refactor] Modularize main.py

[Feature Request] Efficient GCS Bucket Deletion

[Feature Request] Dynamic style and quality for DALL-E 3 via prompt

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent