rahb-realtors-association / chat2gpt Goto Github PK
View Code? Open in Web Editor NEWChatΒ²GPT is a ChatGPT (and DALLΒ·E 2/3, and ElevenLabs) chat bot for Google Chat. π€π¬
Home Page: https://chat2gpt.oncornerstone.app
License: MIT License
ChatΒ²GPT is a ChatGPT (and DALLΒ·E 2/3, and ElevenLabs) chat bot for Google Chat. π€π¬
Home Page: https://chat2gpt.oncornerstone.app
License: MIT License
π Bug Report
Description:
Image cards within ChatΒ²GPT are not directly clickable for downloading. While users can right-click to save the image, this doesn't provide an intuitive user experience.
Steps to Reproduce:
Expected Behavior:
Users should be able to either click directly on the image to initiate a download or have a dedicated download button accompanying the image for easier access.
Actual Behavior:
Clicking directly on the image does nothing. Users are required to right-click and select 'Save Image As' to download.
Additional Information:
This issue might affect users who aren't familiar with the right-click method or expect a more direct approach to downloading images.
Possible Solution (Optional):
Consider implementing a direct click-to-download functionality for the image cards or adding a prominent 'Download' button adjacent to the images for better accessibility.
This feature is going to need more work due to the way the audio data is returned by the API (streamed binary data). A mechanism to store and serve files is beyond the scope of this project, so unless a very simple method is found we may abandon this feature branch.
Originally posted by @justinh-rahb in #42 (comment)
π Feature Request
Description:
Introduce an immediate feedback mechanism in the chatbot to notify users when their request is being processed, specifically for image and TTS generation. This will provide a more interactive and user-friendly experience.
Problem Statement:
Currently, there can be a noticeable delay between when the user sends a request (e.g., /image
or /tts
) and when the bot responds with the generated image or audio. During this waiting period, users are left without feedback, which might lead them to believe the bot is unresponsive or broken.
Proposed Solution:
Before starting the image or audio generation process, the bot should instantly send a response to inform the user that their request is being processed. This can be achieved by sending a "Processing..." message accompanied by a card that provides more details about the ongoing task. This will keep users informed and set the expectation that a more substantial response will follow shortly.
Benefits:
Additional Context:
Below is a code snippet that demonstrates a possible implementation:
# ... existing code ...
def handle_message(user_id, user_message):
try:
# ... existing code ...
# Check if the user input starts with /image
elif user_message.strip().lower().startswith('/image'):
# ... existing code ...
# Before the actual image generation, immediately respond with a processing message
return jsonify({
'text': 'Processing your image request...',
'cardsV2': [{
'cardId': generate_unique_card_id(),
'card': {
'header': {
'title': 'Processing...',
},
'sections': [
{
'widgets': [
{
'textParagraph': {
'text': 'Generating the image based on your request. Please wait...'
}
}
]
}
]
}
}],
})
# Check if the user input starts with /tts
elif user_message.strip().lower().startswith('/tts'):
# ... existing code ...
# Before the actual text-to-speech conversion, immediately respond with a processing message
return jsonify({
'text': 'Processing your TTS request...',
'cardsV2': [{
'cardId': generate_unique_card_id(),
'card': {
'header': {
'title': 'Processing...',
},
'sections': [
{
'widgets': [
{
'textParagraph': {
'text': 'Generating the speech output based on your request. Please wait...'
}
}
]
}
]
}
}],
})
# ... rest of the code ...
except Exception as e:
# ... existing code ...
Screenshots / Mockups (Optional):
No visual mockups provided.
π Feature Request
Description:
Add a new /help
command to the chatbot that provides users with guidance on how to interact with it. This command will display content from the docs/usage.md
file, allowing users to understand the bot's functionalities without having to refer to external documentation.
Problem Statement:
Users often require guidance on how to use chatbots effectively, especially when they encounter the bot for the first time. Providing them with a /help
command directly within the chat interface can improve their experience by offering immediate access to helpful information.
Proposed Solution:
Implement a new /help
command in the bot's message handling logic. Upon invocation, the bot should read the docs/usage.md
file, extract the content below the ---
header, and return it as its response. Here's a code snippet illustrating the proposed implementation:
# Check if the user input starts with /help
elif user_message.strip().lower() == '/help':
try:
# Read the docs/usage.md file
with open('docs/usage.md', 'r') as file:
content = file.read()
# Split the content at the "---" header line and get the second part
help_content = content.split("---", 2)[-1].strip()
# Return the extracted content as the bot's response
return jsonify({'text': help_content})
except Exception as e:
print(f"Error reading help content: {str(e)}")
return jsonify({'text': 'Sorry, I encountered an error retrieving the help content.'})
Benefits:
Additional Context:
The proposed solution assumes that the docs/usage.md
file exists and is structured with a ---
header. If the file structure changes in the future, the implementation may require adjustments.
Screenshots / Mockups (Optional):
N/A
π Feature Request
Description:
Introduce an automated function to fetch and filter voice data, saving the results to data/voices_data.json
.
Problem Statement:
Currently, the process of extracting and filtering voice data is manual. Automating this process will streamline data collection and ensure consistency in the data used across the project.
Proposed Solution:
Implement a function that:
voice_id
, name
, and labels
for each voice entry.data/voices_data.json
.The filtering process can be based on the given code snippet:
# Extract the list of voices and then filter it
voices_data = data["voices"]
filtered_voices = [
{
"voice_id": voice["voice_id"],
"name": voice["name"],
"labels": voice["labels"]
}
for voice in voices_data
]
Benefits:
Additional Context:
To automate the data extraction process, the function should interact with the ElevenLabs voices API. Here are the specifics of the API endpoint:
Endpoint: /v1/voices
Required Parameters: xi-api-key
Expected Response (200 Successful Response):
voices
: A list containing voice data. Each voice entry has several fields, including voice_id
, name
, samples
, category
, fine_tuning
, labels
, etc.voice_id
, name
, and labels
.Sample response structure:
{
"voices": [
{
"voice_id": "string",
"name": "string",
...
"labels": {
"additionalProp1": "string",
"additionalProp2": "string",
"additionalProp3": "string"
},
...
}
]
}
The need for this feature was realized during a manual extraction process. Having an automated solution will be crucial as the project scales and voice data updates become more frequent.
β€² Refactor
Description:
Refactor the existing main.py
to adopt a more modular structure by dividing its functionalities into separate modules under different directories. This ensures clearer code organization and ease of maintainability.
Problem Statement:
The current structure of main.py
is dense and combines multiple functionalities. This dense structure can hinder navigation, make debugging challenging, and complicate future extensions. Combining numerous responsibilities in one file can lead to decreased developer productivity and a heightened potential for errors.
Proposed Solution:
process_events()
in main.py
but move its core logic into the following directory and file structure:
chat2gpt/
β
βββ main.py (contains a stub for process_event())
β
βββ handlers/
β βββ process_event.py (contains the actual logic for process_event())
β βββ chat_response.py
β βββ image_response.py
β βββ tts_response.py
β βββ slash_commands.py
β
βββ settings/
β βββ env_loader.py
β βββ sessions.py
β
βββ utils/
βββ moderation.py
βββ voices.py
βββ tokenizer.py
βββ text_to_speech.py
βββ google_cloud.py
βββ ...
handlers/process_event.py
.handlers/chat_response.py
.handlers/image_response.py
.handlers/tts_response.py
.utils/google_cloud.py
.utils/voices.py
.utils/text_to_speech.py
.handlers/slash_commands.py
.settings/
directory, adopting a streamlined method using a loop or function to load environment variables.utils/
directory, including creating a function to centralize API call headers for the ElevenLabs API.Benefits:
Additional Context:
This architectural change is pivotal for the sustainable evolution of the project. As we incorporate more features, having a well-defined and modular structure will be indispensable. Furthermore, breaking down functions like handle_message
into more specific sub-functions will aid in readability and error management.
π Feature Request: Efficient GCS Bucket Deletion
Description:
Enhance the efficiency of file deletions in our GitHub Action, specifically in cases where there are a significant number of files.
Problem Statement:
Currently, our GitHub Action deletes files one by one, which can be time-consuming when there are many files to delete. The goal is to ensure rapid deletions without compromising the integrity of the operation, especially when the file count varies between deployments.
Proposed Solution:
Integrate the -m
flag into the gsutil rm
command in our GitHub Action. This will allow for multi-threaded operations and speed up the deletion of multiple files.
# Check if the bucket exists and delete it
- name: Delete Existing Bucket
run: |
if gsutil ls "gs://$GCS_BUCKET_NAME"; then
gsutil -m rm -r "gs://$GCS_BUCKET_NAME"
fi
Benefits:
-m
flag is adaptive, meaning it won't have a negative impact if there are fewer files or even none.Additional Context:
In the current setup, when there are multiple files, the Action suggests using the -m
flag for efficiency, as evident from the logs.
π Feature Request
Description:
Introduce a feature in our application that allows users to dynamically set the 'style' and 'quality' parameters for DALL-E 3 image generation at runtime through their prompts. This would be an enhancement over the current implementation, which uses global, static environment variables to set these parameters.
Problem Statement:
While our application currently supports setting DALL-E 3's 'style' and 'quality' parameters via environment variables, these settings are global and inflexible, applying uniformly to all images generated during a session. This setup limits the user's ability to tailor the style and quality of each image to their specific needs or preferences at the moment.
Proposed Solution:
Implement functionality where the application interprets specific hashtags within the user's text prompt (e.g., #natural, #vivid for style, and #standard, #hd for quality) and adjusts the 'style' and 'quality' parameters for that particular DALL-E 3 API call accordingly. The system should revert to the default values set by the environment variables if no relevant hashtags are detected in the prompt.
Benefits:
This feature would significantly enhance user experience by providing the flexibility to customize each image generation request. It would encourage creative experimentation with different styles and qualities, potentially leading to more engaging and diverse outputs. Furthermore, it adds an interactive element to the application, making it more responsive to user inputs.
Additional Context:
Given that users might not be familiar with the usage of these hashtags initially, it would be important to incorporate a user guide or help section explaining how to use these hashtags effectively. The feature should be designed to elegantly handle scenarios where users might input conflicting hashtags or invalid combinations.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
π Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. πππ
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google β€οΈ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.