Coder Social home page Coder Social logo

stophobia / repototext Goto Github PK

View Code? Open in Web Editor NEW

This project forked from jeremiahpetersen/repototext

0.0 1.0 0.0 61.93 MB

Turn an entire GitHub Repo into a single organized .txt file to use with GPT-4 (formerly ChatGPT Advanced Data Analysis / Code Interpreter)

License: MIT License

JavaScript 28.26% Python 53.45% CSS 10.41% HTML 7.88%

repototext's Introduction

example workflow example workflow

repo to text 5

repo to text 7

RepoToText

RepoToText is a web app that scrapes a GitHub repository and converts its files into a single organized .txt. It allows you to enter the URL of a GitHub repository and an optional documentation URL (the doc info will append to the top of the .txt). The app retrieves the contents of the repository, including all files and directories, and also fetches the documentation from the provided URL and includes it in a single organized text file. The .txt file will be saved in the /data folder with user + repo + timestamp info. This file can then be uploaded to Code Interpreter and you can use the chatbot to interact with the entire GitHub repo.

Environment Configuration

Add your GitHub API Key in the .env file

GITHUB_API_KEY='YOUR GITHUB API KEY HERE'

Prompt Example

This is a .txt file that represents an entire GitHub repository. The repository's individual files are separated by the sequence '''--- , followed by the file path, ending with ---. Each file's content begins immediately after its file path and extends until the next sequence of '''--- Add your idea here (Example): Please create a react front end that will work with the back end

FolderToText

FolderToText.py is a script that allows you to turn a local folder, or local files, into a .txt in the same way RepoToText.py does. Choose your files with browse (you can continue adding by clicking "Browse". Once you have all of your files selected and uploaded with browse, type in the file type endings you want to copy with a ',' in between. Example: .py , .js , .md , .ts ---> You can also turn this off and it will add every file you uploaded to the .txt ---> Last, enter in the file name you want to appear and the output path. The file will be written with your file name choice and a timestamp.

Info

  • Creates a .txt with ('''---) separating each file from the repo.
  • Each file from the repo has a header after ('''---) with the file path as the title.
  • The .txt file is saved in the /data folder
  • You can add a URL to a documentation page and the documentation page will append to the top of the .txt file (great to use for tech that came out after Sep 2021).

Tech Used

  • Frontend: React.js
  • Backend: Python Flask
  • Containerization: Docker
  • GitHub API: PyGithub library
  • Additional Python libraries: beautifulsoup4, requests, flask_cors, retry

Running the Application with Docker

To run the application using Docker, follow these steps:

  1. Clone the repository.
  2. Set up the environment variable GITHUB_API_KEY in the .env file.
  3. Build the Docker images with docker compose build.
  4. Start the containers with docker compose up.
  5. Access the application (http://localhost:3000) in a web browser and enter the GitHub repository URL and documentation URL (if available).
  6. Choose All files or choose specific file types.
  7. Click the "Submit" button to initiate the scraping process. The converted text will be displayed in the output area, and it will also be saved in the /data folder.
  8. You can also click the "Copy Text" button to copy the generated text to the clipboard.

TODO

  • Add Docker to project
  • FIX: Broken file types: .ipynb
  • FIX: FolderToText - fix so a user can pick one folder (currently only working when user selects individual files)
  • Add in the ability to work with private repositories
  • Create a small desktop app via PyQT or an executable file
  • Add ability to store change history and update .txt to reflect working changes
  • Add checker function to make sure .txt is current repo version
  • Adjust UI for flow, including change textarea output width, adding file management and history UI
  • Explore prompt ideas including breaking the prompts into discrete steps that nudge the model along

repototext's People

Contributors

jeremiahpetersen avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.