Coder Social home page Coder Social logo

Comments (5)

paul-gauthier avatar paul-gauthier commented on August 15, 2024

Thanks for trying aider and reporting this issue and sharing the chat history. That's very helpful information.

It looks to me like this may be a problem with the way aider handles subdirectories on Windows. I will continue to investigate and let you know when I've implemented a solution.

In the meantime, you might want to try and avoid using subdirectories and see if that helps. When starting a new project with aider you could run aider like aider quorum_scraper.py. Aider will make a new empty file without suggesting a subdirectory. Then when you ask for code, it will go into that file.

For example, here is the transcript of me doing that and giving aider your initial prompt:

aider quorum_scraper.py

Model: gpt-4
Creating empty file quorum_scraper.py
Files not tracked in .git:
quorum_scraper.py
Add them? y
Added quorum_scraper.py to the git repo
Commit 07b56f3 Added new files to the git repo: quorum_scraper.py
Git repo: .git
Repo-map: universal-ctags using 1024 tokens
Added quorum_scraper.py to the chat.
Use /help to see in-chat commands.

I want to make a Quorum web scraper that scrapes text from pages and subpages from http://quorumlanguage.com. The script will create text files so that a large language model can understand them.

To create a web scraper for the Quorum language website, we will need to use a library like BeautifulSoup to parse the HTML of the website and extract the text. We will also need the requests library to send HTTP requests to the website.

Here are the steps we will follow:

  1. Send a GET request to the main page of the website.
  2. Parse the HTML of the main page to find links to subpages.
  3. For each subpage, send a GET request to retrieve the HTML.
  4. Parse the HTML of each subpage to extract the text.
  5. Write the text to a file.

Here's how we can modify your quorum_scraper.py file to implement this:

quorum_scraper.py
<<<<<<< ORIGINAL

=======
import requests
from bs4 import BeautifulSoup
import os

def get_page_text(url):
    response = requests.get(url)
    soup = BeautifulSoup(response.text, 'html.parser')
    return soup.get_text()

def get_subpage_urls(url):
    response = requests.get(url)
    soup = BeautifulSoup(response.text, 'html.parser')
    return [a['href'] for a in soup.find_all('a', href=True) if a.text]

def write_text_to_file(text, filename):
    with open(filename, 'w') as f:
        f.write(text)

def main():
    url = 'http://quorumlanguage.com'
    subpage_urls = get_subpage_urls(url)
    for i, subpage_url in enumerate(subpage_urls):
        text = get_page_text(subpage_url)
        write_text_to_file(text, f'subpage_{i}.txt')

if __name__ == '__main__':
    main()
>>>>>>> UPDATED

This script will create a text file for each subpage on the Quorum language website. The text files will be named subpage_0.txt, subpage_1.txt, etc.

Applied edit to quorum_scraper.py
Commit 3ea30de aider: Refactor quorum_scraper.py to include functions for getting page text, subpage URLs, and writing text to a file, and add a main function that retrieves the subpage URLs, gets the text for each subpage, and writes it to a file.

from aider.

twilwa avatar twilwa commented on August 15, 2024

Correct me if i'm wrong -- until this issue is fixed, aider won't be able to access existing repos on Windows, only new ones being created, yes?

from aider.

paul-gauthier avatar paul-gauthier commented on August 15, 2024

Thanks for your patience on this. I had to set up a cloud windows server to debug this issue.

I just pushed a fix, so aider should stop asking over and over again to add files that are in subdirectories.

You'll need to install the latest from github to pick up this fix. Let me know if if resolves your issue.

from aider.

paul-gauthier avatar paul-gauthier commented on August 15, 2024

I also added a CI step to run all the tests on windows on every push now.

https://github.com/paul-gauthier/aider/actions/runs/5469581577

from aider.

paul-gauthier avatar paul-gauthier commented on August 15, 2024

I'm going to close this issue for now. Please feel free to re-open or file a new issue if you continue to see problems.

from aider.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.