Coder Social home page Coder Social logo

Comments (5)

adamhooper avatar adamhooper commented on September 28, 2024

Thank you for reporting this! I'll look into it. It certainly seems like you're doing everything correctly.

from cjworkbench.

adamhooper avatar adamhooper commented on September 28, 2024

I duplicated the workflow and clicked "Scrape" three times. The second and third times, there was no data -- most statuses were Can't connect: None. But the first time, half the statuses were 200 OK.

I strongly suspect the IOM servers are treating these requests as nefarious and dropping them.

We at Workbench now have an ethical dilemma: should we help our users circumvent IOM's servers' restrictions? We'll have to discuss this and follow up in this issue.

In the meantime: I infer that this is a low-traffic page (maybe 1 update per day). That being the case, this workaround may be a manageable way of getting this data into Workbench:

  1. In a tab, scrape the single URL "https://migration.iom.int/covid19reports" and have it alert you when new data appears. (My tests suggest scraping the single URL should work, if it's scraped rarely enough.)
  2. Whenever Workbench emails you that the page has changed, visit it and add the new rows to a Google Sheet.
  3. In a separate tab, import from the Google Sheet.

from cjworkbench.

arky avatar arky commented on September 28, 2024

Thank you @adamhooper for detailed analysis and workaround.. Am going to close this bug, as it more an issue with filtering on third-party servers.

from cjworkbench.

adamhooper avatar adamhooper commented on September 28, 2024

After discussing as a team, we've decided not to wrestle with uncooperative web servers for the time being.

Also, while want to produce a more-helpful error message, that would have a high cost and a high rate of misleading error messages. (This error in particular, None, could be one of a thousand different problems. So if we were to report "you're breaking the site's terms of service," we'd be wrong 99.9% of the time.)

from cjworkbench.

arky avatar arky commented on September 28, 2024

Thanks @adamhooper

from cjworkbench.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.