Comments (14)
Sorry for the delay! I tried testing the links in README.md last night using awesome_bot locally. I found a few redirecting links, which you can see below. I've included the full log as an attachment.
List of redirecting links
01. [L0074] 301 https://github.com/sibyjackgrove/SolarPV-DER-simulation-utility → https://github.com/tdcosim/SolarPV-DER-simulation-utility
02. [L0150] 301 https://github.com/izabala123/BEMRosetta → https://github.com/BEMRosetta/BEMRosetta
03. [L0157] 301 https://github.com/charxie/energy2d → https://github.com/charxie/multiphysics
04. [L0399] 301 https://github.com/gschivley/PowerGenome → https://github.com/PowerGenome/PowerGenome
05. [L0436] 301 https://gitlab.com/diw-evu/dieter_public/dieter_py → https://gitlab.com/diw-evu/dieter_public/dieterpy
06. [L0459] 301 https://github.com/rl-institut/mvs_eland → https://github.com/rl-institut/multi-vector-simulator
07. [L0464] 302 https://bitbucket.org/harald_g_svendsen/powergama/ → https://bitbucket.org/harald_g_svendsen/powergama/wiki/Home
08. [L0519] 301 https://openei.org → https://openei.org/wiki/Main_Page
09. [L0533] 301 https://github.com/tmrowco/northapp-contrib → https://github.com/tmrowco/bloom-contrib
10. [L0540] 301 https://github.com/mlco2/code-carbon → https://github.com/mlco2/codecarbon
11. [L0603] 301 https://www.appropedia.org/ → https://www.appropedia.org/Welcome_to_Appropedia
12. [L0609] https://ecostress.jpl.nasa.gov/ SSL_connect returned=1 errno=0 state=error: certificate verify failed (unable to get local issuer certificate)
13. [L0633] 301 https://github.com/atreyasha/vegMonitor → https://github.com/atreyasha/vegetation-monitoring
14. [L0654] 301 https://github.com/pyronear/PyroNear → https://github.com/pyronear/pyro-vision
15. [L0704] 301 https://github.com/mankoff/freshwater → https://github.com/GEUS-PROMICE/freshwater
16. [L0713] 404 https://forge.ipsl.jussieu.fr/nemo/chrome/site/doc/NEMO/guide/html/NEMO_guide.html
17. [L0809] 301 https://github.com/apache/climate → https://github.com/apache/attic-climate
18. [L0848] 301 https://github.com/Vizzuality/climate-watch → https://github.com/ClimateWatch-Vizzuality/climate-watch
19. [L0863] 301 https://github.com/adventuroussrv/Climate-Change-Datasets → https://github.com/OpenFloodAI/Climate-Change-Datasets
20. [L0915] 301 https://gitlab.version.fz-juelich.de/toar/mlair → https://gitlab.version.fz-juelich.de/esde/machine-learning/mlair
21. [L0916] 301 https://github.com/amaurymartiny/shoot-i-smoke → https://github.com/shootismoke/mobile-app
22. [L0929] 301 https://github.com/williamorim/Rpollution → https://github.com/openvironment/Rpollution
Notes and observations:
- I added all non-project links to the whitelist, such as the links used by the badges. I did notice that the An Animated Map of the Earth's author's Twitter account no longer exists. This could be temporary, but you could consider linking their website instead of Twitter.
- I set a request delay of 1 second to prevent failures due to too many requests. The downside is that the test takes about 16 minutes to complete (as there were 931 links). This could be an issue if you have a limited amount of CI minutes. I tried setting a shorter delay and it didn't seem to have a noticeable difference to me, but I could be wrong.
- ECOSTRESS (no. 12) could be added to the whitelist to prevent the test from failing due to unverifiable certificates.
- NEMO (no. 16) gives a 404 but the project still exists; the link should be updated to https://forge.ipsl.jussieu.fr/nemo/wiki/Users.
- Overall, there were only 22 redirecting links, so it should be easy enough to find and replace manually. However, I noticed that some projects have a new name, e.g. energy2d (no. 03) is now multiphysics. So, the description of these projects may also be out of date.
Here's an example GitHub Action file which uses awesome_bot. I've set a monthly schedule and am using the Ruby gem method. Let me know what you think and if you would prefer using a different implementation.
GitHub Action
name: linkcheck
on:
schedule:
- cron: "0 3 20 * *"
jobs:
test:
runs-on: ubuntu-latest
# container: dkhamsing/awesome_bot # Docker method
steps:
- name: Check out Git repository
uses: actions/checkout@v2
# begin Ruby gem method
- name: Set up Ruby
uses: ruby/setup-ruby@v1
with:
ruby-version: 2.7
- name: Install awesome_bot and dependencies
run: |
gem install awesome_bot
# end Ruby gem method
- name: Check links using awesome_bot
run: |
awesome_bot --allow-dupe --skip-save-results --request-delay 1 \
--white-list \
tabletopwhale.com,protontypes.eu,opensustain.tech,gitter.im,\
badgen.net,github.com/protontypes/open-sustainable-technology,\
contrib.rocks,github.com/eleanorlutz/earth_atlas_of_space,\
twitter.com/eleanor_lutz \
README.md
from open-sustainable-technology.
Hi @nmstreethran ,
Thanks for your work! This is really good feedback. I will look into this and probably implement your suggested workflow soon.
If you are interested in doing more with us, feel free to join any of our online meetings.
from open-sustainable-technology.
@nmstreethran . We use the organization-documents README just for logging. Normally we use we Gitter chat to announce the next meeting: https://gitter.im/protontypes/community
Most of the time we meet at least once per week at Thursday 18:30 CET. If you would like to join and this time is bad for your we could also switch it.
from open-sustainable-technology.
I will try to fix the conflicts :)
from open-sustainable-technology.
Hey @nmstreethran,
that's a good suggestion. It is easily implemented and should help to remove dead projects or redirected URLs.
We should definitely test this. It would also help to read the list in a scripted way like we are planning to do:
#70
Things that could be relevant:
- The Github.com server could start blocking the many requests (>1000) that we are doing.
- In principle, you need a log history to check which URL was not accessible over a longer period of time. Maybe some projects a just in some maintenance at the moment. However, I do not consider this to be problematic. We can also check the action logs regularly in a manual way to see if the same project is always unavailable. Nevertheless, the action could fail but still, everything is fine.
- We also have to consider how to deal with other code errors 403 and 429.
If you like to you are very welcome to create a pull request ( and test also our new Continuous Reforestation implementation 🌳. I can also do it so that we have an implementation to discuss.
from open-sustainable-technology.
Thanks @Ly0n.
Regarding the GitHub server blocking the large number of requests, and error code 429 (Too Many Requests), awesome_bot has a --request-delay
option to delay each request. Setting it to a reasonable value (maybe 0.5 seconds?) will probably fix this, but it has to be tested. The action will take longer to complete, though.
I'll think about point 2 and other error codes and let you know if I come up with something. I'll check out the issue you referenced and the Continuous Reforestation repository as well in the coming week.
from open-sustainable-technology.
Thank you for the feedback, @tjarkdoering! I'm happy to contribute further and will join the meetings when possible.
from open-sustainable-technology.
That's really amazing and very important for our future work since we are planning to read metadata via the GitHub API to create a database out of it. That's why is is very important that the list always keep clean and readable in a machineable way. @nmstreethran You are very welcome to join our next session. Check out the slides the from the LF Energy conference yesterday to get some idea what we are gone do with the list in the future (slide 10):
https://github.com/protontypes/organization-documents/blob/master/slides/protontypes_measuring_the_open_and_sustainable_technology_world.pdf
from open-sustainable-technology.
@nmstreethran. I checked your GitHub Action script and the URL issues you found. Again, great work. I would like to implement it today but do not want to steal your PR. For me, it is no problem to implement it but it is at the end your performance. What are your thoughts on it?
from open-sustainable-technology.
Hi @Ly0n, I do not mind either way. I can create a PR tomorrow, but if you wish to implement it today itself, then please go ahead.
By the way, are your meetings every Thursday at 18:30 CET? Just a heads up, the next meeting's date is incorrect in the organization-documents repository.
from open-sustainable-technology.
Sorry @tjarkdoering, I just noticed that you have made a commit regarding this issue!
from open-sustainable-technology.
Just one minute ago 😄
But it was only for the redirect links.
from open-sustainable-technology.
Thank you!
from open-sustainable-technology.
I think it has been fixed now. Let me know if there's anything else I can do!
from open-sustainable-technology.
Related Issues (20)
- Future of the Education Dataset HOT 3
- Automatically add contributors to CITATION.cff HOT 9
- Global weather forecasting - GraphCast HOT 4
- Create Open Sustainability Wikipedia Article
- Launch a new landing page and campaign for getting more contributors involved HOT 5
- TimeGPT HOT 1
- Improve Clustering of Projects HOT 7
- Publication of a new dataset based on manual investigation HOT 2
- Funding for OpenSustain.tech HOT 4
- Host a public community event to discuss further steps for OpenSustain.tech
- Weather forcasts: Graphcast, Metnet3 HOT 1
- Label projects based on the end user
- Remove Repos that are not Open Source HOT 1
- Find and implement new useful features provided by website framework
- Create a large hexagon with all the R hexagons by using the URL as filter HOT 1
- Blog articles for this year HOT 2
- Add a "Join the Community" area HOT 2
- Create an "Open Sustainable Technology" browser extension
- Add Scheffler Reflector (Parabolic Concentrated Solar Energy) HOT 8
- OpenClimate.fund HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from open-sustainable-technology.