Comments (9)
"CourseCollectionsJson and CourseTopicUrlsList Urls are not equal"
Because some courses have assessment pages which this scraper may not support.
I have a simple solution, Just bypass the assessment pages. (i don`t need those assessment pages actually)
In src/ScraperType/CourseTopicScraper/CourseTopicScraperMain.py -> function "scrapeCourse"
Below the "topicUrlsList = self.apiUtils.getCourseTopicUrlsList(textFileUrl, courseUrl)"
Add those code
for item in topicUrlsList:
if 'assessment' in item:
topicUrlsList.remove(item)
then restart the scraper.
from educative.io_scraper.
thank you for your answer and suggestion.
I edited that file as you suggested but now I get the following error:
$ python setup.py --run
Educative Scraper (v3.2.5 Master Branch), developed by Anilabha Datta
Project Link: https://github.com/anilabhadatta/educative.io_scraper/tree/v3-dev
Check out ReadMe for more information about this project.
Use the GUI to start scraping.
2023-11-24 16:44:24,446 - INFO - HomeScreen - Creating Home Screen...
2023-11-24 16:44:29,853 - INFO - StartScraper - StartScraper Initiated...
To Terminate, Click on Stop ScraperType Button
2023-11-24 16:44:29,942 - INFO - ExtensionScraper - ExtensionScraper initiated...
2023-11-24 16:44:29,942 - INFO - ExtensionScraper - Started Scraping from Text File URL: https://www.educative.io/courses/intermediate-javascript?showContent=true
2023-11-24 16:44:29,943 - INFO - BrowserUtility - Loading Browser...
2023-11-24 16:44:29,946 - ERROR - StartScraper - start: 20: ExtensionScraper:start: 49: BrowserUtility:loadBrowser: 48: Chromedriver might not be running in background, Please click on Start Chromedriver.
2023-11-24 16:44:35,255 - INFO - HomeScreen - Starting Chrome Driver...
Path: /home/marcello/Documents/Softwares/educative.io_scraper/src/ChromeDrivers/linux/chromedriver-linux64/chromedriver
Option ā-eā is deprecated and might be removed in a later version of gnome-terminal.
Use ā-- ā to terminate the options and put the command line to execute after it.
2023-11-24 16:44:37,384 - INFO - StartScraper - StartScraper Initiated...
To Terminate, Click on Stop ScraperType Button
2023-11-24 16:44:37,467 - INFO - ExtensionScraper - ExtensionScraper initiated...
2023-11-24 16:44:37,467 - INFO - ExtensionScraper - Started Scraping from Text File URL: https://www.educative.io/courses/intermediate-javascript?showContent=true
2023-11-24 16:44:37,467 - INFO - BrowserUtility - Loading Browser...
2023-11-24 16:44:38,087 - INFO - BrowserUtility - Browser Initiated
2023-11-24 16:44:41,712 - INFO - ApiUtility - Course Type Selector: a[href*='/courses/']
2023-11-24 16:44:51,876 - ERROR - StartScraper - start: 20: ExtensionScraper:start: 52: ExtensionScraper:scrapeCourse: 64: ApiUtility:getCourseUrl: 134: Message:
Stacktrace:
#0 0x5596f119be23
#1 0x5596f0ec47a7
#2 0x5596f0f031d3
#3 0x5596f0f032c1
#4 0x5596f0f3ea04
#5 0x5596f0f2403d
#6 0x5596f0f3c369
#7 0x5596f0f23de3
#8 0x5596f0ef7a7b
#9 0x5596f0ef881e
#10 0x5596f115d638
#11 0x5596f1161507
#12 0x5596f116bc4c
#13 0x5596f1162136
#14 0x5596f11309cf
#15 0x5596f1185b98
#16 0x5596f1185d68
#17 0x5596f1194cb3
#18 0x7f4dcf494ac3
from educative.io_scraper.
I tried with another course and it works.
the above error seems related to that course.
from educative.io_scraper.
Do Not Use:https://www.educative.io/courses/intermediate-javascript
Use https://www.educative.io/courses/intermediate-javascript/getting-started
Use the course`s first topic url but not course url.
from educative.io_scraper.
ok, thank you!
from educative.io_scraper.
I actually have a similar problem with this course: in this case it does not work even if I set the first page of the course as a download link.
https://www.educative.io/courses/fundamentals-of-digital-signal-processing
or
https://www.educative.io/courses/fundamentals-of-digital-signal-processing/the-world-of-signals
from educative.io_scraper.
@buscon checking
from educative.io_scraper.
@buscon fixed in latest commit, do git pull
from educative.io_scraper.
I just tested and yes, I confirm it works. thank you!
from educative.io_scraper.
Related Issues (20)
- THE CHROME BROWSER DOES NOT WORK HOT 1
- ERROR - StartScraper - start: 20: ExtensionScraper:start: 50: ExtensionScraper:scrapeCourse: 91: ExtensionScraper:scrapeTopic: 152: QuizUtility:downloadQuizFiles: 25: QuizUtility:downloadQuiz: 69: 'explanation' HOT 4
- failure HOT 4
- unable to proceed due to captcha HOT 3
- ERROR CourseCollectionsJson and CourseTopicUrlsList Urls are not equal HOT 4
- Can't get full source code files HOT 9
- Unable to run in Macos Apple silicon HOT 9
- [Feature request] Support category in viewer HOT 7
- I am getting this error HOT 5
- Error: ShowUtility:showCodeSolutions: 59: Message: javascript error: Cannot read properties of null (reading 'click') HOT 3
- Exception when get course has Cloud lab inside HOT 2
- request-html would might be a good and fast alternative for selenium HOT 3
- Cant Run in Linux Server HOT 7
- [Minor Bug] [MacOS] Scraper fails to run if path has space in it HOT 1
- looking for Educative course downloader HOT 1
- Group Downloaded Topics by "Modules" or "Sections" HOT 1
- Reduce Downloaded File Size for PDF HOT 1
- Create one PDF for full course, with Contents HOT 1
- iframe elements not saved correct HOT 5
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
š Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. ššš
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ā¤ļø Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from educative.io_scraper.