Coder Social home page Coder Social logo

mjbright / futurelearn-dl Goto Github PK

View Code? Open in Web Editor NEW
34.0 6.0 21.0 44 KB

A script to download materials from the FutureLearn website (for enrolled courses)

License: GNU General Public License v3.0

Shell 2.72% Python 97.28%
downloader futurelearn-dl futurelearn-website enrolled-courses

futurelearn-dl's Introduction

futurelearn-dl

An early Python3ic attempt at automating downloads from the FutureLearn website (for enrolled courses).

There are no doubt problems with this, but it seems to work on my initial tests

TESTED: Tested on Windows 8 under Cygwin, using Anaconda Python3. Should work for other installations ... YMMV

futurelearn-dl.py:

First attempt at a Python3 version.

Currently succeeds to obtain authenticity_token and to login using this token.

It then

  • downloads the appropriate course page
  • downloads each 'week' page for the course
  • downloads each 'step' page for each week of the course
  • finds downloadabls urls (pdf and mp4 for the moment) in each 'step' page
  • it chooses a filename (not a meaningful one for mp4) and downloads to that file
    • skips already downloaded files
    • it skips the file if it contains "request signature": seems to indicate a video file which isn't available yet

TEST_futurelearn-dl.py.sh:

This is simply a template for calling futurelearn-dl.py.

Put your email, password and course_id as arguments within this file

Usage:

''' futurelearn-dl.py <course_id> <course_run>[<week_num>]

e.g.

for run 1 of data-to-insight

futurelearn-dl.py  user password data-to-insight 1

or to get just week1:

futurelearn-dl.py  user password data-to-insight 1 1

'''

Note: To override the temp file directory export TMP_DIR=/tmp

Note: To override the output file root directory export OP_DIR=/e/Education/FUTURELEARN

Note: Under cygwin, Anaconda I needed to set in the form DRIVE:/path e.g. export OP_DIR=e:/Education/FUTURELEARN

TODO:

  • Fix unicode errors
  • Extend to more download types
  • Lots more ...

futurelearn-dl's People

Contributors

mjbright avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

futurelearn-dl's Issues

SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1108)')))

Hi,

I am getting the following SSL certificate-related error despite the proper certificates are installed

$ pip install --upgrade certifi
Defaulting to user installation because normal site-packages is not writeable
Requirement already up-to-date: certifi in /usr/lib/python3.8/site-packages (2020.4.5.1)
Traceback (most recent call last):
  File "/usr/lib/python3.8/site-packages/urllib3/connectionpool.py", line 665, in urlopen
    httplib_response = self._make_request(
  File "/usr/lib/python3.8/site-packages/urllib3/connectionpool.py", line 376, in _make_request
    self._validate_conn(conn)
  File "/usr/lib/python3.8/site-packages/urllib3/connectionpool.py", line 994, in _validate_conn
    conn.connect()
  File "/usr/lib/python3.8/site-packages/urllib3/connection.py", line 352, in connect
    self.sock = ssl_wrap_socket(
  File "/usr/lib/python3.8/site-packages/urllib3/util/ssl_.py", line 370, in ssl_wrap_socket
    return context.wrap_socket(sock, server_hostname=server_hostname)
  File "/usr/lib/python3.8/ssl.py", line 500, in wrap_socket
    return self.sslsocket_class._create(
  File "/usr/lib/python3.8/ssl.py", line 1040, in _create
    self.do_handshake()
  File "/usr/lib/python3.8/ssl.py", line 1309, in do_handshake
    self._sslobj.do_handshake()
ssl.SSLCertVerificationError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1108)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.8/site-packages/requests/adapters.py", line 439, in send
    resp = conn.urlopen(
  File "/usr/lib/python3.8/site-packages/urllib3/connectionpool.py", line 719, in urlopen
    retries = retries.increment(
  File "/usr/lib/python3.8/site-packages/urllib3/util/retry.py", line 436, in increment
    raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='elearning.rcgp.org.uk', port=443): Max retries exceeded with url: /pluginfile.php/148915/mod_resource/content/3/NHS_VC_Info%20for%20GPs_v06.pdf (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1108)')))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "./futurelearn-dl.py", line 625, in <module>
    getCourseWeekStepPage(course_id, week_id, step_id, week_num, title)
  File "./futurelearn-dl.py", line 232, in getCourseWeekStepPage
    downloadURLsInPage(course_id, week_id, step_id, week_num, content, DOWNLOAD_TYPE, page_title)
  File "./futurelearn-dl.py", line 386, in downloadURLsInPage
    downloadURLInPage(url, download_dir, DOWNLOAD_TYPE, page_title)
  File "./futurelearn-dl.py", line 452, in downloadURLInPage
    downloadURLToFile(url, ofile, DOWNLOAD_TYPE)
  File "./futurelearn-dl.py", line 405, in downloadURLToFile
    response = session.get(url, headers=headers)
  File "/usr/lib/python3.8/site-packages/requests/sessions.py", line 543, in get
    return self.request('GET', url, **kwargs)
  File "/usr/lib/python3.8/site-packages/requests/sessions.py", line 530, in request
    resp = self.send(prep, **send_kwargs)
  File "/usr/lib/python3.8/site-packages/requests/sessions.py", line 643, in send
    r = adapter.send(request, **kwargs)
  File "/usr/lib/python3.8/site-packages/requests/adapters.py", line 514, in send
    raise SSLError(e, request=request)
requests.exceptions.SSLError: HTTPSConnectionPool(host='elearning.rcgp.org.uk', port=443): Max retries exceeded with url: /pluginfile.php/148915/mod_resource/content/3/NHS_VC_Info%20for%20GPs_v06.pdf (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1108)')))

I have also tried singular certificate, but with the same SSL verification error getting recorded:

curl -sO http://cacerts.digicert.com/DigiCertHighAssuranceEVRootCA.crt 
openssl x509 -inform DES -in DigiCertHighAssuranceEVRootCA.crt -out DigiCertHighAssuranceEVRootCA.pem -text
export PIP_CERT=`pwd`/DigiCertHighAssuranceEVRootCA.pem

Any input?

Cheers, and stay safe,
/z

Any way that one can skip an offending file?

Hi,

I frequently get a situation when the download gets interrupted due to a specific file containing % sign in the filename in the upstream. A case in point is #17

Is there a way to specify in the command line to skip a specific offending file from the course and continue downloading further?

Thanks in advance.

error and mp4 download issue

Hi Bright,

I think the previous error is caused by my local network restriction. But the error like below always appears whether I use VPN to bypass restriction or not:

ERROR: UnicodeEncodeError - 'ascii' codec can't encode character '\xa5' in position 21140: ordinal not in range(128)

When I use VPN, the ssl error like below doesn’t appear.

Traceback (most recent call last):
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/requests/packages/urllib3/connectionpool.py", line 559, in urlopen
body=body, headers=headers)
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/requests/packages/urllib3/connectionpool.py", line 345, in _make_request
self._validate_conn(conn)
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/requests/packages/urllib3/connectionpool.py", line 784, in validate_conn
conn.connect()
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/requests/packages/urllib3/connection.py", line 252, in connect
ssl_version=resolved_ssl_version)
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/requests/packages/urllib3/util/ssl
.py", line 305, in ssl_wrap_socket
return context.wrap_socket(sock, server_hostname=server_hostname)
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/ssl.py", line 376, in wrap_socket
_context=self)
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/ssl.py", line 747, in init
self.do_handshake()
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/ssl.py", line 983, in do_handshake
self._sslobj.do_handshake()
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/ssl.py", line 628, in do_handshake
self._sslobj.do_handshake()
ssl.SSLEOFError: EOF occurred in violation of protocol (_ssl.c:646)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/requests/adapters.py", line 376, in send
timeout=timeout
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/requests/packages/urllib3/connectionpool.py", line 588, in urlopen
raise SSLError(e)
requests.packages.urllib3.exceptions.SSLError: EOF occurred in violation of protocol (_ssl.c:646)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "futurelearn-dl.py", line 536, in
token, cookies = getToken(session, SIGNIN_URL)
File "futurelearn-dl.py", line 144, in getToken
response = session.get(url, headers=headers)
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/requests/sessions.py", line 480, in get
return self.request('GET', url, *_kwargs)
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/requests/sessions.py", line 468, in request
resp = self.send(prep, *_send_kwargs)
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/requests/sessions.py", line 576, in send
r = adapter.send(request, **kwargs)
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/requests/adapters.py", line 447, in send
raise SSLError(e, request=request)
requests.exceptions.SSLError: EOF occurred in violation of protocol (_ssl.c:646)

But when it downloads mp4, it takes so long and I’m not sure whether it’s working.

Downloading 6-week course 'inside-cancer'
ERROR: UnicodeEncodeError - 'ascii' codec can't encode character '\xa5' in position 21140: ordinal not in range(128)
Downloading urlhttps://ugc.futurelearn.com/uploads/files/23/bc/23bc970a-2eaa-457e-90b8-564dc5de5c24/IC6S1_Models_used_in_Cancer_Research_6_1.pdf ...
type=pdf, content.len=191414
Downloading urlhttps://ugc.futurelearn.com/uploads/files/06/69/066931f3-d9cb-4008-85a4-34362eeb53c5/IC6_S1_Models_used_in_Cancer_Research.pdf ...
type=pdf, content.len=1631448
Downloading urlhttps://view.vzaar.com/1435674/video ...

I wait 30 minutes and it’s still on that line. How can I see the progress and know whether it’s downloading the file? And can you give some option in your code and output a list of urls and the corresponding file names in a file? The code like coursera-dl and edx-dl have some option like that. Thank you.

Best regards.

Rui Guo

FATAL:downloadURLInPage: Unhandled escape sequence in filename (how to sort this out?)

Hi,

I appreciate if anyone has any input to get over this specific issue, that is haunting me for a long time (also see #14):

$ ././futurelearn-dl.py EMAIL PASSWORD instructional-methods-in-health-professions-education 1
Downloading 8-week course 'instructional-methods-in-health-professions-education'
FATAL:downloadURLInPage: Unhandled escape sequence in filename <1.4-Weekly-Overview_Philosophy_of_Adult_Education_Inventory_%281%29.pdf>
Look for new files with - find /home/zenny/DoThis issue reappeared when the referred links has `()` symbols:
wnloads/Education/FUTURELEARN/instructional-methods-in-health-professions-education -type f -exec ls -altr {} \;

The link in question is:

https://pbea.agron.iastate.edu/files/Philosophy%20of%20Adult%20Education%20Inventory%20%281%29.pdf

I have manually downloaded the pdf file to the specific week 1 folder and also cp to multiple names like 1.4-Weekly-Overview_Philosophy_of_Adult_Education_Inventory_%281%29.pdf, yet no go!

Related code

A search in futurelearn-dl.py script leads to the following lines that is responsible for the fatal error:

if '%' in filename:
fatal("downloadURLInPage: Unhandled escape sequence in filename <{}>".format(filename))

How to sort this out in a python script when upstream link has % in their links?

Cheers and stay safe,
/z

[SOLVED] Download dies with 'FATAL:No quote(char=\) in <<t;<a href=\"/profile...>>'

Retrieval dies with 'Unhandled escape sequence in filename'

$ futurelearn-dl.py EMAIL PASSWORD diabetes-genomic-medicine 14
...
Downloading 4-week course 'diabetes-genomic-medicine'
Downloading url<https://ugc.futurelearn.com/uploads/files/86/fc/86fc7901-2955-4735-a9ed-ca82fd83fd45/Glossary_16.05.16.pdf>
	to file <diabetes-genomic-medicine/week1/1.1-Welcome-and-introduction-to-the-course_Glossary_16.05.16.pdf> ...
type=pdf, content.len=346813
Downloading url<https://ugc.futurelearn.com/uploads/files/e7/51/e7510d14-13cf-4d05-b2d0-c10585c67db5/Resources.pdf>
	to file <diabetes-genomic-medicine/week1/1.1-Welcome-and-introduction-to-the-course_Resources.pdf> ...
type=pdf, content.len=289436
Downloading url<https://view.vzaar.com/5894540/video>
	to file <diabetes-genomic-medicine/week1/1.2-&quot;It&#39;s-changed-our-lives&quot;_5894540.mp4> ...
type=mp4, content.len=5417214
Downloading url<https://view.vzaar.com/6187238/video>
	to file <diabetes-genomic-medicine/week1/1.5-Understanding-the-pathophysiology_6187238.mp4> ...
type=mp4, content.len=4337309
Downloading url<https://view.vzaar.com/6181836/video>
	to file <diabetes-genomic-medicine/week1/1.8-The-biggest-change-for-us_6181836.mp4> ...
type=mp4, content.len=3585832
Downloading url<https://view.vzaar.com/6069128/video>
	to file <diabetes-genomic-medicine/week1/1.9-What-is-the-impact-for-clinicians_6069128.mp4> ...
type=mp4, content.len=18589056
Downloading url<http://www.diabetesatlas.org/resources/2015-atlas.html>
	to file <diabetes-genomic-medicine/week1/1.10-What-is-the-prevalence-of-diabetes_2015-atlas.html> ...
downloadURLToFile: Failed to download url <http://www.diabetesatlas.org/resources/2015-atlas.html> => 404
Downloading url<https://diabetes-resources-production.s3-eu-west-1.amazonaws.com/diabetes-storage/migration/pdf/DiabetesUK_Facts_Stats_Oct16.pdf>
	to file <diabetes-genomic-medicine/week1/1.12-How-does-a-family-history-affect-my-risk-of-diabetes_DiabetesUK_Facts_Stats_Oct16.pdf> ...
type=pdf, content.len=568295
Downloading url<https://view.vzaar.com/5952300/video>
	to file <diabetes-genomic-medicine/week1/1.13-Sharing-stories_5952300.mp4> ...
type=mp4, content.len=31911949
Downloading url<https://view.vzaar.com/5984535/video>
	to file <diabetes-genomic-medicine/week2/2.3-What-genomics-can-teach-us-about-polygenic-diabetes_5984535.mp4> ...
type=mp4, content.len=37612216
Downloading url<http://www.nature.com/nrg/journal/v6/n3/abs/nrg1556.html>
	to file <diabetes-genomic-medicine/week2/2.4-The-obesity-epidemic_nrg1556.html> ...
type=html, content.len=493548
Downloading url<http://care.diabetesjournals.org/content/early/2015/10/30/dc15-1111.full.pdf>
	to file <diabetes-genomic-medicine/week2/2.6-What-can-we-learn-from-polygenics_dc15-1111.full.pdf> ...
type=pdf, content.len=1098042
Downloading url<https://view.vzaar.com/5943210/video>
	to file <diabetes-genomic-medicine/week2/2.9-C-peptide_5943210.mp4> ...
type=mp4, content.len=3445916
Downloading url<https://view.vzaar.com/5953006/video>
	to file <diabetes-genomic-medicine/week2/2.11-Family-Trees_5953006.mp4> ...
type=mp4, content.len=7580428
Downloading url<https://view.vzaar.com/6134563/video>
	to file <diabetes-genomic-medicine/week3/3.2-Dan’s-story_6134563.mp4> ...
type=mp4, content.len=6563552
Downloading url<https://view.vzaar.com/6134585/video>
	to file <diabetes-genomic-medicine/week3/3.4-Misdiagnosis_6134585.mp4> ...
type=mp4, content.len=12608396
Downloading url<http://www.nature.com/nrneph/journal/v11/n2/full/nrneph.2014.232.html>
	to file <diabetes-genomic-medicine/week3/3.5-Further-reading-on-the-different-MODY-subtypes_nrneph.2014.232.html> ...
type=html, content.len=290881
FATAL:downloadURLInPage: Unhandled escape sequence in filename <3.5-Further-reading-on-the-different-MODY-subtypes_PIIS0140-6736%2803%2914571-0.pdf>

Same with: digital-cancer-management run 1 which died with 'FATAL:No quote(char=) in <<t;<a href="/profile...>>' as of below:

$ ./futurelearn-dl.py EMAIL PASSWORD digital-cancer-management 1
Downloading 4-week course 'digital-cancer-management'
Downloading url<https://view.vzaar.com/12215674/video>
	to file <digital-cancer-management/week1/1.1-Course-introduction_12215674.mp4> ...
type=mp4, content.len=55341081
Downloading url<https://view.vzaar.com/12148582/video>
	to file <digital-cancer-management/week1/1.4-Showcase-of-Digital-Health-Technologies_12148582.mp4> ...
type=mp4, content.len=46315912
Downloading url<https://view.vzaar.com/12177169/video>
	to file <digital-cancer-management/week1/1.5-Patient-advocacy:-working-with-and-for-the-patient_12177169.mp4> ...
type=mp4, content.len=38991678
Downloading url<https://view.vzaar.com/12177170/video>
	to file <digital-cancer-management/week1/1.6-Patient-involvement-makes-a-better-healthcare-technology-ecosystem-_12177170.mp4> ...
type=mp4, content.len=17914038
Downloading url<https://view.vzaar.com/12201211/video>
	to file <digital-cancer-management/week2/2.1-Mind-body-connections_12201211.mp4> ...
type=mp4, content.len=26357533
Downloading url<https://view.vzaar.com/12213553/video>
	to file <digital-cancer-management/week2/2.3-Sleep,-rest,-eat-well-and-reduce-stress_12213553.mp4> ...
type=mp4, content.len=19372637
FATAL:No quote(char=\) in <<t;<a href=\"/profile...>>

Thanks for the wonderful tool, @mjbright. Season's greetings!

Failing to Download PDFs Ends Download: Long outstanding issue!

Trying to download https://www.futurelearn.com/courses/research-question and it fails while trying to download a pdf (maybe non-existent). However, is there a way to skip that specific pdf and continnue downloading? Thanks!

$ ./TEST_futurelearn-dl.py_research-question.sh 
Downloading 5-week course 'research-question'
Downloading url<https://view.vzaar.com/11288441/video>
	to file <research-question/week1/5.2-What-our-current-students-say_11288441.mp4> ...
type=mp4, content.len=36893991
Downloading url<https://view.vzaar.com/13650997/video>
	to file <research-question/week1/5.3-Studying-for-a-PhD-by-distance-learning_13650997.mp3> ...
type=mp3, content.len=6218010
Downloading url<https://view.vzaar.com/11288166/video>
	to file <research-question/week1/5.4-Professor-Martin-Parker,-Director-of-Research-in-the-School-of-Business_11288166.mp4> ...
type=mp4, content.len=78980655
Downloading url<https://view.vzaar.com/11369808/video>
	to file <research-question/week1/5.5-Professor-Kirsten-Malmkjaer,-Professor-of-Translation-Studies_11369808.mp3> ...
type=mp3, content.len=14569992
Downloading url<"http://fass.open.ac.uk/sites/fass.open.ac.uk/files/files/research/sample-research-proposal.pdf>
	to file <research-question/week1/5.15-Bringing-it-all-together_sample-research-proposal.pdf> ...
Traceback (most recent call last):
  File "./futurelearn-dl.py", line 625, in <module>
    getCourseWeekStepPage(course_id, week_id, step_id, week_num, title)
  File "./futurelearn-dl.py", line 232, in getCourseWeekStepPage
    downloadURLsInPage(course_id, week_id, step_id, week_num, content, DOWNLOAD_TYPE, page_title)
  File "./futurelearn-dl.py", line 386, in downloadURLsInPage
    downloadURLInPage(url, download_dir, DOWNLOAD_TYPE, page_title)
  File "./futurelearn-dl.py", line 452, in downloadURLInPage
    downloadURLToFile(url, ofile, DOWNLOAD_TYPE)
  File "./futurelearn-dl.py", line 405, in downloadURLToFile
    response = session.get(url, headers=headers)
  File "/home/zenny/.local/lib/python3.4/site-packages/requests/sessions.py", line 546, in get
    return self.request('GET', url, **kwargs)
  File "/home/zenny/.local/lib/python3.4/site-packages/requests/sessions.py", line 533, in request
    resp = self.send(prep, **send_kwargs)
  File "/home/zenny/.local/lib/python3.4/site-packages/requests/sessions.py", line 640, in send
    adapter = self.get_adapter(url=request.url)
  File "/home/zenny/.local/lib/python3.4/site-packages/requests/sessions.py", line 731, in get_adapter
    raise InvalidSchema("No connection adapters were found for '%s'" % url)
requests.exceptions.InvalidSchema: No connection adapters were found for '"http://fass.open.ac.uk/sites/fass.open.ac.uk/files/files/research/sample-research-proposal.pdf'

Same here:
https://www.futurelearn.com/courses/mindfulness-wellbeing-performance/

Downloading 4-week course 'mindfulness-wellbeing-performance'
Downloading url<http://www.danielgilbert.com/KILLINGSWORTH%20&amp;%20GILBERT%20(2010).pdf>
	to file <mindfulness-wellbeing-performance/week1/1.4-What-is-mindfulness-and-why-does-it-matter_KILLINGSWORTH_&amp;_GILBERT_(2010).pdf> ...
downloadURLToFile: Failed to download url <http://www.danielgilbert.com/KILLINGSWORTH%20&amp;%20GILBERT%20(2010).pdf> => 403
Downloading url<"http://www.danielgilbert.com/KILLINGSWORTH%20&amp;amp;%20GILBERT%20(2010).pdf>
	to file <mindfulness-wellbeing-performance/week1/1.4-What-is-mindfulness-and-why-does-it-matter_KILLINGSWORTH_&amp;amp;_GILBERT_(2010).pdf> ...
Traceback (most recent call last):
  File "./futurelearn-dl.py", line 625, in <module>
    getCourseWeekStepPage(course_id, week_id, step_id, week_num, title)
  File "./futurelearn-dl.py", line 232, in getCourseWeekStepPage
    downloadURLsInPage(course_id, week_id, step_id, week_num, content, DOWNLOAD_TYPE, page_title)
  File "./futurelearn-dl.py", line 386, in downloadURLsInPage
    downloadURLInPage(url, download_dir, DOWNLOAD_TYPE, page_title)
  File "./futurelearn-dl.py", line 452, in downloadURLInPage
    downloadURLToFile(url, ofile, DOWNLOAD_TYPE)
  File "./futurelearn-dl.py", line 405, in downloadURLToFile
    response = session.get(url, headers=headers)
  File "/home/zenny/.local/lib/python3.4/site-packages/requests/sessions.py", line 546, in get
    return self.request('GET', url, **kwargs)
  File "/home/zenny/.local/lib/python3.4/site-packages/requests/sessions.py", line 533, in request
    resp = self.send(prep, **send_kwargs)
  File "/home/zenny/.local/lib/python3.4/site-packages/requests/sessions.py", line 640, in send
    adapter = self.get_adapter(url=request.url)
  File "/home/zenny/.local/lib/python3.4/site-packages/requests/sessions.py", line 731, in get_adapter
    raise InvalidSchema("No connection adapters were found for '%s'" % url)
requests.exceptions.InvalidSchema: No connection adapters were found for '"http://www.danielgilbert.com/KILLINGSWORTH%20&amp;amp;%20GILBERT%20(2010).pdf'
Look for new files with - find /home/zenny/Downloads//Education/FUTURELEARN/mindfulness-wellbeing-performance -type f -exec ls -altr {} \;

unable to download

OS: Ubuntu 14.04
Python: 3

the course that I'm trying to download: https://www.futurelearn.com/courses/ageing/1/todo

gar@gar-B85M:/Downloads/futurelearn-dl-master$ python futurelearn-dl.py myEmailId MyPassword ageing 1
Downloading course 'ageing' - seems to comprise of 6 weeks
Traceback (most recent call last):
File "futurelearn-dl.py", line 552, in
getCourseWeekStepPage(course_id, week_id, step_id, week_num)
File "futurelearn-dl.py", line 226, in getCourseWeekStepPage
getDownloadableURLs(course_id, week_id, step_id, week_num, content, DOWNLOAD_TYPE)
File "futurelearn-dl.py", line 359, in getDownloadableURLs
downloadFile(url, download_dir, DOWNLOAD_TYPE)
File "futurelearn-dl.py", line 419, in downloadFile
downloadURLToFile(url, ofile, DOWNLOAD_TYPE)
File "futurelearn-dl.py", line 370, in downloadURLToFile
debug(1, "Downloading url<{}> ...".format(url), end='')
TypeError: debug() got an unexpected keyword argument 'end'
gar@gar-B85M:
/Downloads/futurelearn-dl-master$

gar@gar-B85M:/Downloads/futurelearn-dl-master$ python futurelearn-dl.py myEmailId MyPassword ageing 1 1
Downloading course 'ageing' - seems to comprise of 6 weeks
Downloading week 1
Downloading available week1 material
Traceback (most recent call last):
File "futurelearn-dl.py", line 563, in
getCourseWeekStepPage(course_id, week_id, step_id, week_num)
File "futurelearn-dl.py", line 226, in getCourseWeekStepPage
getDownloadableURLs(course_id, week_id, step_id, week_num, content, DOWNLOAD_TYPE)
File "futurelearn-dl.py", line 359, in getDownloadableURLs
downloadFile(url, download_dir, DOWNLOAD_TYPE)
File "futurelearn-dl.py", line 419, in downloadFile
downloadURLToFile(url, ofile, DOWNLOAD_TYPE)
File "futurelearn-dl.py", line 370, in downloadURLToFile
debug(1, "Downloading url<{}> ...".format(url), end='')
TypeError: debug() got an unexpected keyword argument 'end'
gar@gar-B85M:
/Downloads/futurelearn-dl-master$

my python version

gar@gar-B85M:~/Downloads/futurelearn-dl-master$ python
Python 3.4.3 (default, Oct 14 2015, 20:28:29)
[GCC 4.8.4] on linux
Type "help", "copyright", "credits" or "license" for more information.

Let me know if I'm going wrong.

Incomplete download

Hi, I've used udemy-dl and coursera-dl, so thought I'd try this. I don't get any errors, but from what I can tell only about half of the videos download. Those that do get downloaded are complete.

None of the articles download, although I think that they are being presented differently than before, which may explain it. As an example, forensic-psychology 4 downloads 14 of the 35 videos and nothing else.

Thanks for your work on this so far. If you want me to try anything, just ask.

Token Failure

Hi and once again thank you very much for your program. Keeping myself busy during the Corona trying to learn more courses online !

Got the following error after I downloaded the source code and pasted it into a file that I called Futurelearn-dl.py

Also using Python3.6

C:\Users\ivan\Pythonprograms>python futurelearn-dl.py email password battery-storage-applications 7
File "futurelearn-dl.py", line 149
fatal("getToken: No authenticity_token in response")
^
IndentationError: unexpected indent

Can you help ?
Many thanks,
Ivan

ssl error

Hi,Bright. I'm using osx 10.9, the error message as below:

Downloading 6-week course 'inside-cancer'
ERROR: UnicodeEncodeError - 'ascii' codec can't encode character '\xa5' in position 21140: ordinal not in range(128)
Downloading urlhttps://view.vzaar.com/1435674/video ...
type=mp4, content.len=53541260
ERROR: UnicodeEncodeError - 'ascii' codec can't encode character '\xa5' in position 21133: ordinal not in range(128)
Downloading urlhttps://ugc.futurelearn.com/uploads/files/da/20/da20c945-e551-4985-b4b2-b91fd601a887/IC_-_Glossary.pdf ...
Traceback (most recent call last):
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/requests/packages/urllib3/connectionpool.py", line 559, in urlopen
body=body, headers=headers)
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/requests/packages/urllib3/connectionpool.py", line 345, in _make_request
self._validate_conn(conn)
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/requests/packages/urllib3/connectionpool.py", line 784, in validate_conn
conn.connect()
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/requests/packages/urllib3/connection.py", line 252, in connect
ssl_version=resolved_ssl_version)
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/requests/packages/urllib3/util/ssl
.py", line 305, in ssl_wrap_socket
return context.wrap_socket(sock, server_hostname=server_hostname)
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/ssl.py", line 376, in wrap_socket
_context=self)
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/ssl.py", line 747, in init
self.do_handshake()
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/ssl.py", line 983, in do_handshake
self._sslobj.do_handshake()
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/ssl.py", line 628, in do_handshake
self._sslobj.do_handshake()
ssl.SSLEOFError: EOF occurred in violation of protocol (_ssl.c:646)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/requests/adapters.py", line 376, in send
timeout=timeout
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/requests/packages/urllib3/connectionpool.py", line 588, in urlopen
raise SSLError(e)
requests.packages.urllib3.exceptions.SSLError: EOF occurred in violation of protocol (_ssl.c:646)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "futurelearn-dl.py", line 569, in
getCourseWeekStepPage(course_id, week_id, step_id, week_num)
File "futurelearn-dl.py", line 240, in getCourseWeekStepPage
getDownloadableURLs(course_id, week_id, step_id, week_num, content, DOWNLOAD_TYPE)
File "futurelearn-dl.py", line 373, in getDownloadableURLs
downloadFile(url, download_dir, DOWNLOAD_TYPE)
File "futurelearn-dl.py", line 433, in downloadFile
downloadURLToFile(url, ofile, DOWNLOAD_TYPE)
File "futurelearn-dl.py", line 388, in downloadURLToFile
response = session.get(url, headers=headers)
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/requests/sessions.py", line 480, in get
return self.request('GET', url, *_kwargs)
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/requests/sessions.py", line 468, in request
resp = self.send(prep, *_send_kwargs)
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/requests/sessions.py", line 576, in send
r = adapter.send(request, **kwargs)
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/requests/adapters.py", line 447, in send
raise SSLError(e, request=request)
requests.exceptions.SSLError: EOF occurred in violation of protocol (_ssl.c:646)

Can you figure out how to make it right?

what am I doing wrong?

I downloaded futurelearn-dl.py to C:\Python27\Scripts
From that folder I ran:

futurelearn-dl.py myemail mypw big-data-visualisation 2
(with myemail and mypw replaced by the actual email address and pw)

This yields the following error:

Traceback (most recent call last):
File "C:\Python27\Scripts\futurelearn-dl.py", line 585, in
OP_DIR = os.getenv('OP_DIR', default=os.getenv('HOME') + '/Education/FUTURELEARN')
TypeError: unsupported operand type(s) for +: 'NoneType' and 'str'
<<<<<<<<<<<<<<<<<

What is going wrong? I am on Windows10, x64, Python 2.7.11 (and I am enrolled in this particular course). The same error pops up when running under Python 3.5.2.

Thank you.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.