Coder Social home page Coder Social logo

@EMG70's YouTube playlist downloading problems 2024-06-10: (1) (This is taking longer than expected) (2) Metadata Fetch: [YouTube video URL] failed: 'NoneType' object cannot be interpreted as an integer [@avni results: (3) failed to download: None (4) it keeps trying to redownload failed videos] about calibre-web HOT 20 OPEN

holta avatar holta commented on August 15, 2024
@EMG70's YouTube playlist downloading problems 2024-06-10: (1) (This is taking longer than expected) (2) Metadata Fetch: [YouTube video URL] failed: 'NoneType' object cannot be interpreted as an integer [@avni results: (3) failed to download: None (4) it keeps trying to redownload failed videos]

from calibre-web.

Comments (20)

deldesir avatar deldesir commented on August 15, 2024 2

@avni what happened is this specific video started to download but stopped due to unavailable fragments. This downloads process left residual fragments that are used each time you try to download again. If you remove the video downloads directory and try again, the download will process from start but will be stuck at some point leaving incomplete file again. You wouldn't see any error message because xklb will not report one at this point, thus the "None" retrieved from database. This is an issue I was trying to address in #157 because each failed video must be accompanied with an error message explaining what happened.

from calibre-web.

deldesir avatar deldesir commented on August 15, 2024 2

It's the latter. I'll resume work on it ASAP.

from calibre-web.

holta avatar holta commented on August 15, 2024 1

@deldesir

Please see the Python errors between Line 1082 and Line 1184 here:

-rw-r--r-- 1 root root 30337 Jun 10 20:27 /var/log/calibre-web.log
                        ...ITS LAST 100 LINES FOLLOW...

  File "/usr/local/calibre-web-py3/cps/tasks/download.py", line 123, in run
    self.message = f"{self.media_url_link} failed to download: {self.read_error_from_database()}"
                                                                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/calibre-web-py3/cps/tasks/download.py", line 139, in read_error_from_database
    error = conn.execute("SELECT error FROM media WHERE webpath = ?", (self.media_url,)).fetchone()[0]
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
sqlite3.OperationalError: no such column: error
[2024-06-10 20:12:22,588] DEBUG {cps.services.worker:91} Add Task for user: Admin - Metadata fetch task for https://www.youtube.com/playlist?list=PLfiHW0cuXSvQ0s-oNi2syJNAPTN8gt4oi
[2024-06-10 20:12:22,589]  INFO {cps.tasks.metadata_extract:131} Starting to fetch metadata for URL: https://www.youtube.com/playlist?list=PLfiHW0cuXSvQ0s-oNi2syJNAPTN8gt4oi
[2024-06-10 20:12:25,447]  INFO {cps.editbooks:385} Received metadata request: ImmutableMultiDict([('current_user_name', 'Admin'), ('shelf_title', 'Physics Form 1')])
[2024-06-10 20:12:25,459]  INFO {cps.editbooks:374} Shelf Physics Form 1 created
[2024-06-10 20:13:09,080] ERROR {cps.tasks.metadata_extract:113} An error occurred during the calculation of views per day for https://www.youtube.com/watch?v=DMTQUGYoHHQ: 'NoneType' object cannot be interpreted as an integer
[2024-06-10 20:13:09,081] ERROR {cps.services.worker:202} 'views_per_day'
Traceback (most recent call last):
  File "/usr/local/calibre-web-py3/cps/services/worker.py", line 199, in start
    self.run(*args)
  File "/usr/local/calibre-web-py3/cps/tasks/metadata_extract.py", line 154, in run
    requested_urls = self._sort_and_limit_requested_urls(requested_urls)
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/calibre-web-py3/cps/tasks/metadata_extract.py", line 117, in _sort_and_limit_requested_urls
    return dict(sorted(requested_urls.items(), key=lambda item: item[1]["views_per_day"], reverse=True)[:min(MAX_VIDEOS_PER_DOWNLOAD, len(requested_urls))])
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/calibre-web-py3/cps/tasks/metadata_extract.py", line 117, in <lambda>
    return dict(sorted(requested_urls.items(), key=lambda item: item[1]["views_per_day"], reverse=True)[:min(MAX_VIDEOS_PER_DOWNLOAD, len(requested_urls))])
                                                                ~~~~~~~^^^^^^^^^^^^^^^^^
KeyError: 'views_per_day'
[2024-06-10 20:14:02,140] DEBUG {cps.services.worker:91} Add Task for user: Admin - Metadata fetch task for https://www.youtube.com/playlist?list=PLfiHW0cuXSvQSZHsA5zpo3JdrY5EJp1yT
[2024-06-10 20:14:03,096]  INFO {cps.tasks.metadata_extract:131} Starting to fetch metadata for URL: https://www.youtube.com/playlist?list=PLfiHW0cuXSvQSZHsA5zpo3JdrY5EJp1yT
[2024-06-10 20:14:05,944]  INFO {cps.editbooks:385} Received metadata request: ImmutableMultiDict([('current_user_name', 'Admin'), ('shelf_title', 'Maths Form 1')])
[2024-06-10 20:14:05,954]  INFO {cps.editbooks:374} Shelf Maths Form 1 created
[2024-06-10 20:15:56,672] ERROR {cps.tasks.metadata_extract:113} An error occurred during the calculation of views per day for https://www.youtube.com/watch?v=DMTQUGYoHHQ: 'NoneType' object cannot be interpreted as an integer
[2024-06-10 20:15:56,672] ERROR {cps.tasks.metadata_extract:113} An error occurred during the calculation of views per day for https://www.youtube.com/watch?v=IeJjQwdbUnw: 'NoneType' object cannot be interpreted as an integer
[2024-06-10 20:15:56,673] ERROR {cps.services.worker:202} 'views_per_day'
Traceback (most recent call last):
  File "/usr/local/calibre-web-py3/cps/services/worker.py", line 199, in start
    self.run(*args)
  File "/usr/local/calibre-web-py3/cps/tasks/metadata_extract.py", line 154, in run
    requested_urls = self._sort_and_limit_requested_urls(requested_urls)
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/calibre-web-py3/cps/tasks/metadata_extract.py", line 117, in _sort_and_limit_requested_urls
    return dict(sorted(requested_urls.items(), key=lambda item: item[1]["views_per_day"], reverse=True)[:min(MAX_VIDEOS_PER_DOWNLOAD, len(requested_urls))])
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/calibre-web-py3/cps/tasks/metadata_extract.py", line 117, in <lambda>
    return dict(sorted(requested_urls.items(), key=lambda item: item[1]["views_per_day"], reverse=True)[:min(MAX_VIDEOS_PER_DOWNLOAD, len(requested_urls))])
                                                                ~~~~~~~^^^^^^^^^^^^^^^^^
KeyError: 'views_per_day'
[2024-06-10 20:20:56,562] DEBUG {cps.services.worker:91} Add Task for user: Admin - Metadata fetch task for https://www.youtube.com/watch?v=iaQV0tehds4
[2024-06-10 20:20:56,740]  INFO {cps.tasks.metadata_extract:131} Starting to fetch metadata for URL: https://www.youtube.com/watch?v=iaQV0tehds4
[2024-06-10 20:21:01,220] DEBUG {cps.services.worker:91} Add Task for user: Admin - Download task for https://www.youtube.com/watch?v=iaQV0tehds4
[2024-06-10 20:21:01,232]  INFO {cps.tasks.download:41} Subprocess args: ['lb-wrapper', 'dl', 'https://www.youtube.com/watch?v=iaQV0tehds4']
[2024-06-10 20:21:12,539]  INFO {cps.editbooks:385} Received metadata request: ImmutableMultiDict([('requested_file', '/library/downloads/calibre-web/Youtube/Foundations For Farming Zimbabwe/Want to learn how to increase your crop yield to 14 tonnes per HaM-oM-<M-^__105.00_[iaQV0tehds4].mp4'), ('current_user_name', 'Admin')])
[2024-06-10 20:21:12,540]  INFO {cps.editbooks:387} Requested file: /library/downloads/calibre-web/Youtube/Foundations For Farming Zimbabwe/Want to learn how to increase your crop yield to 14 tonnes per HaM-oM-<M-^__105.00_[iaQV0tehds4].mp4
[2024-06-10 20:21:12,540]  INFO {cps.editbooks:392} Processing file: <_io.BufferedReader name='/library/downloads/calibre-web/Youtube/Foundations For Farming Zimbabwe/Want to learn how to increase your crop yield to 14 tonnes per HaM-oM-<M-^__105.00_[iaQV0tehds4].mp4'>
[2024-06-10 20:21:12,540] DEBUG {cps.uploader:374} Temporary file: /tmp/calibre_web/c141b4032bcb0d6641fee3a9bf95596e
[2024-06-10 20:21:12,565]  WARN {py.warnings:110} /usr/local/calibre-web-py3/cps/editbooks.py:1535: SAWarning: Object of type <Books> not in session, add operation along 'Authors.books' won't proceed (This warning originated from the Session 'autoflush' process, which was invoked automatically in response to a user-initiated operation.)
  db_element = db_session.query(db_object).filter((func.lower(db_filter).ilike(add_element))).first()

[2024-06-10 20:21:12,585] DEBUG {cps.helper:548} Moving title: /tmp/calibre_web/c141b4032bcb0d6641fee3a9bf95596e to /library/calibre-web/Foundations For Farming Zimbabwe/Want to learn how to increase your crop yield to 14 tonnes per Ha_ (6)/Want to learn how to increase your crop yield to 14 tonnes per Ha_ - Foundations For Farming Zimbabwe
[2024-06-10 20:21:12,612]  INFO {cps.tasks.download:106} Successfully sent the requested file to http://192.168.0.212/books/meta
[2024-06-10 20:21:12,625]  INFO {cps.tasks.download:129} Download task for https://www.youtube.com/watch?v=iaQV0tehds4 completed successfully
[2024-06-10 20:21:46,783] DEBUG {cps.services.worker:91} Add Task for user: Admin - Metadata fetch task for https://www.youtube.com/playlist?list=PLfiHW0cuXSvQ0s-oNi2syJNAPTN8gt4oi
[2024-06-10 20:21:46,784]  INFO {cps.tasks.metadata_extract:131} Starting to fetch metadata for URL: https://www.youtube.com/playlist?list=PLfiHW0cuXSvQ0s-oNi2syJNAPTN8gt4oi
[2024-06-10 20:21:49,236]  INFO {cps.editbooks:385} Received metadata request: ImmutableMultiDict([('current_user_name', 'Admin'), ('shelf_title', 'Physics Form 1')])
[2024-06-10 20:21:49,250]  INFO {cps.editbooks:374} Shelf Physics Form 1 (2) created
[2024-06-10 20:23:41,093] ERROR {cps.tasks.metadata_extract:113} An error occurred during the calculation of views per day for https://www.youtube.com/watch?v=DMTQUGYoHHQ: 'NoneType' object cannot be interpreted as an integer
[2024-06-10 20:23:41,094] ERROR {cps.tasks.metadata_extract:113} An error occurred during the calculation of views per day for https://www.youtube.com/watch?v=IeJjQwdbUnw: 'NoneType' object cannot be interpreted as an integer
[2024-06-10 20:23:41,094] ERROR {cps.services.worker:202} 'views_per_day'
Traceback (most recent call last):
  File "/usr/local/calibre-web-py3/cps/services/worker.py", line 199, in start
    self.run(*args)
  File "/usr/local/calibre-web-py3/cps/tasks/metadata_extract.py", line 154, in run
    requested_urls = self._sort_and_limit_requested_urls(requested_urls)
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/calibre-web-py3/cps/tasks/metadata_extract.py", line 117, in _sort_and_limit_requested_urls
    return dict(sorted(requested_urls.items(), key=lambda item: item[1]["views_per_day"], reverse=True)[:min(MAX_VIDEOS_PER_DOWNLOAD, len(requested_urls))])
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/calibre-web-py3/cps/tasks/metadata_extract.py", line 117, in <lambda>
    return dict(sorted(requested_urls.items(), key=lambda item: item[1]["views_per_day"], reverse=True)[:min(MAX_VIDEOS_PER_DOWNLOAD, len(requested_urls))])
                                                                ~~~~~~~^^^^^^^^^^^^^^^^^
KeyError: 'views_per_day'
[2024-06-10 20:25:16,376] DEBUG {cps.services.worker:91} Add Task for user: Admin - Metadata fetch task for https://www.youtube.com/playlist?list=PLbeRJIjdJ7QEuVxTNfQsPBpUv4UcQdAzJ
[2024-06-10 20:25:17,114]  INFO {cps.tasks.metadata_extract:131} Starting to fetch metadata for URL: https://www.youtube.com/playlist?list=PLbeRJIjdJ7QEuVxTNfQsPBpUv4UcQdAzJ
[2024-06-10 20:25:19,790]  INFO {cps.editbooks:385} Received metadata request: ImmutableMultiDict([('current_user_name', 'Admin'), ('shelf_title', 'Word from Brian')])
[2024-06-10 20:25:19,800]  INFO {cps.editbooks:374} Shelf Word from Brian created
[2024-06-10 20:27:23,994] ERROR {cps.tasks.metadata_extract:113} An error occurred during the calculation of views per day for https://www.youtube.com/watch?v=DMTQUGYoHHQ: 'NoneType' object cannot be interpreted as an integer
[2024-06-10 20:27:23,994] ERROR {cps.tasks.metadata_extract:113} An error occurred during the calculation of views per day for https://www.youtube.com/watch?v=IeJjQwdbUnw: 'NoneType' object cannot be interpreted as an integer
[2024-06-10 20:27:23,995] ERROR {cps.services.worker:202} 'views_per_day'
Traceback (most recent call last):
  File "/usr/local/calibre-web-py3/cps/services/worker.py", line 199, in start
    self.run(*args)
  File "/usr/local/calibre-web-py3/cps/tasks/metadata_extract.py", line 154, in run
    requested_urls = self._sort_and_limit_requested_urls(requested_urls)
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/calibre-web-py3/cps/tasks/metadata_extract.py", line 117, in _sort_and_limit_requested_urls
    return dict(sorted(requested_urls.items(), key=lambda item: item[1]["views_per_day"], reverse=True)[:min(MAX_VIDEOS_PER_DOWNLOAD, len(requested_urls))])
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/calibre-web-py3/cps/tasks/metadata_extract.py", line 117, in <lambda>
    return dict(sorted(requested_urls.items(), key=lambda item: item[1]["views_per_day"], reverse=True)[:min(MAX_VIDEOS_PER_DOWNLOAD, len(requested_urls))])
                                                                ~~~~~~~^^^^^^^^^^^^^^^^^
KeyError: 'views_per_day'

from calibre-web.

EMG70 avatar EMG70 commented on August 15, 2024 1

Now that both these PRs are merged... 💯

* PR [Handle restricted/unavailable videos [and list them in "Tasks" view, when downloading (stale!) playlists] #179](https://github.com/iiab/calibre-web/pull/179)

* PR [Build list of URLs simpler [refactor PR #179 for readability / maintainability] #180](https://github.com/iiab/calibre-web/pull/180)

...where do we stand, helping @EMG70 move forward here?⚡

New VM installed after merging of PRs #179 and #180

SUDO IIAB-DIAGNOSTICS - https://dpaste.com/4U8T9242B

I have attempted to download exactly same videos that failed on 10 /06/24.There is a huge improvement on most of them.Please see screenshots.
Screenshot 2024-06-11 at 22-57-35 Internet in a Box Tasks
One of the videos gave an unfamiliar long error message which seems to be part of the Calibre-web log .
Screenshot from 2024-06-11 23-04-46

from calibre-web.

avni avatar avni commented on August 15, 2024 1

Testing June 11, 2024. YouTube Playlist Part 1

iiab-diagnostics: https://dpaste.com/4RDY4S535

Image

from calibre-web.

avni avatar avni commented on August 15, 2024 1

Testing June 11, 2024. YouTube Playlist Part 2

[Updated]

iiab-diagnostics: https://dpaste.com/4RDY4S535

  • Tested the Physics Form 1 Playlist of 29 videos. Only 9 of the 28 available videos were attempted to be downloaded. There is one unavailable video but that aligns with what shows on the YouTube website.
  • UPDATE: The failed video is from the original playlist I tried on this IIAB instance: https://www.youtube.com/watch?v=7qiwh4ybglo. This makes me think the system is retrying failed videos periodically.

Image

from calibre-web.

avni avatar avni commented on August 15, 2024 1

Testing June 11, 2024. YouTube Playlist Part 3

[Updated]

iiab-diagnostics: https://dpaste.com/GRX3DRTYX

  • Tested the Maths Form 1 Playlist of 22 videos. Only 18 of the 21 available videos were attempted to be downloaded. The 18 succeeded. There are two one unavailable videos, which does not align with what shows on the YouTube website, where it just shows 1 unavailable. The playlist itself has one duplicate video, Videos #18 and 19 in the playlist. Calibre-web, I think is smart enough to ignore duplicates.

  • There is 1 failed video that shows up in the queue. The failed video is from the original playlist I tried on this IIAB instance: https://www.youtube.com/watch?v=7qiwh4ybglo. This makes me think the system is retrying failed videos periodically.

Image

from calibre-web.

avni avatar avni commented on August 15, 2024 1

Testing June 11, 2024. YouTube Playlist Part 4

iiab-diagnostics: https://dpaste.com/CW6GJTBBZ

  • Tested the Word from Brian Playlist of 3 videos.
  • The 3 videos in the playlist were downloaded successfully.
  • There is 1 failure listed, but that is the failed video from the original playlist I tried on this IIAB instance: https://www.youtube.com/watch?v=7qiwh4ybglo. This makes me think the system is retrying failed videos periodically.

Image

from calibre-web.

EMG70 avatar EMG70 commented on August 15, 2024 1

Testing June 11, 2024. YouTube Playlist Part 4

iiab-diagnostics: https://dpaste.com/CW6GJTBBZ

  • Tested the Word from Brian Playlist of 3 videos.
  • The 3 videos in the playlist were downloaded successfully.
  • There is 1 failure listed, but that is the failed video from the original playlist I tried on this IIAB instance: https://www.youtube.com/watch?v=7qiwh4ybglo. This makes me think the system is retrying failed videos periodically.

Image

I agree with this observation,the system seems to retry a previously failed video.

from calibre-web.

deldesir avatar deldesir commented on August 15, 2024 1

makes me think the system is retrying failed videos periodically

⬆️

@deldesir do you agree?

xklb does retry failed downloads, but Calibre-Web no. We don't have this implemented yet but will eventually.

from calibre-web.

holta avatar holta commented on August 15, 2024

@deldesir

Are these 2 other issues related — also with the 'NoneType' object cannot be interpreted as an integer error?

from calibre-web.

holta avatar holta commented on August 15, 2024

Are these recent cps/tasks/download.py PRs related?

Is this cps/tasks/metadata_extract.py PR related?

from calibre-web.

holta avatar holta commented on August 15, 2024

Now that both these PRs are merged... 💯

...where do we stand, helping @EMG70 move forward here?⚡

from calibre-web.

holta avatar holta commented on August 15, 2024

@deldesir

Why do 3 videos show failed to download: None ?

Can you investigate, and help improve this error message?

(In @EMG70's big screenshot, just above.)

from calibre-web.

holta avatar holta commented on August 15, 2024

makes me think the system is retrying failed videos periodically

⬆️

@deldesir do you agree?

from calibre-web.

deldesir avatar deldesir commented on August 15, 2024

@deldesir

Why do 3 videos show failed to download: None ?

Can you investigate, and help improve this error message?

(In @EMG70's big screenshot, just above.)

I am pretty sure failed to download: None error refers to "unavailable fragments" error. I could see in my test the video took longer than expected but eventually downloaded.

from calibre-web.

holta avatar holta commented on August 15, 2024

I am pretty sure failed to download: None error refers to "unavailable fragments" error. I could see in my test the video took longer than expected but eventually downloaded.

🧩

Possible background: (for others!)

from calibre-web.

avni avatar avni commented on August 15, 2024

xklb does retry failed downloads, but Calibre-Web no. We don't have this implemented yet but will eventually.

If you look at the screenshots I posted, the same YouTube file fails multiple times implying that failed files are being retried automatically. Do you know why that is?

from calibre-web.

avni avatar avni commented on August 15, 2024

@deldesir fascinating. I don't know the inner workings of the system but am curious to understand more. For example, is there a downloads file for each video you download or just one for all downloads. If the latter, I wonder if you can just remove those fragments from the downloads directory for any files that fail. Very cool you are diving deep into this in #157. Thank you for the thorough explanation. 🙏

from calibre-web.

holta avatar holta commented on August 15, 2024

It's the latter. I'll resume work on it ASAP.

Awesome.

And Advance Apologies to everyone — that bug fixes are urgent yes — but take time to be architected properly ✊

from calibre-web.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.