Coder Social home page Coder Social logo

Comments (5)

inteoryx avatar inteoryx commented on July 28, 2024 1

Thank you very much for all this. Excellent bug report and, even better, it came with a fix. I have pulled in your fix and updated the test cases file. Seems to all be working now. I really appreciate your contribution.

from twitter-video-dl.

Perkinss27 avatar Perkinss27 commented on July 28, 2024

I'm working on a fix, I'll share it here later this day or tomorrow.

from twitter-video-dl.

Perkinss27 avatar Perkinss27 commented on July 28, 2024

Fix commited to my fork.
Code might be messy, and we might have to do further checking before assuming it works correctly.
It's an early fix that worked for the case I explained earlier.
For more details about what's modified and what's added please read my comment.

from twitter-video-dl.

inteoryx avatar inteoryx commented on July 28, 2024

Awesome, thanks very much for finding this issue and submitting a fix. There is an issue with your fix which is that it can download lower resolution than the max available, for example try your code with https://twitter.com/Queen__Deshda/status/1452467693517426697 - it downloads a lower resolution version of the video than the mainline. Ideally I would like to keep the max resolution behavior.

When I have a chance I will try to update your code to keep the max resolution version and merge it. Or, if you want to take that on, that would be great. From the tests directory you can run "python TestVideoDownloader.py" and it will catch errors like this.

I really do appreciate your PR and will merge it soon once this issue is ironed out - thanks again.

from twitter-video-dl.

Perkinss27 avatar Perkinss27 commented on July 28, 2024

Hi, thank you for your feedback. I took time to find what causes the bug you describe here and it looks like it does not come from my update. The bug occurs because sometime ext_tw_pattern and amplitude_pattern do not find the highest resolution video link.
I looked into the json data received from requesting https://twitter.com/Queen__Deshda/status/1452467693517426697 and manually found those 3 video links (+ 1 container) :

{
    "bitrate": 256000,
    "content_type": "video/mp4",
    "url": "https://video.twimg.com/ext_tw_video/1452467633727627271/pu/vid/480x270/mJYI0tdF0rk4Boxq.mp4?tag=12"
},
{
    "bitrate": 2176000,
    "content_type": "video/mp4",
    "url": "https://video.twimg.com/ext_tw_video/1452467633727627271/pu/vid/1280x720/-Rk-UWbHoRufnCUr.mp4?tag=12"
},
{
    "bitrate": 832000,
    "content_type": "video/mp4",
    "url": "https://video.twimg.com/ext_tw_video/1452467633727627271/pu/vid/640x360/JSZJP1JdQJDhuEdr.mp4?tag=12"
}

In this case only https://video.twimg.com/ext_tw_video/1452467633727627271/pu/vid/480x270/mJYI0tdF0rk4Boxq.mp4?tag=12 and https://video.twimg.com/ext_tw_video/1452467633727627271/pu/vid/640x360/JSZJP1JdQJDhuEdr.mp4?tag=12 are matched by your regular expression, so I started comparing links and spotted a difference.
In fact, the highest resolution link contains '-' in its field just after its resolution (1280x720) : -Rk-UWbHoRufnCUr. Comparing these fields for the 3 video links we have :

mJYI0tdF0rk4Boxq
-Rk-UWbHoRufnCUr
JSZJP1JdQJDhuEdr

Next step was to look at how this field is handled by your regular expression :

ext_tw_pattern = re.compile(r'(https://video.twimg.com/ext_tw_video/(\d+)/pu/vid/(\d+x\d+)/\w+.mp4\?tag=\d+)')

By quickly looking at it, ext_tw_pattern handles the previously described field with the \w+ statement.
According to google : \w (lowercase w) matches a "word" character: a letter or digit or underbar [a-zA-Z0-9_].
So the bug occurs because \w does not match a "word" with a '-'.
I then updated the regular expression to :

ext_tw_pattern = re.compile(r'(https://video.twimg.com/ext_tw_video/(\d+)/pu/vid/(\d+x\d+)/[^.]+.mp4\?tag=\d+)')

replacing \w by [^.] which means anything except a '.' .
And ran a quick test comparing deprecated regular expression and the updated one on the 3 above links.
scr42552 (2)
Bingo, highest resolution matched ! The new pattern should fix that bug.

So I changed both ext_tw_pattern and amplitude_pattern because they were using \w.
You can see that fix on my fork's last commit.

from twitter-video-dl.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.