Hey
So with 4.3.2 these properties are available on all video objects: watched, is_free_premium and preview (and some more but they are not related to the issue).
Normally, any property comes from one of these sources:
- The Pornhub HubTraffic API, used by default whenever we can because it is lighter and contains enough data for basic needs
- The Pornhub JSON video data that can be found on the page (using the
parser.resolve
function) which contains video data specifically for the video player (video segments, hotspots, thumbnails, etc.)
- The page itself, see
video.is_favorite
However, the properties i mentionned above are not available or too complicated to get from any of these sources. Instead, they only come from a query page.
For instance, the watched
property comes from the "watched" tag you can see on a video thumbnail before clicking on it. Obviously, we can't get this information anywhere else from a query (unless we decide to irerate the entire account history).
If the video comes from a query
It is fine, the properties found on the query page are already stored and can be parsed on demand.
If the video does not come from a query (e.g. from Client.get)
PHUB will "simulate" a query. This is what the video._as_query
property does (unless overrid by a query). It creates a temporary playlist on the client account, add the video to it and creates a new Query object with the playlist page.
Appart from the fact that this method is really not optimized (it costs 4 requests just to get whether one video has been watched), it is also not working because it uses the video page token, which means it needs to fetch the video page, and therefore consider the video as already watched, so the whole playlist procedure is useless.
This bug also affects other query-based properties, even though for some of them we can find a workaround (e.g. for the preview property, there might be a way to reconstruct the source url with the video key/id and other informations).
It might be possible to use a different token (like the query token) but this would require each video to store their queries.
What to do
You decide where these query properties should be implemented. Some of them (like watched and is_free_premium) are most likely to be used while iterating a query.
For exemple, as of now you can do:
for video in query.sample(filter = lambda vid: not vid.watched and vid.is_free_premium): ...
But another possible implementation could be:
for video in query.sample(watched = False, free_premium = True, ...): ...
This second implementation might look less user friendly because it has less of an OOP style but would make sure that these properties are safe to use. It would also make possible to not wrap each video in a Video object for optimization purposes since the query data we want is directly accessed from a regex (consts.re.eval_video
).
Have fun