Comments (2)
Captions can be retrieved from the following route:
https://app.pluralsight.com/transcript/api/v1/caption/webvtt/<clipId>/<versionId>/<language>/
clipId: It is already used within the script, nothing new
versionId: The versionId can be retrieved from the /viewClip endpoint, but the script will need some refactoring. Instead of getVideoUrl
we will need getVideoData
or something like that, as we absolutely cannot hit that end-point more than necessary.
The response contains version
in the root of the object (second-to-last property).
language: The language comes from a set of hard-coded values in the embedded player itself. But the values are here:
[{"name":"Afrikaans","code":"af"},{"name":"Albanian","code":"sq"},{"name":"Amharic","code":"am"},{"name":"Arabic","code":"ar"},{"name":"Armenian","code":"hy"},{"name":"Azeerbaijani","code":"az"},{"name":"Basque","code":"eu"},{"name":"Belarusian","code":"be"},{"name":"Bengali","code":"bn"},{"name":"Bosnian","code":"bs"},{"name":"Bulgarian","code":"bg"},{"name":"Catalan","code":"ca"},{"name":"Cebuano","code":"ceb"},{"name":"Chinese (Simplified)","code":"zh-CN"},{"name":"Chinese (Traditional)","code":"zh-TW"},{"name":"Corsican","code":"co"},{"name":"Croatian","code":"hr"},{"name":"Czech","code":"cs"},{"name":"Danish","code":"da"},{"name":"Dutch","code":"nl"},{"name":"English","code":"en"},{"name":"Esperanto","code":"eo"},{"name":"Estonian","code":"et"},{"name":"Finnish","code":"fi"},{"name":"French","code":"fr"},{"name":"Frisian","code":"fy"},{"name":"Galician","code":"gl"},{"name":"Georgian","code":"ka"},{"name":"German","code":"de"},{"name":"Greek","code":"el"},{"name":"Gujarati","code":"gu"},{"name":"Haitian Creole","code":"ht"},{"name":"Hausa","code":"ha"},{"name":"Hawaiian","code":"haw"},{"name":"Hebrew","code":"iw"},{"name":"Hindi","code":"hi"},{"name":"Hmong","code":"hmn"},{"name":"Hungarian","code":"hu"},{"name":"Icelandic","code":"is"},{"name":"Igbo","code":"ig"},{"name":"Indonesian","code":"id"},{"name":"Irish","code":"ga"},{"name":"Italian","code":"it"},{"name":"Japanese","code":"ja"},{"name":"Javanese","code":"jw"},{"name":"Kannada","code":"kn"},{"name":"Kazakh","code":"kk"},{"name":"Khmer","code":"km"},{"name":"Korean","code":"ko"},{"name":"Kurdish","code":"ku"},{"name":"Kyrgyz","code":"ky"},{"name":"Lao","code":"lo"},{"name":"Latin","code":"la"},{"name":"Latvian","code":"lv"},{"name":"Lithuanian","code":"lt"},{"name":"Luxembourgish","code":"lb"},{"name":"Macedonian","code":"mk"},{"name":"Malagasy","code":"mg"},{"name":"Malay","code":"ms"},{"name":"Malayalam","code":"ml"},{"name":"Maltese","code":"mt"},{"name":"Maori","code":"mi"},{"name":"Marathi","code":"mr"},{"name":"Mongolian","code":"mn"},{"name":"Myanmar (Burmese)","code":"my"},{"name":"Nepali","code":"ne"},{"name":"Norwegian","code":"no"},{"name":"Nyanja (Chichewa)","code":"ny"},{"name":"Pashto","code":"ps"},{"name":"Persian","code":"fa"},{"name":"Polish","code":"pl"},{"name":"Portuguese (Portugal, Brazil)","code":"pt"},{"name":"Punjabi","code":"pa"},{"name":"Romanian","code":"ro"},{"name":"Russian","code":"ru"},{"name":"Samoan","code":"sm"},{"name":"Scots Gaelic","code":"gd"},{"name":"Serbian","code":"sr"},{"name":"Sesotho","code":"st"},{"name":"Shona","code":"sn"},{"name":"Sindhi","code":"sd"},{"name":"Sinhala (Sinhalese)","code":"si"},{"name":"Slovak","code":"sk"},{"name":"Slovenian","code":"sl"},{"name":"Somali","code":"so"},{"name":"Spanish","code":"es"},{"name":"Sundanese","code":"su"},{"name":"Swahili","code":"sw"},{"name":"Swedish","code":"sv"},{"name":"Tagalog (Filipino)","code":"tl"},{"name":"Tajik","code":"tg"},{"name":"Tamil","code":"ta"},{"name":"Telugu","code":"te"},{"name":"Thai","code":"th"},{"name":"Turkish","code":"tr"},{"name":"Ukrainian","code":"uk"},{"name":"Urdu","code":"ur"},{"name":"Uzbek","code":"uz"},{"name":"Vietnamese","code":"vi"},{"name":"Welsh","code":"cy"},{"name":"Xhosa","code":"xh"},{"name":"Yiddish","code":"yi"},{"name":"Yoruba","code":"yo"},{"name":"Zulu","code":"zu"},{"name":"Chinese (Simplified)","code":"zh"},{"name":"Hebrew","code":"he"}]
Example when everything is put together: https://app.pluralsight.com/transcript/api/v1/caption/webvtt/b8648aa4-41aa-432f-a37e-b51b8ae361fc/345312ba-591d-4216-8846-477fb7d51459/en
The format is not SRT, but WebVTT.
Summary of tasks:
- Refactor getVideoUrl -> getVideoData and expose both the url for the video and subtitles.
- Download subtitles alongside the videos.
- Add option to choose the language (could be hard-coded)
Bonus tasks:
- Allow passing the language selection as a CLI argument (no language specified, no subtitles downloaded)
- Optionally convert WebVTT to SRT
from pluralsight-scraper.
Ok, I will try, and thank you for very detailed task steps
from pluralsight-scraper.
Related Issues (20)
- April is free, will the maintainer return? HOT 16
- v2 branch: regex: Missing characters (Windows) HOT 1
- Directory not found (path too long; Windows) HOT 10
- Immitating Pluralsight HOT 1
- npm ERR! code ELIFECYCLE HOT 4
- Improve error handling
- Videos with resolution of 1024x768 cannot be downloaded HOT 4
- Option to ignore login for free courses
- Is there any way to download a specific range of videos ? HOT 5
- Feature request: resume after error/interruption HOT 7
- Pluralsight block my account HOT 22
- npm run login _ Not work HOT 1
- Script stopped if the title have slash character HOT 1
- Can i download one file in a video set HOT 2
- The two last videos are not downloaded of a course HOT 4
- brings an error when I type in npm run login HOT 1
- Tab symbol error handling HOT 3
- Pluralsight offline player HOT 1
- Error: Protocol "blob:" not supported. HOT 6
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pluralsight-scraper.