Coder Social home page Coder Social logo

ikeboy / pluralsight-scraper Goto Github PK

View Code? Open in Web Editor NEW
136.0 9.0 49.0 70 KB

Pluralsight video downloader

Home Page: https://www.knyz.org/blog/post/pluralsight-scraper-released/

License: GNU General Public License v2.0

JavaScript 100.00%
pluralsight offline pluralsight-scraper education ripper

pluralsight-scraper's Introduction

Pluralsight Scraper

What is this?

This is the second iteration of the pluralsight scraper. It is used to retrieve mp4 video files by scraping pluralsight's website using its own API. This project does not endorse piracy and requires a valid pluralsight login to function!

Sample Output

Why?

Pluralsight doesn't have a way to play videos offline on Linux as far as I know and I wanted to play stuff offline on Linux, hence this project.

Pluralsight.com FAQ: Can I watch your videos on Linux? Available apps

How?

The script launches a pupputeer.js chromium instance, which it uses to allow you to interactively authenticate with the real website. Once logged in, it will save your cookies to a separate file (cookies.txt) in order to authenticate the API requests required to download the video files.

Usage

  1. Clone the repo git clone https://github.com/knyzorg/pluralsight-scraper

  2. Run npm install to install the dependencies

  3. Run npm run login to open a session

  4. Run npm run get -- "https://app.pluralsight.com/library/courses/rust-fundamentals/table-of-contents" to begin downloading the course

Isn't this against Pluralsight's Terms of Service?

Yes it is: Refer to Section 5

The applicable License granted you by these Terms of Use is a right of access through the Site only, and does not grant to you any right to download or store any Proprietary Materials in any medium[...]

Detection Evasion

There is a relatively high likely-hood that your account will be flagged for running this script. It is very difficult to evade such things and the current strategy is to naively wait 30 seconds between requests.

pluralsight-scraper's People

Contributors

darkle avatar javajohnhub avatar jubayerarefin avatar maksfn avatar prostopasta avatar siriokun avatar vezaynk avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

pluralsight-scraper's Issues

Feature request: resume after error/interruption

Currently, if the script breaks, I have to restart downloading from the beginning of a course. I would love to be able to resume from where it broke.

Perhaps it could skip existing files, even if I have to manually confirm Y/N for overwriting existing files?

Scraper is not finding correct url for courses

This is the console.log I used on line 54 to look at the module data. As you can see the titles are correct but almost every link is for table-of-contents.
Mac OS

Logging in...
{ courses:
[ { name: 'Course Overview',
url: 'https://app.pluralsight.com/player?course=angular-2-getting-started&author=deborah-kurata&name=angular-2-getting-started-m0&clip=0&mode=live' },
{ name: 'Introduction',
url: 'https://app.pluralsight.com/library/courses/angular-2-getting-started/table-of-contents' },
{ name: 'Anatomy of an Angular 2 Application ',
url: 'https://app.pluralsight.com/library/courses/angular-2-getting-started/table-of-contents' },
{ name: 'Get the Most from This Course',
url: 'https://app.pluralsight.com/library/courses/angular-2-getting-started/table-of-contents' },
{ name: 'Sample Application',
url: 'https://app.pluralsight.com/library/courses/angular-2-getting-started/table-of-contents' },
{ name: 'Course Outline',
url: 'https://app.pluralsight.com/library/courses/angular-2-getting-started/table-of-contents' },
{ name: 'Introduction',
url: 'https://app.pluralsight.com/library/courses/angular-2-getting-started/table-of-contents' },
{ name: 'Selecting a Language',
url: 'https://app.pluralsight.com/library/courses/angular-2-getting-started/table-of-contents' },
{ name: 'Selecting an Editor',
url: 'https://app.pluralsight.com/library/courses/angular-2-getting-started/table-of-contents' },
{ name: 'Setting up Our Environment',
url: 'https://app.pluralsight.com/library/courses/angular-2-getting-started/table-of-contents' },
{ name: 'Setting up an Angular 2 Application',
url: 'https://app.pluralsight.com/library/courses/angular-2-getting-started/table-of-contents' },
{ name: 'Running an Angular 2 Application',
url: 'https://app.pluralsight.com/library/courses/angular-2-getting-started/table-of-contents' },
{ name: 'About Modules',
url: 'https://app.pluralsight.com/library/courses/angular-2-getting-started/table-of-contents' },
{ name: 'Loading Modules and Hosting our Application',
url: 'https://app.pluralsight.com/library/courses/angular-2-getting-started/table-of-contents' },
{ name: 'Summary',
url: 'https://app.pluralsight.com/library/courses/angular-2-getting-started/table-of-contents' },
{ name: 'Introduction',
url: 'https://app.pluralsight.com/library/courses/angular-2-getting-started/table-of-contents' },
{ name: 'What Is a Component?',
url: 'https://app.pluralsight.com/library/courses/angular-2-getting-started/table-of-contents' },
{ name: 'Creating the Component Class',
url: 'https://app.pluralsight.com/library/courses/angular-2-getting-started/table-of-contents' },
{ name: 'Defining the Metadata with a Decorator',
url: 'https://app.pluralsight.com/library/courses/angular-2-getting-started/table-of-contents' },
{ name: 'Importing What We Need',
url: 'https://app.pluralsight.com/library/courses/angular-2-getting-started/table-of-contents' },
{ name: 'Demo: Creating the App Component',
url: 'https://app.pluralsight.com/library/courses/angular-2-getting-started/table-of-contents' },
{ name: 'Bootstrapping the App Component',
url: 'https://app.pluralsight.com/library/courses/angular-2-getting-started/table-of-contents' },
{ name: 'Demo: Bootstrapping the App Component',
url: 'https://app.pluralsight.com/library/courses/angular-2-getting-started/table-of-contents' },
{ name: 'Checklists and Summary',
url: 'https://app.pluralsight.com/library/courses/angular-2-getting-started/table-of-contents' },
{ name: 'Introduction',
url: 'https://app.pluralsight.com/library/courses/angular-2-getting-started/table-of-contents' },
{ name: 'Building a Template',
url: 'https://app.pluralsight.com/library/courses/angular-2-getting-started/table-of-contents' },
{ name: 'Building the Component',
url: 'https://app.pluralsight.com/library/courses/angular-2-getting-started/table-of-contents' },
{ name: 'Using a Component as a Directive',
url: 'https://app.pluralsight.com/library/courses/angular-2-getting-started/table-of-contents' },
{ name: 'Binding with Interpolation',
url: 'https://app.pluralsight.com/library/courses/angular-2-getting-started/table-of-contents' },
{ name: 'Adding Logic with Directives: ngIf',
url: 'https://app.pluralsight.com/library/courses/angular-2-getting-started/table-of-contents' },
{ name: 'Adding Logic with Directives: ngFor',
url: 'https://app.pluralsight.com/library/courses/angular-2-getting-started/table-of-contents' },
{ name: 'Checklists and Summary',
url: 'https://app.pluralsight.com/library/courses/angular-2-getting-started/table-of-contents' },
{ name: 'Introduction',
url: 'https://app.pluralsight.com/library/courses/angular-2-getting-started/table-of-contents' },
{ name: 'Property Binding',
url: 'https://app.pluralsight.com/library/courses/angular-2-getting-started/table-of-contents' },
{ name: 'Handling Events with Event Binding',
url: 'https://app.pluralsight.com/library/courses/angular-2-getting-started/table-of-contents' },
{ name: 'Handling Input with Two-way Binding',
url: 'https://app.pluralsight.com/library/courses/angular-2-getting-started/table-of-contents' },
{ name: 'Transforming Data with Pipes',
url: 'https://app.pluralsight.com/library/courses/angular-2-getting-started/table-of-contents' },
{ name: 'Checklists and Summary',
url: 'https://app.pluralsight.com/library/courses/angular-2-getting-started/table-of-contents' },
{ name: 'Introduction',
url: 'https://app.pluralsight.com/library/courses/angular-2-getting-started/table-of-contents' },
{ name: 'Defining Interfaces',
url: 'https://app.pluralsight.com/library/courses/angular-2-getting-started/table-of-contents' },
{ name: 'Encapsulating Component Styles',
url: 'https://app.pluralsight.com/library/courses/angular-2-getting-started/table-of-contents' },
{ name: 'Using Lifecycle Hooks',
url: 'https://app.pluralsight.com/library/courses/angular-2-getting-started/table-of-contents' },
{ name: 'Building Custom Pipes',
url: 'https://app.pluralsight.com/library/courses/angular-2-getting-started/table-of-contents' },
{ name: 'Checklists and Summary',
url: 'https://app.pluralsight.com/library/courses/angular-2-getting-started/table-of-contents' },
{ name: 'Introduction',
url: 'https://app.pluralsight.com/library/courses/angular-2-getting-started/table-of-contents' },
{ name: 'Building a Nested Component',
url: 'https://app.pluralsight.com/library/courses/angular-2-getting-started/table-of-contents' },
{ name: 'Using a Nested Component',
url: 'https://app.pluralsight.com/library/courses/angular-2-getting-started/table-of-contents' },
{ name: 'Passing Data to a Nested Component Using @input',
url: 'https://app.pluralsight.com/library/courses/angular-2-getting-started/table-of-contents' },
{ name: 'Passing Data from a Component Using @output',
url: 'https://app.pluralsight.com/library/courses/angular-2-getting-started/table-of-contents' },
{ name: 'Checklists and Summary',
url: 'https://app.pluralsight.com/library/courses/angular-2-getting-started/table-of-contents' },
{ name: 'Introduction',
url: 'https://app.pluralsight.com/library/courses/angular-2-getting-started/table-of-contents' },
{ name: 'How Does It Work?',
url: 'https://app.pluralsight.com/library/courses/angular-2-getting-started/table-of-contents' },
{ name: 'Building a Service',
url: 'https://app.pluralsight.com/library/courses/angular-2-getting-started/table-of-contents' },
{ name: 'Registering the Service',
url: 'https://app.pluralsight.com/library/courses/angular-2-getting-started/table-of-contents' },
{ name: 'Injecting the Service',
url: 'https://app.pluralsight.com/library/courses/angular-2-getting-started/table-of-contents' },
{ name: 'Checklists and Summary',
url: 'https://app.pluralsight.com/library/courses/angular-2-getting-started/table-of-contents' },
{ name: 'Introduction',
url: 'https://app.pluralsight.com/library/courses/angular-2-getting-started/table-of-contents' },
{ name: 'Observables and Reactive Extensions',
url: 'https://app.pluralsight.com/library/courses/angular-2-getting-started/table-of-contents' },
{ name: 'Setting Up',
url: 'https://app.pluralsight.com/library/courses/angular-2-getting-started/table-of-contents' },
{ name: 'Sending an Http Request',
url: 'https://app.pluralsight.com/library/courses/angular-2-getting-started/table-of-contents' },
{ name: 'Subscribing to an Observable',
url: 'https://app.pluralsight.com/library/courses/angular-2-getting-started/table-of-contents' },
{ name: 'Checklists and Summary',
url: 'https://app.pluralsight.com/library/courses/angular-2-getting-started/table-of-contents' },
{ name: 'Introduction',
url: 'https://app.pluralsight.com/library/courses/angular-2-getting-started/table-of-contents' },
{ name: 'Setting Up',
url: 'https://app.pluralsight.com/library/courses/angular-2-getting-started/table-of-contents' },
{ name: 'Configuring Routes',
url: 'https://app.pluralsight.com/library/courses/angular-2-getting-started/table-of-contents' },
{ name: 'Tying Routes to Actions',
url: 'https://app.pluralsight.com/library/courses/angular-2-getting-started/table-of-contents' },
{ name: 'Placing the Views',
url: 'https://app.pluralsight.com/library/courses/angular-2-getting-started/table-of-contents' },
{ name: 'Passing Parameters to a Route',
url: 'https://app.pluralsight.com/library/courses/angular-2-getting-started/table-of-contents' },
{ name: 'Activating a Route with Code',
url: 'https://app.pluralsight.com/library/courses/angular-2-getting-started/table-of-contents' },
{ name: 'Checklists and Summary',
url: 'https://app.pluralsight.com/library/courses/angular-2-getting-started/table-of-contents' },
{ name: 'Introduction',
url: 'https://app.pluralsight.com/library/courses/angular-2-getting-started/table-of-contents' },
{ name: 'TypeScript Configuration File (tsconfig.json)',
url: 'https://app.pluralsight.com/library/courses/angular-2-getting-started/table-of-contents' },
{ name: 'npm Package File (package.json) and TypeScript Definitions File (typings.json)',
url: 'https://app.pluralsight.com/library/courses/angular-2-getting-started/table-of-contents' },
{ name: 'index.html File Libraries',
url: 'https://app.pluralsight.com/library/courses/angular-2-getting-started/table-of-contents' },
{ name: 'Summary',
url: 'https://app.pluralsight.com/library/courses/angular-2-getting-started/table-of-contents' },
{ name: 'Introduction',
url: 'https://app.pluralsight.com/library/courses/angular-2-getting-started/table-of-contents' },
{ name: 'Recapping Our Journey',
url: 'https://app.pluralsight.com/library/courses/angular-2-getting-started/table-of-contents' },
{ name: 'Learning More',
url: 'https://app.pluralsight.com/library/courses/angular-2-getting-started/table-of-contents' },
{ name: 'What Is Angular? (Revisited)',
url: 'https://app.pluralsight.com/library/courses/angular-2-getting-started/table-of-contents' },
{ name: 'Closing',
url: 'https://app.pluralsight.com/library/courses/angular-2-getting-started/table-of-contents' } ],
title: 'Angular 2: Getting Started | Pluralsight' }
Logged in!

Directory not found (path too long; Windows)

Titles could be huge and Windows still limit the path to 260 characters (there is an option in Windows 10 to increase it but you have to opt-in... by editing the register...)

I recommend the CWD to be <= 50 characters.

As for future modification/parameters:

  • Using the course url (bonus it is human friendly)
  • In extreme cases would be to export the GUID + a mapping file(s) with the tile so the user could rename it as it wants.

Script stopped if the title have slash character

Downloading video file for: 15. Destructuring and Rest/Spread.mp4
events.js:174
      throw er; // Unhandled 'error' event
      ^

Error: ENOENT: no such file or directory, open '/Users/siriokun/Movies/pluralsight/videos/react-js-getting-started/15. Destructuring and Rest/Spread.mp4'
Emitted 'error' event at:
    at WriteStream.onerror (_stream_readable.js:713:12)
    at WriteStream.emit (events.js:189:13)
    at lazyFs.open (internal/fs/streams.js:272:12)
    at FSReqWrap.args [as oncomplete] (fs.js:140:20)
npm ERR! code ELIFECYCLE

Immitating Pluralsight

I am interested in developing a supplementary program to the scraper to imitate pluralsight's server.

This will have the advantages of supporting unit tests and avoiding getting banned during development.

It will need to

  • Mimic login screen
  • Mimic API
  • Mimic cookie-based authentication
  • Test for various scenarios (such as suspension)

Videos with resolution of 1024x768 cannot be downloaded

Observed behavior:
Currently the script is not able to download (older) videos that have a resolution of 1024x768. The error message is not helpful: "Something went wrong. Double check the URL and try logging in again."

Expected behavior:
Videos with a resolution of 1024x768 can be downloaded.

Analysis
The resolution of 1280x720 is hard-baked into the code here:

https://github.com/knyzorg/pluralsight-scraper/blob/8388d58bfdb760d55f50e60ed852fbe063546ea4/index.js#L21

Changing this manually to 1024x768 works. The solution would be to try the higher resolution first, then fall back to the lower one. This is exactly what the web player does (I checked).

Tab symbol error handling

Experienced 'Tab' symbol in title name, got the following error:
"""
Retrieving metadata for:

  1.      What We're Going to Learn
    

Downloading video file for:

  1.      What We're Going to Learn.mp4
    

events.js:292
throw er; // Unhandled 'error' event
^

Error: ENOENT: no such file or directory, open

F:!Programming!\pluralsight-scraper\videos\incident-response-handling-performing\73.
What We're Going to Learn.mp4

Emitted 'error' event on WriteStream instance at:
at errorOrDestroy (internal/streams/destroy.js:108:12)
at WriteStream.onerror (_stream_readable.js:753:7)
at WriteStream.emit (events.js:315:20)
at internal/fs/streams.js:376:12
at FSReqCallback.oncomplete (fs.js:155:23) {
errno: -4058,
code: 'ENOENT',
syscall: 'open',
path: "F:\\!Programming!\\pluralsight-scraper\\videos\\incident-response-handling-performing\\73. \tWhat We're Going to Learn.mp4"
"""
What symbol should I add to exclusion - would work adding [\t] or need some other tabular symbol coding?

npm ERR! code ELIFECYCLE

Hi Dear, i coudn't download linux-system-security-lpic-2 it gave me some kind of error that i intserted bellow . First i tried to download you example course, it was fine.

aryan@Ubuntu:~/pluralsight-scraper$ npm run get -- "https://app.pluralsight.com/library/courses/rust-fundamentals/table-of-contents"
 
> [email protected] get /home/aryan/pluralsight-scraper
> node index.js get "https://app.pluralsight.com/library/courses/rust-fundamentals/table-of-contents"

Downloading course: rust-fundamentals
Retrieving metadata for: 1. Course Overview
Downloading video file for: 1. Course Overview.mp4
Retrieving metadata for: 2. Module Overview
Downloading video file for: 2. Module Overview.mp4
Retrieving metadata for: 3. Installing Rust
Downloading video file for: 3. Installing Rust.mp4
^Caryan@Ubuntu:~/pluralsight-scraper$ npm run get -- "https://app.pluralsight.com/library/courses/linux-system-security-lpic-2/table-of-contents"

> [email protected] get /home/aryan/pluralsight-scraper
> node index.js get "https://app.pluralsight.com/library/courses/linux-system-security-lpic-2/table-of-contents"

Downloading course: linux-system-security-lpic-
Something went wrong. Double check the URL and try logging in again.
npm ERR! code ELIFECYCLE
npm ERR! errno 1
npm ERR! [email protected] get: `node index.js get "https://app.pluralsight.com/library/courses/linux-system-security-lpic-2/table-of-contents"`
npm ERR! Exit status 1
npm ERR! 
npm ERR! Failed at the [email protected] get script.
npm ERR! This is probably not a problem with npm. There is likely additional logging output above.

npm ERR! A complete log of this run can be found in:
npm ERR!     /home/aryan/.npm/_logs/2020-04-10T09_33_13_237Z-debug.log
aryan@Ubuntu:~/pluralsight-scraper$ 

Thanks in advance!

Cannot read property 'focus' of null`

`>npm start

[email protected] start C:\proyectos\nodejs\pluralsight-scraper
node index.js

This program was written for EDUCATION and PERSONAL USE ONLY
Please be respectful of the original authors' intellectual property

This scraper is open source and licensed under GPLv2 on Github
https://github.com/knyzorg/pluralsight-scraper

Logging in...
(node:16788) UnhandledPromiseRejectionWarning: Unhandled promise rejection (rejection id: 1): Cannot read property 'focus' of null`

Improve error handling

Currently all errors spit out somewhat generic-looking errors. Better errors should be generated for

  • Suspensions
  • 404s
  • Rate-limiting
  • Unknown errors
  • Resolution not available #18
    This should also be implemented for #15

brings an error when I type in npm run login

npm ERR! missing script: login

npm ERR! A complete log of this run can be found in:
npm ERR! C:\Users\Sam\AppData\Roaming\npm-cache_logs\2020-04-30T18_09_50_653Z-debug.log

What should I do ?

Option to ignore login for free courses

Every week pluralsight makes 5 free courses available. These do not require a login to query and may potentially cause accounts to get flagged. An option to download them without logging in would be helpful.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.