Coder Social home page Coder Social logo

canvas-student-data-export's Introduction

Introduction

The Canvas Student Data Export Tool can export nearly all of a student's data from Instructure Canvas Learning Management System (Canvas LMS). This is useful when you are graduating or leaving your college or university, and would like to have a backup of all the data you had in canvas.

The tool exports all of the following data:

  • Course Assignments
  • Course Announcements
  • Course Discussions
  • Course Pages
  • Course Files
  • Course Modules
  • SingleFile HTML of Assignments, Announcements, Discussions, Modules

The tool will export your data in JSON format, and will organize it nicely into folders named for every term of every year. Example:

  • Fall 2013
    • Econ 101
      • course files
      • modules
      • Econ 101.json
    • English 101
      • course files
      • modules
      • English 101.json
  • Fall 2014
  • Fall 2015
  • Fall 2016
  • Spring 2014
  • Spring 2015
  • Spring 2016
  • Spring 2017
  • Winter 2014
  • Winter 2015
  • Winter 2016
  • Winter 2017
  • all_output.json

Getting Started

Dependencies

To run the program, you will need the following dependencies:
pip install requests
pip install jsonpickle
pip install canvasapi
pip install python-dateutil
pip install PyYAML

npm i github:gildas-lormeau/SingleFile

You can install these dependencies using pip install -r requirements.txt AND npm i

Then run from the command line: python export.py

Configuration

These are the configuration parameters for the program:

  • Canvas API URL - this is the URL of your institution, for example https://example.instructure.com
  • Canvas API key - this can be created by going to Canvas and navigating to Account > Settings > Approved Integrations > New Access Token
  • Canvas User ID - this can be found at https://example.instructure.com/api/v1/users/self in the id field
  • Path to Cookies File - file needs to be in netscape format, you can get your cookies via a tool like "Get cookies.txt Clean" on chrome. This can also be left blank if an html images are unwanted.
  • Directory in which to download course information to (will be created if not present)
  • List of Course IDs that should be skipped

If single file fails to find your browser, you can set a path in singlefile.py. If you also want to run additional singlefile arguments that can also be done there.

Loading credentials from a file

To avoid manually entering credentials every time you run the program, you can create a credentials.yaml file in the same directory as the script that has the following fields:

API_URL: < URL of your institution >
API_KEY: < API Key from Canvas >
USER_ID: < User ID from Canvas >
COOKIES_PATH: < Path to cookies file >

You can then run the script as normal: python export.py

Contribute

I would love to see this script's functionality expanded and improved! I welcome all pull requests :) Thank you!

canvas-student-data-export's People

Contributors

17acres avatar alex-bellon avatar davekats avatar dj346 avatar mjforan avatar moorepants avatar nafeej avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

canvas-student-data-export's Issues

Cannot find chrome

Whenever I try to run the program I get this error message occasionally. I suspect this means that puppeteer cannot find chrome. I have updated my singlefile.py file to include these lines:

from subprocess import run

SINGLEFILE_BINARY_PATH = "./node_modules/single-file/cli/single-file"
CHROME_PATH = "/opt/google/chrome/google-chrome" #Uncomment this and set your browser exe if it can't find yours.

def addQuotes(str):
    return "\"" + str.strip("\"") + "\""

def download_page(url, cookies_path, output_path, output_name_template = ""):
    args = [
        addQuotes(SINGLEFILE_BINARY_PATH),
        #"--browser-executable-path=" + addQuotes(CHROME_PATH.strip("\"")), #Uncomment this and set your browser exe if it can't find yours.
        "--browser-cookies-file=" + addQuotes(cookies_path),
        "--output-directory=" + addQuotes(output_path),
        addQuotes(url)
        ]

    if(output_name_template != ""):
        args.append("--filename-template=" + addQuotes(output_name_template))

    try:
        run("node " + " ".join(args), shell=True)
    except Exception as e:
        print("Was not able to save the URL " + url + " using singlefile. The reported error was " + e.strerror)

if __name__ == "__main__":
    download_page("https://www.google.com/", "", "./output/test", "test.html")

(I changed the file location of the chrome bin)

after doing this I still get the error message. I don't know if I am doing anything wrong.

Suggestion: download user files

Canvas stores your submissions and other misc. files on the platform and it's pretty easy to download those using their API. Here's how I did it:

https://github.com/Cyberes/canvas-student-data-export/blob/master/module/user_files.py

from concurrent.futures import ThreadPoolExecutor, as_completed
from pathlib import Path

import canvasapi
from tqdm import tqdm

from module.helpers import make_valid_folder_path


def do_download(task):
    task[1].parent.mkdir(parents=True, exist_ok=True)
    task[0].download(task[1])


def download_user_files(canvas: canvasapi.Canvas, base_path: str):
    base_path = Path(base_path)
    user = canvas.get_current_user()
    folders = []
    for folder in user.get_folders():
        n = folder.full_name.lstrip('my files/')
        if n:
            c_n = make_valid_folder_path(n)
            folders.append((folder, c_n))

    files = []
    for folder, folder_name in tqdm(folders, desc='Fetching User Files'):
        for file in folder.get_files():
            out_path = base_path / folder_name / file.display_name
            files.append((file, out_path))

    with ThreadPoolExecutor(max_workers=10) as executor:
        bar = tqdm(files, desc='Downloading User Files')
        futures = [executor.submit(do_download, task) for task in files]
        for _ in as_completed(futures):
            bar.update()

cannot import name 'Finder' from 'importlib.abc'

I'm getting this error:

line 10, in <module> from singlefile import download_page File "/Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/site-packages/singlefile/__init__.py", line 5, in <module> from importlib.abc import Finder, Loader ImportError: cannot import name 'Finder' from 'importlib.abc' (/Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/importlib/abc.py)

Unexpected token '??=' when downloading pages

It successfully builds the folder hierarchy, but fails to download any pages due to the errors of the form:

Downloading course home page
Unexpected token '??='
  Downloading assignment pages
Unexpected token '??='
Unexpected token '??='
Unexpected token '??='
Unexpected token '??='
Unexpected token '??='
Unexpected token '??='
Unexpected token '??='
Unexpected token '??='
Unexpected token '??='
Unexpected token '??='
Unexpected token '??='
Unexpected token '??='
Unexpected token '??='
Unexpected token '??='
Unexpected token '??='
Unexpected token '??='
  Downloading course module pages
  Downloading course announcements pages
Unexpected token '??='
Unexpected token '??='
Unexpected token '??='
Unexpected token '??='
Unexpected token '??='
Unexpected token '??='
Unexpected token '??='
Unexpected token '??='
Unexpected token '??='
Unexpected token '??='
Unexpected token '??='
Unexpected token '??='
Unexpected token '??='
Unexpected token '??='
Unexpected token '??='
Unexpected token '??='
Unexpected token '??='
Unexpected token '??='
Unexpected token '??='
Unexpected token '??='
Unexpected token '??='
Unexpected token '??='
Unexpected token '??='
Unexpected token '??='
Unexpected token '??='
Unexpected token '??='
Unexpected token '??='
Unexpected token '??='
Unexpected token '??='
Unexpected token '??='
Unexpected token '??='
Unexpected token '??='
Unexpected token '??='
Unexpected token '??='
Unexpected token '??='
Unexpected token '??='
Unexpected token '??='
Unexpected token '??='
Unexpected token '??='
Unexpected token '??='
Unexpected token '??='
Unexpected token '??='
Unexpected token '??='
Unexpected token '??='
  Downloading course discussion pages
Unexpected token '??='
Unexpected token '??=

and so on. It seems to correctly populate the json files, but not download any HTML pages.

Probably unrelated, but some other pages also report the error

Failed to retrieve submissions for this assignment
Forbidden

, but since that is less frequent it seems more likely to be an issue with canvas rather than the exporter.

ModuleNotFoundError: No module named 'canvasapi'

I'm running this on my Mac. I've already installed canvasapi module to my computer by typing in the terminal:
pip install canvasapi

Later on, after having installed everything, I went ahead to run export.py by typing:
python3 export.py

Then I was greeted with the following:

Traceback (most recent call last):
  File "/Users/raymu/Downloads/canvas-student-data-export-master/export.py", line 7, in <module>
    from canvasapi import Canvas
ModuleNotFoundError: No module named 'canvasapi'

npm i github command failure

C:\Users\thevi>npm i github:gildas-lormeau/SingleFile
npm ERR! code ENOENT
npm ERR! syscall spawn git
npm ERR! path git
npm ERR! errno -4058
npm ERR! enoent An unknown git error occurred
npm ERR! enoent This is related to npm not being able to find a file.
npm ERR! enoent

Cannot bypass login page for any downloaded HTML files

I am able to download the module HTML files from Canvas, however all of them just show the login page where you type in your university credentials. Even the credentials that I know currently work will not work on the local HTML files.

I opened one in Visual Studio Code to see what it looks like, and it seems that everything except the login portion is base64 encoded, and is likely encrypted. Is there a way around this?

Issue with packages (I think)

Error after running export.py:

Getting list of all courses

Downloading course list page
node:internal/modules/cjs/loader:936
throw err;
^

Error: Cannot find module 'C:\Users\Nick\ALLCOURSES\node_modules\single-file\cli\single-file'
at Function.Module._resolveFilename (node:internal/modules/cjs/loader:933:15)
at Function.Module._load (node:internal/modules/cjs/loader:778:27)
at Function.executeUserEntryPoint [as runMain] (node:internal/modules/run_main:77:12)
at node:internal/main/run_main_module:17:47 {
code: 'MODULE_NOT_FOUND',
requireStack: []
}

Having a hard time interpreting this. I have all the packages installed.

Was not able to save the URL using singlefile. The reported error was No such file or directory

Describe the bug
% python export.py

Downloading course list page Was not able to save the URL https://xxxx.instructure.com/courses/ using singlefile. The reported error was No such file or directory

Issue is only with single-file, the rest of the program will still run till completion with expected output/export.

To Reproduce
Steps to reproduce the behavior:

  1. Manual installation:
    pip install -r requirements.txt
    sudo apt install git
    npm i github:gildas-lormeau/SingleFile

  2. Run: python export.py

  3. output snippet below

Welcome to the Canvas Student Data Export Tool

Connecting to canvas

Creating output directory: ./output

Getting list of all courses

  Downloading course list page
Was not able to save the URL https://xxxx.instructure.com/courses/ using singlefile. The reported error was No such file or directory
Working on 2022 Fall C: CSE XXX: Algorithms (2022 Fall - C)
  Getting assignments

Expected behavior
Save https://xxxx.instructure.com/courses/ to courses_list.html in ./ouput.

Environment

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.