Coder Social home page Coder Social logo

ssardina-teaching / git-hw-submissions Goto Github PK

View Code? Open in Web Editor NEW
1.0 2.0 0.0 239 KB

Scripts to support homework submissoin via Git and GitHub

Shell 3.91% Python 80.07% Jupyter Notebook 16.02%
git github submissions homework assignments teaching marking

git-hw-submissions's Introduction

Submission Management Support Scripts

REPO: https://github.com/ssardina-teaching/git-hw-submissions

These are some useful scripts that I use in teaching:

  • gh_classroom_collect.py: will collect all repos in a given GitHub Classroom/Organization for a given assignment.
  • git_clone_submissions.py: will clone and update a set of repositories (provided in a CSV file) for a given submission tag.
  • gh_authors_collect.py: extract the number of commits per each author in a set of GitHub repositories. This can be used to do analysis of student contributions
  • git_create_wiki.py: will push a template Wiki to a list of GitHub repos.
  • gh_member_bulk_team.py: add/delete GH username to a list of teams in an organization (e.g., to add tutors to groups so they can see student repos).
  • gh_pr_feedback_check_merged.py: check if a GH Classroom Feedback PRs have been (wrongly) merged in each repo.
  • gh_pr_feedback_comment.py: push feedback marking to repos' Feedback PRs.
  • gg_get_worksheet.py: dump a Google Sheet worksheet as CSV file (usually marking sheet).
  • git_batch_commit.sh: a shell script template to make changes to a collection of repos.

Other scripts (under other-scripts/ folder):

  • gh_scrape_scrape.py: Scrape GitHub for repo info via searches.
  • gh_clone_repos.py: Clones set of GitHub repo.

All these scripts were tested under Python 3.6+

To install all requirements:

$ sudo pip install -r requirements.txt

gh_classroom_collect.py: collect repos from a GH Organizations

The script gh_classroom_collect.py produces a CSV file with all the repos in a given GitHub Classroom for a particular assignment.

The CSV file produced, for each repo, the following information:

  • the organization name of the GitHub classroom;
  • the assignment prefix name;
  • the user of the repo;
  • the GitHub name of the repo; and
  • the full SSH git URL specification.

The script requires a username and its password or file with GitHub access token that allows access to the organization.

For example, to get all the repos submitted for assignment with prefix p0-warmup into a CSV file p0-repos.csv:

$ python3 ../git-hw-submissions.git/gh_classroom_collect.py -u ssardina -t ~/.ssh/keys/github-token-ssardina-new-May_5-2021.txt RMIT-COSC1127-1125-AI21  p0-warmup p0-repos.csv

This will produce a CSV of this form:

ORG_NAME,ASSIGNMENT,REPO_ID,REPO_NAME,REPO_URL
RMIT-COSC1127-1125-AI21,p0-warmup,CallumA3791362,RMIT-COSC1127-1125-AI21/p0-warmup-CallumA3791362,[email protected]:RMIT-COSC1127-1125-AI21/p0-warmup-CallumA3791362.git
RMIT-COSC1127-1125-AI21,p0-warmup,eolivesjo,RMIT-COSC1127-1125-AI21/p0-warmup-eolivesjo,[email protected]:RMIT-COSC1127-1125-AI21/p0-warmup-eolivesjo.git
RMIT-COSC1127-1125-AI21,p0-warmup,bivhitscar,RMIT-COSC1127-1125-AI21/p0-warmup-bivhitscar,[email protected]:RMIT-COSC1127-1125-AI21/p0-warmup-bivhitscar.git
...
...

If we want to map repo's suffixex (github_username) to student ids (identifier), we can use --student-map cosc1127-map.csv

Next, we can use that CSV file to clone the corresponding repos at a given tag submission using the script git_clone_submissions.py.

$ python ../git-hw-submissions.git/git_clone_submissions.py --file-timestamps test/cosc1127_timestamps.csv p0/cosc1127-repos-p0.csv submission p0/

git_clone_submissions.py: batch git cloning

Script git_clone_submissions.py clones a set of student/team repositories listed in a CSV file at a given tagged commit. The CVS file should contain the team name (under column name TEAM) and a GIT ssh link (under column name GIT-URL).

If a repository already exists, it will be updated automatically:

  • if the tag changed to a different commit, the new commit will be pulled;
  • if the repo does not have the tag anymore (the student has withdraw the submission), the local copy will be removed from disk.

At the end, the script produces a CSV file with the information of each repo successfully cloned, including commit id (commit), time of the commit (submitted_at), and time of the tagging (tagged_at).

The script depends on the GitPython module:

$ pip3 install gitpython --user

For example, to clone Project 0 at commit with tag "submission" using the database of repos p0-repos.csv:

$ python ../git-hw-submissions.git/git_clone_submissions.py --file-timestamps p0/cosc1127_timestamps.csv p0-repos.csv submission p0/ &| tee p0/clone-p0.txt

All repos will be cloned within folder p0/ and the file p0/cosc1127_timestamps.csv will contain the timestamps and commits of each repo cloned successfully. The file will contain the date of the commit linked to the tag and, if the tag is an annotated tag (and not just lightweight tag), it will also include the date tagged (otherwise they will be assumed the same). See annotated vs lightweight tags here.

To just clone the last commit in the master branch, use master as the tag.

The timezone used is defined by constant TIMEZONE in the script (default to Australia/Melbourne time zone).

gh_authors_collect.py: extract author commit stats

Given a CSV file with a collection of repositories, extract in a CSV file how many commits each author has done per repo. For example:

python3 git-hw-submissions.git/gh_authors_collect.py -u ssardina \
    -t ~/.ssh/keys/github-token-ssardina.txt \
    --tag submission ai20-p2-repos.csv ai20-p2-authors.csv

The --tag option restricts to tags finishing in a given tag. If no tag is given, the whole repo is parsed.

The input csv file must have the fields:

  • REPO_NAME: the full repo name: owner/organization + name of repo.
  • REPO_ID: the id of the repo (e.g., team name).

gh_create_wiki.py: push Wiki template to list of repos

Example:

$ python3 gh_create_wiki.py ../ai20-contest-repos.csv ~/AI20/assessments/project-contest/updated-src/wiki-template/

gh_member_bulk_team.py: add/delete GH username to GH teams

For example, to add Axel to all the teams except teachers:

$ python gh_member_bulk_team.py RMIT-COSC2780-2973-IDM24  axelahmer  --nteams  "teachers" "headtutor"
Running the script on: 2024-05-18-00-35-27
Sat, 18 May 2024 00:35:27 INFO     Getting organization RMIT-COSC2780-2973-IDM24...
Sat, 18 May 2024 00:35:27 INFO     Getting GH user for axelahmer...
Teams available: ['AI NPCs', 'ASP Dads', 'Galacticos', 'gASP', 'Harry Ron and Hermoine', 'IDM Project', 'Intellect Realm', 'Inter-Dimensional Masochists (IDM)', 'Logic Nexus', 'Lorem Ipsum', 'Mister World Wide', 'Prolog nightmares again', 'sajeevan', 'Super awesome team name', 'teachers', 'TRY']
Adding user **axelahmer** to team AI NPCs
Adding user **axelahmer** to team ASP Dads
Adding user **axelahmer** to team Galacticos
Adding user **axelahmer** to team gASP
Adding user **axelahmer** to team Harry Ron and Hermoine
Adding user **axelahmer** to team IDM Project
Adding user **axelahmer** to team Intellect Realm
Adding user **axelahmer** to team Inter-Dimensional Masochists (IDM)
Adding user **axelahmer** to team Logic Nexus
Adding user **axelahmer** to team Lorem Ipsum
Adding user **axelahmer** to team Mister World Wide
Adding user **axelahmer** to team Prolog nightmares again
Adding user **axelahmer** to team sajeevan
Adding user **axelahmer** to team Super awesome team name
Adding user **axelahmer** to team TRY

Some useful commands

Once all git repos have been cloned in git-submissions/, one can build zip files from the submissions into directory zip-submissions/ as follows:

for d in git-submissions-p2/*; do echo "============> Processing ${d}" ; zip -q -j "./zip-submissions-p2/`basename "$d.zip"`" "${d}"/p2-multiagent/* ;done

or for the final CTF project:

for d in git-submissions-p4/*; do echo "============> Processing ${d}" ; zip -q -j "./zip-submissions-p4/`basename "$d.zip"`" "${d}"/pacman-contest/* ;done

To count the number of commits between dates:

git log --after="2018-03-26T00:00:00+11:00" --before="2018-03-28T00:00:00+11:00" | grep "Date:" | wc -l

To copy just the new zip files:

rsync  -avt --ignore-existing  zip-submissions-p4/*.zip AI18-assessments/project-4/zip-submissions/

git_batch_commit.sh: a shell script template to make changes to a collection of repos.

This script allows to commit and push changes to a collection of repos; for example to make edits to students' repos after they have been created.

Links

git-hw-submissions's People

Contributors

andrewpaulchester avatar ssardina avatar thundergolfer-two avatar

Stargazers

 avatar

Watchers

 avatar  avatar

git-hw-submissions's Issues

Script to post comments on Feedback PRs

As discussed with @AndrewPaulChester just now, we want a script that goes through each repo in the repo.csv database and makes a comment in the repo Feedback PR to paste the marking results.

The input will be:

  • CSV file of the MARKING spreadsheet
  • Text file of the autograder marking

This way we get rid of bulk email via YAMM and links to Google Drive for the report, everythign stays within the feedback PR in the Github repos.

We will use the PyGithub Python library, here is an example how to add a a comment:

https://pygithub.readthedocs.io/en/latest/examples/PullRequest.html?highlight=pull%20request

Basically, we want to paste a form of this table which used to be sent by email to students:

image

let's do it!! ๐Ÿ’ช

Cloning and date of tagging

seems the script is not differentiating date of the commit (first) being tagged, from the date the tag was done (second)

team | submitted_at | commit | tag | tagged_at
3427684 | 26/7/2020 18:23:44 | 2413c6288fe3eb74eaad902588a08704b754ef44 | submission | 26/7/2020 18:23:44

This comes from this repo whose tag was changed later than 26/7:

https://github.com/RMIT-COSC1127-1125-AI/p0-warmup-sunilXY

Better handling of individual/team assignments

Add an explicit --team feature or something to treat individual and team assignment differently.

When we get the list of repos in a class organization with an assignment prefix, all we obtain is the suffix of the repo. For example: p1-multiagent-<suffix>

When exporting the student/team class list in GitHub Classroom (GC) we get:

**identifier | github_username | github_id | name | group_name**

The identifier is basically the student number or id given to GC.

  • In individual assignments, the <suffix> is the GitHub username of the student who accepted the assignment, stored in the column github_username
  • In a team assignment, the <suffix> is the group/team name, stored in the column group_name

So, the script should behave like this:

  • If it is a team assignment, then just clone use the suffix of the repo. These suffixes will correspond to the group_name and many students identifiers will share the same group name.
  • If it is an individual assignment, then we can either take the suffix as is (and then the mapping to student identifier/number) has to be done outside; or we can use the identifier as the name of the clone dir (which will give for example student numbers directly).

Orphan branches and missing commits

Consider this repo:

https://github.com/RMIT-COSC1127-1125-AI/p2-multiagent-aoligei

It is giving me only 4 commits to tag submission but there are more:

[ssardina@pacman-ai20-1 aoligei]$ gitlog --all
* bac5b3d (HEAD, tag: submission, origin/final) ready to submit
* 48eb03e questions done
* eb3485e (origin/q3_q4) q5 done, but I think it still can be improved
* 6a6ae66 q3 and q4
* 90ff36e (origin/q5) q5
*   5fba808 (origin/master, origin/HEAD) Merge pull request #2 from RMIT-COSC1127-1125-AI/q2
|\  
| * 5be6c6b (origin/q2) implemented q2, passed all tests
* |   445c04e Merge pull request #1 from RMIT-COSC1127-1125-AI/q1
|\ \  
| |/  
|/|   
| * 20722dc (origin/q1) deleted debug comments, formatted code for easier read
| * 2cfae3c implented q1, passed all test, left debug codes as comments for future use.
|/  
* 03f3a7f Initial commit

It only recognizes commits starting from 6a6ae66! In fact commit 6a6ae66 does not seem to have any parent when I process it one by one via PyGithub.

This is wheat I get:

[ssardina@pacman-ai20-1 aoligei]$ git shortlog   submission
Weiyi (4):
      q3 and q4
      q5 done, but I think it still can be improved
      questions done
      ready to submit

So how to handle those cases to get the no of commits per student correctly?

Problem with commit dates and timezones

Under some systems (e.g.., nectar) the timestamp produced for the commit tagged is UTC without adding the +10.

In this case it caused all timestamps to be behind by 10hrs...

we need to make this robust

Delete students in GH Classroom

There are bad students in GH added with suffix -1, basically duplicates:

image

Can we virtually scrape the pages and virtually hit the "delete" user button and then confirm:

image

Of course a challenge will be to authenticate in GH, but it should be possible with a token access

Parse all branches

When getting the commits of all authors, extract branches first and then parse all branches

This will allow getting the number of commits a user did in ALL the repo, not just the main branch

SSh access error to GH when cloning/updating

In nectar, (not at home), after cloning some repos, it starts throwing access/ssh errors:

image

I think this may have to do with having too many ssh connections one after another....

The manual fix I have now is to add as sleep every repo cloning:

image

This is git_clone_submissions.py

Weird error when creating os.makedirs

Original report by Sebastian Sardina (Bitbucket: [Sebastian Sardina](https://bitbucket.org/Sebastian Sardina), ).


@thundergolfer, this is a bit strange but somehow it fails to make the directory for a team:

#!python

[ssardina@Thinkpad-X1 script-tools.git]$ python3 process-ai-teams.py ai18-repos-full.csv submission git-repos
WARNING:__main__:Student number ID '0' is invalid
git-repos/ABPK
Traceback (most recent call last):
  File "process-ai-teams.py", line 260, in <module>
    main()
  File "process-ai-teams.py", line 256, in main
    team.process(config)
  File "process-ai-teams.py", line 146, in process
    os.makedirs(dest_repo_folder)
  File "/usr/lib/python3.5/os.py", line 226, in makedirs
    head, tail = path.split(name)
  File "/usr/lib/python3.5/posixpath.py", line 103, in split
    i = p.rfind(sep) + 1
AttributeError: 'PosixPath' object has no attribute 'rfind'

Any clue what may be happening? strange bc it must be working for you!

Timestamp backup fails with underscore in names of files

When the script copies the current timestamp file to a backup file, it leaves an empty file if the filename contains an underscore, like cosc1127_timestamp.csv. The problem is strange and shutil.copy does the job when running it interactively:

image

Issues with git submissions that already exist

Original report by Sebastian Sardina (Bitbucket: [Sebastian Sardina](https://bitbucket.org/Sebastian Sardina), ).


@thundergolfer , I think I answered the question I just sent you:

#!python

fatal: destination path 'git-repos/ABPK' already exists and is not an empty directory.
ERROR:__main__:Repo clone failed for '[email protected]:AlexBorg/ai18-pacman-projects-3490810.git'. exit code: 128
ERROR:__main__:Failed to complete processing for team: ABPK

One would not be cloning from scratch every repo, that would only happen once at the start. Then if a repo already exists locally then it should be OK. Script should pull/update and check for updated tags.

Does it make sense?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.