Coder Social home page Coder Social logo

pedal-edu / pedal Goto Github PK

View Code? Open in Web Editor NEW
26.0 4.0 7.0 22.63 MB

A collection of tools to analyze student's Python source code

Home Page: https://pedal-edu.github.io/pedal/

License: MIT License

Python 98.33% Makefile 0.09% Batchfile 0.06% Jupyter Notebook 1.51%
feedback education learning-python python grading autograder

pedal's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

pedal's Issues

Custom Environments

[ ] Set up the infrastructure for custom environments
[ ] Make the BlockPy environment
[ ] Make the Command Line environment

Initial Major Restructure

Here's what I propose for the directory layout (based on advice by http://docs.python-guide.org/en/latest/writing/structure/):

Top-Level:

  • tests/
  • docs/
  • pedal/
  • README.rst
  • LICENSE
  • MAKEFILE
  • requirements.txt
  • setup.py

All tests will go in the tests/ directory. We're already using unittest, so we can keep using that. I'll set up the MAKEFILE to run the relevant tests conveniently. I'll also be setting up a code coverage reporter.

We should probably use Sphinx to generate our documentation. I'm going to vote for Google style docstrings - I've been using them in Tifa and I think they're a bit more concise than the regular Sphinx ones. https://pythonhosted.org/an_example_pypi_project/sphinx.html#function-definitions

The Pedal module will have the following submodules:

  • cait/: AST matching and code configuration
  • tifa/: Type inferencing, flow analysis
  • environments/: Compatibility interfaces for BlockPy, Jupyter Notebooks, Web-CAT, etc. I'm not entirely clear what exactly goes in here, but one example is the Magic stuff for Jupyter Notebooks (but maybe that'll end up being distributed across other modules).
  • feedback/: The logic from feedback.js that takes all the reports and dispatches them.
  • mistakes/: Your collected mistake identifications (formerly the instructor_* pattern groups)
  • report/: The functions for triggering feedback strings (e.g., explain, gently, set_success)
  • sandbox/: A module for safely executing student code and checking the result.

These should be organized with absolute imports. The __init__.py file in each module should expose the client-facing functions/variables/classes.

Let me know if you have a preferred name for mistakes/ - perhaps misconceptions/ is better for that.

Here's a rough outline of dependencies between these libraries:

  • Report can be used by
  • TIFA, Sandbox can be used by
  • CAIT can be used by
  • Mistakes can be used by
  • Feedback, Environments

Data State for BoolType

set_source("var = 14\n"
           "var = '14'\n"
           "var = False\n")
match = find_match("_var_")
data_state = match['_var_']
data_state = match['_var_'].get_data_state()
print(data_state.type) #UnknownType
print(data_state.trace[0].type)#StrType
print(data_state.trace[0].trace[0].type)#NumType

I expected the first print statement to print BoolType instead of UnknownType

Better Assertions

[ ] Move utility functions into a Static assertions module
[ ] Move existing assertions into a Runtime assertions module
[ ] Clean up existing code to use new feedback style
[ ] Finish implementing all the classic assert* functions
[ ] Unit test

Better traceback output for runtime errors

I'm thinking we should make the following changes:

  1. Flip the order so that the callstack starts at the top instead of the bottom.
  2. Have markdown format for each line:
  • Bullet points for each row of the traceback
  • Highlight the python code lines
  • Make the actual error itself more separated out
  1. Make sure that the explanatory text appears
  2. Put the "When I tried running X" at the top.

Indexing mistake in Tifa

This indexing shouldn't be valid.

movie_list = [{'Title':'Iron Man', 'Release Date':'May 2, 2008','Budget':'140','Running Time':'126'},
              {'Title':'Iron Man 2', 'Release Date':'May 7, 2010','Budget':'200', 'Running Time':'125'},
              {'Title':'Iron Man 3', 'Release Date':'May 3, 2013','Budget':'200', 'Running Time':'130'}]
total=0
for title in movie_list:
    total=total + 'title'['Running Time']
print(total)

Category/Tool/Label/Priority Naming Rules

I think we should probably establish naming conventions for these symbols. Here's my current ruleset:

Tool: The internal name for a Tool. This won't be shown to a student, but could be useful for internal tracking purposes.
Category: A human-readable general description of feedback, such as "syntax" or "runtime".
Label: A human-readable specific description of the feedback, such as the specific error type or analyzer issue. If it begins with an asterisk, it is meant to be used more internally than student-facing.
Priority: A priority level (low, medium, high) or a category to put this feedback before.

All of the above are:

  • Case insensitive
  • Favor spaces over camel case and underscores

Consider bundling a markdown parser/generator

Since the new message format allows Markdown (and HTML by extension), we should consider having a Markdown->HTML tool as an optional dependency. Potentially, we should even bundle it for those who want pure Python. We need to investigate whether that's a necessity, though. Perhaps all of this should just be in a simple "HTML Only Environment"?

TIFA could map AST Nodes to Values

TIFA walks the AST and abstractly evaluates the program. At each step of the tree, it has a best guess (or guesses, technically) about what the current type of that expression is. We could easily produce a dictionary that maps instances of AST nodes from the tree to the type that was determined.

The biggest potential application of this is connecting that data back to CAIT. If you found an expression match in the source code, TIFA could tell you what type of value it is - not just when it is a Name, but when it's anything!

Student code:

my_list = ["Ice", "Fire", "Thunder"]
my_value = my_list[1]

Instructor code:

match= find_match("my_value = __expr__")
tifa_check_type(match['__expr__'])
# Would be a Tifa `StrType`

Not sure what the exact interface should look like; I kind of prefer the idea that Tifa would have a function rather than this being a method inside CAIT. However, I'm open to alternative ideas.

Incompatible types could talk more about the values

If you have code like the following:

(1 + 2) + ""

Tifa reports that you can't add a number and a string. The feedback could be more precise however, and explain that (1+2) produced a number. At the very least, it could point to the expression as being one of the two components (heck, they could even be highlighted).

Update examples

Most of the examples in the examples/ folder are from earlier versions of Pedal, and don't reflect current practice. We need to update them.

Unify Pedal's Type System

I find that Pedal's type system is growing increasingly complex. The logic started in TIFA, was partially included in CAIT, became relevant to many functions in the Toolkit, and is now also a very big part of Assertions. I've backed into supporting both static types and runtime types, from a wide variety of sources:

  • Inferred types from static analysis (half of TIFA's business)
  • AST annotation nodes (the new optional static typing feature)
  • Python type functions (e.g., list, int)
  • A JSON-encoded representation from CORGIS
  • String representations of the types (which at some point in Python's future, will be the only representation of AST annotations!)
  • The Python typing module (well, not really, but we could)

It seems like it might be time to promote Pedal's type system to something more sophisticated and customizable, extracting it out of TIFA into its own Tool. I'm imagining a system where the savvy teacher could set flags and make modifications to the Type system in order to achieve certain experiences - for example, you might disable support for my wacky [int] list annotations or allow None to be used for certain types (default Optional behavior). The main value would be a consistent API for other Tools to refer to types, and make it clear what type-consuming functions are allowed to consume (e.g., can you pass in a string-represented-type to assert_is_instance?).

I'm not sure this is the right time for this change, since we're hitting the semester soon. However, I don't know that we can rewrite the last major Toolkit submodule (functions) without making some kind of coherent decision here. But perhaps we can push it off until later if we're a little clever. Regardless, we need to make some decisions about this long term!

Things that TIFA should catch

Here's a few common issues that students have that TIFA should handle more intelligently:

  1. Redefining a parameter:
def x(a):
   a = 0
x(5)
  1. Initializing a function local variable outside of the function:
sum = 0
def summate(a_list):
    for value in a_list:
        sum += value
  1. Unused parameter

HTML Classes/Text vs. Message

These might be separate issues, but for now they overlap.

Pedal's whole deal is generating a Response based on Conditions. In BlockPy, this Response can be a mix of Markdown and HTML, so that it looks nice - we call this the message. However, we are anticipating supporting Environments that might be text only, so we added the text field to Feedback, with the idea that responsible citizens would generate both and the Environment would decide which was more appropriate. Unfortunately, this is a difficult, cross-cutting concern because a lot of the logic for generating text (e.g., in Sandbox and Assertions) is still stuck in code.

In a similar vein, I believe that many Environments would like to use CSS to style their output. At the very least, BlockPy would like to make Tracebacks more appealing and syntax highlight Python code. This suggests that we need to give a mechanism to Environments to change the HTML classes that are attached to certain structured blocks, or at the very least be outputting default classes (e.g., "pedal-sandbox-traceback-line pedal-python-code") in the generated HTML.

What is the most elegant way of making all this text generation more sophisticated?

My current idea is a pedal.utility.html module with a function like make_html that intelligently adds an HTML tag with the appropriate CSS classes (or perhaps several functions, for various use cases like wrapping Output, Code, emphasized text, Name, ...)? And then this function could also support a "fall-back" mode that produces text instead of HTML (removing the need for the text/text_template field entirely).

Documentation in separate repo

I don't think we should be stuffing the documentation in the core repo, at least not the generated files. Probably the whole thing gets pushed out to its own separate repo. Need to look into best practices.

Pygame/Arcade Mocking

Find out whatever the best python game library is nowadays, and mock it like MatPlotLib.

Stylistic checks

Terry Harvey said that some of the stylistic checks he does are:

  • Bad capitalization
  • Declaring a new variable for every intermediate result
  • code that is never executed
  • pointless else clauses
  • magic numbers
  • repeating code instead of using a loop
  • giant functions
  • creating "type" vars instead of using OO (in 181/275)

I'm betting these are largely checkable with Pedal. For some of them, you can definitely imagine that instructors would want a "slider" to adjust how picky they are. Intermediate temporary variables, for instance, are only a problem for certain sizes, really.

He also wants to look at names, but that's harder. Still, we could do something like Elena Glassmen's name checking tool.

"@" Breaks Cait

Student had code like this:

my_email = person@udel.edu

Amusingly, this breaks CAIT because @ is now an operator. Email addresses are now technically valid Python code...

Tags

We have long talked about a sophisticated tagging system for Pedal's feedback functions. However, we have yet to talk about what it looks like.

MyPy integration

I'd like folks to be able to swap out Tifa for MyPy, in case they want students to do "real" type checking instead of TIFA style.

Record all feedbacks

The Report currently does not record failing feedbacks, but I think we want to start doing so. They will be used by things like the custom analyzers of #40

Sectional Resolver

We have a halfway-working sectional resolver, that supports a file that has been broken up into sections via some text string (e.g., #########). Let's get that working all the way.

explain and explain_r

We should take the time to consolidate these functions into one and add the 'code' key into the dictionary so that feedback is more easily analyzed. This should probably happen over the winter break when BlockPy currently isn't in use so we can fix all the 1064 Curriculum.

function diving in Cait

pedal.cait

Tree matching isn't working properly when diving into function arguments. Created test_function_diving test case (that should currently fail).

student_code1 = "print(report['station'])"
matcher1 = StretchyTreeMatcher("___[__expr__]")
matches01 = matcher1.find_matches(student_code1)

HTML vs. Text mode for reporting

Most of our tools currently generate HTML, but this is not necessarily ideal for Pedal being used in command line environments. It seems to me like we might want settings to control whether Tools should report in plain-text or HTML. I can't think of other output types, however.

In our definitions, we describe most tools generating Feedback Components. These Components could be a string, but they can also be some kind of object with a "message" field or a list of Components. I think that we should probably elevate this a little to be something more coherent with the different data models being passed around. One thought could be that a Component is an object with possible fields:

{
    "message": An HTML representation of the message
    "text": An explicitly plain text representation of the message.
    ...
}

Where other fields could include a line number, column position, error type, etc.

Tifa incorrect inference: iterating over empty list

While list1 was initialized to an empty list, the list does get populated. However, Tifa detects this as iterating over an empty list.

import weather
import matplotlib.pyplot as plt

weather_reports = weather.get_weather()
list1 = []
list2 = []
for weather in weather_reports:
    if weather["Station"]["City"] == "Chicago":
        list1.append(weather)
for weather in list1 :
    list2.append(weather["Data"]["Precipitation"])
plt.plot (list2)
plt.title ("Precipitation in Chicago")
plt.xlabel ("labelX")
plt.ylabel ("labelY")
plt.show ()

Deprecate cait.ast_helpers.py

Do we still need ast_helpers.py? If it's necessary for dev, but isn't meant to be used by instructors, perhaps it can be moved to some other location like the tools main directory?

Internationalization

It's 2020 and we don't support other languages besides English? Well, that's no good.

Going from memory, the strategy is usually just use gettext. So, first, we must:

[ ] Have gettext support in Skulpt. It would be easy to port a pure python over, but I worry about speed and efficiency - I'd like to get a JavaScript version. This might be worth putting a bounty on and making it a full PR back to upstream Skulpt.

From there, we'll have to:
[ ] Go through all the text strings in the project and wrap them with _(...).
[ ] Look up key places with pluralization and use ngettext
[ ] Handle stuff found in pedal.utilities.text; for instance, how do deal with articles?

Cait2 branch changes

The following is a summary of all the changes that I've made in the new branch cait2, so far. Need to walk through them and make sure that they are all valid changes.

  1. Top-level API state has been restructured. There's no longer a Cait object, since that didn't appear to be doing anything. Instead, state is stored through whatever Report gets passed into the main Cait functions (parse_program, find_match, find_matches, etc.), defaulting to the MAIN_REPORT.
  2. I changed the way TIFA's data is loaded. If TIFA hasn't been run, attempting to use its functions will just cause an error. The argument is that the user should know to run TIFA themselves before CAIT, not expect CAIT to run TIFA.
  3. Renamed EasyNode to CaitNode.
  4. Removed CtMap in favor of built-in Dictionaries. This required changing a couple tests - it seemed they relied on the particular ordering of CtMap's keys and values, which seemed like it might be fragile to me. An AstMap now has a match_lineno property to support this better.
  5. Added CaitNode.is_method(), get_data_state(), and get_data_type()
  6. StretchyTreeMatcher now returns empty lists instead of False values - I don't think this has any negative effects, and allows you to remove if checks before iterating through the matches.
  7. AstSymbol now has a __getattr__ that will call its .ast_node attribute to getattr from there. This allows you to use the result of AstMap lookups more easily.
  8. Changed AstSymbol.__str__ to be __repr__, and made __str__ just return the id attribute. This allows you to .format the results more easily.
  9. Added CaitNode.find_matches which farms out a call to a new StretchyTreeMatcher with find_matches. The idea is to simplify subexpression matching:
# Old
submatches = find_expr_sub_matches("_target_.append(___)", __expr__)
# New
submatches = __expr__.find_matches("_target_.append(___)")

Changes I'm still considering:

  1. AstMap.__getitem__ could automatically return the first item from the symbol_table's list, rather than returning a list of all potential items. I observe that almost every single Mistakes function and unittest use the first element. If someone needed a later element, they could just use AstMap.symbol_table to access it directly. This would require touching almost every test, but it seems like it'd be a nice change.
  2. Fixing all the camelCase attributes. That's not consistent with the rest of the API and is bad Python style, but it'd be a fairly big refactor. In the meantime, I tried to add under_score_case attributes in a few places that allow access either way. Documentation should favor the under_score_case.

Kill the Toolbox

I was once advised to never have a "utilities" module. Well, we probably will have such a thing for Pedal's core, but we shouldn't have one that is teacher facing. The existing Toolbox module is effectively a hodge-podge of checks, and needs to be broken up and redistributed. I think most of these will become either Assertions or Extensions.

Extensions

Currently, my vague vision for things like MatPlotLib, Turtle, and Pygame are to support them as "Extensions". Handling these external modules require cross-cutting concerns - stuff in both Sandbox and Assertions, at the very least. It seems to me like they shouldn't be stuck inside of those submodules.

PyPi pushes

We need to start a formal process for pushing to PyPi and managing releases. Currently, we have no process and its chaos.

CI/CD workflow on push

We should set up an Action to automatically run our unit tests, style checker, coverage, and such whenever we push our code to the repo. It's way more professional. That will also encourage us to use branching.

Built-in Sandbox Exceptions should print more nicely

Currently they show off pedal's module system in an ugly way. By default, we should catch these and give better messages. But we need to provide hooks for instructors who want to handle those errors differently (different message, different control flow).

Function Definition matching

Function definitions are kind of complex...so argument matching and recording function arguments as symbols is not yet implemented.

CLI Analyzers for research/teaching

[ ] Ability to verify that a bunch of (instructor created) student submissions have the expected feedback.
[ ] Ability to experimentally run a Progsnap file of student submissions against a new bit of instructor code, generating a CSV of all the recorded feedback events.
[ ] Ability to (re-)grade a bunch of student submissions from the command line.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.