pedal-edu / pedal Goto Github PK

View Code? Open in Web Editor NEW

26.0 4.0 7.0 22.63 MB

A collection of tools to analyze student's Python source code

Home Page: https://pedal-edu.github.io/pedal/

License: MIT License

Python 98.33% Makefile 0.09% Batchfile 0.06% Jupyter Notebook 1.51%

feedback education learning-python python grading autograder

pedal's People

Stargazers

Watchers

Forkers

aivarannamaa mahuan06 krishols majid-heidary thrallab vritant24

pedal's Issues

TIFA built-in exceptions

KeyError et al should be available as built-in variables.

Custom Environments

[ ] Set up the infrastructure for custom environments
[ ] Make the BlockPy environment
[ ] Make the Command Line environment

Initial Major Restructure

Here's what I propose for the directory layout (based on advice by http://docs.python-guide.org/en/latest/writing/structure/):

Top-Level:

tests/
docs/
pedal/
README.rst
LICENSE
MAKEFILE
requirements.txt
setup.py

All tests will go in the tests/ directory. We're already using unittest, so we can keep using that. I'll set up the MAKEFILE to run the relevant tests conveniently. I'll also be setting up a code coverage reporter.

We should probably use Sphinx to generate our documentation. I'm going to vote for Google style docstrings - I've been using them in Tifa and I think they're a bit more concise than the regular Sphinx ones. https://pythonhosted.org/an_example_pypi_project/sphinx.html#function-definitions

The Pedal module will have the following submodules:

cait/: AST matching and code configuration
tifa/: Type inferencing, flow analysis
environments/: Compatibility interfaces for BlockPy, Jupyter Notebooks, Web-CAT, etc. I'm not entirely clear what exactly goes in here, but one example is the Magic stuff for Jupyter Notebooks (but maybe that'll end up being distributed across other modules).
feedback/: The logic from feedback.js that takes all the reports and dispatches them.
mistakes/: Your collected mistake identifications (formerly the instructor_* pattern groups)
report/: The functions for triggering feedback strings (e.g., explain, gently, set_success)
sandbox/: A module for safely executing student code and checking the result.

These should be organized with absolute imports. The __init__.py file in each module should expose the client-facing functions/variables/classes.

Let me know if you have a preferred name for mistakes/ - perhaps misconceptions/ is better for that.

Here's a rough outline of dependencies between these libraries:

Report can be used by
TIFA, Sandbox can be used by
CAIT can be used by
Mistakes can be used by
Feedback, Environments

Python 3.4 compatibility

Syntax error in assertions.py when dealing with Python 3.4 because of loop unrolling.

Data State for BoolType

set_source("var = 14\n"
           "var = '14'\n"
           "var = False\n")
match = find_match("_var_")
data_state = match['_var_']
data_state = match['_var_'].get_data_state()
print(data_state.type) #UnknownType
print(data_state.trace[0].type)#StrType
print(data_state.trace[0].trace[0].type)#NumType

I expected the first print statement to print BoolType instead of UnknownType

Tifa's type checker doesn't handle iteration over record types

It gets very confused

Better Assertions

[ ] Move utility functions into a Static assertions module
[ ] Move existing assertions into a Runtime assertions module
[ ] Clean up existing code to use new feedback style
[ ] Finish implementing all the classic assert* functions
[ ] Unit test

Duplicate Syntax Error Labels

syntax_error and SyntaxError tags are currently both used. At some point these should be unified.

Better traceback output for runtime errors

I'm thinking we should make the following changes:

Flip the order so that the callstack starts at the top instead of the bottom.
Have markdown format for each line:

Bullet points for each row of the traceback
Highlight the python code lines
Make the actual error itself more separated out

Make sure that the explanatory text appears
Put the "When I tried running X" at the top.

TIFA's function calls do not seem to support keyword arguments

It seems to think of it as an undefined variable.

Use TIFA information in evaluating a specific Ast node from Cait

If Cait is looking at a given expression node, TIFA should be able to report the anticipated type of that node. It could give information about it, even if it wasn't a node.

Indexing mistake in Tifa

This indexing shouldn't be valid.

movie_list = [{'Title':'Iron Man', 'Release Date':'May 2, 2008','Budget':'140','Running Time':'126'},
              {'Title':'Iron Man 2', 'Release Date':'May 7, 2010','Budget':'200', 'Running Time':'125'},
              {'Title':'Iron Man 3', 'Release Date':'May 3, 2013','Budget':'200', 'Running Time':'130'}]
total=0
for title in movie_list:
    total=total + 'title'['Running Time']
print(total)

Category/Tool/Label/Priority Naming Rules

I think we should probably establish naming conventions for these symbols. Here's my current ruleset:

Tool: The internal name for a Tool. This won't be shown to a student, but could be useful for internal tracking purposes.
Category: A human-readable general description of feedback, such as "syntax" or "runtime".
Label: A human-readable specific description of the feedback, such as the specific error type or analyzer issue. If it begins with an asterisk, it is meant to be used more internally than student-facing.
Priority: A priority level (low, medium, high) or a category to put this feedback before.

All of the above are:

Case insensitive
Favor spaces over camel case and underscores

Consider bundling a markdown parser/generator

Since the new message format allows Markdown (and HTML by extension), we should consider having a Markdown->HTML tool as an optional dependency. Potentially, we should even bundle it for those who want pure Python. We need to investigate whether that's a necessity, though. Perhaps all of this should just be in a simple "HTML Only Environment"?

TIFA could map AST Nodes to Values

TIFA walks the AST and abstractly evaluates the program. At each step of the tree, it has a best guess (or guesses, technically) about what the current type of that expression is. We could easily produce a dictionary that maps instances of AST nodes from the tree to the type that was determined.

The biggest potential application of this is connecting that data back to CAIT. If you found an expression match in the source code, TIFA could tell you what type of value it is - not just when it is a Name, but when it's anything!

Student code:

my_list = ["Ice", "Fire", "Thunder"]
my_value = my_list[1]

Instructor code:

match= find_match("my_value = __expr__")
tifa_check_type(match['__expr__'])
# Would be a Tifa `StrType`

Not sure what the exact interface should look like; I kind of prefer the idea that Tifa would have a function rather than this being a method inside CAIT. However, I'm open to alternative ideas.

Incompatible types could talk more about the values

If you have code like the following:

(1 + 2) + ""

Tifa reports that you can't add a number and a string. The feedback could be more precise however, and explain that (1+2) produced a number. At the very least, it could point to the expression as being one of the two components (heck, they could even be highlighted).

Streamline feedback for majors kinds of failures

When a students submits code that doesn't compile, make sure the feedback messages all say the same kind of thing a bit more clearly.

Duplicate Output (any print statement) in Jupyter Notebooks while sandboxing

Checking output doesn't work correctly because output is duplicated when printing, leading to more outputs than the notebook created
hypothetical solution: call a clear method on the jupyter side

Update examples

Most of the examples in the examples/ folder are from earlier versions of Pedal, and don't reflect current practice. We need to update them.

Unify Pedal's Type System

I find that Pedal's type system is growing increasingly complex. The logic started in TIFA, was partially included in CAIT, became relevant to many functions in the Toolkit, and is now also a very big part of Assertions. I've backed into supporting both static types and runtime types, from a wide variety of sources:

Inferred types from static analysis (half of TIFA's business)
AST annotation nodes (the new optional static typing feature)
Python type functions (e.g., list, int)
A JSON-encoded representation from CORGIS
String representations of the types (which at some point in Python's future, will be the only representation of AST annotations!)
The Python typing module (well, not really, but we could)

It seems like it might be time to promote Pedal's type system to something more sophisticated and customizable, extracting it out of TIFA into its own Tool. I'm imagining a system where the savvy teacher could set flags and make modifications to the Type system in order to achieve certain experiences - for example, you might disable support for my wacky [int] list annotations or allow None to be used for certain types (default Optional behavior). The main value would be a consistent API for other Tools to refer to types, and make it clear what type-consuming functions are allowed to consume (e.g., can you pass in a string-represented-type to assert_is_instance?).

I'm not sure this is the right time for this change, since we're hitting the semester soon. However, I don't know that we can rewrite the last major Toolkit submodule (functions) without making some kind of coherent decision here. But perhaps we can push it off until later if we're a little clever. Regardless, we need to make some decisions about this long term!

Things that TIFA should catch

Here's a few common issues that students have that TIFA should handle more intelligently:

Redefining a parameter:

def x(a):
   a = 0
x(5)

Initializing a function local variable outside of the function:

sum = 0
def summate(a_list):
    for value in a_list:
        sum += value

Unused parameter

HTML Classes/Text vs. Message

These might be separate issues, but for now they overlap.

Pedal's whole deal is generating a Response based on Conditions. In BlockPy, this Response can be a mix of Markdown and HTML, so that it looks nice - we call this the message. However, we are anticipating supporting Environments that might be text only, so we added the text field to Feedback, with the idea that responsible citizens would generate both and the Environment would decide which was more appropriate. Unfortunately, this is a difficult, cross-cutting concern because a lot of the logic for generating text (e.g., in Sandbox and Assertions) is still stuck in code.

In a similar vein, I believe that many Environments would like to use CSS to style their output. At the very least, BlockPy would like to make Tracebacks more appealing and syntax highlight Python code. This suggests that we need to give a mechanism to Environments to change the HTML classes that are attached to certain structured blocks, or at the very least be outputting default classes (e.g., "pedal-sandbox-traceback-line pedal-python-code") in the generated HTML.

What is the most elegant way of making all this text generation more sophisticated?

My current idea is a pedal.utility.html module with a function like make_html that intelligently adds an HTML tag with the appropriate CSS classes (or perhaps several functions, for various use cases like wrapping Output, Code, emphasized text, Name, ...)? And then this function could also support a "fall-back" mode that produces text instead of HTML (removing the need for the text/text_template field entirely).

Documentation in separate repo

I don't think we should be stuffing the documentation in the core repo, at least not the generated files. Probably the whole thing gets pushed out to its own separate repo. Need to look into best practices.

Pygame/Arcade Mocking

Find out whatever the best python game library is nowadays, and mock it like MatPlotLib.

Stylistic checks

Terry Harvey said that some of the stylistic checks he does are:

Bad capitalization
Declaring a new variable for every intermediate result
code that is never executed
pointless else clauses
magic numbers
repeating code instead of using a loop
giant functions
creating "type" vars instead of using OO (in 181/275)

I'm betting these are largely checkable with Pedal. For some of them, you can definitely imagine that instructors would want a "slider" to adjust how picky they are. Intermediate temporary variables, for instance, are only a problem for certain sizes, really.

He also wants to look at names, but that's harder. Still, we could do something like Elena Glassmen's name checking tool.

Subclassed UnitTest that allows you to quickly convert existing tests to Pedal

I assume most instructors have already written unit tests in Python using the built-in unittest library. If we subclass the UnitTest modules' classes correctly, we could get folks to integrate Pedal without doing more than changing a single line.

Typechecker is too strict with functions that return None variables

If a function returns None directly, then the type checker has no issue. However, if it returns a variable with a None value, it breaks the contract. That shouldn't happen.

"@" Breaks Cait

Student had code like this:

my_email = person@udel.edu

Amusingly, this breaks CAIT because @ is now an operator. Email addresses are now technically valid Python code...

MyPy integration

I'd like folks to be able to swap out Tifa for MyPy, in case they want students to do "real" type checking instead of TIFA style.

Record all feedbacks

The Report currently does not record failing feedbacks, but I think we want to start doing so. They will be used by things like the custom analyzers of #40

Sectional Resolver

We have a halfway-working sectional resolver, that supports a file that has been broken up into sections via some text string (e.g., #########). Let's get that working all the way.

explain and explain_r

We should take the time to consolidate these functions into one and add the 'code' key into the dictionary so that feedback is more easily analyzed. This should probably happen over the winter break when BlockPy currently isn't in use so we can fix all the 1064 Curriculum.

function diving in Cait

pedal.cait

Tree matching isn't working properly when diving into function arguments. Created test_function_diving test case (that should currently fail).

student_code1 = "print(report['station'])"
matcher1 = StretchyTreeMatcher("___[__expr__]")
matches01 = matcher1.find_matches(student_code1)

HTML vs. Text mode for reporting

Most of our tools currently generate HTML, but this is not necessarily ideal for Pedal being used in command line environments. It seems to me like we might want settings to control whether Tools should report in plain-text or HTML. I can't think of other output types, however.

In our definitions, we describe most tools generating Feedback Components. These Components could be a string, but they can also be some kind of object with a "message" field or a list of Components. I think that we should probably elevate this a little to be something more coherent with the different data models being passed around. One thought could be that a Component is an object with possible fields:

{
    "message": An HTML representation of the message
    "text": An explicitly plain text representation of the message.
    ...
}

Where other fields could include a line number, column position, error type, etc.

Tifa incorrect inference: iterating over empty list

While list1 was initialized to an empty list, the list does get populated. However, Tifa detects this as iterating over an empty list.

import weather
import matplotlib.pyplot as plt

weather_reports = weather.get_weather()
list1 = []
list2 = []
for weather in weather_reports:
    if weather["Station"]["City"] == "Chicago":
        list1.append(weather)
for weather in list1 :
    list2.append(weather["Data"]["Precipitation"])
plt.plot (list2)
plt.title ("Precipitation in Chicago")
plt.xlabel ("labelX")
plt.ylabel ("labelY")
plt.show ()

Deprecate cait.ast_helpers.py

Do we still need ast_helpers.py? If it's necessary for dev, but isn't meant to be used by instructors, perhaps it can be moved to some other location like the tools main directory?

Typechecking functions that do not return

Functions that do not return currently don't register as issues when they have a return type annotation.

Internationalization

It's 2020 and we don't support other languages besides English? Well, that's no good.

Going from memory, the strategy is usually just use gettext. So, first, we must:

[ ] Have gettext support in Skulpt. It would be easy to port a pure python over, but I worry about speed and efficiency - I'd like to get a JavaScript version. This might be worth putting a bounty on and making it a full PR back to upstream Skulpt.

From there, we'll have to:
[ ] Go through all the text strings in the project and wrap them with _(...).
[ ] Look up key places with pluralization and use ngettext
[ ] Handle stuff found in pedal.utilities.text; for instance, how do deal with articles?

Cait2 branch changes

The following is a summary of all the changes that I've made in the new branch cait2, so far. Need to walk through them and make sure that they are all valid changes.

Top-level API state has been restructured. There's no longer a Cait object, since that didn't appear to be doing anything. Instead, state is stored through whatever Report gets passed into the main Cait functions (parse_program, find_match, find_matches, etc.), defaulting to the MAIN_REPORT.
I changed the way TIFA's data is loaded. If TIFA hasn't been run, attempting to use its functions will just cause an error. The argument is that the user should know to run TIFA themselves before CAIT, not expect CAIT to run TIFA.
Renamed EasyNode to CaitNode.
Removed CtMap in favor of built-in Dictionaries. This required changing a couple tests - it seemed they relied on the particular ordering of CtMap's keys and values, which seemed like it might be fragile to me. An AstMap now has a match_lineno property to support this better.
Added CaitNode.is_method(), get_data_state(), and get_data_type()
StretchyTreeMatcher now returns empty lists instead of False values - I don't think this has any negative effects, and allows you to remove if checks before iterating through the matches.
AstSymbol now has a __getattr__ that will call its .ast_node attribute to getattr from there. This allows you to use the result of AstMap lookups more easily.
Changed AstSymbol.__str__ to be __repr__, and made __str__ just return the id attribute. This allows you to .format the results more easily.
Added CaitNode.find_matches which farms out a call to a new StretchyTreeMatcher with find_matches. The idea is to simplify subexpression matching:

# Old
submatches = find_expr_sub_matches("_target_.append(___)", __expr__)
# New
submatches = __expr__.find_matches("_target_.append(___)")

Changes I'm still considering:

AstMap.__getitem__ could automatically return the first item from the symbol_table's list, rather than returning a list of all potential items. I observe that almost every single Mistakes function and unittest use the first element. If someone needed a later element, they could just use AstMap.symbol_table to access it directly. This would require touching almost every test, but it seems like it'd be a nice change.
Fixing all the camelCase attributes. That's not consistent with the rest of the API and is bad Python style, but it'd be a fairly big refactor. In the meantime, I tried to add under_score_case attributes in a few places that allow access either way. Documentation should favor the under_score_case.

Kill the Toolbox

I was once advised to never have a "utilities" module. Well, we probably will have such a thing for Pedal's core, but we shouldn't have one that is teacher facing. The existing Toolbox module is effectively a hodge-podge of checks, and needs to be broken up and redistributed. I think most of these will become either Assertions or Extensions.

Extensions

Currently, my vague vision for things like MatPlotLib, Turtle, and Pygame are to support them as "Extensions". Handling these external modules require cross-cutting concerns - stuff in both Sandbox and Assertions, at the very least. It seems to me like they shouldn't be stuck inside of those submodules.

PyPi pushes

We need to start a formal process for pushing to PyPi and managing releases. Currently, we have no process and its chaos.

CI/CD workflow on push

We should set up an Action to automatically run our unit tests, style checker, coverage, and such whenever we push our code to the repo. It's way more professional. That will also encourage us to use branching.

Built-in Sandbox Exceptions should print more nicely

Currently they show off pedal's module system in an ugly way. By default, we should catch these and give better messages. But we need to provide hooks for instructors who want to handle those errors differently (different message, different control flow).

Function Definition matching

Function definitions are kind of complex...so argument matching and recording function arguments as symbols is not yet implemented.

Improve Runtime Error Reporting

Make the look like actual tracebacks

Old style CAIT API usage is not working

Previously, CAIT's api allowed you to extract out the root node (the module) using parse_program and then use find_all to traverse it. It seems like that is no longer supported:

https://github.com/acbart/pedal/blob/master/tests/test_cait.py#L370

Also, am I correct in understanding that CAIT rebuilds its parse tree every time you call find_matches? As opposed to building it once during parse_program, like it used to?

function placeholders

Need handler for making wild card names for functions.

CLI Analyzers for research/teaching

[ ] Ability to verify that a bunch of (instructor created) student submissions have the expected feedback.
[ ] Ability to experimentally run a Progsnap file of student submissions against a new bit of instructor code, generating a CSV of all the recorded feedback events.
[ ] Ability to (re-)grade a bunch of student submissions from the command line.

pedal-edu / pedal Goto Github PK

pedal's People

Stargazers

Watchers

Forkers

pedal's Issues

Recommend Projects

Recommend Topics

Recommend Org