pedal-edu / pedal Goto Github PK
View Code? Open in Web Editor NEWA collection of tools to analyze student's Python source code
Home Page: https://pedal-edu.github.io/pedal/
License: MIT License
A collection of tools to analyze student's Python source code
Home Page: https://pedal-edu.github.io/pedal/
License: MIT License
KeyError et al should be available as built-in variables.
[ ] Set up the infrastructure for custom environments
[ ] Make the BlockPy environment
[ ] Make the Command Line environment
Here's what I propose for the directory layout (based on advice by http://docs.python-guide.org/en/latest/writing/structure/):
Top-Level:
All tests will go in the tests/ directory. We're already using unittest
, so we can keep using that. I'll set up the MAKEFILE
to run the relevant tests conveniently. I'll also be setting up a code coverage reporter.
We should probably use Sphinx to generate our documentation. I'm going to vote for Google style docstrings - I've been using them in Tifa and I think they're a bit more concise than the regular Sphinx ones. https://pythonhosted.org/an_example_pypi_project/sphinx.html#function-definitions
The Pedal module will have the following submodules:
cait/
: AST matching and code configurationtifa/
: Type inferencing, flow analysisenvironments/
: Compatibility interfaces for BlockPy, Jupyter Notebooks, Web-CAT, etc. I'm not entirely clear what exactly goes in here, but one example is the Magic stuff for Jupyter Notebooks (but maybe that'll end up being distributed across other modules).feedback/
: The logic from feedback.js that takes all the reports and dispatches them.mistakes/
: Your collected mistake identifications (formerly the instructor_* pattern groups)report/
: The functions for triggering feedback strings (e.g., explain, gently, set_success)sandbox/
: A module for safely executing student code and checking the result.These should be organized with absolute imports. The __init__.py
file in each module should expose the client-facing functions/variables/classes.
Let me know if you have a preferred name for mistakes/
- perhaps misconceptions/
is better for that.
Here's a rough outline of dependencies between these libraries:
Report
can be used byTIFA
, Sandbox
can be used byCAIT
can be used byMistakes
can be used byFeedback
, Environments
Syntax error in assertions.py when dealing with Python 3.4 because of loop unrolling.
set_source("var = 14\n"
"var = '14'\n"
"var = False\n")
match = find_match("_var_")
data_state = match['_var_']
data_state = match['_var_'].get_data_state()
print(data_state.type) #UnknownType
print(data_state.trace[0].type)#StrType
print(data_state.trace[0].trace[0].type)#NumType
I expected the first print statement to print BoolType instead of UnknownType
It gets very confused
[ ] Move utility functions into a Static assertions module
[ ] Move existing assertions into a Runtime assertions module
[ ] Clean up existing code to use new feedback style
[ ] Finish implementing all the classic assert* functions
[ ] Unit test
syntax_error and SyntaxError tags are currently both used. At some point these should be unified.
I'm thinking we should make the following changes:
It seems to think of it as an undefined variable.
If Cait is looking at a given expression node, TIFA should be able to report the anticipated type of that node. It could give information about it, even if it wasn't a node.
This indexing shouldn't be valid.
movie_list = [{'Title':'Iron Man', 'Release Date':'May 2, 2008','Budget':'140','Running Time':'126'},
{'Title':'Iron Man 2', 'Release Date':'May 7, 2010','Budget':'200', 'Running Time':'125'},
{'Title':'Iron Man 3', 'Release Date':'May 3, 2013','Budget':'200', 'Running Time':'130'}]
total=0
for title in movie_list:
total=total + 'title'['Running Time']
print(total)
I think we should probably establish naming conventions for these symbols. Here's my current ruleset:
Tool: The internal name for a Tool. This won't be shown to a student, but could be useful for internal tracking purposes.
Category: A human-readable general description of feedback, such as "syntax" or "runtime".
Label: A human-readable specific description of the feedback, such as the specific error type or analyzer issue. If it begins with an asterisk, it is meant to be used more internally than student-facing.
Priority: A priority level (low, medium, high) or a category to put this feedback before.
All of the above are:
Since the new message format allows Markdown (and HTML by extension), we should consider having a Markdown->HTML tool as an optional dependency. Potentially, we should even bundle it for those who want pure Python. We need to investigate whether that's a necessity, though. Perhaps all of this should just be in a simple "HTML Only Environment"?
TIFA walks the AST and abstractly evaluates the program. At each step of the tree, it has a best guess (or guesses, technically) about what the current type of that expression is. We could easily produce a dictionary that maps instances of AST nodes from the tree to the type that was determined.
The biggest potential application of this is connecting that data back to CAIT. If you found an expression match in the source code, TIFA could tell you what type of value it is - not just when it is a Name, but when it's anything!
Student code:
my_list = ["Ice", "Fire", "Thunder"]
my_value = my_list[1]
Instructor code:
match= find_match("my_value = __expr__")
tifa_check_type(match['__expr__'])
# Would be a Tifa `StrType`
Not sure what the exact interface should look like; I kind of prefer the idea that Tifa would have a function rather than this being a method inside CAIT. However, I'm open to alternative ideas.
If you have code like the following:
(1 + 2) + ""
Tifa reports that you can't add a number and a string. The feedback could be more precise however, and explain that (1+2) produced a number. At the very least, it could point to the expression as being one of the two components (heck, they could even be highlighted).
When a students submits code that doesn't compile, make sure the feedback messages all say the same kind of thing a bit more clearly.
Checking output doesn't work correctly because output is duplicated when printing, leading to more outputs than the notebook created
hypothetical solution: call a clear method on the jupyter side
Most of the examples in the examples/ folder are from earlier versions of Pedal, and don't reflect current practice. We need to update them.
I find that Pedal's type system is growing increasingly complex. The logic started in TIFA, was partially included in CAIT, became relevant to many functions in the Toolkit, and is now also a very big part of Assertions. I've backed into supporting both static types and runtime types, from a wide variety of sources:
list
, int
)typing
module (well, not really, but we could)It seems like it might be time to promote Pedal's type system to something more sophisticated and customizable, extracting it out of TIFA into its own Tool. I'm imagining a system where the savvy teacher could set flags and make modifications to the Type system in order to achieve certain experiences - for example, you might disable support for my wacky [int]
list annotations or allow None
to be used for certain types (default Optional behavior). The main value would be a consistent API for other Tools to refer to types, and make it clear what type-consuming functions are allowed to consume (e.g., can you pass in a string-represented-type to assert_is_instance
?).
I'm not sure this is the right time for this change, since we're hitting the semester soon. However, I don't know that we can rewrite the last major Toolkit submodule (functions
) without making some kind of coherent decision here. But perhaps we can push it off until later if we're a little clever. Regardless, we need to make some decisions about this long term!
Here's a few common issues that students have that TIFA should handle more intelligently:
def x(a):
a = 0
x(5)
sum = 0
def summate(a_list):
for value in a_list:
sum += value
These might be separate issues, but for now they overlap.
Pedal's whole deal is generating a Response based on Conditions. In BlockPy, this Response can be a mix of Markdown and HTML, so that it looks nice - we call this the message
. However, we are anticipating supporting Environments that might be text only, so we added the text
field to Feedback, with the idea that responsible citizens would generate both and the Environment would decide which was more appropriate. Unfortunately, this is a difficult, cross-cutting concern because a lot of the logic for generating text (e.g., in Sandbox and Assertions) is still stuck in code.
In a similar vein, I believe that many Environments would like to use CSS to style their output. At the very least, BlockPy would like to make Tracebacks more appealing and syntax highlight Python code. This suggests that we need to give a mechanism to Environments to change the HTML classes that are attached to certain structured blocks, or at the very least be outputting default classes (e.g., "pedal-sandbox-traceback-line pedal-python-code") in the generated HTML.
What is the most elegant way of making all this text generation more sophisticated?
My current idea is a pedal.utility.html
module with a function like make_html
that intelligently adds an HTML tag with the appropriate CSS classes (or perhaps several functions, for various use cases like wrapping Output, Code, emphasized text, Name, ...)? And then this function could also support a "fall-back" mode that produces text instead of HTML (removing the need for the text/text_template field entirely).
I don't think we should be stuffing the documentation in the core repo, at least not the generated files. Probably the whole thing gets pushed out to its own separate repo. Need to look into best practices.
Find out whatever the best python game library is nowadays, and mock it like MatPlotLib.
Terry Harvey said that some of the stylistic checks he does are:
I'm betting these are largely checkable with Pedal. For some of them, you can definitely imagine that instructors would want a "slider" to adjust how picky they are. Intermediate temporary variables, for instance, are only a problem for certain sizes, really.
He also wants to look at names, but that's harder. Still, we could do something like Elena Glassmen's name checking tool.
I assume most instructors have already written unit tests in Python using the built-in unittest
library. If we subclass the UnitTest modules' classes correctly, we could get folks to integrate Pedal without doing more than changing a single line.
If a function returns None
directly, then the type checker has no issue. However, if it returns a variable with a None value, it breaks the contract. That shouldn't happen.
Student had code like this:
my_email = person@udel.edu
Amusingly, this breaks CAIT because @ is now an operator. Email addresses are now technically valid Python code...
We have long talked about a sophisticated tagging system for Pedal's feedback functions. However, we have yet to talk about what it looks like.
I'd like folks to be able to swap out Tifa for MyPy, in case they want students to do "real" type checking instead of TIFA style.
The Report currently does not record failing feedbacks, but I think we want to start doing so. They will be used by things like the custom analyzers of #40
We have a halfway-working sectional resolver, that supports a file that has been broken up into sections via some text string (e.g., #########
). Let's get that working all the way.
We should take the time to consolidate these functions into one and add the 'code' key into the dictionary so that feedback is more easily analyzed. This should probably happen over the winter break when BlockPy currently isn't in use so we can fix all the 1064 Curriculum.
pedal.cait
Tree matching isn't working properly when diving into function arguments. Created test_function_diving test case (that should currently fail).
student_code1 = "print(report['station'])"
matcher1 = StretchyTreeMatcher("___[__expr__]")
matches01 = matcher1.find_matches(student_code1)
Most of our tools currently generate HTML, but this is not necessarily ideal for Pedal being used in command line environments. It seems to me like we might want settings to control whether Tools should report in plain-text or HTML. I can't think of other output types, however.
In our definitions, we describe most tools generating Feedback Components. These Components could be a string, but they can also be some kind of object with a "message" field or a list of Components. I think that we should probably elevate this a little to be something more coherent with the different data models being passed around. One thought could be that a Component is an object with possible fields:
{
"message": An HTML representation of the message
"text": An explicitly plain text representation of the message.
...
}
Where other fields could include a line number, column position, error type, etc.
While list1 was initialized to an empty list, the list does get populated. However, Tifa detects this as iterating over an empty list.
import weather
import matplotlib.pyplot as plt
weather_reports = weather.get_weather()
list1 = []
list2 = []
for weather in weather_reports:
if weather["Station"]["City"] == "Chicago":
list1.append(weather)
for weather in list1 :
list2.append(weather["Data"]["Precipitation"])
plt.plot (list2)
plt.title ("Precipitation in Chicago")
plt.xlabel ("labelX")
plt.ylabel ("labelY")
plt.show ()
Do we still need ast_helpers.py? If it's necessary for dev, but isn't meant to be used by instructors, perhaps it can be moved to some other location like the tools
main directory?
Functions that do not return currently don't register as issues when they have a return type annotation.
It's 2020 and we don't support other languages besides English? Well, that's no good.
Going from memory, the strategy is usually just use gettext. So, first, we must:
[ ] Have gettext
support in Skulpt. It would be easy to port a pure python over, but I worry about speed and efficiency - I'd like to get a JavaScript version. This might be worth putting a bounty on and making it a full PR back to upstream Skulpt.
From there, we'll have to:
[ ] Go through all the text strings in the project and wrap them with _(...)
.
[ ] Look up key places with pluralization and use ngettext
[ ] Handle stuff found in pedal.utilities.text
; for instance, how do deal with articles?
The following is a summary of all the changes that I've made in the new branch cait2, so far. Need to walk through them and make sure that they are all valid changes.
Cait
object, since that didn't appear to be doing anything. Instead, state is stored through whatever Report gets passed into the main Cait functions (parse_program
, find_match
, find_matches
, etc.), defaulting to the MAIN_REPORT
.EasyNode
to CaitNode
.CtMap
in favor of built-in Dictionaries. This required changing a couple tests - it seemed they relied on the particular ordering of CtMap
's keys and values, which seemed like it might be fragile to me. An AstMap
now has a match_lineno
property to support this better.CaitNode.is_method()
, get_data_state()
, and get_data_type()
StretchyTreeMatcher
now returns empty lists instead of False
values - I don't think this has any negative effects, and allows you to remove if checks before iterating through the matches.AstSymbol
now has a __getattr__
that will call its .ast_node
attribute to getattr
from there. This allows you to use the result of AstMap
lookups more easily.AstSymbol.__str__
to be __repr__
, and made __str__
just return the id
attribute. This allows you to .format
the results more easily.CaitNode.find_matches
which farms out a call to a new StretchyTreeMatcher
with find_matches
. The idea is to simplify subexpression matching:# Old
submatches = find_expr_sub_matches("_target_.append(___)", __expr__)
# New
submatches = __expr__.find_matches("_target_.append(___)")
Changes I'm still considering:
AstMap.__getitem__
could automatically return the first item from the symbol_table's list, rather than returning a list of all potential items. I observe that almost every single Mistakes function and unittest use the first element. If someone needed a later element, they could just use AstMap.symbol_table
to access it directly. This would require touching almost every test, but it seems like it'd be a nice change.I was once advised to never have a "utilities" module. Well, we probably will have such a thing for Pedal's core, but we shouldn't have one that is teacher facing. The existing Toolbox
module is effectively a hodge-podge of checks, and needs to be broken up and redistributed. I think most of these will become either Assertions or Extensions.
Currently, my vague vision for things like MatPlotLib, Turtle, and Pygame are to support them as "Extensions". Handling these external modules require cross-cutting concerns - stuff in both Sandbox and Assertions, at the very least. It seems to me like they shouldn't be stuck inside of those submodules.
We need to start a formal process for pushing to PyPi and managing releases. Currently, we have no process and its chaos.
We should set up an Action to automatically run our unit tests, style checker, coverage, and such whenever we push our code to the repo. It's way more professional. That will also encourage us to use branching.
Currently they show off pedal's module system in an ugly way. By default, we should catch these and give better messages. But we need to provide hooks for instructors who want to handle those errors differently (different message, different control flow).
Function definitions are kind of complex...so argument matching and recording function arguments as symbols is not yet implemented.
Make the look like actual tracebacks
Previously, CAIT's api allowed you to extract out the root node (the module) using parse_program
and then use find_all
to traverse it. It seems like that is no longer supported:
https://github.com/acbart/pedal/blob/master/tests/test_cait.py#L370
Also, am I correct in understanding that CAIT rebuilds its parse tree every time you call find_matches
? As opposed to building it once during parse_program
, like it used to?
Need handler for making wild card names for functions.
[ ] Ability to verify that a bunch of (instructor created) student submissions have the expected feedback.
[ ] Ability to experimentally run a Progsnap file of student submissions against a new bit of instructor code, generating a CSV of all the recorded feedback events.
[ ] Ability to (re-)grade a bunch of student submissions from the command line.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.