dalpy-developers / dalpy Goto Github PK

View Code? Open in Web Editor NEW

14.0 3.0 3.0 121 KB

A Python package for learning data structures and algorithms.

Home Page: https://pypi.org/project/dalpy/

License: MIT License

Batchfile 0.46% Python 99.54%

python data-structures-and-algorithms cormen-algorithms cormen unit-testing academic grading-system

dalpy's Introduction

DALPy

DALPy is a Python package for learning data structures and algorithms. It is based off of Introduction to Algorithms by Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Clifford Stein. This library was made specifically for administering and grading assignments related to data structures and algorithms in computer science.

With this library students can receive progress reports on their problem sets in real time as they complete assignments. Additionally, student submission assessment is done with unit tests, instead of hand-tracing, ensuring that the grades that students receive accurately reflect their submissions.

The DALPy testing suite offers extremely lightweight and flexible unit testing utilities that can be used on any kind of assignment, whether to write functions or build classes. Course administration can be easily streamlined by restricting which library data structures students are allowed to use on any particular assignment.

DALPy began as a project by two Brandeis University undergraduate students to replace hand-written problem sets written in pseudocode.

Provided Data Structures

The DALPy library offers a set of fundamental data structures and algorithms, with behavior as specified by Cormen et al.'s Introduction to Algorithms. The following structures (separated by module) are supported:

arrays
- Array
- Array2D
queues
- Queue
stacks
- Stack
sets
- Set
linked_lists
- SinglyLinkedListNode
- DoublyLinkedListNode
trees
- BinaryTreeNode
- NaryTreeNode
- depth(NaryTreeNode)
hashing
- HashTable
heaps
- PriorityQueue
- build_min_heap(Array)
graphs
- Vertex
- Graph

Unit Testing

Along with the DALPy data structures come test utilities for writing test cases. The testing framework allows a course administrator to easily write test cases for either expected function output or general class behavior. Test cases can then be combined into a testing suite. The testing suite has the capability to set a test case run-time timeout and to record comma-separated test results for administrative use.

Consider the example test case below:

import unittest
from dalpy.factory_utils import make_stack
from dalpy.test_utils import build_and_run_watched_suite, generic_test

from student_submission import student_function

# TestCase class for testing student_function
class StudentFunctionTest(unittest.TestCase):

    # A single test case
    def simple_test_case(self):
        stack = make_stack([1, 2, 3])
        expected = make_stack([1, 1, 2, 2, 3, 3])
        generic_test(stack, expected, student_function, in_place=True)

# Run the test cases using build_and_run_watched_suite with a timeout of 4 seconds
if __name__ == '__main__':
    build_and_run_watched_suite([StudentFunctionTest], 4)

Installation

DALPy is available on PyPI, and can be installed with pip.

pip install dalpy

DALPy has the following dependencies:

Python >= 3.6

Issues

We encourage you to report issues using the GitHub tracker. We welcome all kinds of issues, especially those related to correctness, documentation and feature requests.

Academic Usage

If you are planning to use DALPy for a university course and have questions, feel free to reach out by email.

Documentation

The full documentation for DALPy is available here.

Sample Usage

To view sample assignments using DALPy browse the DALPy sample problems repository on GitHub.

Notes

This project was formerly known as Cormen-Lib.

dalpy's People

Contributors

Stargazers

Watchers

Forkers

masonrware wwxiao09 zeng-lecheng

dalpy's Issues

GraphEdgeError message does not use dalpy_to_string for vertices.

Describe the bug
The GraphEdgeError class does not produce a readable output. This is because the vertices are not displayed properly with their data. An example output:

dalpy.graphs.GraphEdgeError: graph does not have an edge from a vertex with name <dalpy.graphs.Vertex object at 0x7faa24830340> to a vertex with name <dalpy.graphs.Vertex object at 0x7faa24833670>

To Reproduce
Steps to reproduce the behavior:

g = Graph()
v1 = Vertex("a")
v2 = Vertex("b")
g.add_vertex(v1)
g.add_vertex(v2)
g.weight(v2, v1)
Observe the error message.

Expected behavior
The output should display the Vertex objects using the data of the vertex.

Enforcing No Argument Modification in Testing Utilities

Is your feature request related to a problem? Please describe.
For problems that require students to write functions, we sometimes want to impose that the students' functions should not modify some or all of their arguments. To currently enforce this, a copy needs to be made of the arguments that aren't meant to be modified before running the students' function. Then, after the function is run, the copy needs to be compared to the arguments that were passed to the function. This occurs in cases where the arguments are data structures of some kind (e.g. Array, Stack, etc.).

Describe the solution you'd like
Since this process is easily generalizable, I believe that this should be added to the testing utilities. In particular, it could be added to the run_generic_test function.

The first question is how the user should specify which arguments should be checked for modification. This could be specified in a parameter enforce_no_mod:

By default, enforce_no_mod is False indicating that all of the arguments can be modified.
If enforce_no_mod is True, then none of the arguments can be modified and should all be checked.
Lastly, enforce_no_mod can be a list of bool of equal length to the existing params parameter. enforce_no_mod[i] = True means params[i] should be checked for modification. enforce_no_mod[i] = False means that it should not.

The second question is how run_generic_test will know how to copy any arguments that need to be checked. This can be specified by a list copy_fns such that copy_fns[i] is a function that can be used to make a copy of params[i]. Since params can be a single value (i.e. not a list), it may be good to allow copy_fns to be a single function also. This should be provided a default value as to not have to change existing testing code (None is sufficient).

The third question is how run_generic_test will know how to compare the argument passed to the students' function and its copy. This also can be specified by a parameter copy_eqs_fns that operates in a similar manner to copy_fns except where copy_eqs_fns[i], in the instance it's a list, is a function that checks the equality of params[i] to its copy. This should be provided a default value as to not have to change existing testing code (None is sufficient).

Describe alternatives you've considered
The alternative is the current process described in my answer to the first prompt.

Additional context

Rename Cormen-Lib to DALPy

Description
I believe that we should rename Cormen-Lib to DALPy. DALPy would stand for a Python module for learning data structures and algorithms. The module will remain based off of Introduction to Algorithms by Cormen et al. I believe that DALPy sounds more professional as it does not associate Introduction to Algorithms to only one of its four authors. Also, it follows the convention of existing Python libaries such as NumPy, SciPy, PyTorch etc.

Additional Context
This will involve renaming the module on PyPI as well as the associated organization and repositories on GitHub.

Giving clear hint of the renaming

Is your feature request related to a problem? Please describe.
Yes it is a problem. I was installing cormen-lib today and found my code not working at all. The import cormen_lib statement threw an ModuleNotFoundError to me which led me to troubleshoot on my Python installation. After 20 minutes of configuring and reinstalling cormen-lib and Python and even my IDE back and forth, I finally saw your PyPI page and realized that the project was renamed.

Describe the solution you'd like
Giving clear hints of the renaming is much better, instead of removing all the code under the name cormen_lib and throwing ModuleNotFoundError. We could have a print statement in cormen_lib.py: print('cormen_lib is renamed to dalpy. See https://github.com/DALPy-Developers/DALPy for detail'), as well as all the names of the classes in the library. When users try to import something from cormen_lib (e.g. running their previous homework), they would know what to do to solve it.

Reimplement Stack, Queue to Match Theoretical Runtime

Is your feature request related to a problem? Please describe.
Currently, Stack and Queue are backed by list which is assumed to be a dynamic array. Therefore, Queue:dequeue degrades to linear time. It is also possible for Stack:push and Queue:enqueue to be linear time if list must perform an array resize.

Describe the solution you'd like
Stack and Queue are reimplemented to have constant time operations and infinite capacity.

cormen_equals and copy_stack throwing error when applied to two Stacks

Describe the bug
Stack equality using cormen_equals has a bug. The __stack_equals method uses copy_stack from factory_utils which does not properly copy a stack's contents, and always throws an error stating that "one can not pop from an empty stack" unless the input stack is empty.

To Reproduce
Steps to reproduce the behavior:

Create a Stack with more than one element.
Call cormen_equals on that Stack with itself.
See error

Expected behavior
The cormen_equals method should return true if the contents of the Stacks are equal and false otherwise.

Solution
See line 43 of the factory_utils module - it should store s.pop() in a variable and push it to the buf Stack twice.

Automatic Proper Floating Point Comparison

Is your feature request related to a problem? Please describe.
Currently, most test cases are built around the run_generic_test function in the cormen_lib.test_utils module. Some test cases will involve the comparison of floating point numbers. If no custom_comparator is specified in run_generic_test, cormen_equals is used instead. cormen_equals uses == to compare floats. This is not the proper way to compare floats.

Describe the solution you'd like
I believe that cormen_equals should make use of math.isclose() when passed two floats.

Describe alternatives you've considered
The alternative is for the user to specify in each test case that compares floats custom_comparator = math.isclose. This is not only tedious, but sometimes a user may forget to do this. This is especially likely if the tests happen to pass on their machine.

Additional context
StackOverflow article on float comparison in Python

Function to Add Children to a NaryTreeNode

Is your feature request related to a problem? Please describe.
Building a n-ary tree is somewhat difficult for the purpose of unit testing. Suppose you have an NaryTreeNode a and wish to add children [b,c,d] (all NaryTreeNodes as well). Currently you have to do the following:

a.leftmost_child = b
b.parent = a
c.parent = a
d.parent = a
b.right_sibling = c
c.right_sibling = d

When there are more children, or when one adds children to [b,c,d], one can easily forget to set one of these attributes.

Describe the solution you'd like
I think a function should be added to factory_utils that takes a NaryTreeNode node and a list of NaryTreeNode children and properly adds all the elements of children as children of root.

Describe alternatives you've considered
The alternative is to add a method that does this to NaryTreeNode. However, this does not follow the simple nature of the NaryTreeNode class as described in Introduction to Algorithms.

Additional context
The motivation for this issue came from finding a bug in test cases built for a problem related to NaryTreeNodes.

Add a warning when students return a value from a function being evaluated with in-place=True

Is your feature request related to a problem? Please describe.
Some problems may take a data structure as input, passed by reference, to be updated by a function implemented by a student. In these situations, test_utils.run_generic_test offers an optional parameter in_place that if set to True causes the test case to compare the expected against the input parameter. Any returned value from the tested function will then get discarded, which can cause confusion for students who believe that the function should have a return type.

Describe the solution you'd like
In the case that a student has a function return a non-None value when the in_place flag is set to True, a warning could appear to indicate to the student that they should be updating the input parameter by reference instead of returning a new value. Another option could be to allow for the distinction between a "return comparator" and "parameter comparator", which both need to be satisfied. This could also be used for the dual purpose of ensuring that an input data structure is not altered when it should not be (for example the accidental destruction of a queue).

Describe alternatives you've considered
The only way to clarify this at the moment is to add a note to the problem description (in English) explaining that the input should be updated, and no return type is required for such problems.

Additional context
Reported by a student in Brandeis University COSI21a of Spring 2022

dalpy_to_string on trees of integers shows quotes

Describe the bug
The BinaryTreeNode and NaryTreeNode dalpy_to_string methods improperly display trees of integers. They display the integers in the trees as though they were strings, such as ['10'] instead of [10].

To Reproduce
Steps to reproduce the behavior:

Instantiate BinaryTreeNode with an integer argument, i.e. t = BinaryTreeNode(1)
Observe result of dalpy_to_string on that argument, i.e. dalpy_to_string(t)
Repeat for NaryTreeNode.

Expected behavior
The dalpy_to_string method should display trees of integers without quotation marks around the tree element data fields.

Screenshots

Additional context
Check this behavior for other dalpy objects as well.

Fix misaligned PriorityQueue runtimes

Is your feature request related to a problem? Please describe.
Currently, PriorityQueue is backed by a sorted list. Therefore, the actual runtimes of DALPy PriorityQueue functions is misaligned with their assumed runtimes.

Describe the solution you'd like
PriorityQueue should be reimplemented using Python's base library heapq with a custom decrease_key function that mantains O(log(n)) runtime.

Changes to HashTable and Set

Is your feature request related to a problem? Please describe.
Currently there is no way to iterate over the keys of a HashTable.

Also, I believe that Set should operate more like a mathematical set versus a Java/Python set. That is, modification to the Set should be done via difference and union operations with other Sets.

Describe the solution you'd like
One should be able to get either get the keys (have HashTable have a function that returns a collection of them) or be able to iterate over the keys (e.g. for k in t).

In regard to Set, add() should be removed and replace with union() which takes a Set. That way, Set will operate more how you would use it in pseudocode:

S = S U {1}
T = T - {2}
S = S U T

Describe alternatives you've considered
To iterate over a HashTable now, the keys must be stored in an external data structure like an Array or Set. I also think it is best that iteration is kept to over keys to avoid confusion with tuples (like the behavior of Python dict's items()).

For Sets the alternative is to keep add() (and possible inclue a remove()) but those operate on single elements. Since singleton Sets can be easily created, I think it is suitable to use union and difference operations in their place.

Additional context
None

Addition of a Miscellaneous Utilities Module

Is your feature request related to a problem? Please describe.
In some scenarios (e.g. finding a max, initializing graph vertex priorities in Dijkstra's, priority queue insertion) users may want to make use of positive or negative infinity. This also appears in the Cormen textbook. Users may also want to make use of the floor and ceiling mathematical operations.

Describe the solution you'd like
I think a new module should be added to the library that includes (positive) infinity as a constant and implementations of the floor and ceiling operations. I believe the module should be named something like utilities or miscellaneous.

Describe alternatives you've considered
Currently, when building the starter code for a problem set, the math module is imported or an infinity constant is added in the instance where it is known they would be used. However, to allow users to have these capabilities available in any Cormen Lib code they write, these capabilities should be added to the package. This will provide a better Python implementation of the pseudocode used in the Cormen textbook. I believe these capabilities should be added in a new module in the package as they do not belong in any of the existing modules. I do not believe the new module should be called math because additional functions or constants could be added to it (e.g. functions operating on strings).

Additional context

Update dalpy_to_string to display object type

Is your feature request related to a problem? Please describe.
The current iteration of the dalpy_to_string function does not indicate the object type in the string representation of DALPy objects. This can be confusing when the dalpy_to_string representation and other base python object representations coincide, for example with DALPy Array and Python list.

Describe the solution you'd like
Some indication of the object type should be added to the dalpy_to_string function's string representation generation for each DALPy provided data structure. For example:

Array[1,2,3]
Queue[1,2,3]

Additional context
Here is a screenshot from a Brandeis University Spring 2022 COSI21a student's submission for which they were confused as to why they were failing test cases, despite the output and expected appearing identical.

README.md

dalpy-developers / dalpy Goto Github PK

dalpy's Introduction

DALPy

Provided Data Structures

Unit Testing

Installation

Issues

Academic Usage

Documentation

Sample Usage

Notes

dalpy's People

Contributors

Stargazers

Watchers

Forkers

dalpy's Issues

Recommend Projects

Recommend Topics

Recommend Org