Regex2NFA is a Python program that converts regular expressions into NFAs and simulates a given string on the resulting NFA. This project was developed as an assignment for a course on theoretical informatics.
- Supports a wide range of regular expressions, including alternation, concatenation, and Kleene star.
- Uses Thompson's algorithm to convert a regular expression to an NFA.
- Provides an
NFA
class implementation with functions for alternation, concatenation, and Kleene star for use with Thompson's algorithm. - Includes a
build.py
script that reads input for a regular expression, creates an NFA for it, removes epsilon, and callsreduce
. - Includes a
run.py
script that reads an NFA generated bybuild.py
and simulates a string on it, printing 'N' and 'Y' for each character of the string, depending on whether the NFA accepts the string up to that character. - Includes
manual_tests.py
andautomatic_tests.py
for testing the program manually and automatically.
The following bash script allows you to run the conversion and simulation of a regex in a single command.
./regex.sh <String> <Regular Expression>
Regex2NFA is a Python program designed to facilitate the conversion of regular expressions into epsilon NFAs (ε-NFAs). It further provides functionality to remove epsilon transitions, minimize the resulting NFA, and simulate a given input string on the transformed automaton. The project was initially developed as an assignment for a theoretical informatics course, aiming to demonstrate the practical implementation of various concepts in automata theory.
The conversion process begins by taking a regular expression as input and generating an epsilon NFA, which is an NFA that can contain epsilon transitions—transitions that can be taken without consuming any input symbol. This allows for a more concise representation of the regular expression.
Once the epsilon NFA is constructed, the program applies an algorithm to remove epsilon transitions, resulting in a standard NFA. This step is important as it simplifies the automaton and allows for more efficient operations and simulations.
To optimize the NFA further, the program employs a minimization algorithm, reducing the number of states and transitions while preserving the language recognized by the automaton. Minimization helps in creating a more compact and manageable representation of the regular expression.
Finally, the program provides the capability to simulate a given input string on the minimized NFA. This simulation determines whether the input string is accepted by the regular expression and automaton, providing insights into the matching behavior of the given pattern.
By incorporating these steps, Regex2NFA offers a comprehensive solution for converting regular expressions into epsilon NFAs, removing epsilon transitions, minimizing the resulting NFA, and simulating input strings.
The NFA class is an implementation of a nondeterministic finite automaton. It provides methods for defining and manipulating NFAs, including alternation, concatenation, and Kleene star operations. These methods are used in conjunction with Thompson's algorithm to convert a regular expression to an NFA.
The build.py script takes a regular expression as input and constructs an NFA that accepts the same language as the regular expression. The script first formats the input regex, converting it into a list of tokens. The list is then processed to create an NFA for each symbol in the regex, and the NFAs are combined using the operations specified in the regex.
The resulting NFA is simplified by removing epsilon transitions, and the reduce method is called to further simplify the NFA. The script then outputs the simplified NFA.
The run.py script takes an NFA generated by build.py and simulates an input string on the NFA. For each character of the input string, the script prints 'Y' if the NFA accepts the string up to that character, and 'N' otherwise.
The project includes manual_tests.py and automatic_tests.py for manually and automatically testing the implementation. These scripts help ensure the correctness of the NFA conversion and simulation processes.