This is project consists of two implementations of a sudoku solver, one uses only the CPU wheras the other uses both the CPU and the GPU.
The CPU version takes around 25 seconds whereas the parallel version takes around 4 seconds, to solve 16 of the hardest puzzles collected from this dataset.
The algorithm used for the solver is Breadth First Search (BFS), in which new possible grids are generated and stored based on the initial grid. Each existing grid is assigned a thread, which adds the new grids generated by that grid to the list of new grids. The challenge in this implementation is efficiently storing the new grids in a data structure asynchronously. To solve this, the exclusive scan (prefix sum) method is used to determine the start positions for insertion of the new grids for the current grid. Overall, this project demonstrates the benefits of utilizing both the CPU and GPU in parallel processing for solving sudoku puzzles in a more efficient manner.