Coder Social home page Coder Social logo

edxz7 / pattern-recognition-2d-collinear-points-finder-with-python-pyspark Goto Github PK

View Code? Open in Web Editor NEW
6.0 1.0 2.0 3.27 MB

Find all the line segments formed by k-collinear points presents in a set of 2D points

License: MIT License

Python 100.00%
pyspark pattern-recognition python3 pygame artificial-intelligence collinear-points learn-by-examples

pattern-recognition-2d-collinear-points-finder-with-python-pyspark's Introduction

Pattern Recognition: Find all unique segments made with 2d collinear points (python/pyspark)

Pattern recognition algorithm. Finds all unique line segments made with at least k collinear points found in a set of 2D points expressed in cartesian coordinates. k must be greater than or equal to 3 (why?, becuase any pair of points defines a line, so the concept of collinearity is not well define for two points)

Examples of the output:

Getting Started

The python code (fastSearchCollinearPoints.py) and pyspark code (pysparkSearchCollinearPoints.py) are independent from each other and each one represents an algorithm that solves this task.

The complexity of the python version is O(nln(n)) whereas the complexity of the pyspark version is O(n^2) where n is the size of the input

To run the code using your own data you must provide a plain text file in the following format

3
12 24
31 45
45 89

the first line indicates the number of points in the file. Inside the data folder there are many test cases to play around.

You can also use a random generator functions to generate the points at random. Search for the following snippets of code to use them

[a snippet of the code inside fastSearchCollinearPoints.py]

# * Random points generator
# Instead of read the data from files, uncomment the lines below to generate them randomly
# and set k to search for at leat k collinear points
# k = 4
# points = randomPointsGenerator.random_points(2000)

[a snippet of the code inside pysparkSearchCollinearPoints.py]

# * Random points generator
# Instead of read the data from a file, uncomment the line below to generate them randomly
# k = 4
# pts, points_to_draw = points_from_random(200)

Usage

For the python code you can use the followimg command

python fastSearchCollinearPoints.py k filename

Whete k is the number of collinear point you want to detect and filename is the name of the file you want to inspect (this file must be inside the data direcctory).

If you don't provide any arguments, the ones hardcoded inside the code will be used instead. You can use this option if you don't want to provide any argument. You can find where to select files manually in the line 100 which contains the following piece of code

except:
    k = 4
    file_name = 'input20.txt'          
    path      = Path.cwd()
    file_path = path / 'data' / file_name

For the pyspark version

spark-submit pysparkSearchCollinearPoints.py

to choose the files from the data folder you can use the function points_from_file (see the line 147 of the file pysparkSearchCollinearPoints.py for more info)

Installation

To run the python code you must have install the libraries: sys, numpy, pathlib, pygame

If you have installed anaconda, you can get pygame running the following code

pip install pygame 

To run the pyspark code you must have intalled properly spark and pyspark


FAQ

  • I have troubles with pygame, Can I avoid used it?

    • If you have troubles with pygame, you can rip off all the references to the stddraw.py file. You still have access to the list of the collinear points found and you can use this data to feed another plotting library like matplotlib.
  • Why do you choose pygame instead of matplotlib?

    • Because in my opinion pygame offers a cleaner canvas to represent the results and it can render thousand of points and lines more efficiently than matplotlib
  • Does matplotlib will be supported in the future?

    • Maybe, if I have time, I certainly do it
  • will it be useful to look inside this code or it is of that kind of unreadable and ugly code?

    • Yeah, this code has some nice explanations and you certanly should see it if you want to learn to do similar things, particularly the pyspark version. The code written in python was a clever algorithm that I wrote for the Coursera course Algorithms I and I want to share it with the world. (The original version of this code was written in java, so if you are taking the Algorithms I course this python version don't will be graded by the autograder)

Acknowledgments and Credits


License

License

pattern-recognition-2d-collinear-points-finder-with-python-pyspark's People

Contributors

edxz7 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

Forkers

sethips mmsengul

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.