Coder Social home page Coder Social logo

python-party's Introduction

Task for Great Python developer

If you found this task it means we are looking for you!

Task

Word Search

Implement a CLI tool that finds phrases in a given text file. Search phrase (single word) can be misspelled. Use American Soundex algorithm to match words - https://en.wikipedia.org/wiki/Soundex. Software should return the top unique 5 matched words.

Sample file wiki_lt.txt:

   Lithuania (UK and US: Listeni/ˌlɪθuːˈeɪniə/,[11][12][13] Lithuanian: Lietuva
   [lʲɪɛtʊˈvɐ]), officially the Republic of Lithuania (Lithuanian: Lietuvos
   Respublika), is a country in Northern Europe.[14] One of the three Baltic
   states, it is situated along the southeastern shore of the Baltic Sea, to the
   east of Sweden and Denmark. It is bordered by Latvia to the north, Belarus to
   the east and south, Poland to the south, and Kaliningrad Oblast (a Russian
   exclave) to the southwest. Lithuania has an estimated population of 2.9 million
   people as of 2015, and its capital and largest city is Vilnius. Lithuanians are
   a Baltic people. The official language, Lithuanian, along with Latvian, is one
   of only two living languages in the Baltic branch of the Indo-European language
   family.

Sample usage:

$ ./find.py wiki_lt.txt lituania

Sample output:

   Lithuania
   Lithuanian
   Lietuva
   Listeni
   living

The exact results might be different because of different scoring, matching, sorting algorithms, etc.

NOTE that given text file might be larger than we have RAM on our machine, e.g. +4 GB, etc. Any parallelized optimizations are encouraged.

Few simple steps

  1. Fork this repo
  2. Do your best
  3. Prepare pull request and let us know that you are done

Few simple requirements

  • Use Python 3.
  • It should be easy to identify candidate's code. It should not be buried inside some framework directory structure.
  • Proper error handling must be implemented, user input must be validated.

Bonus

  • The more code is covered with automated tests, the better. Unit, integration, end to end tests, etc. are all encouraged.
  • Use PEP 8 style guide.
  • Use pylint, flake8 or any other tool to statically test code.
  • Use mypy type hints.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.