Coder Social home page Coder Social logo

matthew-murphy / se-q3-wordcount Goto Github PK

View Code? Open in Web Editor NEW

This project forked from jedeness/se-q3-wordcount

0.0 0.0 0.0 1.69 MB

Create a command line utility to count words in a text file, and show the top 20 most frequent words.

License: MIT License

Python 100.00%

se-q3-wordcount's Introduction

Wordcount

In this assignment you will use your knowledge of Python basic strings, arithmetic, file reading, and dictionaries to create a command line utility to count the words in the text files in the books directory โ€” (small.txt and alice.txt).

Complete the command-line python program named wordcount.py so that it will count the number of words in a text file using optional flags named --count and --topcount.

Example

$ python wordcount.py --count books/alice.txt
"'tis : 1
"--said : 1
"come : 2
"coming : 1
"edwin : 1
"french, : 1
"he's : 1
"how : 2
"i : 8
"i'll : 2
"it" : 2
"keep : 1
"let : 1
"much : 1
"poison" : 1
"purpose"?' : 1
$ python wordcount.py --topcount books/alice.txt
Top 20 most frequent words in books/alice.txt
the : 1605
and : 766
to : 706
a : 614
she : 518
of : 493
said : 421
it : 362
in : 352
was : 333
you : 265
i : 261
as : 249
that : 222
alice : 221
her : 208
at : 206
had : 176
with : 169
all : 155

Part A

For the --count flag, implement a print_words() function that counts how often each word appears in the text and prints:

word1 : count1
word2 : count2
...

Print the above list in order, sorted alphabetically by word (Python will sort punctuation to come before letters, which is fine โ€” do not strip out punctuation). Store all the words as lowercase (i.e., 'The' and 'the' count as the same word).

Part B

For the --topcount flag, implement a print_top() function similar to print_words(), but which prints just the top 20 most common words sorted so the most common word is first, then the next most common, and so on.

Testing with Unittest

This assignment also has separate unit tests to help you during development. The unit tests are located in the tests folder; you should not modify these. Make sure all unit tests are passing before you submit your solution. You can invoke the unit tests from the command line at the root of your project folder:

$ python -m unittest discover tests

You can also run these same tests using the Test Explorer extension built in to the VSCode editor, by enabling automatic test discovery. This is a really useful tool and we highly recommend to learn it.

https://code.visualstudio.com/docs/python/testing#_test-discovery

  • Test framework is unittest
  • Test folder pattern is tests
  • Test name pattern is test*

Submitting your work

To submit your solution for grading, you will need to create a github Pull Request (PR). Refer to the PR Workflow article in your course content for details.

se-q3-wordcount's People

Contributors

madarp avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.