Coder Social home page Coder Social logo

log-processing's People

Contributors

dolan-peter avatar nicmcphee avatar

Watchers

 avatar  avatar  avatar

log-processing's Issues

Create a `submodule` script for `bats`

Instead of having all the confusion about submodules caused by bats, we should make a setup-bats.sh script (or similar) that runs the desired commands. They people can run that once when they first clone the repo. Thanks to @biruk741 for the suggestion!

This should happen in earlier labs, like the command line intro and its pre-lab, and continue out across the subsequent labs that use bats.

Unit tests should normalize whitespace before sorting lines, rather than after

Right now, process_client_logs.bats and process_logs.bats each take the actual result, sort it, and diff it with the expected result ignoring whitespace.

  # process_client_logs.bats
  sort data/discovery/failed_login_data.txt > data/discovery_sorted.txt
  sort tests/discovery_failed_login_data.txt > data/target_sorted.txt
  run diff -wbB data/target_sorted.txt data/discovery_sorted.txt
  assert_success
  # process_logs.bats
  sort tests/summary_plots.html > targets/target.txt
  sort failed_login_summary.html > targets/sorted.txt
  run diff -wbB targets/target.txt targets/sorted.txt
  assert_success

There's a problem, though—if a line has extra whitespace, it might get sorted differently than the canonical, whitespace-normalized version!

Screen Shot 2021-09-17 at 5 44 02 PM

So, to compare two files in a whitespace-insensitive way, we need to normalize whitespace and then sort :)

Expand user name tests to include _ and -

There are user names with _ and - in the full data set, but none in the test data until you get to the final test. It might be good to have examples in the partial tests that include those so people get a heads up sooner.

Document cloning with `--recurse-submodules`

I think that if they include --recurse-submodules when they do their git clone (presumably on the command-line), then it will avoid all the submodule weirdness. I'm not 100%, though, so I'm going to leave this be for now.

Document `diff` output from `bats` tests

Every semester students are confused by the diff output from the bats tests, and TBH I typically have to reconstruct what's going on there each time.

Maybe we should document that output in the write-up (or a separate document), including:

  • How to read the XdY, XcY, etc., lines
  • How to read the < and > lines

Have students create releases for the two parts

If we're going to keep this as a two week lab, with different parts due in consecutive weeks, we should have them create releases (or at least tags) after the first part is done to make it easier for us to check out the first part separately from the second. Otherwise it's really hard to figure out where their first half "ends" and their work on the second half "begins". If we have them create releases or tags then we can checkout the appropriate release/tag and grade that without worrying about how things get muddied by their work on the second part.

Convert to `bats-core`

We need to convert from the old-style bats to the new bats-core. That's a fair bit of work because there are a lot of tests in this repo, all of which will have to be updated. We'll also need to mention the git submodule add commands in the README.

Test files to update:

  • process_client_logs.bats
  • create_username_dist.bats
  • create_hours_dist.bats
  • create_country_dist.bats
  • assemble_report.bats
  • process_logs.bats

Other tasks:

  • Make sure everything is clean with shellcheck
  • Update README to include git submodule add commands

README.MD assemble_report.sh writeup

This order is incorrect:

  • username_dist.html
  • hours_dist.html
  • country_dist.html

The test actually wants to see:

  • country_dist.html
  • hours_dist.html
  • username_dist.html

Break the lab write-up into separate pages

This write-up is really long, which makes it intimidating, and hard to scan through for particular pieces of information. It would probably be a lot more readable if we cut it into multiple pages, with the "main" README having lots of links into those pages.

Reduce the number of points in Canvas for this lab

The two parts of this lab are collectively worth nearly 100 points, which is a bit out of proportion to the value of the other labs. The Segmented File Server is only worth 24, i.e., a quarter of this, and it really should be more like 50-75 points compared to this.

Fix the broken country code example in the README

In the Write create_country_dist.sh section, I say:

After you've converted IP addresses to country codes, you can extract the country codes, count their occurrences (like we counted usernames before), and generate the necessary data.addRow lines, which again look like data.addRow(['04', 87]);. Remember to then wrap those with the appropriate header and footer, and you're done with this part.

The example is presumably a copy/paste from the hours section. The string '04' should be changed to a country code like 'FR'.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.