Coder Social home page Coder Social logo

halflife's People

Contributors

tripleee avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

Forkers

tripleee

halflife's Issues

Rearchitect Halflife

The internal architecture is rather messy, and should be rearchitected.

Must:

  • Report results when they are established. The current architecture attempts to collect results into a dict and then separately report what's in the dict when everything is done, but this means the analysis and reporting code is always in two different places, usually far away from each other. The original rationale for this design was so that the analysis code would not need to know what it's reporting (for example, a function which queries DNS for information about an IP address doesn't and basically shouldn't know which domain name or Metasmoke post this is related to, but those details are relevant and important when reporting the result).
  • Modularize ActionCable and Metasmoke APIs. There is some brief attempt at encapsulating these, but it's not very intuitive or elegant.

Should:

  • Break out more of the analysis into separate routines or even separate code modules (DNS, URL analysis, etc).
  • Report results as JSON instead of, or in addition to, the current ad-hoc human-readable formatting.
  • Explore async (#1).
  • Explore breaking out URL fetching into a separate microservice.

Refactored hit rate wrong

The restored "x/y over timespan" calculation was wrong, it counts x and y per individual feedback, not per post!

Weird disk outages around end of month

A month ago I had to restart Halflife a number of times after the month had rolled over, and now I'm seeing the same thing again.

In brief, it seems to eat up all the disk space, and require a number of restarts before the space is properly reclaimed.

This could be a weird artifact of the Docker deployment model and/or how it works on the EC2 instance where I'm running this. Probably the deployment model should be reworked altogether.

Halflife skips many posts

When two or more posts are reported in rapid succession, the second and subsequent posts are sometimes missed by Halflife.

Searching for tagged/skipped shows many such incidents.

I originally thought this was a problem with Metasmoke, but in fact, it's probably a bug in Halflife. It should pick up the websocket again quickly after a message is received in order to keep on listening properly.

Moving to async processing would probably fix this. (#2 #1)

Restore domain and URL tail search

Regex search over the Metasmoke API was turned off for stability and performance reasons. (See #6.)

I have a fix in the works for domains, but URL tails (mostly useful for updating drugs brands) doesn't look like it's going to be easy.

Chat is very slow

Frequently when a post is reported, you would like to see very quickly the first results from the analysis: is the domain name blacklisted? did the URL tail match any keywords? and only then proceed to report the more-detailed analysis.

The chat interface throttles sending to one message per second. Some messages could be collected into multi-line chat messages (but then you cannot use formatting -- no bold, italics, links etc) to make reports appear quicker.

Ultimately, I'm thinking the back-end queries should be done using some sort of asynch framework so that multiple queries could be pending at the same time and you don't have to wait for the queries to execute serially before you can get the result from the one you actually care about for this particular post.

Feature request: watched IP address update logic

sevenmentor.com in particular is notoriously switching IP addresses like others change underwear. It would be nice if Halflife could periodically check their IP address, and perhaps even automatically add the new address to the IP watchlist when it changes.

(My practice has been to keep the old addresses on the watch list still. If we find that this is problematic, maybe also eventually figure out a way to remove old ones.)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.