cisagov / malcolm Goto Github PK

Malcolm is a powerful, easily deployable network traffic analysis tool suite for full packet capture artifacts (PCAP files), Zeek logs and Suricata alerts.

Home Page: https://cisagov.github.io/Malcolm/

License: Other

Dockerfile 5.63% Shell 17.70% CSS 16.44% HTML 1.55% PHP 0.23% Python 43.45% JavaScript 4.65% Zeek 6.27% Ruby 2.72% Vim Script 0.01% PowerShell 1.08% Makefile 0.10% Perl 0.18%

network-security pcap security arkime cybersecurity infosec network-traffic-analysis networksecurity opensearch opensearch-dashboards

malcolm's Introduction

Malcolm

Malcolm is a powerful network traffic analysis tool suite designed with the following goals in mind:

Easy to use – Malcolm accepts network traffic data in the form of full packet capture (PCAP) files and Zeek logs. These artifacts can be uploaded via a simple browser-based interface or captured live and forwarded to Malcolm using lightweight forwarders. In either case, the data is automatically normalized, enriched, and correlated for analysis.
Powerful traffic analysis – Visibility into network communications is provided through two intuitive interfaces: OpenSearch Dashboard, a flexible data visualization plugin with dozens of prebuilt dashboards providing an at-a-glance overview of network protocols; and Arkime (formerly Moloch), a powerful tool for finding and identifying the network sessions comprising suspected security incidents.
Streamlined deployment – Malcolm operates as a cluster of Docker containers – isolated sandboxes that each serve a dedicated function of the system. This Docker-based deployment model, combined with a few simple scripts for setup and run-time management, makes Malcolm suitable to be deployed quickly across a variety of platforms and use cases; whether it be for long-term deployment on a Linux server in a security operations center (SOC) or for incident response on a Macbook for an individual engagement.
Secure communications – All communications with Malcolm, both from the user interface and from remote log forwarders, are secured with industry standard encryption protocols.
Permissive license – Malcolm is comprised of several widely used open-source tools, making it an attractive alternative to security solutions requiring paid licenses.
Expanding control systems visibility – While Malcolm is great for general-purpose network traffic analysis, its creators see a particular need in the community for tools providing insight into protocols used in industrial control systems (ICS) environments. Ongoing Malcolm development will aim to provide additional parsers for common ICS protocols.

Although all the open-source tools that make up Malcolm are already available and in general use, Malcolm provides a framework of interconnectivity that makes it greater than the sum of its parts.

In short, Malcolm provides an easily deployable network analysis tool suite for full PCAP files and Zeek logs. While Internet access is required to build Malcolm, internet access is not required at runtime.

Documentation

See the Malcolm documentation.

Share your feedback

You can help steer Malcolm's development by sharing your ideas and feedback. Please take a few minutes to complete this survey ↪ (hosted on Google Forms) so we can understand the members of the Malcolm community and their use cases for this tool.

Copyright and License

Malcolm is Copyright 2024 Battelle Energy Alliance, LLC, and is developed and released through the cooperation of the Cybersecurity and Infrastructure Security Agency of the U.S. Department of Homeland Security.

Malcolm is licensed under the Apache License, version 2.0. See LICENSE.txt for the terms of its release.

Contact information of author(s):

[email protected]

malcolm's People

Contributors

Stargazers

Watchers

Forkers

milljm killvxk triplekill raystyle cclauss gdraperi luenix c0axial thezedwards andreymironenkogit 0xflotus averroes jordan-sauchuk vault-the securitypilot bajaguy benhe119 dreadnauhgt freemanzyq 3453-315h sts0mrg0 netpass hazho hopkin-assessments marciopocebon fraggler0ck ineedaspo1 awesome-security windyorguk mav8140 hwany417 yuting-linux p0prxx gitbxq m2kar upxnoops athna corpdk-main seabreg jertwaz cybertoxin tonmoychy apcvalmet travel213 ro9ueadmin zzczzc123 zmink-pxc dreamflychen elithz aliebrahimi1781 jions7ihj rajivraj p-rchan osipion reversetools jplettuce timmytr1ll zikyhd srqway blue-infosec 5l1v3r1 swipswaps reanimat0r layamba25 4sp1r3 cyamal1b4 mingunhuang gtrunsec tjunxiang92 amarjitghuman bfcoder chixsh fwu313 demirelcan schallee polling-repo-continua liuxuehao 2b45 xx-zhang hecg119 donaldjarmstrongbremer idaholab s0fianehamlaoui zdv2002 yehias hcp6897 dekoder alexey-hub houndy1 cybersecops zhanghan1990 obsidianknife itcms iamthatiam777 llayman sanjayws jeis4wpi mopotlongo ccaiccie xiaoruiguo

malcolm's Issues

add Zeek support for GQUIC

add support for gquic via Salesforce GQUIC zeek parser. See also this article.

support Elasticsearch snapshot/restore

Need to provide the ability to specify parameters for snapshot/restore as a way to backup data.

This may be used in conjunction with index curation (issue #6).

Could not get this to work with Docker Desktop on Mac

Port forwarding from Mac to container is working as i can see the default page with the Readme.htm, services say they started but cannot navigate to Moloch web services or any other.
rebuilt container with different variables and changed port forwarding to no avail. What am I missing? Post-container run, are there manual steps to run before Moloch web services are available? The README file seems to make assumptions which are for Ubuntu not Docker?

default file creation permissions may cause filebeat not to start up

User _imp0ster on reddit discovered this bug that happens on some systems

        $ docker-compose logs -f filebeat
        ...
        filebeat_1       | 2019-06-12 17:56:56,442 INFO success: filebeat entered RUNNING state, process has stayed up for > than 0 seconds (startsecs)
        filebeat_1       | 2019-06-12 17:56:56,444 INFO exited: filebeat (exit status 1; not expected)
        filebeat_1       | 2019-06-12 17:56:57,448 INFO spawned: 'filebeat' with pid 390
        filebeat_1       | Exiting: error loading config file: config file ("filebeat.yml") can only be writable by the owner but the permissions are "-rw-rw-r--" (to fix the permissions use: 'chmod go-w /usr/share/filebeat/filebeat.yml')
        filebeat_1       | 2019-06-12 17:56:57,491 INFO success: filebeat entered RUNNING state, process has stayed up for > than 0 seconds (startsecs)
        filebeat_1       | 2019-06-12 17:56:57,500 INFO exited: filebeat (exit status 1; not expected)
        filebeat_1       | 2019-06-12 17:56:58,504 INFO spawned: 'filebeat' with pid 398
        filebeat_1       | Exiting: error loading config file: config file ("filebeat.yml") can only be writable by the owner but the permissions are "-rw-rw-r--" (to fix the permissions use: 'chmod go-w /usr/share/filebeat/filebeat.yml')
        filebeat_1       | 2019-06-12 17:56:58,629 INFO success: filebeat entered RUNNING state, process has stayed up for > than 0 seconds (startsecs)
        filebeat_1       | 2019-06-12 17:56:58,634 INFO exited: filebeat (exit status 1; not expected)
        filebeat_1       | 2019-06-12 17:56:59,638 INFO spawned: 'filebeat' with pid 406
        filebeat_1       | Exiting: error loading config file: config file ("filebeat.yml") can only be writable by the owner but the permissions are "-rw-rw-r--" (to fix the permissions use: 'chmod go-w /usr/share/filebeat/filebeat.yml')
        filebeat_1       | 2019-06-12 17:56:59,695 INFO success: filebeat entered RUNNING state, process has stayed up for > than 0 seconds (startsecs)
        filebeat_1       | 2019-06-12 17:56:59,698 INFO exited: filebeat (exit status 1; not expected)
        filebeat_1       | 2019-06-12 17:57:00,733 INFO spawned: 'filebeat' with pid 416
        filebeat_1       | Exiting: error loading config file: config file ("filebeat.yml") can only be writable by the owner but the permissions are "-rw-rw-r--" (to fix the permissions use: 'chmod go-w /usr/share/filebeat/filebeat.yml')
        filebeat_1       | 2019-06-12 17:57:00,863 INFO success: filebeat entered RUNNING state, process has stayed up for > than 0 seconds (startsecs)
        filebeat_1       | 2019-06-12 17:57:00,864 INFO exited: filebeat (exit status 1; not expected)
        filebeat_1       | 2019-06-12 17:57:01,868 INFO spawned: 'filebeat' with pid 440
        filebeat_1       | Exiting: error loading config file: config file ("filebeat.yml") can only be writable by the owner but the permissions are "-rw-rw-r--" (to fix the permissions use: 'chmod go-w /usr/share/filebeat/filebeat.yml')
        filebeat_1       | 2019-06-12 17:57:01,900 INFO success: filebeat entered RUNNING state, process has stayed up for > than 0 seconds (startsecs)
        filebeat_1       | 2019-06-12 17:57:01,905 INFO exited: filebeat (exit status 1; not expected)
        filebeat_1       | 2019-06-12 17:57:02,908 INFO spawned: 'filebeat' with pid 451

I need to fix the permissions on filebeat.yml in the installer and/or startup script to make sure this doesn't happen.

As a workaround.

./scripts/stop.sh
chmod go-w ./filebeat/filebeat.yml
start things back up

"View JSON Document" context action for ID field in Moloch does not display old data correctly

When creating the URL to view the JSON document of a session/record, I was not including any date parameters to the sessions.json API. This would cause any session > 1 hour old to not be displayed.

include HASSH in Zeek plugins for SSH fingerprinting

see:

https://github.com/salesforce/hassh/tree/master/bro
https://engineering.salesforce.com/open-sourcing-hassh-abed3ae5044c

update moloch to 2.0

add community_id to zeek plugins for conn.log

Add community_id to zeek plugins for conn.log to be able to correlate zeek's conn.log with moloch sessions.

See:
https://github.com/corelight/bro-community-id
https://github.com/corelight/community-id-spec

To quote:

When processing flow data from a variety of monitoring applications (such as Zeek and Suricata), it's often desirable to pivot quickly from one dataset to another. While the required flow tuple information is usually present in the datasets, the details of such "joins" can be tedious, particular in corner cases. This spec describes "Community ID" flow hashing, standardizing the production of a string identifier representing a given network flow, to reduce the pivot to a simple string comparison.

This Zeek package provides support for "community ID" flow hashing, a standardized way of labeling traffic flows in network monitors. When loaded, the package adds a community_id string field to conn.log. This is work in progress between the Zeek and Suricata communities, to enable correlation of flows in the outputs of both tools. Feedback is very welcome, also from users/developers of other monitoring software.

Elasticsearch to Splunk export/searching capabilities

Research ES to Splunk export/searching capabilities

Possibilities:

Forwarding from ES via Logstash syslog
- https://www.elastic.co/guide/en/logstash/current/plugins-inputs-elasticsearch.html
- https://www.elastic.co/guide/en/logstash/current/plugins-outputs-syslog.html
Searching ES databases with Splunk

support LDAP authentication

Need to support authentication through an external LDAP server and/or other similar technologies. As everything we're doing auth-wise across the project is handled by nginx (https://github.com/idaholab/Malcolm/blob/master/nginx/nginx.conf) this shouldn't be too bad as I imagine those plugins already exist for nginx.

move common logstash enrichments to separate pipeline

Right now common logstash enhancements like geo IP lookup, MAC OUI lookup, etc. are done in the middle of the zeek processing.

If we add other pipelines (suricata, for example) we don't want to have to re-invent the wheel. I should move those to another post-parsing pipeline.

turning off AUTO_TAG feature disables tagging altogether

I ran across this the other day testing:

x-common-upload-variables: &common-upload-variables
  AUTO_TAG : 'false'

Setting AUTO_TAG to false in docker-compose.yml not only disables auto tagging, but also tags manually added via the upload web interface. That's not ideal.

automated testing

Currently Malcolm's tested manually by me on a per-change basis. As the project matures, I need to look into implementing some kind of test framework that can be run overnight or something to ensure builds and functionality don't break without my knowing it.

Elasticsearch/Java Permissions Error

Hello, I'm getting the following errors from Elasticsearch before it exits.

# hostnamectl
  Operating System: CentOS Linux 7 (Core)
  CPE OS Name: cpe:/o:centos:centos:7
  Kernel: Linux 3.10.0-957.21.3.el7.x86_64
  Architecture: x86-64

# docker-compose logs -f elasticsearch                                                                                                                                                                                 
Attaching to malcolm_elasticsearch_1
elasticsearch_1  | OpenJDK 64-Bit Server VM warning: Option UseConcMarkSweepGC was deprecated in version 9.0 and will likely be removed in a future release.
elasticsearch_1  | OpenJDK 64-Bit Server VM warning: UseAVX=2 is not supported on this CPU, setting it to UseAVX=1
elasticsearch_1  | [2019-06-24T19:28:15,079][WARN ][o.e.b.ElasticsearchUncaughtExceptionHandler] [unknown] uncaught exception in thread [main]
elasticsearch_1  | org.elasticsearch.bootstrap.StartupException: java.lang.IllegalStateException: Failed to create node environment
elasticsearch_1  |      at org.elasticsearch.bootstrap.Elasticsearch.init(Elasticsearch.java:163) ~[elasticsearch-6.8.0.jar:6.8.0]
elasticsearch_1  |      at org.elasticsearch.bootstrap.Elasticsearch.execute(Elasticsearch.java:150) ~[elasticsearch-6.8.0.jar:6.8.0]
elasticsearch_1  |      at org.elasticsearch.cli.EnvironmentAwareCommand.execute(EnvironmentAwareCommand.java:86) ~[elasticsearch-6.8.0.jar:6.8.0]
elasticsearch_1  |      at org.elasticsearch.cli.Command.mainWithoutErrorHandling(Command.java:124) ~[elasticsearch-cli-6.8.0.jar:6.8.0]
elasticsearch_1  |      at org.elasticsearch.cli.Command.main(Command.java:90) ~[elasticsearch-cli-6.8.0.jar:6.8.0]
elasticsearch_1  |      at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:116) ~[elasticsearch-6.8.0.jar:6.8.0]
elasticsearch_1  |      at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:93) ~[elasticsearch-6.8.0.jar:6.8.0]
elasticsearch_1  | Caused by: java.lang.IllegalStateException: Failed to create node environment
elasticsearch_1  |      at org.elasticsearch.node.Node.<init>(Node.java:299) ~[elasticsearch-6.8.0.jar:6.8.0]
elasticsearch_1  |      at org.elasticsearch.node.Node.<init>(Node.java:266) ~[elasticsearch-6.8.0.jar:6.8.0]
elasticsearch_1  |      at org.elasticsearch.bootstrap.Bootstrap$5.<init>(Bootstrap.java:212) ~[elasticsearch-6.8.0.jar:6.8.0]
elasticsearch_1  |      at org.elasticsearch.bootstrap.Bootstrap.setup(Bootstrap.java:212) ~[elasticsearch-6.8.0.jar:6.8.0]
elasticsearch_1  |      at org.elasticsearch.bootstrap.Bootstrap.init(Bootstrap.java:333) ~[elasticsearch-6.8.0.jar:6.8.0]
elasticsearch_1  |      at org.elasticsearch.bootstrap.Elasticsearch.init(Elasticsearch.java:159) ~[elasticsearch-6.8.0.jar:6.8.0]
elasticsearch_1  |      ... 6 more
elasticsearch_1  | Caused by: java.nio.file.AccessDeniedException: /usr/share/elasticsearch/data/nodes
elasticsearch_1  |      at sun.nio.fs.UnixException.translateToIOException(UnixException.java:90) ~[?:?]
elasticsearch_1  |      at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:111) ~[?:?]
elasticsearch_1  |      at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:116) ~[?:?]
elasticsearch_1  |      at sun.nio.fs.UnixFileSystemProvider.createDirectory(UnixFileSystemProvider.java:389) ~[?:?]
elasticsearch_1  |      at java.nio.file.Files.createDirectory(Files.java:692) ~[?:?]
elasticsearch_1  |      at java.nio.file.Files.createAndCheckIsDirectory(Files.java:799) ~[?:?]
elasticsearch_1  |      at java.nio.file.Files.createDirectories(Files.java:785) ~[?:?]
elasticsearch_1  |      at org.elasticsearch.env.NodeEnvironment.lambda$new$0(NodeEnvironment.java:273) ~[elasticsearch-6.8.0.jar:6.8.0]
elasticsearch_1  |      at org.elasticsearch.env.NodeEnvironment$NodeLock.<init>(NodeEnvironment.java:206) ~[elasticsearch-6.8.0.jar:6.8.0]
elasticsearch_1  |      at org.elasticsearch.env.NodeEnvironment.<init>(NodeEnvironment.java:270) ~[elasticsearch-6.8.0.jar:6.8.0]
elasticsearch_1  |      at org.elasticsearch.node.Node.<init>(Node.java:296) ~[elasticsearch-6.8.0.jar:6.8.0]
elasticsearch_1  |      at org.elasticsearch.node.Node.<init>(Node.java:266) ~[elasticsearch-6.8.0.jar:6.8.0]
elasticsearch_1  |      at org.elasticsearch.bootstrap.Bootstrap$5.<init>(Bootstrap.java:212) ~[elasticsearch-6.8.0.jar:6.8.0]
elasticsearch_1  |      at org.elasticsearch.bootstrap.Bootstrap.setup(Bootstrap.java:212) ~[elasticsearch-6.8.0.jar:6.8.0]
elasticsearch_1  |      at org.elasticsearch.bootstrap.Bootstrap.init(Bootstrap.java:333) ~[elasticsearch-6.8.0.jar:6.8.0]
elasticsearch_1  |      at org.elasticsearch.bootstrap.Elasticsearch.init(Elasticsearch.java:159) ~[elasticsearch-6.8.0.jar:6.8.0]
elasticsearch_1  |      ... 6 more
elasticsearch_1  | OpenJDK 64-Bit Server VM warning: Option UseConcMarkSweepGC was deprecated in version 9.0 and will likely be removed in a future release.
elasticsearch_1  | OpenJDK 64-Bit Server VM warning: UseAVX=2 is not supported on this CPU, setting it to UseAVX=1
elasticsearch_1  | [2019-06-24T19:54:24,725][WARN ][o.e.b.ElasticsearchUncaughtExceptionHandler] [unknown] uncaught exception in thread [main]
elasticsearch_1  | org.elasticsearch.bootstrap.StartupException: java.lang.IllegalStateException: Failed to create node environment
elasticsearch_1  |      at org.elasticsearch.bootstrap.Elasticsearch.init(Elasticsearch.java:163) ~[elasticsearch-6.8.0.jar:6.8.0]
elasticsearch_1  |      at org.elasticsearch.bootstrap.Elasticsearch.execute(Elasticsearch.java:150) ~[elasticsearch-6.8.0.jar:6.8.0]
elasticsearch_1  |      at org.elasticsearch.cli.EnvironmentAwareCommand.execute(EnvironmentAwareCommand.java:86) ~[elasticsearch-6.8.0.jar:6.8.0]
elasticsearch_1  |      at org.elasticsearch.cli.Command.mainWithoutErrorHandling(Command.java:124) ~[elasticsearch-cli-6.8.0.jar:6.8.0]
elasticsearch_1  |      at org.elasticsearch.cli.Command.main(Command.java:90) ~[elasticsearch-cli-6.8.0.jar:6.8.0]
elasticsearch_1  |      at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:116) ~[elasticsearch-6.8.0.jar:6.8.0]
elasticsearch_1  |      at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:93) ~[elasticsearch-6.8.0.jar:6.8.0]
elasticsearch_1  | Caused by: java.lang.IllegalStateException: Failed to create node environment
elasticsearch_1  |      at org.elasticsearch.node.Node.<init>(Node.java:299) ~[elasticsearch-6.8.0.jar:6.8.0]
elasticsearch_1  |      at org.elasticsearch.node.Node.<init>(Node.java:266) ~[elasticsearch-6.8.0.jar:6.8.0]
elasticsearch_1  |      at org.elasticsearch.bootstrap.Bootstrap$5.<init>(Bootstrap.java:212) ~[elasticsearch-6.8.0.jar:6.8.0]
elasticsearch_1  |      at org.elasticsearch.bootstrap.Bootstrap.setup(Bootstrap.java:212) ~[elasticsearch-6.8.0.jar:6.8.0]
elasticsearch_1  |      at org.elasticsearch.bootstrap.Bootstrap.init(Bootstrap.java:333) ~[elasticsearch-6.8.0.jar:6.8.0]
elasticsearch_1  |      at org.elasticsearch.bootstrap.Elasticsearch.init(Elasticsearch.java:159) ~[elasticsearch-6.8.0.jar:6.8.0]
elasticsearch_1  |      ... 6 more
elasticsearch_1  | Caused by: java.nio.file.AccessDeniedException: /usr/share/elasticsearch/data/nodes
elasticsearch_1  |      at sun.nio.fs.UnixException.translateToIOException(UnixException.java:90) ~[?:?]
elasticsearch_1  |      at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:111) ~[?:?]
elasticsearch_1  |      at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:116) ~[?:?]
elasticsearch_1  |      at sun.nio.fs.UnixFileSystemProvider.createDirectory(UnixFileSystemProvider.java:389) ~[?:?]
elasticsearch_1  |      at java.nio.file.Files.createDirectory(Files.java:692) ~[?:?]
elasticsearch_1  |      at java.nio.file.Files.createAndCheckIsDirectory(Files.java:799) ~[?:?]
elasticsearch_1  |      at java.nio.file.Files.createDirectories(Files.java:785) ~[?:?]
elasticsearch_1  |      at org.elasticsearch.env.NodeEnvironment.lambda$new$0(NodeEnvironment.java:273) ~[elasticsearch-6.8.0.jar:6.8.0]
elasticsearch_1  |      at org.elasticsearch.env.NodeEnvironment$NodeLock.<init>(NodeEnvironment.java:206) ~[elasticsearch-6.8.0.jar:6.8.0]
elasticsearch_1  |      at org.elasticsearch.env.NodeEnvironment.<init>(NodeEnvironment.java:270) ~[elasticsearch-6.8.0.jar:6.8.0]
elasticsearch_1  |      at org.elasticsearch.node.Node.<init>(Node.java:296) ~[elasticsearch-6.8.0.jar:6.8.0]
elasticsearch_1  |      at org.elasticsearch.node.Node.<init>(Node.java:266) ~[elasticsearch-6.8.0.jar:6.8.0]
elasticsearch_1  |      at org.elasticsearch.bootstrap.Bootstrap$5.<init>(Bootstrap.java:212) ~[elasticsearch-6.8.0.jar:6.8.0]
elasticsearch_1  |      at org.elasticsearch.bootstrap.Bootstrap.setup(Bootstrap.java:212) ~[elasticsearch-6.8.0.jar:6.8.0]
elasticsearch_1  |      at org.elasticsearch.bootstrap.Bootstrap.init(Bootstrap.java:333) ~[elasticsearch-6.8.0.jar:6.8.0]
elasticsearch_1  |      at org.elasticsearch.bootstrap.Elasticsearch.init(Elasticsearch.java:159) ~[elasticsearch-6.8.0.jar:6.8.0]
elasticsearch_1  |      ... 6 more
malcolm_elasticsearch_1 exited with code 1

"to check" queue for carved file can fill up with duplicate files

I need to provide steps to reproduce, but there is a scenario where:

the currently checking queue is full
the "to check" queue gets filled up with a whole bunch of files with the same SHA

what happens then is that none of these ever clear out, and will essentially just sit in the queue forever since everybody thinks someone else is checking the duplicate file but it never does.

Button for wiping Malcolm data "on the fly"

There is scripts/wipe.sh but it would be nice to have a GUI-ish way for wiping data on the fly. Maybe tie this in to Moloch's ES Indices tab, implement something to multi-select indices or pattern-select indices and delete everything with one click?

ignore beats status updates in logs.sh

When metricbeat or auditbeat or running, there's a lot of POST things that could be ignored in logs.sh.

update elastalert container

The elastalert image Malcolm uses is a little bit old. There was an issue that was holding me back but it looks like there's a path forward now.

Elasticsearch index curation

Create a docker container to handle periodic calling of curator and allow the user to define rules for index filtering and actions.

Currently it's possible (probably) that without manual intervention eventually the disk could fill up with Elasticsearch indices. To use Malcolm as a long-term solution there needs to be automated steps put in place to handle this.

Improve documentation on how to use signed certificates (vs. self-signed)

The documentation should be improved to show how to use signed certificates (vs. self-signed). Use something like Let's Encrypt as an example.

Also I need to research what this looks like between a remote beats forwarder and Malcolm. I've done it with the Malcolm web interface certs (via nginx config) but I haven't set it up in beats/logstash yet.

tagging and views for routable (public) IP addresses is incomplete

In Moloch Malcolm has the view "Public IP Addresses" and, using similar logic, the logstash parser for Zeek logs (11_zeek_logs.conf) is marking things internal_source/external_source based on IP address.

However, my list of IP addresses is incomplete. See https://en.wikipedia.org/wiki/Reserved_IP_addresses

Here is the (as far as I know) complete list for IPv4:

0.0.0.0/8
10.0.0.0/8
100.64.0.0/10
127.0.0.0/8
169.254.0.0/16
172.16.0.0/12
192.0.0.0/24 
192.0.2.0/24
192.88.99.0/24
192.168.0.0/16
198.18.0.0/15
198.51.100.0/24
203.0.113.0/24
224.0.0.0/4
232.0.0.0/8
233.0.0.0/8
234.0.0.0/8
239.0.0.0/8

For the moloch view creation, this boils down to this (shown from user_settings.json here):

    "views": {
      "Public IP Addresses": {
        "expression": "(country.dst == EXISTS!) || (country.src == EXISTS!) || (ip.dst == EXISTS! && ip.dst != 0.0.0.0/8 && ip.dst != 10.0.0.0/8 && ip.dst != 100.64.0.0/10 && ip.dst != 127.0.0.0/8 && ip.dst != 169.254.0.0/16 && ip.dst != 172.16.0.0/12 && ip.dst != 192.0.0.0/24  && ip.dst != 192.0.2.0/24 && ip.dst != 192.88.99.0/24 && ip.dst != 192.168.0.0/16 && ip.dst != 198.18.0.0/15 && ip.dst != 198.51.100.0/24 && ip.dst != 203.0.113.0/24 && ip.dst != 224.0.0.0/4 && ip.dst != 232.0.0.0/8 && ip.dst != 233.0.0.0/8 && ip.dst != 234.0.0.0/8 && ip.dst != 239.0.0.0/8 && ip.dst != 240.0.0.0/4 && ip.dst != 255.255.255.255 && ip.dst != :: && ip.dst != ::1 && ip.dst != ff00::/8 && ip.dst != fe80::/10 && ip.dst != fc00::/7 && ip.dst != fd00::/8) || (ip.src == EXISTS! && ip.src != 0.0.0.0/8 && ip.src != 10.0.0.0/8 && ip.src != 100.64.0.0/10 && ip.src != 127.0.0.0/8 && ip.src != 169.254.0.0/16 && ip.src != 172.16.0.0/12 && ip.src != 192.0.0.0/24  && ip.src != 192.0.2.0/24 && ip.src != 192.88.99.0/24 && ip.src != 192.168.0.0/16 && ip.src != 198.18.0.0/15 && ip.src != 198.51.100.0/24 && ip.src != 203.0.113.0/24 && ip.src != 224.0.0.0/4 && ip.src != 232.0.0.0/8 && ip.src != 233.0.0.0/8 && ip.src != 234.0.0.0/8 && ip.src != 239.0.0.0/8 && ip.src != 240.0.0.0/4 && ip.src != 255.255.255.255 && ip.src != :: && ip.src != ::1 && ip.src != ff00::/8 && ip.src != fe80::/10 && ip.src != fc00::/7 && ip.src != fd00::/8)"
      },

In logstash I used the tool rgxg:

$ for i in 0.0.0.0/8 10.0.0.0/8 100.64.0.0/10 127.0.0.0/8 169.254.0.0/16 172.16.0.0/12 192.0.0.0/24  192.0.2.0/24 192.88.99.0/24 192.168.0.0/16 198.18.0.0/15 198.51.100.0/24 203.0.113.0/24 224.0.0.0/4 232.0.0.0/8 233.0.0.0/8 234.0.0.0/8 239.0.0.0/8 240.0.0.0/4; do
>   rgxg cidr -N "$i"
> done
0(\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])){3}
10(\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])){3}
100\.(12[0-7]|1[01][0-9]|[7-9][0-9]|6[4-9])(\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])){2}
127(\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])){3}
169\.254(\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])){2}
172\.(3[01]|2[0-9]|1[6-9])(\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])){2}
192\.0\.0\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])
192\.0\.2\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])
192\.88\.99\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])
192\.168(\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])){2}
198\.1[89](\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])){2}
198\.51\.100\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])
203\.0\.113\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])
(23[0-9]|22[4-9])(\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])){3}
232(\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])){3}
233(\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])){3}
234(\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])){3}
239(\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])){3}
(25[0-5]|24[0-9])(\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])){3}

to generate this logstash code:

    if (([srcIp] =~ "1?0(\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])){3}") or
        ([srcIp] =~ "192\.168(\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])){2}") or
        ([srcIp] =~ "172\.(3[01]|2[0-9]|1[6-9])(\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])){2}") or
        ([srcIp] =~ "127(\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])){3}") or
        ([srcIp] =~ "(23[0-9]|22[4-9])(\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])){3}") or
        ([srcIp] =~ "23[2-4](\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])){3}") or
        ([srcIp] =~ "239(\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])){3}") or
        ([srcIp] =~ "(25[0-5]|24[0-9])(\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])){3}") or
        ([srcIp] =~ "100\.(12[0-7]|1[01][0-9]|[7-9][0-9]|6[4-9])(\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])){2}") or
        ([srcIp] =~ "169\.254(\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])){2}") or
        ([srcIp] =~ "192\.0\.[02]\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])") or
        ([srcIp] =~ "192\.88\.99\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])") or
        ([srcIp] =~ "198\.1[89](\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])){2}") or
        ([srcIp] =~ "198\.51\.100\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])") or
        ([srcIp] =~ "203\.0\.113\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])") or
        ([srcIp] == "ff02::fb") or
        ([srcIp] == "fe80::20c:29ff:fe19:f7d") or
        ([srcIp] == "::1")) {

For now I'm haven't tackled IPv6 yet.

I need to make sure that this not only works, but that it's not terrible performance.

convert Zeek fields without a Moloch mapping to use Elastic Common Schema (ECS)

see https://www.elastic.co/blog/introducing-the-elastic-common-schema
and https://github.com/elastic/ecs/tree/master/schemas

Eventually Moloch is going to do this as well, so we might as well get a head start.

rework kibana dashboards to use moloch fields where possible

Some data in the database exists in two fields: for example, zeek_gquic.user_agent and quic.useragent.

The Kibana dashboards right now generally use the zeek versions. It would be better to rework them to use the moloch fields, as it would allow more data to be visualized in kibana (moloch sessions vs. just zeek logs).

elasticsearch Name or service not known using install and README

i try to upload the malcolm tool but i keep get that
logstash_1_3978e7e78829 | [2019-11-04T11:23:36,470][WARN ][logstash.outputs.elasticsearch] Attempted to resurrect connection to dead ES instance, but got an error. {:url=>"http://elasticsearch:9200/", :error_type=>LogStash::Outputs::ElasticSearch::HttpClient::Pool::HostUnreachableError, :error=>"Elasticsearch Unreachable: [http://elasticsearch:9200/][Manticore::ResolutionFailure] elasticsearch: Name or service not known"}
logstash_1_3978e7e78829 | [2019-11-04T11:23:41,475][WARN ][logstash.outputs.elasticsearch] Attempted to resurrect connection to dead ES instance, but got an error. {:url=>"http://elasticsearch:9200/", :error_type=>LogStash::Outputs::ElasticSearch::HttpClient::Pool::HostUnreachableError, :error=>"Elasticsearch Unreachable: [http://elasticsearch:9200/][Manticore::ResolutionFailure] elasticsearch"}
logstash_1_3978e7e78829 | [2019-11-04T11:23:46,483][WARN ][logstash.outputs.elasticsearch] Attempted to resurrect connection to dead ES instance, but got an error. {:url=>"http://elasticsearch:9200/", :error_type=>LogStash::Outputs::ElasticSearch::HttpClient::Pool::HostUnreachableError, :error=>"Elasticsearch Unreachable: [http://elasticsearch:9200/][Manticore::ResolutionFailure] elasticsearch: Name or service not known"}
logstash_1_3978e7e78829 | [2019-11-04T11:23:51,489][WARN ][logstash.outputs.elasticsearch] Attempted to resurrect connection to dead ES instance, but got an error. {:url=>"http://elasticsearch:9200/", :error_type=>LogStash::Outputs::ElasticSearch::HttpClient::Pool::HostUnreachableError, :error=>"Elasticsearch Unreachable: [http://elasticsearch:9200/][Manticore::ResolutionFailure] elasticsearch"}
logstash_1_3978e7e78829 | [2019-11-04T11:23:56,499][WARN ][logstash.outputs.elasticsearch] Attempted to resurrect connection to dead ES instance, but got an error. {:url=>"http://elasticsearch:9200/", :error_type=>LogStash::Outputs::ElasticSearch::HttpClient::Pool::HostUnreachableError, :error=>"Elasticsearch Unreachable: [http://elasticsearch:9200/][Manticore::ResolutionFailure] elasticsearch: Name or service not known"}
logstash_1_3978e7e78829 | [2019-11-04T11:24:01,503][WARN ][logstash.outputs.elasticsearch] Attempted to resurrect connection to dead ES instance, but got an error. {:url=>"http://elasticsearch:9200/", :error_type=>LogStash::Outputs::ElasticSearch::HttpClient::Pool::HostUnreachableError, :error=>"Elasticsearch Unreachable: [http://elasticsearch:9200/][Manticore::ResolutionFailure] elasticsearch"}
logstash_1_3978e7e78829 | [2019-11-04T11:24:06,518][WARN ][logstash.outputs.elasticsearch] Attempted to resurrect connection to dead ES instance, but got an error. {:url=>"http://elasticsearch:9200/", :error_type=>LogStash::Outputs::ElasticSearch::HttpClient::Pool::HostUnreachableError, :error=>"Elasticsearch Unreachable: [http://elasticsearch:9200/][Manticore::ResolutionFailure] elasticsearch: Name or service not known"}
logstash_1_3978e7e78829 | [2019-11-04T11:24:11,535][WARN ][logstash.outputs.elasticsearch] Attempted to resurrect connection to dead ES instance, but got an error. {:url=>"http://elasticsearch:9200/", :error_type=>LogStash::Outputs::ElasticSearch::HttpClient::Pool::HostUnreachableError, :error=>"Elasticsearch Unreachable: [http://elasticsearch:9200/][Manticore::ResolutionFailure] elasticsearch"}

netstat -nlp ip-172-31-39-19: Mon Nov 4 11:24:54 2019
(Not all processes could be identified, non-owned process info
will not be shown, you would have to be root to see it all.)
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
tcp 0 0 127.0.0.53:53 0.0.0.0:* LISTEN -
tcp 0 0 127.0.0.1:8022 0.0.0.0:* LISTEN -
tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN -
tcp6 0 0 :::488 :::* LISTEN -
tcp6 0 0 :::9200 :::* LISTEN -
tcp6 0 0 :::3030 :::* LISTEN -
tcp6 0 0 :::22 :::* LISTEN -
tcp6 0 0 :::443 :::* LISTEN -
tcp6 0 0 :::8443 :::* LISTEN -
tcp6 0 0 :::28991 :::* LISTEN -
tcp6 0 0 :::9600 :::* LISTEN -
tcp6 0 0 :::5601 :::* LISTEN -
udp 0 0 127.0.0.53:53 0.0.0.0:* -
raw6 0 0 :::58 :::* 7 -

integrate new amazon ICS parsers

https://github.com/amzn?utf8=%E2%9C%93&q=zeek&type=&language=

filebeat can get into a state where it's doing 100% CPU when "idle"

Not sure how this happens yet, but occasionally fter a lot of pcap uploads filebeat can get into a state where it's doing 100% CPU when "idle".

Not 100% reproducible, but I would like to figure it out.

pcap files with colon (:) in the name don't process correctly

PCAP files with colon (:) in the name don't process correctly when put into pcap/upload folder. they are moved, but not autozeek-d or scanned by moloch-capture.

add right-click pivot from Moloch to Kibana

Add right-click pivot from Moloch field values to filter in Kibana.

Improve documentation on how to use another ES cluster rather than local docker

In some cases it will make more sense for people to use their own elasticsearch deployment rather than Malcolm's dockerized one.

This should be easy to configure, but I should improve the documentation to outline how to do that.

Also there may be changes to docker-compose.yml (or maybe install.py can tweak docker-compose.yml to do it).

netsniff-ng may be HUP'ed incorrectly if the PCAP processing directory is backlogged

This is very similar to issue #34.

The pcap-capture/scripts/netsniff-roll.sh script which is executed in the context of the pcap-capture container is responsible for making sure that netsniff-ng is HUP'ed if the PCAP file it is writing to exceeds the value for the PCAP_ROTATE_MINUTES environment variable (the PCAP_ROTATE_MEGABYTES value is actually handled by netsniff-ng itself with the -F argument).

However, if the pcap/upload directory is very full, the same files will be viewed multiple times. This will cause the thing to hup netsniff-ng every few seconds.

Possible fixes:

capture to a different directory than the uploads (does slightly complicate things)
only HUP netsniff-ng for the most recent PCAP file per-process, rather than all of them (probably the easiest solution?)
only HUP netsniff-ng for PCAP files that are newer than the last time we hupped that process (also probably pretty easy)

handle moloch db.pl upgrades

currently Malcolm does not automatically handle database version upgrades for moloch

see https://molo.ch/faq#how_do_i_upgrade_to_moloch_2

Malcolm should detect (automatically, if possible) this scenario and do the appropriate stuff before running viewer

502 Bad Gateways - nginx upstream error

Hi,

I'm unable to connect to any of the interfaces.

I'm getting the following error:

nginx-proxy_1 | nginx: [emerg] host not found in upstream "moloch:8005" in /etc/nginx/nginx.conf:22
nginx-proxy_1 | nginx.1 | 2019/07/17 13:41:39 [error] 41#41: *3 connect() failed (111: Connection refused) while connecting to upstream, client: 192.168.1.132, server: , request: "GET / HTTP/1.1", upstream: "http://172.18.0.4:8005/", host: "192.168.1.85"

I believe this may be something to do with the fact that I'm not actually accessing the interface on the localhost, but from a bridged host elsewhere.

I've tried accessing /etc/nginx/nginx.conf but the directory doesnt exist?

Thanks

subnet-level view in Moloch Connections

Tracking arkime/arkime#881

For large networks with many hosts, the connections view can get cluttered. It would be useful to be able to specify a CIDR mask, something like "Src IP/24", "Dst IP/16", etc. for the "Field for the source/destination nodes" in the connections view. Then, the connections view could display which subnets are talking to each other, rather than just individual hosts.

There would have to be some thought about how IPv6 would fit into it. I guess you could just scale the subnet mask up from 32 to 128 bits (eg., if they did Src IP/16 that would be equivalent to /64 for IPv6 addresses?).

This can be done (manually) with subnet/host mapping via cidr-map.txt and host-map.txt, but it would be nice to be able to do it ad-hoc.

Can't Get Docker Compose to Pull

I run the docker compose cmd in the malcolm directory (utilizing the github repo) and I get the following error:
ERROR: Version in "./docker-compose.yml" is unsupported. You might be seeing this error because you're using the wrong Compose file version. Either specify a supported version (e.g "2.2" or "3.3") and place your service definitions under the serviceskey, or omit theversion key and place your service definitions at the root of the file to use version 1. For more on the Compose file format versions, see https://docs.docker.com/compose/compose-file/

When that failed I went ahead and just commented out the version in the docker-compose.yml file and ran the pull again and got this error:

ERROR: The Compose file './docker-compose.yml' is invalid because: Unsupported config option for x-common-upload-variables: 'AUTO_TAG' Unsupported config option for x-common-beats-variables: 'BEATS_SSL' Unsupported config option for x-logstash-variables: 'ES_EXTERNAL_SSL_CERTIFICATE_VERIFICATION' Unsupported config option for x-zeek-file-extraction-variables: 'EXTRACTED_FILE_START_SLEEP' Unsupported config option for x-moloch-variables: 'INITIALIZEDB' Unsupported config option for x-kibana-variables: 'KIBANA_OFFLINE_REGION_MAPS' Unsupported config option for x-pcap-capture-variables: 'PCAP_FILTER' Unsupported config option for services: 'moloch'

Please help!

need to investigate snapshot repository location filling up

I need to investigate what happens if the ES snapshot repository gets full, what to do if that happens.

netsniff-ng may be HUP'ed incorrectly if pre-existing PCAP files exist

However, there is a bug in netsniff-roll.sh: if previous pcap files matching the file naming schema exist in the pcap upload directory, they will be detected and cause netsniff-ng to continuously roll.

I need to fix netsniff-roll.sh to only HUP netsniff-ng based on pcap files that have were created after the netsniff-ng process we are examining.

Kibana Dashboards

Hello,

Interesting project you've got here! :)

The Kibana dashboards look very similar to ours in Security Onion. Is that where they came from? If so, may I ask that you provide proper acknowledgement?

Thanks in advance for your consideration!

Moloch Packet capture fall behind

I have setup Malcolm in a test lab on my home server, I seem to be having a perpetual issue with the moloch pcap being around 12hours behind the Zeek events. I installed following the Ubuntu install guide. I have looked into disk/performance issues and don't see anything specific but would like to get some thoughts if possible

network map diffs over time

The ability to specify a baseline time range from the Connections view and/or API in Moloch over some time range, then show changes/additions to those connections. This would highlight changes to the logical network diagram and could be used to alert on new devices appearing in the network.

INITIALIZEDB environment variable and restart value in docker-compose.yml could cause moloch container to wipe elasticsearch database on every reboot

Currently the behavior of the moloch container specified in the docker-compose.yml file is that if the INITIALIZEDB environment variable is set, that is a flag to wipe the database. Upon running "start.sh" this value gets changed in the .yml file to "false", and upon running "wipe.sh" the value gets set to "true"

However, the behavior of the restart value is such that docker restarts the container exactly as it was run previously, with the same initial environment variables. So, this could happen.

user runs "wipe.sh", yml file gets INITIALIZEDB=true
user runs "start.sh": moloch container starts with environment variable INITIALIZEDB=true, then start.sh tweaks yml file for next run
user imports lots of super important data
user reboots machine
due to restart setting, docker runs container again. yml file is ignored, INITIALIZEDB is true again
moloch wipes elasticsearch data, super important data is lost

That's the scenario. I need to use something smart (like a volume that, if not present, signals the wipe, otherwise it gets after things start up) rather than just assuming the yml file is going to be starting things.

update Zeek to 3.0.0

See https://github.com/zeek/zeek/releases/tag/v3.0.0

Need to update builds to include Zeek 3.0.0. This is going to include:

renaming references from bro to zeek everywhere
checking all of the log file formats to make sure they haven't changed, and, if they do, update the appropriate logstash/filebeat stuff

fix mapping of Zeek values to Moloch's http.uri

Moloch's http.uri field consists of both the host and the URI portion.

Previously in the mapping of this field I was only using the URI portion (and, sometimes, incorrectly writing over it with the referrer).

I've standardized it to fill this field out correctly as Moloch is doing it.

investigate logstash pipeline.workers setting

investigate logstash pipeline.workers setting (for better Logstash performance?)

install.py for macOS will mess up docker's settings.json file

If the user answers "yes" to "Configure Docker resource usage ..." when running install.py on macOS, CPU and RAM is calculated and written back out into the settings.json file.

If the user accepts the default precalculated values, it works fine, as they are initially set as integers. However, if the user enters their own CPU cores and RAM settings, they are stored as strings and written out (incorrectly) as such to settings.json.

They need to be cast as ints before storing.

Investigate additional scanners

Investigate integrating other scanners into Malcolm:

yara
rita
snort
packetbeat
suricata

If/when I decide to move on any of these, I'll log a separate issue.

investigate open distro for elasticsearch

https://opendistro.github.io/for-elasticsearch/

This looks like it might be a good choice for the default ES images used for Malcolm

integrate MITRE ATT&CK BZAR into Malcolm's Zeek instance

https://github.com/mitre-attack/car/tree/master/implementations/bzar

The BZAR project uses the Bro/Zeek Network Security Monitor to detect ATT&CK-based adversarial activity.

MITRE ATT&CK is a publicly-available, curated knowledge base for cyber adversary behavior, reflecting the various phases of the adversary lifecycle and the platforms they are known to target. The ATT&CK model includes behaviors of numerous threats groups.

BZAR is a set of Bro/Zeek scripts utilizing the SMB and DCE-RPC protocol analyzers and the File Extraction Framework to detect ATT&CK-like activity, raise notices, and write to the Notice Log.

multiple user accounts and account management

need to support multiple user accounts
need to support authentication with authentication servers (active directory, etc.?)
support with auth_setup.sh
add to nginx/htpasswd
add to auth.env (array?)
support in initmoloch.sh (?)
look at differing permissions in nginx.conf (?)
what about Moloch's user-level permissions?

pcap file with malformed (too long) data is not indexed properly

see https://wiki.wireshark.org/SampleCaptures and search for c05-http-reply-r1.pcap.

This is malformed data, but it is discarded by Elasticsearch because it is too long. is this something to be concerned about? Maybe, maybe not. i just wanted to document it to see if we want to do something about it.

there is a truncate filter:

However, we don't want to do this on all fields as it would be expensive.

Investigate Elasticsearch cluster configurations

Malcolm's Elasticsearch-in-Docker use of Elasticsearch is mostly one-node centric: in other words, the whole scale-out usefulness of Elasticsearch isn't set up by default.

I need to look at what Elasticsearch clusters look like, and how that would look from within the context of Malcolm.