m3047 / shodohflo Goto Github PK
View Code? Open in Web Editor NEWPure Python netflow and DNS correlation, with reusable Frame Streams, DnsTap and Protobuf implementations
License: Apache License 2.0
Pure Python netflow and DNS correlation, with reusable Frame Streams, DnsTap and Protobuf implementations
License: Apache License 2.0
Have the pcap-agent
capture these and display them in the rollover detail for Netflow artifacts.
The packet sniffer captures the remote port as well as remote address, and this is recorded as part of the flow
key in Redis. Surface this in the UI.
How? Rollover? Just display it after the address? Should address + port combos be distinct in the UI, or grouped by the address (DNS A and AAAA records don't distinguish port numbers)?
I'm seeing this sporadically:
ERROR:asyncio:Exception in callback Server.process_data()
handle: <Handle Server.process_data()>
Traceback (most recent call last):
File "/usr/lib64/python3.6/asyncio/events.py", line 145, in _run
self._callback(*self._args)
File "/usr/local/share/shodohflo/agents/pcap_agent.py", line 315, in process_data
if pkt.p == socket.IPPROTO_TCP and pkt.data.flags & dpkt.tcp.TH_RST:
AttributeError: 'bytes' object has no attribute 'flags'
and I'm adding some code to the fwm
branch to assess how often it's occurring and test a possible mitigation.
Hello. I'm currently working on versions of the DNS and pcap agents which use asyncio
. Included in this, the frame streams implementation shodohflo/fstrm.py
needs to support asyncio
as well.
shodohflo.fstrm
will remain backwards compatible.
A new branch will appear in the repository, bringing the total of more or less permanent branches to three:
asyncio
.The present implementation lacks flexibility and only supports 1:1 telemetry capture.
The current dns_agent combines two functionalities:
The Dnstap protocol as architected is not network-aware; the BIND implementation is capable only writing to either the unix socket or to (rotating) files.
It is desirable to support many-to-one, one-to-many, and many-to-many capture modalities for redundancy and failover, as well as for additional uses. For instance, work is underway to alter Rear View RPZ to that it can optionally ingest telemetry data via UDP datagrams transmitted in the format envisioned here.
The anticipated future architecture is:
This issue exists to inform the community of the anticipated change and to solicit feedback.
UDP datagrams are anticipated as the transmission modality, with each datagram encapsulating one observable event.
An observable event is defined as a (potential) CNAME
chain ending in a single IP address; if the underlying Dnstap event resolves the (single) CNAME
chain to multiple addresses, then one observable event is generated for each address.
UDP is chosen because it supports not only one-to-one and many-to-one, but also one-to-many and many-to-many (multicast addresses). Additionally, UDP is connectionless and so recovery / tolerance for network or receiver outages is much simpler to build as well as understand.
UDP has an absolute 64K byte limit on datagram size; on the other hand the efficient datagram size is determined by path MTU and can be considerably smaller (the typical ethernet MTU is 1500 bytes). Datagrams larger than the path MTU are dealt with via fragmentation, which increases the possibility of data loss and requires packet reassembly at the receiving end.
Note that the maximum size of a DNS query or response tracks the UDP absolute limit on datagram size.
Dnstap telemetry can capture different versions of a DNS query or response (stub resolver to caching resolver or caching resolver to authoritative, request or response) as well as additional metadata. Since a single DNS message itself can theoretically reach the absolute limit on datagram size, curation of data is required in order to reliably use UDP as a transport.
The datagram content is envisioned as a JSON dictionary with three keys:
CNAME
chain; both IPv4 and IPv6 are supportedCNAME
chainGiven the DNS data:
www.example.com. IN CNAME server.example.com.
server.example.com. IN A 10.0.0.1
a sample (and prettified) datagram payload might look like this:
{ "id": 1,
"address": "10.0.0.1",
"chain": ["server.example.com.", "www.example.com."],
"client": "10.43.11.48"
}
Have an option to export what is displayed in the UI as a dotfile. Dotfiles (https://en.wikipedia.org/wiki/DOT_%28graph_description_language%29) are used by e.g. Graphviz and understood by Gephi.
It wasn't documented (well) in Python 3.6 but the event loop keeps track of Tasks using a weakref.WeakSet. Theoretically this can cause tasks to mysteriously disappear during garbage collection.
So, special measures need to be taken to keep a regular reference to the task around while a request is being serviced.
Affected items are:
examples/dnstap2json.py
shodohflo/fstrm.py
agents/dns_agent.py
agents/pcap_agent.py
Fedora 37 ships with Python 3.11 which no longer supports loop.run_forever()
. This was mitigated for trualias in m3047/trualias#6 and the intent is to do something along the same lines here.
This will be a multipart effort involving:
shodohflo/fstrm.py
agents/pcap_agent.py
Create an example program which outputs JSON to a UDP socket. agents/dns_agent.py
provides an example of extracting Dnstap information and writing it to Redis using a ThreadPoolExecutor
. This new example would write directly to a UDP socket asynchronously.
Rationale for UDP socket: A udp socket allows subscribers to connect/disconnect from the stream independently of the sending application, effectively decoupling them. On the downside a UDP socket effectively limits the datagram size to (some fraction of) MTU.
This issue exists to solicit comments on this proposal.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.