israel-lugo / netforeman Goto Github PK
View Code? Open in Web Editor NEWMaking sure your network is running smoothly
License: GNU General Public License v3.0
Making sure your network is running smoothly
License: GNU General Public License v3.0
We need to create something that knows how to monitor for events like the connection table being full, and so on, and react to them.
Assuming for example the following configuration:
linux {
# route-checks is an array, order is important
route-checks = [
# make sure we can reach the email relay host
{
dest = ${email_relay_host}
# must lead to somewhere, i.e. not a blackhole
non_null = true
# on-error is an array, order is important
on-error = [
{
action = add_route
dest = ${email_relay_host}
nexthops = ${core_routers}
}
{ action = email }
]
}
}
We have two actions to execute in case of error:
Both actions are supplied by separate modules (in this case, add_route
comes from linux
and email
comes from email
). Should we have some kind of action list, into which modules insert when loaded? How do we deal with name conflicts? What if module foo exports action bar
, and module bla
also exports its own different action bar
?
We could go all out and have namespaces per-module. But it looks weird having to type linux.add_route
from within the linux
configuration. Perhaps we could do relative lookups, where the name is first looked up in the namespace of the module that defines it, and then in the global namespace. Or we could require that relative names only exist within their own module and that's it, no need to go look in the global namespace.
Services crash. Bad things happen. We need to create a module that makes sure a certain process is running. Ideally, it should know about services, so it can do things like restart automatically. Perhaps two separate check types, service_check
and process_check
.
The module should have actions like restartservice
, startservice
and executecmd
(which just runs an arbitrary shell command).
Doing a ipr.route("get", dst='0.0.0.0/0')
results in the following:
{'attrs': [('RTA_TABLE', 254),
('RTA_DST', '0.0.0.0'),
('RTA_OIF', 1),
('RTA_PREFSRC', '127.0.0.1'),
('RTA_CACHEINFO', {'rta_tsage': 0, 'rta_used': 0, 'rta_id': 0, 'rta_lastuse': 0, 'rta_clntref': 1, 'rta_ts': 0, 'rta_error': 0, 'rta_expires': 0})],
'dst_len': 32,
'event': 'RTM_NEWROUTE',
'family': 2,
'flags': 2147484160,
'header': {'error': None,
'flags': 0,
'length': 96,
'pid': 882,
'sequence_number': 289,
'type': 24},
'proto': 0,
'scope': 0,
'src_len': 0,
'table': 254,
'tos': 0,
'type': 2}
That is, a 0.0.0.0/32 (local) dev lo proto unspec
route. The nexthop is completely missing, and can't be checked against anything. Furthermore, the destination length is 32 instead of 0.
Same thing happens for IPv6.
This will break route_check
for 0.0.0.0/0 (or ::/0) with nexthops_any, as the nexthops are nowhere to be found.
Try using pydhcplib or something else. Send out DHCP requests and check responses against a whitelist of authorized servers.
Suggested actions would be email.sendmail or even other more complex things such as logging in somewhere and shutting down a port :)
We need some way to send emails. The configuration file (issue #1) already models the required settings. Now we just need to make it do stuff.
We can use the Python email module to create a message, and the smtplib module to send it.
We're dieing from netaddr exceptions and so on, e.g. if we define a route_check
with dest=0.0.0.0/k
, we die with:
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/lib/python3.4/runpy.py", line 170, in _run_module_as_main
"__main__", mod_spec)
File "/usr/lib/python3.4/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/home/capi/netforeman/netforeman/cli.py", line 94, in <module>
sys.exit(main())
File "/home/capi/netforeman/netforeman/cli.py", line 81, in main
dispatcher = dispatch.Dispatch(args.config_file)
File "/home/capi/netforeman/netforeman/dispatch.py", line 50, in __init__
ok = self.config.load_modules()
File "/home/capi/netforeman/netforeman/config.py", line 182, in load_modules
settings = API.settings_from_pyhocon(config_tree, self)
File "/home/capi/netforeman/netforeman/config.py", line 65, in settings_from_pyhocon
return cls._SettingsClass.from_pyhocon(conf, configurator)
File "/home/capi/netforeman/netforeman/fibinterface.py", line 284, in from_pyhocon
for subconf in conf.get_list('route_checks', default=[])
File "/home/capi/netforeman/netforeman/fibinterface.py", line 284, in <listcomp>
for subconf in conf.get_list('route_checks', default=[])
File "/home/capi/netforeman/netforeman/fibinterface.py", line 248, in from_pyhocon
dest = netaddr.IPNetwork(cls._get_conf(conf, 'dest'))
File "/usr/lib/python3/dist-packages/netaddr/ip/__init__.py", line 933, in __init__
raise AddrFormatError('invalid IPNetwork %s' % addr)
netaddr.core.AddrFormatError: invalid IPNetwork 0.0.0.0/k
We should catch these kinds of errors, and convert them to a config.ParseError
.
Now that LinuxFIBInterface.get_route_to()
is capable of dealing with blackhole/unreachable routes (see #5), we can implement this. Remember to check the rest of the code to make sure other places don't break with null routes.
It would be a good idea to separate the plugin modules (e.g. email
, linuxfib
) from the core framework. Place them in a new subpackage called modules
.
Open question: what to do with moduleapi
? It's not a plugin module, but defines the API for them. And what about abstract modules such as fibinterface
? Perhaps rename these to _moduleapi
and _fibinterface
and place them inside modules
too.
As per title. We're using HOCON syntax as it's both flexible and readable.
Right now, we're not parsing the entire tree before running modules in Dispatch.run(). For example, error handler conf subtrees are only parsed if we enter the error. This may result in configuration errors only being detected at inappropriate times, e.g. precisely when a failure occurred and one needed to fix it.
A simple example. Action foo
does not exist, but this will only be detected at runtime, if and when we enter the on_error
handler:
modules = [
linuxfib
]
linuxfib {
route_checks = [
{
dest = 198.51.100.5
nexthops_any = [ 192.0.2.129, 192.0.2.143 ]
on_error = [
{
action = foo
arg = 1
}
]
}
]
}
We get a NetlinkError(22, "Invalid argument")
when trying to do a LinuxFIBInterface.get_route_to()
and the route happens to be a blackhole. Deal with this somehow.
iproute2 has the same behavior, seems to be netlink-related. However, iproute2 has the ip route list match
operator, which can show us routes that match even if they're blackholed. See how they do that. Can't use IPRoute.get_routes(dst=)
, that fails with a NetlinkError too.
We should use the program exit codes to return useful information to the user.
Example 1: a check failed, and corrective action was taken. What to return?
Example 2: a check failed, and one of the corrective actions failed, but not all. What to return? I think it should be an error, period. Right now we're not even detecting this; FIBModuleAPI._route_check_failed
catches the error from the action, logs it, and continues happily. We would need to propagate the error up, but without interrupting the execution of other actions.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.