Coder Social home page Coder Social logo

israel-lugo / netforeman Goto Github PK

View Code? Open in Web Editor NEW
3.0 3.0 0.0 158 KB

Making sure your network is running smoothly

License: GNU General Public License v3.0

Python 100.00%
automate diagnostics infrastructure-monitoring network network-admin react routing watchdog

netforeman's People

Contributors

israel-lugo avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar  avatar

netforeman's Issues

How to deal with actions?

Assuming for example the following configuration:

linux {
    # route-checks is an array, order is important
    route-checks = [
        # make sure we can reach the email relay host
        {   
            dest = ${email_relay_host}
            # must lead to somewhere, i.e. not a blackhole
            non_null = true
            # on-error is an array, order is important
            on-error = [
                {   
                    action = add_route
                    dest = ${email_relay_host}
                    nexthops = ${core_routers}
                }
                { action = email }
            ]
        }
}

We have two actions to execute in case of error:

  1. Add a route (in this case so we can reach the email relay)
  2. Send an email

Both actions are supplied by separate modules (in this case, add_route comes from linux and email comes from email). Should we have some kind of action list, into which modules insert when loaded? How do we deal with name conflicts? What if module foo exports action bar, and module bla also exports its own different action bar?

We could go all out and have namespaces per-module. But it looks weird having to type linux.add_route from within the linux configuration. Perhaps we could do relative lookups, where the name is first looked up in the namespace of the module that defines it, and then in the global namespace. Or we could require that relative names only exist within their own module and that's it, no need to go look in the global namespace.

New module to monitor services/processes

Services crash. Bad things happen. We need to create a module that makes sure a certain process is running. Ideally, it should know about services, so it can do things like restart automatically. Perhaps two separate check types, service_check and process_check.

The module should have actions like restartservice, startservice and executecmd (which just runs an arbitrary shell command).

linuxfib: getting a route to default doesn't work

Doing a ipr.route("get", dst='0.0.0.0/0') results in the following:

{'attrs': [('RTA_TABLE', 254),
           ('RTA_DST', '0.0.0.0'),
           ('RTA_OIF', 1),
           ('RTA_PREFSRC', '127.0.0.1'),
           ('RTA_CACHEINFO', {'rta_tsage': 0, 'rta_used': 0, 'rta_id': 0, 'rta_lastuse': 0, 'rta_clntref': 1, 'rta_ts': 0, 'rta_error': 0, 'rta_expires': 0})],
 'dst_len': 32,
 'event': 'RTM_NEWROUTE',
 'family': 2,
 'flags': 2147484160,
 'header': {'error': None,
            'flags': 0,
            'length': 96,
            'pid': 882,
            'sequence_number': 289,
            'type': 24},
 'proto': 0,
 'scope': 0,
 'src_len': 0,
 'table': 254,
 'tos': 0,
 'type': 2}

That is, a 0.0.0.0/32 (local) dev lo proto unspec route. The nexthop is completely missing, and can't be checked against anything. Furthermore, the destination length is 32 instead of 0.

Same thing happens for IPv6.

This will break route_check for 0.0.0.0/0 (or ::/0) with nexthops_any, as the nexthops are nowhere to be found.

New module to detect rogue DHCP servers

Try using pydhcplib or something else. Send out DHCP requests and check responses against a whitelist of authorized servers.

Suggested actions would be email.sendmail or even other more complex things such as logging in somewhere and shutting down a port :)

Implement email sending

We need some way to send emails. The configuration file (issue #1) already models the required settings. Now we just need to make it do stuff.

We can use the Python email module to create a message, and the smtplib module to send it.

Normalize configuration exceptions

We're dieing from netaddr exceptions and so on, e.g. if we define a route_check with dest=0.0.0.0/k, we die with:

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.4/runpy.py", line 170, in _run_module_as_main
    "__main__", mod_spec)
  File "/usr/lib/python3.4/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/home/capi/netforeman/netforeman/cli.py", line 94, in <module>
    sys.exit(main())
  File "/home/capi/netforeman/netforeman/cli.py", line 81, in main
    dispatcher = dispatch.Dispatch(args.config_file)
  File "/home/capi/netforeman/netforeman/dispatch.py", line 50, in __init__
    ok = self.config.load_modules()
  File "/home/capi/netforeman/netforeman/config.py", line 182, in load_modules
    settings = API.settings_from_pyhocon(config_tree, self)
  File "/home/capi/netforeman/netforeman/config.py", line 65, in settings_from_pyhocon
    return cls._SettingsClass.from_pyhocon(conf, configurator)
  File "/home/capi/netforeman/netforeman/fibinterface.py", line 284, in from_pyhocon
    for subconf in conf.get_list('route_checks', default=[])
  File "/home/capi/netforeman/netforeman/fibinterface.py", line 284, in <listcomp>
    for subconf in conf.get_list('route_checks', default=[])
  File "/home/capi/netforeman/netforeman/fibinterface.py", line 248, in from_pyhocon
    dest = netaddr.IPNetwork(cls._get_conf(conf, 'dest'))
  File "/usr/lib/python3/dist-packages/netaddr/ip/__init__.py", line 933, in __init__
    raise AddrFormatError('invalid IPNetwork %s' % addr)
netaddr.core.AddrFormatError: invalid IPNetwork 0.0.0.0/k

We should catch these kinds of errors, and convert them to a config.ParseError.

route_check: implement non_null check

Now that LinuxFIBInterface.get_route_to() is capable of dealing with blackhole/unreachable routes (see #5), we can implement this. Remember to check the rest of the code to make sure other places don't break with null routes.

Move plugin modules to a new subpackage

It would be a good idea to separate the plugin modules (e.g. email, linuxfib) from the core framework. Place them in a new subpackage called modules.

Open question: what to do with moduleapi? It's not a plugin module, but defines the API for them. And what about abstract modules such as fibinterface? Perhaps rename these to _moduleapi and _fibinterface and place them inside modules too.

Detect errors in the entire config tree at parse time

Right now, we're not parsing the entire tree before running modules in Dispatch.run(). For example, error handler conf subtrees are only parsed if we enter the error. This may result in configuration errors only being detected at inappropriate times, e.g. precisely when a failure occurred and one needed to fix it.

A simple example. Action foo does not exist, but this will only be detected at runtime, if and when we enter the on_error handler:

modules = [
    linuxfib
]

linuxfib {
    route_checks = [
        {   
            dest = 198.51.100.5
            nexthops_any = [ 192.0.2.129, 192.0.2.143 ]
            on_error = [
                {   
                    action = foo
                    arg = 1
                }
            ]
        }
    ]
}

linuxfib: deal with blackhole/unreachable routes

We get a NetlinkError(22, "Invalid argument") when trying to do a LinuxFIBInterface.get_route_to() and the route happens to be a blackhole. Deal with this somehow.

iproute2 has the same behavior, seems to be netlink-related. However, iproute2 has the ip route list match operator, which can show us routes that match even if they're blackholed. See how they do that. Can't use IPRoute.get_routes(dst=), that fails with a NetlinkError too.

Return meaningful exit codes

We should use the program exit codes to return useful information to the user.

Example 1: a check failed, and corrective action was taken. What to return?

Example 2: a check failed, and one of the corrective actions failed, but not all. What to return? I think it should be an error, period. Right now we're not even detecting this; FIBModuleAPI._route_check_failed catches the error from the action, logs it, and continues happily. We would need to propagate the error up, but without interrupting the execution of other actions.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.