A Few Python Insights

I run a Raspberry Pi on my home network for three reasons:

  • To have a local DNS (dnsmasq) so I don’t have to remember or look up numeric IP addresses
  • To handle VPN connections (OpenVPN) into my home network when I’m traveling
  • To have something where I can play around with Linux

Actually, there’s a fourth reason: having grown up before there were anything like home PCs I love having an entire computer — sporting 4GB of memory and 32GB of storage no less — I can hold in the palm of my hand. That costs less than $100 in 2019 dollars. That’s Star Trek (original series) stuff, baby, just a few centuries early.

That last reason causes me to upgrade the Pi when newer, more powerful models come out…which then forces me to scratch my head trying to remember how the services I run on the Pi are configured and recreate the wheel as it were. I tried to do this by writing a bash script but quickly determined my bash skills were, umm, too rudimentary to even think of doing something like what I wanted.

So I decided it would be a fun introduction to python to write an app which could be used to configure the new Pi, suck certain data across from the old Pi, swap their names and DNS entries and make it easier to do the upgrade.

The result was maestro.py, whose source files you can find here. Fair warning, I don’t know python all that well yet so don’t be surprised if you have trouble trying to get it to work as I haven’t included every file within my project directory.

I’ve built it under python 3.7 using a virtual environment. The venv subdirectory contains a lot of files I didn’t upload to github. Hopefully the requirements.txt file will let you use pip to install all the dependencies once you’ve got the basic virtual environment set up on your machine.

Of course as this is a pretty unique app you probably aren’t going to want to just download it and use it anyway. Think of it as a case study from someone who spent many years writing in C# coming to grips with python, hopefully resulting in some useful insights for you the reader.

My First Pythonic Insight

My first pass involved a single rather long file that was structured linearly. It wasn’t easily extensible which was a problem because I kept thinking of more systems I had to configure to achieve my goal.

My second pass involved organizing code into classes, one to a file, as is the natural default in C#. This ran me head on into python’s import system. It was not a pleasant experience, mostly because I kept viewing things in terms of C# namespaces and classes.

I finally got it working, but it still wasn’t easily extensible (and it was messy getting all the import statements done right). What I really wanted was a folder into which I could drop python files for processing various kinds of config files (e.g., /etc/hostname, /etc/openvpn/server.conf) and have them automagically used by the program. I struggled with this for quite a while until I connected a few things I’d been reading online.

“Everything’s an object”. You’ll often hear that about python. And everything is. More importantly, despite the fact that some objects are methods, some are functions, some are classes, etc., they can all be worked with in the same way. It’s just as valid to ask about the attributes — python term — of a class (which can be fields, methods, etc) as it is to ask about the attributes of dictionary. Or a list. Or… a module.

Since calling a function in python is simply a matter of calling an object that happens to be a function all you need to know is its “address” and, possibly, the parameters it requires (functions can have optional and default parameters). The “address” needs to point back to the source code where the function is defined.

That’s why this line:

import os
pathExists = os.path.exists("some path")

and this line:

from os import path
pathExists = path.exists("some path")

and this line:

import os
pathMethod = os.path.exists
pathExists = pathMethod("some path")

all call the exact same method and do the exact same thing.

In and of itself this isn’t particularly insightful — you pick up the need to import stuff quickly in python because almost everything is defined somewhere else — but the ‘aha’ moment occurs when you realize all that the import/from statements are doing is specifying where to look for code…and you can specify how to look for code in lots of different ways.

Panning for Methods

With that hint I went back and studied importlib and came up with this:

fpModules = [name for (_, name, _) in pkgutil.iter_modules([os.path.dirname(__file__) + "/file-processors"])]

# scan the modules for all functions matching that of a file processor
# (i.e., func(configuration.ConfigInfo, configuration.MaestroFile)
for modName in fpModules:
    imported = importlib.import_module('.' + modName, package='file-processors')

    for item in [x for x in dir(imported) if not x.startswith("__")]:
        attribute = getattr(imported, item)

        if matchesFunctionSignature(attribute, reqdParams):
            fileProcessors.append(attribute)

I’ve left the necessary import statements out for clarity. The magic starts with line 1, where the pkgutil.iter_modules method grabs all the filenames in the file-processors subdirectory of my project directory.

I then loop thru all the module names and load each module in lines 5 and 6. Since I’m not using absolute paths to the modules I have to specify where import_module should look by giving it a package name (which is really the path to the file-processors subdirectory, packages being a special kind of directory in python).

That’s also why there’s a leading ‘.’ on the module name argument; the module path is relative to the package path (python requires you to leave out the more traditional file path separators, e.g., ‘/’).

Since while imported is a module it’s also an object — everything’s an object in python — I can also search it for functions, specifically functions that are designed to process configuration files. That’s what I do in lines 8 and 9, working off of the names of all the attributes in the module.

Note that the dir() function is perfectly happy taking a module object as an argument and treating it as path to something of interest (it may even just be a path behind the scenes for all I know). I ignore any path element which starts with a double underscore (‘__’) because those are reserved/private objects needed by python but not by me in this instance.

I then get the attribute associated with the item via getattr() and test it in lines 11 and 12 to see whether it’s a function with the right signature. If it is, it’s (hopefully) a file processor that I may need, so I add it to my collection of file processors.

Those can be called, anytime, just by appending parentheses to them containing the correct parameters. They are no different than a python function defined with the def keyword…because, ultimately, that’s what each entry in the file processors collection points back to, a def clause in some python file someplace.

Checking Method Signatures

The last bit of magic I needed to create a simple extensible app involved checking those methods I found to see if they were ones I could (reasonably) safely call. That’s what matchesFunctionSignature() does:

def matchesFunctionSignature(func, reqdParams: [] = []):
    """determines if an object is a function with a specific set of parameters"""
    if not inspect.isfunction(func):
        return False

    signature = inspect.signature(func)
    if len(signature.parameters) != len(reqdParams):
        return False

    for paramName in signature.parameters:
        idx = list(signature.parameters.keys()).index(paramName)
        if signature.parameters[paramName].annotation != reqdParams[idx]:
            return False

    return True

This function requires two parameters, a (purported) function and a list of required parameter types (the parameter names, of course, aren’t important).

The first thing we do, in line 3, is make sure the object we’re studying is, in fact, a function. That’s done by a call to inspect.isfunction(), which contains a host of similar things, and other stuff, for inspecting pythonic objects.

If the object really is a function we then need to check the parameters it expects. When I write these file creation/modification functions I use a convention wherein it must use python type hinting to indicate it expects two specific types of parameters:

def someTypeOfFileProcessor( configInfo:configuration.ConfigInfo, fileInfo:configuration.MaestroFile ):
    pass

I get information about the function’s parameters from a call to inspect.signature(). This lets me first check to ensure the function has two and only two parameters.

I then check each parameter in turn to see if it matches the corresponding required type in the type list we were given. I get the index of each parameter by that somewhat convoluted line 11 which extracts the keys — which are the parameter names — from the parameters dictionary and then figures out their index among all the parameters (the parameters dictionary is an ordered dictionary, so the list that comes out has the parameters in the right sequence).

Once I have the parameter’s index I check the parameter’s type against the type I want it to be in lines 12 and 13. If everything matches I’ve got a valid file processor function.

Decorators As Pseudo Metadata

One of the main reasons I went the class route in my second attempt was to make each file processor self-aware as to what file it handled. That’s easy enough to do with a class by defining a method that returns the target file name.

Making a python function self-aware that way stumped me for a while. I found myself wishing python had something like C#’s Attribute system, where you can decorate anything with metadata, and functionality, that is tied to the decorated entity but can be accessed without having to create or call the entity.

It turns out you can do something loosely analogous to that in python using decorators. Which are actually just functions that wrap other functions:

def target_path(targetPath:str):
    """wraps a file processing function so that the file to be processed is specified as kind of metadata
    in the decorator"""
    logger = logging.getLogger("maestro")

    def decorator_target_path(func):
        @functools.wraps(func)
        def wrapper_target_path(*args, **kwargs):
            if len(args) != 2:
                logger.warning("target_path decorated function has {} arguments when 2 are expected".format(len(args)))
            elif isinstance(args[0], configuration.ConfigInfo):
                configInfo = args[0]
                fileInfo = next1
                else:
                    logger.log(0, "processing {}".format(targetPath))
                    newArgs = (args[0], fileInfo)
                    func(*newArgs, **kwargs)
            else:
                logger.warning("1st argument passed to target_path wrapped function is not a ConfigInfo")

            return targetPath
        return wrapper_target_path
    return decorator_target_path

This decorator will wrap any function. But it “wants” to wrap one which accepts two parameters, a ConfigInfo object and a MaestroFile object. If it finds its been given such a function it replaces the provided MaestroFile object with one drawn from the “master list” of files to be configured (e.g., /etc/hostname) held by the ConfigInfo object, and then calls the wrapped/decorated function with the revised parameter list.

The trick comes in using a parameter from the decorator call to look up the correct MaestroFile object, as you can see in the first line below:

@target_path("/etc/hostname")
def WriteHostname(configInfo: configuration.ConfigInfo, fileInfo: configuration.MaestroFile):
    # fileInfo gets replaced by the targeted entry from configInfo.files by the decorator
    backupFile(fileInfo)

    file = open(fileInfo.targetPath, "w")
    file.write("{}\n".format(configInfo.local.name))
    file.close()

Wrap Up

There’s more to the overall project than just the bits described above. But the most interesting parts — and the biggest sources of insight into python for me — were the bits I described. Everything else is reasonably straightforward, although I will briefly point out a few more things:

  • I originally had a lot of print() statements so the console output would show what was happening. I switched that over to calls to python’s built-in logging facility because that let me easily define verbose/normal modes of status reporting.
  • The configuration file that drives this app is JSON-structured. Getting that into a set of python classes was more of a pain than I expected coming from the C# world. The most reliable way I found to do this was a two-step: use json.load() to read the file in as a nested dictionary and then dacite.from_dict() to convert that dictionary over to a set of interconnected python data class instances.
  • At one point I need to use scp to grab a file of dhcp leases from a different Raspberry Pi. Since I didn’t want to hardcode sudo passwords in my source code I ended up using the pexpect package to interact programmatically with a separately spawned shell to do the scp call. That way I could ask for the password at runtime and discard it once I’m done with its.

  1. x for x in configInfo.files if x.targetPath == targetPath), None) if fileInfo is None: logger.warning("No configuration defined for {}".format(targetPath 

Leave a Comment

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Archives
Categories