Intro to the Immutable Log

The Gold Standard for Recording Access to Company Data

The Immutable Log, which is a term I first came across while reading an article in Forbes many years ago, is such an amazing concept—yet virtually nobody (statistically speaking) knows about it! An Immutable Log is a complete record of each and every interaction with any data within a data store. We have seen time and time again, that people are prone to misusing sysadmin power, especially when it pertains to someone they know or care about (or might be stalking). You must have tiered access when it comes to accessing a database (or data store), but there should always be an umbrella layer that holds absolutely everyone in an organization accountable for all interactions (a relative Panopticon). Without this, a very refined user could wreak havoc that was almost unobservable by the vast majority of the members of an organization—especially if these members weren’t at the very least, highly competent technologists in their own right. The sheer evaluation of technical expertise during an interview and daily work can be so hard to qualify, that entire subcultures can crop up within large companies where the key to membership is simply: knowing the intricacies of your domain, cold.

More and more companies are implementing some form of an Immutable Log (IL). A properly designed Immutable Log is very simple in theory: it records all of the interactions with a database (like the ones that store user records), monitors everyone in the company with access—and cannot be modified, edited, or deleted—and you’d better believe there are solid redundant copies (that are geographically distributed) should you be dealing with especially crafty folk who want to erase their tracks.

Let’s consider the design of a theoretical database that stores: an ID, access/modification timestamp, accessor user account, and a description of why the data was accessed (although a deployed IL would go much deeper, including validating an action using some unchangeable form of information, or hash, that could be referenced by the subsequent logged actions—which could also harness a blockchain-type implementation).

If one were creating such a log with PostgreSQL (often considered the big-brother of MySQL), for example, a significant portion of the system design should focus on the very top of the pyramid—how do you design the IL to account for root access. Of course, some user on the server must have sufficient permissions during the creation of the IL database, including when the IL is updated (since the IL will be improved like all schemas), but what is the order or operations for this user? More likely than not, this user wouldn’t be called “root,” in practice, but let’s refer to this user as root for the sake of this post.

The root user would be used to conjure up the IL schema, and to let’s say, edit it, periodically. The first consideration must then be, how do we make sure root is bound by the rules of the IL after the agreed upon implementation (or launch) of the IL? That is to say, any edits of the IL must be contained within the IL, so there are no relative super users, and everyone plays by the same rules. By definition, if any employee or consultant could exist outside of the log, the log was not designed correctly. There should be no opportunities for this kind of error during the upgrading of account permissions, or any such sudo activity (which if you didn’t know, means that some Linux user can put on a crown and run around with the keys to the castle, or subsequently burn it all down). Everyone’s actions are logged from the inception date of the IL. Okay, this is starting to sound a little dystopian in nature, but actually, the IL is incredibly freeing. Why? Because everyone then knows that everything they are doing is being logged, and that simple fact alone, will curb potential errant behavior from existing, though it could very well take other forms—like kidnapping the IL; itself. I should never have watched War Games as a kid.

Maybe that’s a little too high level, but if a company is using an IL correctly, we can at least say we have data on employee (or consultant) activity, which is a completely separate issue from how we detect errant behavior from really smart people! What are the rules to the game, once you know the game is live? How do you predict errant behavior, and define it? Well, in short, that’s where it becomes much trickier. One could coyly say, “let’s just run the IL for some amount of time and treat the IL as a training dataset for a machine learning algorithm (a simple one), that looks for outliers in access patterns.” But again, this could easily be defeated if the company’s access control list (aka the rules of the game) are poorly guarded, or if employees swapped login credentials, or permissions—or if an outside entity obtained internal access somehow and then masqueraded as a sysadmin, and just used their plain vanilla access to start downloading critical system details—which wouldn’t set off any red flags. I feel like I’ve been watching too much Mr. Robot and Person of Interest, and am now having Dameon flashbacks… .

Needless to say, creating and implementing an IL properly is system-specific, requires a comprehensive understanding of the players and their relative boundaries of “acceptable” activity, a separate system to quantify risk, and then, sigh—some way to notify multiple individuals when problematic activity is detected, because a rogue sysadmin receiving a notification that the IL detected some risky behavior related to their account doesn’t really help much, does it? Don’t even get me started on having a backup plan should something bad happen, or how to remove user access quickly during a DEFCON 1 event.

An IL is a really great tool in the prevention of potential security threats and misuse, but implementing one that is applicable to many of the projects, or companies across the nation is a really difficult effort (and way too overkill from a cybersecurity standpoint). I have been toying with the idea of creating an open-source IL skeleton for a little while, but again, creating a generalizable form that could be used in practice is where it gets much, much harder. Until then, the best I can recommend is to think deeply on who has “God mode” access within your organization, and why—because no one should, unless their access is logged in an Immutable way.

Nick Warren

Nick is the Founder & CEO of MetaSensor, a venture-backed internet of things startup located in Silicon Valley, and a Behavioural Product Designer at Duke's Center for Advanced Hindsight (with Dan Ariely et al.). | Read Full Bio »

« Newer

Does Your Company Have Data Recovery …

In any company, a member of the leadership should be able to randomly ask a senior member of the tech staff to run a failure simulation. This simulation is a …

Older »

An Easy Way to Set up Grav for Local …

You should almost never download individual files via FTP, edit them, and then upload them—overwriting the older copies on a server. This is a generally …