Atomic Shelters for Active Observers – Hardening by Example (part 2)

Boris Lukashev
Aug 6, 2018
8 min read

In part 1 of this series we defined a valuable target objective for attackers seeking to advance compromise of a client environment, resolved its value and approach vectors to the valuable assets (inside and into the system), and defined a basic scope of defensive measures required to harden the objective. Now that we have defined a set of hardening requirements, how will we attain them?

In part out of sound technical reasoning and in part because shamelessly advancing ourselves as a solution provide is part of a rational business model; our first-order approach is to define per-objective scopes of work, consolidate them with deduplication of efforts into a phased project, and create a deliverable crafted in detail to the defined specification... A waning number of enterprises and governments can find the talent and budget to hire or contract entities capable of doing such work in narrow mission-critical verticals. Outside of those niche/valued spaces, there is a massive talent shortage – this sort of work requires an in-depth understanding of a plethora of components, interactions, and levels of abstraction to effectively define the scope of objectives, much less execute in a provable and repeatable manner.

All too often we see the results of this talent shortage as the second option – a “generic set of restrictions” placed on all systems under the jurisdiction of a department or team under guise of systems hardening, which may or may not have any application to the assets of value residing on the host, and may or may not impact service operations on the host. Defining a universally functional generic defense model using a single tool like SELinux (a Linux Security Module) or iptables (the generic Linux firewall interface) is infeasible given the domains of concern outlined in part 1 – traffic still needs to come in and out, and internal gearing needs to maintain friction to work (SELinux can’t be overly restrictive without breaking things).

Our friends and partners over at Atomicorp just outside of DC have come to many of the same conclusions and defensive tactics – they are also informed by work in critical infrastructure, military service in ground units, and decades of engagement across industries and clients with their own unique perspective on technology and how it applies to what they do. Unlike us however, they decided to take the proverbial high road in the paradigm of “tailor vs fabricate” - they chose to take on the effort required to build a generically applicable set of intertwined hardening methods providing domain-specific zones of defense on each vector of ingress to a system and access between components of the system. Diving into the deep end of that particular quicksand pit, they also elected to make configuration of these measures user-accessible, which as any engineer knows, increases complexity by orders of magnitude since the human factor is usually the root cause of issues.

Atomicorp Secured Linux (ASL) is the branded name for a cohesive layered hardening approach to Linux systems consisting of the following:

1 – Kernels hardened against attacks against them (deterministic and probabilistic binary mechanisms along with reactive measures), against userspace malfeasance (like memory operations involved in staging shellcode), and providing a learning role based access control mechanism which can work alongside existing LSMs (or perfectly well all alone).

2 – Host Intrusion Detection/Prevention System (HIDS/HIPS) fed by custom rules and near-real-time threat telemetry providing userspace monitoring of/response to logs, file integrity checksums, and states such as network connections, processes, and configurations enacted.

3 – A firewall manager providing user-friendly interface to an otherwise draconian white-listing implementation for network access control and integrating with the HIPS component to provide reactive firewall block rules against addresses determined to be attacking the host.

4 – A web application firewall providing contextual filtering of HTTP requests and responses, also fed by a near-real-time feed of their custom telemetry to provide detection and defense against 0-day threats which the web-app/component creators have not been able to patch yet (“vulnerability shielding” in industry slang).

5 – Integration between all of the pieces involved resulting in the kernel logs triggering actions by the HIPS and firewall, or multiple exploit/bad auth attempts against the web app will also result in firewall blocks and reactive responses from the intrusion detection system. This is presented by a webUI accessible to the human operator who does not have the time to dive into the thousands of pages of documentation describing each of the components involved.

Without getting into proprietary details, suffice to say that the efforts put toward ensuring compatibility and stability on systems deploying this armada of hardening functions are nothing short of herculean. Configurations for the kernel build, intrusion prevention system, and application firewall are painstakingly tailored to provide the maximum level of compatibility and protection such as to remove the common blocker to hardening implementations – the elephant tears of engineers watching their work product turn to slag under overly restrictive “generic defenses.” A wonderful illustration of such problems can be seen in the PTI roll-out for a well known OS vendor – anti-virus and other things relying on those virtual mappings suddenly and catastrophically failed to work with the new defense.

With this arsenal of tooling, ASL uses an automated (user-proof approach) installer to deploy and integrate these components with the system to service the objectives laid out in our defense assessment (in part 1). While it may not map out exactly like a painstakingly tailored defensive design and implementation onto our target system’s components and services, it does provide the desired defensive posture of “being too unpleasant a target to waste time on” for attackers answering to a clock (anyone but a bored teenager, and they dont make those as clever anymore, or with as much free time). Unless the objective is to deter a nation state, and not one of the soft-shelled utopia-oriented ones either, ASL functions restrict ingress and post-exploitation lateral movement to a degree which should satisfy the needs of most service hosts. “Thats a great marketing pitch...” (was my original thought too) but how does it work, exactly?

Going back to the hardening objectives identified in part 1, we can map ASL functions to those domains of concern and observe the reduction in attack surface/increase in attack complexity.

1 – Network restriction is achieved via the ASL firewall wrappers and informed by a restrictive default configuration, user changes via the UI or CLI, and the host intrusion prevention system leveraging it to create dynamic block rules for addresses determined to be interacting with the host adversely. The administrative services (SSH/HTTPS) can be locked down to permit access only from the administrative subnet, and failed authentication attempts or flat out attacks result in source IPs being blocked and unable to continue recon or offensive operations without a source address change. Similarly ingress-only for UDP services not relying on bidirectional comms can be established, and any exploits sent along those datapaths resulting in a service crash would find the source address of the exploit/attack blocked as well, preventing further communication (via the shell which the attacker tried to launch).

2 – Contextual filtering of the most complex (and with the most direct access to assets under management) protocol (HTTP) is achieved via the web application firewall. This is an inline implementation sitting inside the web server or proxying connections prior to it, and both responds itself to interactions with clients and logs information to be processed by the intrusion prevention system. The IPS reviews logs from the web server (and WAF), the SSH service, and other common system services in real time. Comparing current log states (or command outputs) with normalized baselines permits contextual identification of “abnormal” in the least, and utilizing the rule feed from Atomicorp, the HIPS most commonly identifies the actual condition and provides an appropriate response. Attacks bypassing these mechanisms and causing a service crash still have to deal with the kernel’s defenses, and the HIPS wires into those logs too.

3 – ASL provides safe default configurations to minimize privilege in the services it configures, but this is hardest part to get right in a generic way – without prior knowledge of which services, processes, users, groups, and subjects of access are in play and how they interact, a detailed separation model which does not break functionality is nearly impossible to attain. However, to the degree possible, it does provide hardening measures for isolation which the user elects to implement such as the RBAC system, enforcing boundaries in namespaces (hardening Docker/LXC/etc), restricting access to resources owned by other users/groups, and blocking off common avenues for bypassing existing isolation mechanisms by reading/writing to raw resources (like /dev/mem). Furthermore, ASL utilizes its custom HIPS rules to analyze (crash) logs and other telemetry in order to attempt identification of malicious activity at the process level and reactively responding to the detected condition (such as by effecting block rules for addresses interacting with the service at the time of crashes).

4 – ASL carries multiple mechanisms for mitigating impact and reducing collateral damage when exploits are successful. The kernel’s brute force protection mechanism prevents UIDs which have crashed recently from starting up new PIDs. The HIDS/HIPS watches context such as the process and netstat tables, along with system and application logs to detect signs of compromise and take action to set the system back to a consistent state. As noted above, the firewall functionality is leveraged to sever all comms with “bad hosts” preventing egress even via unconventional methods such as UDP shells, informed by the log analysis performed via the HIDS/HIPS runtime.

5 – This piece ASL does in spades, in no small part due to the fact that Atomicorp actually stewards the HIDS/HIPS project in use, and their CTO is one of the world’s foremost experts in its use. Their collaboration with other projects in the ecosystem has paid off in a fair amount of log output normalization by the component authors, along with addition of indicators critical for directed response (like which container on the host produced the dmesg log for a segfault...). For human-oriented outputs, they have put major efforts into dealing with the JSON streams, and are actively engaged with partners such as SVIT in working out delivery mechanisms for NSM and standalone uses by engineers called up to help fight the fire. By the time most readers see this, they will have already uncased the vast array of new capabilities and data-level interoperability functions which they’ve been crafting under the moonlight...

In 4/5 stated requirements, ASL meets the mark head-on without exhausting its bag of tricks. The targeted isolation piece won’t be automated any time soon – we are a very DevOps heavy shop and even with our Ruby-driven flexibility, cannot create the proper execution flows to effect isolation without understanding components, dependencies, and interplay across a wide variety of operating and failure conditions. What’s more, is that a year of licensing for this fire-and-forget approach is about the equivalent of one hour’s worth of engineering time out of the initial assessment for such an effort (bad for business if you’re us, good for customers who can’t get us). One of those rare cases of getting many times more than one pays for. In the final tally, even data-only attacks against pull-based services such as SNMP face a mountain of challenges in post exploitation, which will not endear the subject system to any attacker...

Given that Atomicorp offers such a robust stack as a viable way forward using the 2nd method of defensive correction (well-rounded coverage in hardening measures), why then would anyone call the consultants? In fact, the two are not mutually exclusive, but symbiotic – the tooling and expertise provided by Atomicorp, coupled with our (or their) domain-specific experts working shoulder to shoulder with the client to optimize the solution for their specific use case provides the best of both worlds. Systems are built with performance and security considerations from the design phase on out, leveraging the components and methods from ASL along with granular isolation via RBAC/namespaces/etc, MPROTECT exceptions specific to the environment, kernels are built to the hardware and specification in use, and rule feeds tailored to the applications and systems in play. Whats more is that the lessons learned on field work directly inform the products design and implementation. How’s that for “vendor backed?”

If you’re an enterprise facing nation-state-level threats, or a small business caring about the integrity of your systems and data, effective hardening of the infrastructure is not something solely reserved for the megacorps of the world – give us or our friends at Atomicorp a call, it will be the difference between your organization and the one in the news.

Data Sovereignty in the Age of the Mobile Dragnet

A Serial Case of AIR on the Side-Channel

Once Upon a Cloudy AIR I Crossed a Gap Which Wasn't There

Atomic Shelters for Active Observers – Hardening by Example (part 2)

Comments