Return on Security Invest: Top-down and Best-Practice

Klaus Jochem

Klaus Jochem has more than 30 years of IT experience. In 2014, he joined the Bayer Information Security Program, where he was responsible for secure application architecture. He has been a freelance consultant for IT and OT security since July 2019.

Klaus Jochem, IT/security strategist

Klaus Jochem, how long have you worked in cybersecurity and OT security, and what have been your most exciting experiences so far?

I‘ve been involved with IT for more than 35 years. And I’ve worked intensively in IT security since 1989. I got into OT security about 15 years ago through working in manufacturing execution systems.

My most exciting moment was in 1987: I was working on mainframes with the IBM operating system. A system programmer kept blocking certain directories, which disrupted the data transfer between the development locations in Germany and Florida over the IBM SNA network. As a result, about 1,500 software developers couldn’t do their jobs. Using functions inherent in the operating system, I programmed a trigger that displayed the relevant programmer’s username and terminal on my screen. You can’t imagine how shocked he was when I suddenly appeared and confronted him.

You often talk about RoSI (return on security investment) as a central element of cybersecurity. What does this idea mean to you? What benefits does it offer and what can companies learn from it?

I wouldn’t describe RoSI as a central element. It’s a method for evaluating cybersecurity investments. It works in advance, before the investments are actually made. That’s what makes it so powerful.

When it comes to cybersecurity investments, we often forget that costs increase going forward as a result – even for solutions that completely prevent the losses in question. That translates into a long-term drop in the company’s profits, because it’s not usually possible to pass on the cost of cybersecurity measures to customers. Instead, you have to sell more products just to keep profits at the same level. To express it in my favorite currency, gummy bears: if the annual operating costs of a cybersecurity solution add up to EUR 50,000, the company has to sell a lot more gummy bears each year to earn the same profit.  

At the latest, RoSI should come into play when designing risk reduction measures. If a security risk is identified that exceeds the organization’s risk capacity, action should be taken to reduce the amount of risk to an acceptable amount. In essence, the difference is the value of R – the amount of risk that a security solution must eliminate.

RoSI provides a way to compare different security measures or solutions that can reduce risk by at least R. The criteria for such comparison is the cost of the tools involved. And the primary goal is to find a solution that optimizes RoSI.

At minimum, tool costs include the purchase price, the cost of rollout, and operating costs, ideally over three or five years. You should also take opportunity costs and effectiveness losses into account.

What do effectiveness losses include?

One element is friction. Imagine that you are planning to reduce your identified risk by R. According to the solution manufacturer, doing so will be easy and fast. There might even by a TEI (total economic impact) analysis from Forrester Research available.

But this promise might not hold true for your specific operating environment. Carl von Clausewitz calls the reasons for this “friction,” while for Donald Rumsfeld they were “unknown unknowns.” Two examples:

a)     Let’s say you have an antivirus solution and the weekly full scan is very slow. On these days, your developers take a two-hour lunch break because the solution needs 100 percent of your CPU time to scan the Java libraries in the development environment. As a result, all the developer computers in this environment are subject to a scan exception. An attacker can take advantage of this situation with spear fishing.

b)    Or imagine that rolling out a solution automatically requires write access to the SMB protocol on the admin shares of production network workstations. Enabling this access requires opening the production network segments in the firewall for the SMB protocol and creating a local admin user in each system, all with the same password that is saved in the solution’s management console. Now an attacker could gain access to all computers in the production network over the management console. In the worst case, a WannaCry-type worm could bring the entire production network to a halt even though a security solution is in place.

Friction makes security solutions less effective and, in situations like our second example, reduces security throughout the system. In most cases, the planned level of risk reduction isn’t achieved. And this is something to keep in mind when choosing a product.

Here’s a tip: find a solution that’s a good match for your company and your IT service provider. The planned security gains should not deviate too much from what’s actually achieved. Keep friction in mind when you’re choosing the solution!

Effectiveness losses can also involve evolution and disruption. Attackers continue to develop their tools. New weak points are discovered and patches either don’t completely solve the problem or can’t be implemented. For efficiency reasons, they may also need to allow exceptions. All these factors reduce the solution’s effectiveness over time. As a rule, security solutions are like cars: they lose 15 percent of their value the minute you leave the dealership. It’s important to consider these factors when evaluating solutions in terms of RoSI.

Another tip: pick solutions that are as robust as possible in the face of new attack techniques and weak points. Doing so takes a lot of effort, but can prevent the need to buy yet another solution after two years.

Building on what you’ve said, how is RoSI connected to the risk-based approach you also mention frequently?

Risk is at the center of everything related to cybersecurity. I often hear slogans like “Cybersecurity: just get started.” From a business economics perspective, this approach is a catastrophe.

Here’s an example: imagine a barn with a few mouseholes. The door is always open, too. The farmer complains that sacks of seed are constantly disappearing. “Just get started” could mean that I’ll begin by sealing up the first mousehole and then move on to the next one. By the time I get to the barn door, time passes during which more sacks of seed are stolen.

In a risk-based approach, you start by installing a lock on the barn door. You don’t have to worry about the mice since someone is stealing whole sacks of seed. If you really want to limit the damage that the mice are causing, you can get a cat.

As a rule of thumb, you should first address the risks that exceed the organization’s risk capacity. It isn’t necessary to reduce these risks as much as possible – implementing security measures that push the risk level below the organization’s risk limit is enough.

A risk analysis ensures that you identify the weak points (barn door) that, in the current circumstances (door is open) leads to the losses (stolen sacks of seed) that the organization (farmer) can no longer bear.

So RoSI is applied when designing security measures to reduce risk below the risk capacity limit. If all the potential measures reduce risk to this extent, then RoSI becomes the method for achieving lasting cost savings.

To continue with our example: a number of different actions could reduce the farmer’s losses. An partial list of options could include:

1.     Video surveillance of the entryway with motion detectors and an alarm.

2.     A fingerprint-activated padlock.

3.     A fence around the barn.

4.     Protecting the barn with a vintage East German self-firing spring gun.

5.     Storing the sacks in a lockable container in the barn.

6.     A regular padlock.

With RoSI, you assess the operating costs for different solutions, ideally over a longer period. In light of various cost aspects, option 6 is the best choice.

But using RoSI in advance doesn’t mean you don’t have to do any follow-up work. After solutions are implemented, you should check whether risk was actually reduced by the amount planned. Such checks rarely take place, but they are crucial in building confidence in security measures.

Combined with RoSI and a risk-based approach, how can companies reduce risk most with the very next euro they invest in cybersecurity?

Assuming that you know what your risks are, I recommend approaching the problem from two directions: top-down and from a best-practice perspective.

Top-down: focus on the three biggest risks.

Analyze the extent to which you can reduce them using standard tools. (See below for ideas.) Then analyze the risk that remains after standard tools have been applied (simulation!). Take action where the remaining risk is greatest.

Look for synergies. Are there security solutions that could reduce more than one risk at once – including the smaller ones?

Use RoSI to assess the effectiveness and costs of solution options with your IT/OT environment and the know-how of your IT/OT employees in mind.

These analyses are time-intensive, but performing them protects you from major surprises during implementation – you find out in advance to what costs are likely and what security gains can be expected. Of course, no one is immune from every uncertainty. As Helmuth von Moltke (the Elder) wrote, “No plan of operations extends with certainty beyond the first encounter with the enemy’s main force.”

Best practice: taking full advantage of existing security solutions.

This is the bread and butter of security – and an ongoing issue. Put the standard solutions that you already have – whether technical or organizational – to their full use. I believe you can block more than 60 percent of attacks this way or drastically reduce the damage they cause. A few examples:

a)     You have a firewall around the perimeter of your production network. In many cases you have licensed IDS and IPS but don’t use them. Before you invest in new technologies, turn them on. They will block a lot of attacks coming from the company network – at no additional cost and without affecting production systems.

b)    Keep an eye on what’s happening with weak points. Subscribe to your software suppliers‘ security newsfeeds. In addition, subscribe to newsfeeds from US-CERT, BSI, Heise, etc. Every day, look to see whether critical weaknesses have be published for the products you use.

c)     Develop instructions for what to do if a relevant critical weakness is published. When such a case occurs, apply patches to applications at the network’s periphery (Internet DMZ, production DMZ) within 24 hours after the supplier approves them. Prepare an emergency procedure to follow if a supplier can’t provide patches after a critical weakness is published. Consider disconnecting the system from the network to be one option. Discuss how to do it with the units involved, and practice!

d)    Don’t work with permanent administrator rights, especially for domains – and most especially for infrastructure administrators. Limiting such rights drastically reduces the severity of the consequences of an attack. For example: Maersk and Merck would have suffered much less damage from NotPetya in 2017 if the employees who downloaded the malware hadn’t had administrator rights.

e)     Block the use of USB storage devices to reduce the likelihood of an inside attack. You can make the required changes to group policy with just a few clicks. And organize protected channels that employees can use to bring in data from USB drives in a controlled way.

f)     Activate application whitelisting. AppLocker has been part of the Windows operating system since Windows 7. Combined with d), whitelisting can block a large share of attacks with new types of malware that are initiated from the user context. There’s a prerequisite: you need the Enterprise version of Windows. Using application whitelisting is the number one recommendation from the US Department of Homeland Securityand the Australian Cyber Security Centre.

You work with both critical infrastructure and manufacturers. What are their similarities and differences in terms of cybersecurity?

I think the risk-based approach makes sense in both contexts. But I would always implement best practices in parallel. The main differences in the two areas are:

a)     Acceptable risk levels for critical infrastructure are specified by law. For example, §8a para. 1 of Germany‘s Act on the Federal Office for Information Security (BSIG) requires operators of critical infrastructure “to take appropriate organizational and technical precautionary measures in order to avoid disruptions of the availability, integrity, authenticity and confidentiality of their information technology systems, components or processes that are decisive for the functionality of the critical infrastructures operated by them … organizational and technical precautionary measures shall be considered appropriate, if the required efforts are not disproportionate to the consequences of a failure or an impairment of the critical infrastructure concerned.”

b)    The new version of this legislation places further bounds on the risk-based approach: §8a para. 1a requires the use of attack detection systems, even though with a MTTI of 160 days they are largely ineffective (2020 Ponemon study).

c)    Activities related to critical infrastructure focus on ensuring availability.

This focus on availability makes analyzing risk somewhat simpler. In addition, I always recommend assuming the probability of the event in question to be 1 in order to avoid discussions that don’t bring much clarity. Once you know which processes and assets influence the availability of critical services, you can have a less rigorous discussion about the likelihood that a risk event will occur.

What role do you see for company certifications? Who should try get certified and in what circumstances does doing so make sense – or not?

I think certifications are very helpful – as long as you’re not getting one just because you need to check it off your list. And as long as it doesn’t entail creating an epic of War and Peace proportions.

The main advantage of a certification (ISO 27001) is knowing who needs to do what if a security incident occurs. The point is to limit potential damage as much as possible. Every plant fire brigade has the same goal. Eisenhower, who of course was familiar with the concept of friction, expressed the idea well in 1957 when he said, “Plans are useless, but planning is indispensable!”

By the end of the certification procedure, you are sure to be familiar with your processes, assets, security measures, and risks. The owners of assets and risks have been identified and responsibilities and roles have been defined. You have laid out the steps for handling security events and practiced them, and actions to effectively manage the crisis and ensure business continuity are in place.

It’s important that C-level management wants the certification and initiates it. After all, certification always entails a change process too big to originate at the regular employee level. A member of the management team should also always act as the project sponsor.

Since even a lean certification process involves high costs and a lot of effort, it’s important to weigh the benefits in advance.

What kinds of problems typically motivate clients to seek you out? What issues are especially present in today’s market?

Many clients are looking for assessments before they take steps involving critical infrastructure. Due to the BSIG, clients are also thinking a lot about asset management, weak point management, and attack detection.

The Federal Office for Information Security categorized the 2020 cyberattack on Düsseldorf University Hospital as “preventable.” What’s your view of this incident and what can companies learn from it?

I discussed this topic extensively in a blog. I don’t believe that baseline IT protection (or, as the Office for Information Security calls it, “IT-Grundschutz” could have prevented an attack like this. In this case, there was no advance warning (the event was a zero-day exploit). The Office for Information Security issued a warning on the day the weakness was published, while Citrix’s warning came a day earlier, when it published the workaround. The workaround itself was implemented immediately. At the time of publication, Citrix said that attacks exploiting the weak point had already been observed. A month passed before a patch was available.

This all adds up to an extremely unfavorable situation! We aren’t generally prepared for attacks with no advance notice. Avoiding attacks with no warning is the objective of every defense strategy – the best example was the Cuba crisis.    

Application whitelisting or EDR solutions can ward off or limit attacks like these, but very few are available for Linux. What’s more, people still consider Linux to be the better operating system – some experts claim you don’t even need antivirus tools for it! So even if relevant solutions were available, people probably wouldn’t use them. And if you work with an appliance, for example, installing them wouldn’t be a simple matter either.

Essentially, the downstream systems are where you need to take action. The Citrix system routes users to a system at the data center. You can take steps at this level by rigorously firming these systems up and implementing strict zero-trust measures. Application whitelisting or EDR solutions could be used here, too.

If you could say something to top managers and the heads of IT and risk management, what would it be?

“Get what matters right the first time!”

Mr. Jochem, thank you for your time and the very interesting discussion. We look forward to talking with you again soon.

Please remember: This article is based our knowledge at the time it was written – but we learn more every day. Do you think important points are missing or do you see the topic from a different perspective? We would be happy to discuss current developments in greater detail with you and your company’s other experts and welcome your feedback and thoughts.

And one more thing: the fact that an article mentions (or does not mention) a provider does not represent a recommendation from CyberCompare. Recommendations always depend on the customer’s individual situation.