A security concept for a global factory network: Practical considerations in implementation

Authors: Michael Voeth, Clare Patterson and Jannis Stemmann; published in the Cyber Security: A Peer-Reviewed Journal, Volume 6, Issue 1. 

Author’s Biography

Michael Voeth started at Bosch 20 years ago in the special machinery and assembly systems unit, where he was responsible for standardising software, communication and data exchange to increase efficiency and output productivity. Subsequently he acted as product manager in the packaging machine organisation. For more than ten years he has been responsible for IT security for infrastructure and machine connectivity in manufacturing area for Bosch worldwide.

Clare Patterson is a cyber security researcher, having previously worked for Shell in cyber security strategy and transformation, and as the CIO of Shell Energy. She holds a MSc in cyber security from the University of London and is studying for a PhD at the University of Kent.

Jannis Stemmann previously worked with McKinsey & Co. as a consultant and was responsible for two factories of Bosch as Vice President of Manufacturing and worked as chief of staff for the Bosch board of management. He holds a PhD in engineering sciences from Hamburg University of Technology, Germany.

Abstract

On top of information technology (IT) security risks faced by almost all companies, manufacturers need to deal with additional challenges from operating their own factories. For example, typical difficulties arise from legacy systems, proprietary communication protocols and real-time requirements in highly automated production environments. At the same time, budget constraints need to be considered, as manufacturers often face strong competition (plants are usually unprofitable at low utilisation, and therefore each competitor has a strong incentive to lower prices down to the marginal cost of production). This paper explains the combined IT and operational technology (OT) security concept used by a corporation with a global manufacturing footprint operating in various industry sectors. Lessons learned from testing some security tools are included. In order to scale know-how and make security more affordable for companies in similar situations, the concept of a curated marketplace is introduced, and its implementation described.

INTRODUCTION: A GLOBAL AND DIVERSE MANUFACTURING FOOTPRINT

The Bosch Group is a global corporation founded in 1886 with close to €78bn in revenues and about 400,000 employees. There are currently four main business sectors:
•Mobility solutions (especially for automotive components);
•Industrial technology (electrical and hydraulic drives and controls as well as special machinery such as assembly lines and testing rigs for Bosch plants and research and development [R&D] facilities);
•Energy and building technology (security and communications, heating and water solutions);
•Consumer goods (power tools and household appliances).

Because of the diverse product and customer range, the underlying business units to some extent resemble mid-sized companies with a considerable degree of entrepreneurial independence. Bosch is majority owned by the Robert Bosch Foundation, a kind of charitable institution. In line with the spirit of its founder, management always strives to find a balance between economic, social and ecological interests. A long-term focus is at the core of company values. For example, all Bosch locations are already CO2 neutral since 2020.

The global plant network comprises nearly 250 factories in almost all countries, a typical plant having 500–2,000 local employees. The manufacturing equipment base includes about 125,000 machines with 750,000 information technology (IT) devices from more than 3,000 machine vendors, while about 10,000 machines are added or replaced each year. Manufacturing processes and degrees of automation vary significantly between product categories and countries.
In total, the estimated number of managed virtual area local networks (VLANs) is 4,000. About 30,000 technicians, including external and internal maintenance crews, are servicing the production equipment, and therefore regularly need access to eg human–machine interfaces (HMIs), industrial PCs or programmable logic controllers (PLCs).
In general, Bosch does not operate critical infrastructure. At the same time, most business units face substantial competition and are operating on tight budgets. Both aspects are relevant for IT/operational technology (OT) security and shared with most mid-sized businesses.

PLANTS INCLUDE MANY FUNCTIONS THAT NECESSITATE LOCAL COMPETENCIES

While manufacturing is at the core of each plant, many more functions are required for operation. Major other operation domains are process engineering, product engineering, logistics, quality, facility management (FM), alongside specialist teams such as health and safety executive (HSE). All these domains communicate with each other on the shop floor in a decentralised manner, not centrally coordinated. This is also a main difference to pure enterprise IT setups, which tend to be very centralised. Each plant has to run with a specific combination of national guidelines, products, customer and supplier connections, manufacturing processes and available team capacity. All of these factors influence IT risk assessments.
As a simple example, plants have to define if they need local server rooms which they have to take care of. There are also, however, many more complicated decisions and actions related to IT/OT security which are more efficiently taken on the ground, up to incident response in an emergency, possibly requiring 24/7 availability at short notice.
Responsibility for security of local IT and OT always rests with plant management, who are considered the business owners. Plant managers are supported by local plant IT coordinators, who ensure alignment with central directives and best practice exchange between locations.

PLANT SECURITY IS BASED ON A CONTINUOUS IMPROVEMENT CYCLE

Systematic management for Bosch means a continuous improvement cycle (according to the well-known plan–do–check–act (PDCA) logic). In this sense security is like safety, efficiency or quality: it can never be finished or achieved — the target state is constantly evolving.
Starting point of the improvement cycle is the definition of roles and responsibilities. Today there are easily about 1,600 pages of relevant norms and standards, which need to be filtered on what is relevant and how to implement the guidelines. Examples include COBIT for management responsibilities, the ISO 27001–27035 norm family for security management, TOGAF as framework for data architectures, applications, infrastructures as well as internal guidelines like the Bosch Production System. Of course, there are also country-specific regulations which have to be considered. Also, clarity of accountability and responsibilities with vendors and IT service providers as well as between central IT and local plant teams have to be considered.

As a next step, risk management guidelines such as ISO 31000 or NIST RMF are transferred into practical compliance and audit checklists. That way, implicit expectations are made explicit. Often in this step it becomes clear that guidelines can be interpreted in different ways. Assessments of each location’s IT are scheduled and carried out. Calibration between plants is helpful to decide on unclear cases.
As a result of self-assessments and audits, gaps between the current state and target state of risk management are shown. Action plans to mitigate IT/OT risks are then laid out. These plans have to consider resource constraints (often on both budget and personnel sides) and therefore be prioritised properly.

Examples for IT security controls can be found in the ISO 27k series; for ICS specific controls the ISO/IEC 62443 is a standard reference. A typical risk mitigation measure in plants is network segmentation. While this is general knowledge, the exact way to (a) cluster IT/OT assets into adequate groups and (b) define and assess the actually achieved degree of separation requires detailed expertise.
Implementation of agreed action plans in reality often takes longer than expected, and rollouts after initial pilots cover less than the anticipated infrastructure because of exceptions and technical difficulties.
The final step in this model cycle is the anchoring of any new processes in the organisation. Main elements include documentation, workflow automation, monitoring, reporting and competence management, eg by training employees and external partners.
During the course of such a cycle, laws and standards have been updated, new products have been developed, new plants have been built, new attack vectors have been discovered and new lessons have been learned. So the cycle starts again.

ZONE CONCEPT

The architecture used in most Bosch locations in principle is similar to standard reference architectures by, for example, NIST or the IEC. A high-level overview (see Figure 1) of a typical zone separation would show both vertical and horizontal network segmentation with defined conduits.
Enterprise zones shown in light blue are managed by the central IT function. Assets in these zones are patched and hardened according to state-of-the-art practices. As a rule of thumb, there are no legacy systems in these zones. The same rulesets are also relevant for local office zones shown in the darker blue.
Demilitarised zones (DMZ) are shown in yellow. There is a joint responsibility of central IT and central manufacturing IT for conduits between DMZ and office zones. Infrastructure in DMZ is also hardened and patched. The central DMZ contains global services for all plants, eg connectivity, remote access, asset management databases and software distribution. An asset inventory including all machine OT assets (eg up to ten assets per machine, with several industrial PCs) is crucial for systematic management. Local DMZ host servers for eg SCADA, MES or local databases.
Partially connected to the DMZ, machines and machine clusters are operated in security zones. These are separated into VLANs, with large plants having up to 100 different OT VLANs. As a general guideline, direct Internet connections are not permitted from inner zones. For example, no e-mail servers are operated in DMZ or local plant zones. New challenges in this regard are Internet of Things (IoT) devices such as video cameras that bypass the DMZ by directly connecting with an IoT gateway and offering accessibility from the Internet or from untrusted devices (eg service technicians that use webcams to monitor a process). The rate of change in OT environments also starts to make the management of firewall rules unwieldly. New concepts (‘zero trust’) are therefore under consideration.

SHELL MODEL: DEFENCE IN DEPTH

The zone concept discussed previously can be seen as the centre of a security shell model (also called defence in depth¹). Organisational and technical security controls are used to defend plants against various kinds of cyberattacks and misuse of IT/OT assets – the most common being disk operating system (DoS), mass malware, and targeted ransomware attacks.
•Shopfloor employees are trained with special campaigns on IT/OT security awareness and guidelines, using formats that take their situations, languages and requirements into account;
•Intrusion detection systems/intrusion prevention systems (IDS/IPS) are used to monitor conduits between enterprise and local zones;
•For endpoint protection, a general problem for manufacturers is the widespread use of legacy systems with easily exploitable vulnerabilities (eg industrial PCs running with Windows XP or Windows 7 operating systems) or which lack any basic security features like user authentication and authorisation (eg most industrial controllers). Therefore, Bosch developed an antivirus scan stick together with a security vendor which can be used to check machines during operation, so without the need for any interruption. The scan stick works with Windows XP and other common legacy systems and can be updated periodically with the newest signatures. It is obviously not an ideal endpoint protection solution but can remediate malware to a large extent. The tool is made commercially available also to external companies by the security vendor;
•Any universal serial bus (USB) devices or mass storages are checked for malware with a scan station before connection to machines. Mechanical protection caps are used to make it at least more difficult to insert USB connectors unintentionally;
•Another challenge is the heterogeneous machine landscape. Obviously, using individual remote access solutions for each individual machine supplier is not feasible. Therefore Bosch developed a proprietary remote shopfloor access solution, complete with three factor authentication, ticketing and monitoring system, which works across all plants and most connected machine types worldwide. Work safety is taken into account by having a local operator confirming or shutting down the remote access connection with a physical key switch at the machine or assembly line.

All of these defence methods are complemented by other controls as well as regularly tested and revised in line with the continuous improvement mindset described above.

LESSONS LEARNED ON ASSET INVENTORY AND ICS ANOMALY DETECTION SYSTEMS

An up-to-date IT/OT asset inventory is a basic requirement for any kind of systematic asset management (eg maintenance), not just security. Main objectives of the asset inventory process include completeness of the inventory and manual workload reduction. Information on assets should include items that can be polled automatically from endpoints (eg firmware and OS versions) and items that have to be added from other data sources (such as the asset ‘owner’, ie a responsible employee or team with contact data).

ICS (OT) anomaly detection systems, on the other hand, generally are not considered as being of similar fundamental importance for security. At the same time, their use is marketed by system vendors and consultants, and new directives in the US or Germany even mandate their use for some critical infrastructure providers. There is some overlap between the two aforementioned product categories, because ICS detection systems can to some extent be used for asset inventory creation, and maintenance. Implementation of a detection system without any asset inventory predefined is not recommended. The benefits of a thorough asset inventory reach far beyond security, however, and without proper inventory it is difficult to prioritise alarms from detection systems. A baseline inventory is also needed to ensure coverage of the detection sensors is as complete as intended.

During tests of several market-leading asset inventory systems at Bosch plants, it became apparent that due to the heterogeneous machine base, 10–15 per cent of actually connected assets were not found or identified correctly by any single asset inventory solution. Diverse and, in some cases, proprietary protocols were partially incompatible with the tested software, in spite of some vendors’ claims. As one of the authors likes to say, the machines talk in different dialects, and some of these dialects are difficult to understand even for native speakers.

Therefore, Bosch decided on a combination of systems to fill a configuration management database (CMDB) — to some extent still manually. All OT assets connected to the OT VLAN (explained previously) must be documented in the same CMDB as used for enterprise IT network components and servers from the central IT organisation. So more then 250,000 assets are current documented from manufacturing sites in the central CMDB and related to the existing VLAN segments, network infrastructure and server components. To detect attacks on inventoried assets in manufacturing, various systems and suppliers were tested in production sites. The first pilots were installed very close to the manufacturing lines. It turned out, however, that the ICS detection systems triggered a high number of alarms. Even after months of baselining, there were more than 1,000 false positive events per day for large factories. The analysis of the results yielded two main drivers:
•Older machines with undocumented data flows, often running on operating systems and applications out of support;
•Frequent process changes, eg caused by new product variants or process variable changes.

The experience was similar to one described in a recent article, stating ‘static rulesets generated excessive amounts of false positive alerts that led to analyst overload, signal blindness and alert fatigue’.²
Gradual replacement of legacy systems will certainly improve the amount of data flow documentation. The latter point in particular, however, seems to be different to power plant or refinery environments, where the process parameters presumably are more stable and communication protocols are more standardised. In automotive component manufacturing, daily process optimisation (within a controlled cycle including testing and validation) is often necessary to sustain competitiveness. Managing the events (including minuscule differences in signal timing or process values) created by an anomaly detection system therefore takes a vast amount of expert analyst capacity, which most manufacturing plants simply cannot afford. As stated during a recent OT security expert discussion,³ a typical alarm from an ICS anomaly detection system could indicate that a programmable logic controller (PLC) programme was changed. But this is business as usual and happens hundreds of times per month in any significant operation. In order to understand whether this was an issue or not, the asset owner would need to compare this alarm against the PLC update schedule and the schedule of the maintenance operator that was doing the upload. These schedules could actually be very flexible (eg due to regular overtime), so understanding the baseline in reality can be very difficult. Of course, there might be more stable environments that lend themselves to automating alarms. And ideally there would be a rigorous change management process powered by an automated workflow system or similar — but this is also far from reality or economic feasibility for most manufacturers. Additionally, the ICS detection sensors proved to be expensive, as one sensor per subnet is required, and plants use up to a hundred subnets. So the perceived cost/benefit ratio was much worse than for other security controls.
For specific use cases of electronic control units, for example, OT-specific intrusion detection is used; however, a general rollout to all manufacturing sites is not a prioritised action so far, subject to review.

CHALLENGES IN THE CYBER SECURITY MARKET: WHY BOSCH ESTABLISHED CYBERCOMPARE

As demonstrated by the few examples discussed in this article, Bosch had and is having the same challenges as most other owners of IT and OT assets:

•IT departments are capacity-constrained and technical security specialists are rare;
•IT security budgets are limited;
•Security vendors actually spend a high amount of their budgets on marketing and sales. A recent study shows that on average, more than 40 per cent of revenues of IT security vendors are used for sales and marketing.⁴ In fact, annual reports and Intellectual Property Office (IPO) prospectuses of publicly listed security companies show marketing and sales costs of more than 60 per cent of revenues for some very well-known vendors, and only a small fraction of this number being invested in product development;
•There are hundreds of security vendors in each category, and every vendor claims that their solution is technologically leading and crucial for an adequate security posture;
•The asset owners need independent advice on how to get the most security for their budget (or put another way, how to get the maximum amount of risk reduction for their budget). That could very well mean taking a low-cost organisational measure instead of the newest artificial intelligence (AI) solution;
•If a technical solution and/or a consulting service is needed, how should it be specified? Most companies start from scratch when writing a requirement specification for an endpoint detection and response (EDR) system, awareness trainings, phishing campaigns or a penetration test. In fact, IT departments of many mid-sized businesses do not even specify requirements, which makes a comparison of offers difficult;
•Asset owners are reluctant to spread sensitive information on security gaps widely in the market. Negotiating non-disclosure agreements (NDAs) with multiple vendors is additional workload. Taken together, this often effectively leads to a lock-in with existing partners;
•Evaluation of quotes — making them really comparable — is very time-consuming, especially if employees assigned to this task are doing it for the first time for a specific topic. Most companies purchase an endpoint protection platform, a public key infrastructure (PKI) system or similar security solutions only once, or re-request for quote (RfQ) it only every three to five years. The same is true for special consulting services like network architecture design or threat and risk analyses;
•Asset owners are also looking for hints from others, eg how to write an effective emergency guideline, or how to conduct and evaluate proof-of-concept tests of a network detection and response (NDR) system. Learning from others who have gone through the set-up and teething troubles of installing software tools typically is a huge benefit. CyberCompare can take these lessons to ensure companies ask the right questions in RfQs and appreciate what is really required to achieve the benefits in the glossy marketing brochures. For example, lots of detection software vendor material talks about the amazing AI used to correlate events and produce whizzy graphs, but does not mention all the effort of getting the data-feeds working and normalised, approvals needed (eg from workers’ councils), setting up rules for data retention and disposal, fine-tuning the alerts and sustaining it all through different upgrade paths.

In an effort to make the cyber security market more efficient, and security more affordable, Bosch therefore founded a subsidiary to address some of these challenges: Bosch CyberCompare. In essence, this subsidiary works as a curated semi-digital marketplace and vendor-neutral adviser to chief information officers (CIOs):
•In contrast to managed security service providers (MSSPs) or distributors, CyberCompare does not have any exclusive partner agreements or reselling contracts with vendors. At the same time, providers can be recommended based on experience with Bosch and feedback from external CyberCompare clients;
•Specification templates are provided along with input from independent subject matter experts, eg specialists for EDR/managed detection and response (MDR) implementation that have actually tested systems of various vendors in parallel;
•Clients can request quotes under the Bosch brand, so that tenders are anonymised and sensitive information of clients is safeguarded;
•Quotations are evaluated transparently, and recommendations are made to the client;
•For vendors and consultants, CyberCompare is often the most efficient sales channel, as client leads are pre-qualified and all required information is provided transparently upfront. Also, highly qualified specialists can differentiate themselves more easily from low-cost providers with limited qualifications.

Examples for successful security projects include standard services like penetration tests or security trainings as well as large-scale efforts like managed security information and event management (SIEM) and security operations centres (SOCs). While in the beginning, CyberCompare was aimed at the Bosch supply chain, the service is now being used by a much wider range of external clients now. The company started in Germany and has expanded to other parts of Europe such as the UK.

SUMMARY

Businesses with global manufacturing footprints have to cope with more security challenges than either regional companies or enterprises without significant operations technology. For those companies not operating critical infrastructure, resources for IT/OT security are often particularly limited and need to be invested efficiently. Bosch is reviewing its security concepts such as zones, conduits and other controls, in line with a structured process of continuous improvement. At the same time, lessons learned are offered via a subsidiary (Bosch CyberCompare) in order to make security more affordable and the cyber security market more efficient.

References
  1. Mosteiro-Sanchez, A., Barcelo, M., Astorga, J. and Urbieta, A. ‘(2020), ‘Securing IIoT using defence-in-depth: Towards an end-to-end secure industry 4.0’, Journal of Manufacturing Systems, Vol. 57, pp. 367–378.
  2. Kipling, L. (2020), ‘The industrial Internet of Things: From preventive to reactive systems: Redefining your cyber security game plan for the changing world’, Cyber Security: A Peer-Reviewed Journal, Vol. 4, No. 2.
  3. Peterson, D., Interview with Pascal Ackerman, author of Industrial Cybersecurity, Volumes 1 and 2 (2022), ‘Unsolicited Response Podcast’, available at https://unsolicitedresponse.libsyn.com/interview-with-pascal-ackerman-author-of-industrial-cybersecurity-volumes-1-and-2 (accessed 3rd May, 2022).
  4. Debate Security (October 2020), ‘Cybersecurity Technology Efficacy: Is cybersecurity the new “market for lemons”?’, available at https://www.debatesecurity.com/downloads/Cybersecurity-Technology-Efficacy-Research-Report-V1.0.pdf (accessed 3rd May, 2022).
Citation

Voeth, Michael, Patterson, Clare and Stemmann, Jannis (2022, September 1). A security concept for a global factory network : Practical considerations in implementation. In the Cyber Security: A Peer-Reviewed Journal, Volume 6, Issue 1. https://doi.org/10.69554/GYZJ2649.

Source/ Original Publisher: Henry Stewart Publications.