A shock to the system
When government officials in Wuhan confirmed in late 2019 that they were treating dozens of patients infected by an unfamiliar virus, few predicted that the entire world would be facing the health and economic challenges that we now confront. There is no greater, more striking example today of the manifestation of systemic risk than the COVID-19 pandemic. A seemingly isolated incident, in a seemingly remote location, created a ripple effect that devastated healthcare systems and crippled even the most resilient of economies. Some have described it as a “black swan” event, but the fact is that the risk – the systemic risk – of a pandemic has been structurally building for decades.
COVID-19 can help us unpack what kinds of structures are prone to systemic risks. The cybersphere is certainly one of them. Here, we explore these parallels and discuss approaches to identifying and mitigating systemic cyber risk within and across organisations.
“Internal and third-party cyber-dependences can be highly opaque.”
What are we actually talking about?
Traditionally found in the lexicon of financial sector analyses, “systemic risk” has been variously defined as the risk of ‘a breakdown of the entire system’ 1, ‘the risk connected to the complete failure of a business, a sector, an industry’2 , and ‘the probability that cumulative losses will occur from an event that ignites a series of successive losses along a chain’,3 to cite but a few. According to the World Economic Forum, systemic cyber risk is ‘the risk that a cyber event… at an individual component of a critical infrastructure ecosystem will cause significant delay, denial, breakdown, disruption or loss, such that services are impacted not only in the originating component but consequences also cascade into related… ecosystem components.’4
Across many policy advisory groups and think tanks, “systemic risk” in cyber security has largely been used to describe the potential for a cyber incident, or series of incidents, to create global shockwaves of a devastating nature. However, it can also be a useful exercise to examine systemic risk through a much narrower lens – focusing on what it could look like within a single business, or between clusters of interdependent organisations or industries.
So, what does this look like in practice? If we consider that a “chain-reaction” is a defining property of systemic risk, then a good place to start is looking at what links that chain is made of when we think of cyber risk.
Intra-organisational risk. In this scenario, an isolated cyber incident impacting a system or network within one part of an organisation leads to a total failure of an organisation’s critical infrastructure. Consider the potential impact of just a single employee clicking on a malicious link embedded in a phishing email. This may allow the threat actor to deploy a remote access trojan, granting them a level of control over the infected computer. From this initial access point, the hacker is able to move laterally through the organisation’s network and ultimately execute a ransomware attack that encrypts all company data and brings business to a standstill.
Inter-organisational risk. Here, an organisation is critically impacted by a cyber incident occurring at a third-party service provider or along its supply chain. Consider the implications of the likes of Amazon Web Services (AWS) experiencing a general outage: the more than one million companies5 who rely on AWS for their cloud-computing infrastructure could be impacted. A prolonged outage could see dependent AWS customers go out of business.
The defining features of systemic risk
Sociologist Charles Perrow attributes organisational failure to two structural factors: complexity and tight coupling.6 Here, complexity refers to systems that are interconnected in ways not immediately obvious or visible. Opaque systems make it harder to diagnose issues and to predict the impact of an incident arising in one part of the system on the rest of it. The linkages between the component parts are obscure. Tight coupling refers to the close integration of different components of a system. The tighter the integration across a system, the harder it is to prevent any single incident from cascading through it.
COVID-19 presents both characteristics as it travels through a highly interconnected and tightly integrated globalised system. Similarly, consider the subprime mortgage meltdown that spurred the 2008 financial crisis. Mortgage-backed-securities are financial products that are made up by bundling together several home loans. The bundling and re-bundling of subprime loans in the run-up to the financial crisis made it increasingly difficult to see what these products were made of and challenging to identify the risk of their imminent collapse. Furthermore, the highly interconnected nature of the financial industry meant that when Lehman Brothers fell as a result, the other banks were lined up like dominoes.7
The cybersphere is similarly prone to these two structural features. An organisation’s internal and third-party cyber-dependences can be highly opaque, and the threat landscape noisy and confusing. And we live in a world that has never been more tightly coupled in terms of our connectivity (both logically and geographically). This coupling has only increased since the onset of COVID-19 and continues to accelerate at a breakneck pace. It is therefore imperative that the systemic risks baked into an organisation’s internal structures and supply chain be clearly identified, the connections mapped out, the interdependencies accounted for. But how?
Back to basics: a risk-based approach to cyber security
Enough talk of global meltdowns and organisational failure. Systemic risks abound in the cybersphere, but they can be managed.
The first step is to identify them. Doing so within the confines of your own organisation is one thing, but assessing systemic risk across a long supply chain is a more daunting task. Keeping in mind that systemic risk is the risk of a total meltdown, a good place to start is identifying your own organisation’s key points of critical failure. We can think of critical failure across two categories:
“A good place to start is identifying your own organisation’s key points of critical failure.”
Critical activities. Forget cyber security for a minute. What does your business need to do on an ongoing basis to continue to survive and thrive? Now, work your way backwards to map out all the various dependencies that emanate from these basic, critical activities. Who does your business rely on (internally/externally) to do them? If they were subject to a cyber-attack, can you trace the chain of events all the way back to your business-critical functions?
Critical exposure. Which people, vendors, partners or clients (even if not business-critical functionally) have sufficient access to your networks and/or data to present a systemic vulnerability? For example, if they were hacked, could the threat actors gain access to your networks and/or data?
Completing this exercise will allow you to trace, identify and then rank your internal and external dependencies. In other words, it will reduce unnecessary complexity from the equation and clarify your areas of vulnerability. Already, you’re addressing the first of the two structural factors that contribute to systemic risk.
Now, build some resilience into these structures to reduce how tightly coupled these critical dependencies are. Foundational cyber risk mitigation techniques can be applied here. Generally speaking, cyber risk treatment covers three broad categories:
- Reduction: This includes implementing cyber security controls to alter, reduce or eliminate a risk. For example, implementing endpoint protection or classifying your data. While these controls may result in a reduction of operational efficiency in the short-run, they will lead to greater resilience in the longer term.
- Transference: Also known as risk assignment, this takes the form of having another entity assume the risk and is usually achieved through insurance.
- Acceptance: Here, the negative impact of the risk is accepted by the organisation. This is a common response to minor risks or ones where the cost of implementing the mitigation outweighs the potential impact.
“Controls may result in a reduction
of operational efficiency in the short-run,
but will lead to greater resilience in the longer term.”
Returning to the example above, our team recently worked closely with a software company that had moved the bulk of its operations to various cloud platforms. Their product was hosted on AWS and their office network existed solely for the purpose of accessing the internet. Our cyber risk assessment indicated that a sustained AWS outage presented a systemic risk to the organisation through the following:
- First, the availability of their product would be disrupted.
- Second, we discovered that many of their critical vendors also relied on AWS.
In developing a mitigation roadmap for the organisation, we segmented our risk management approach into the
three categories of reduction, transference, and acceptance described above. We therefore made the following recommendations:
- We suggested that they reduce this risk by implementing redundancy between the company’s AWS zones and testing their business continuity and disaster recovery plans. This mitigated the impact of a partial AWS failure; in other words, it created some space between a partial AWS failure and total business failure.
- We recommended transferring some of the risk through the purchase of an insurance policy to lessen the financial impact of an incident in terms of lost revenue. Again, insurance here serves to partially decouple a cyber incident from an organisation’s critical need to fund its activities.
- Finally, the organisation accepted a certain level of systemic risk. To implement additional controls to further reduce the risk of a total AWS failure would require maintaining so much redundant infrastructure that their product would not be commercially viable.
Indeed, not all systemic risks can be mitigated. But by reducing complexity, introducing clarity, mapping out dependencies and building slack into the system, they can be reduced. And with the right controls and preparation in place, should they manifest, they can be managed.
Ranking your third parties for cyber risk
How much access does the third party have to your network?
Answer 2 key questions:
1. How much access will the third party inherently have?
Many Managed Service Providers (MSP) inherently require network access to carry out their function.
2. How much access does a third party need to fulfil their function?
Third parties may sometimes request excessive access simply for the purpose of expediency.
What data types will be accessible to the third party?
Different data types have different operational, reputational and legal implications associated with their theft or exposure. Make sure to consider:
1. The value of the accessible data to your business.
Examples of highly valuable data include Intellectual Property (IP) as well as data essential to business continuity.
2. The regulatory value of that data.
Standards such as the California Consumer Privacy Act (CCPA) or General Data Protection Regulation (GDPR) levy hefty fines against organisations that do not manage and protect covered data.
3. The reputational value of that data.
Exposing your customers’ PII and/or sensitive financial or healthcare information can result in a devastating loss of trust, tarnishing your brand for an extended period.
How business-critical is the third party?
It is important to ask how the loss of a given third party would impact your business, and ensure that business-critical third parties present robust cyber security postures, because:
1. A cyber-attack against a business-critical third party can lead to significant operational disruption for your organisation.
2. Replacing business-critical third parties(should they fail your cyber due diligence) is likely to be more onerous than replacing the more “nice-to-have” service providers.
In combination, these three elements (access, data-types and business-criticality) can be weighted in line with your business’s priorities and used to group third parties into buckets comprising high, medium, and low priority for cyber due diligence.
1. ‘Systemic Risk & Management in Finance’, CFA Institute. Available at: https://www.cfainstitute.org/en/advocacy/issues/systemic-risk
2. ‘Systemic Risk vs. Systematic Risk: What’s the Difference?’, Investopedia. Available at: https://www.investopedia.com/ask/answers/09/systemic-systematic-risk.asp
3. Kaufnam, G. ‘Bank failures, systemic risk, and bank regulation’, The Cato Journal, February 1996.
4. ‘Understanding Systemic Cyber Risk’, World Economic Forum, October 2016.
5. ‘Who’s Using Amazon Web Services? [2020 Update]’, Contino, 28 January 2020. Available at: https://www.contino.io/insights/whos-using-aws
6. Charles Perrow, Normal Accidents: Living with High-Risk Technologies, 1984.
7. ‘Warren Buffett: In the 10 years since financial panic, we’ve learned we’re ‘all dominoes’ spaced closely together’, CNBC, 10 September 2018. Available at: https://www.cnbc.com/2018/09/10/warren-buffett-2008-financial-crisis-showed-we-are-all-dominoes.html