Of particular interest to me right now is the appropriate risk amount to report on for any given issue. Being IT folks –warning broad stroke in progress – we prefer to want “precise” numbers that are not refutable by anyone and are supported by the over-whelming amount of electronic data that we have at our disposal. However, in reality – and in the information security risk management space – we lack such data. As such, there are information security industry super-stars that discourage the idea of taking a stand on quantifying information security risk; and from my perspective – devalue the subject matter expertise (some industry folks water this down to the word “opinion”) that security professionals offer to their organization. I guess I am getting off-topic – so let’s get back to topic: appropriate risk value to report on.
Quite a few risk quantification tools and methodologies tend to produce a risk value often referred to as the “expected loss amount”. Typically, this is the product of a loss event frequency value (LEF for those FAIR-minded folks) and the average monetary loss magnitude. For most information security risk practitioners and the organizations that employ them, the expected loss amount may be the most appropriate risk value to articulate to decision makers for any given risk issue. However, an additional minute or two of analysis of your loss distribution could result in you wanting to articulate a risk amount different then the expected loss amount.
Let’s take a look at some phrases and a few examples.
Loss event frequency: The probable frequency of which we expect a loss to incur.
Average loss magnitude: This is the average (or mean) loss value from a simulation or actual loss events. For example, if I perform 1001 simulations where a value between $1 and $10 dollars is drawn– I would add up the sum of all the simulations and divide it by 1000.
Expected loss magnitude: This is the product of the loss event frequency (most often the mean LEF) and the average loss magnitude. For example, if my loss event frequency is 0.1 per year (once every ten years), and my average loss magnitude is $10,000; my expected loss magnitude would be $1000.
Remember what the median is? The median is the number that is directly in the middle of a range of numbers. For example, if we perform 1001 simulations where a value between $1000 and $20,000 could be drawn and the number in the middle (value number 501, when ordered from lowest to highest) is $10,000 – that is our median.
At this point we have what could be the first comparison in determining which risk value to report. Generally speaking, if the mean and the median are close to each other, then the data set – or loss magnitude values may not be too skewed. If the mean is a lot higher then the median, then this could be the result of large loss magnitude values that are having a significant impact on the mean – somewhat “inflating” the average loss magnitude. The same concept applies is the mean is a lot lower then the median.
In some cases, using the mean loss magnitude to calculate the expected loss magnitude is appropriate. In other cases, the median may be more appropriate because the values influencing the mean are so far out in the distribution – or tail – that it would be inappropriate to use the average loss magnitude.
Now let’s look at another example. We have a risk scenario where the average loss value (per event) is $73,400, and you expect on average, 4 loss events per year. The annual expected loss ($73,400 x 4) is $293,600. However, we are dealing with probabilities and distributions and in reality there could be one year where we only have one loss event related to this specific issue and some years where we might have 10 loss events. How do we deal with this?
I performed a small experiment to help me better understand this.
From a previous risk issue, I derived the mean and standard deviations from the simulated loss event frequency (LEF) values and loss magnitudes (LM) values. In Excel, I wrote a small VBA-macro that allows me to define some simulation parameters and reference both the LEF and LM mean and standard deviation values. For each simulation iteration, the macro generates an LEF value based off a distribution that leverages the LEF mean and standard deviation. Then for each LEF value ( I round to the nearest integer), the macro then generates a loss magnitude value for each loss event and then sums those loss magnitude values. For example, if my LEF is two, then my utility randomly generated two loss values, using a distribution that leverages the LM mean and standard deviation; then sums those two values. The simulation continues until the desired number of iterations is complete. For my small experiment, I performed a simulation consisting of 3001 iterations. You can see the LEF and LM means and standard deviations in the image below.
Now that we have simulated loss values, we want to visually represent them. I want to represent the values two ways.
This is a small scatter plot diagram with a smoothed line. In Excel we create loss magnitude bins and count the number of times each iteration’s loss magnitude sum fell into these bins. As you can see the loss magnitude values look normally distributed.
In this chart, I want to show the percentage of loss magnitude values in relation to the loss amounts themselves. So in this chart, my simulated loss is greater then $14,924; 99.999% of the time. However, there is roughly a 10% chance that the risk could be greater then $404,924.
So what does all of this mean? What it means is that even though our expected loss value was $293,600* – the simulation resulted in the values below:
The lowest simulated loss magnitude was: $14,924.
The largest simulated loss magnitude was: $620,000.
The mean (average) loss magnitude was: $308,636.
The median of the loss magnitude value was: $309,000.
There is a 20% chance (80th percentile or 1-in-5), that the loss amount could be: $380,000.
There is a 5% chance (95th percentile or 1-in-20), that the loss amount could be: $441,900.
Note: The values above would change from simulation to simulation – but not significantly assuming the input parameters (LEF and LM mean and standard deviation values) remain constant.
Note: It is important to note that the term “tail risk” is usually associated with values at the 97.5th percentile or greater, or less then 2.5% of the time. While the numbers at the 1-in-20 and various tail risk points are tempting to use: please keep in mind that these are low probability / high magnitude loss amounts. Grandstanding on these values just for the shock factor – is the equivalent of crying wolf and undermines the value we can provide to our decision makers.
Now, our decision maker is faced with a harder decision. Do I assume or mitigate the risk associated with an expected loss amount of $308,636 or does this 1-in-5 loss magnitude value of $380,000 stand out to me? While it may seem like we are dealing with a small difference between the mean and the 1-in-5 values – risk tolerance, risk thresholds, and risk management strategies vary between decision makers and organizations.
Here is the take away: as you start going down the risk quantification road keep the following in mind:
1. There is NO absolute 100% guaranteed predictable loss value – especially from a simulated loss distribution; but you have to report something. Thus choose a tool that lets you see the points from the distribution – not just a single value.
2. Be mindful of how you articulate risk values. A consistent theme I hear and read about on a regular basis is that risk implies uncertainty – always. You need to underscore this when articulating risk to leadership.
3. Have the discussion with your management / decision makers as to what loss value they would prefer to see. Their feedback may highly influence the value you report.
4. Use the right value for the right purpose. For single risk issues, expected loss amounts may be appropriate. For a loss distribution (model) that represents dozens or even hundred of risks – the 1-in-5, 1-in-10, 1-in-20 and maybe some tail risk values may be the best values to react to or budget for.
Have a great Memorial Day weekend!
* In the interest of transparency, the observant reader will notice that my mean LEF is actually 4.17. For simulation purposes, I have rounded generated loss values to the nearest integer. In a given year, you can’t have 4.17 loss events. You would either have 4 or you would have 5. However, if you take the product of 4.17 and $73,400; $306,078 – you will notice that it is within a few thousand dollars of the simulation’s mean and median values.