The Risk Is Right.

Of particular interest to me right now is the appropriate risk amount to report on for any given issue. Being IT folks –warning broad stroke in progress – we prefer to want “precise” numbers that are not refutable by anyone and are supported by the over-whelming amount of electronic data that we have at our disposal. However, in reality – and in the information security risk management space – we lack such data. As such, there are information security industry super-stars that discourage the idea of taking a stand on quantifying information security risk; and from my perspective – devalue the subject matter expertise (some industry folks water this down to the word “opinion”) that security professionals offer to their organization. I guess I am getting off-topic – so let’s get back to topic: appropriate risk value to report on.

Quite a few risk quantification tools and methodologies tend to produce a risk value often referred to as the “expected loss amount”. Typically, this is the product of a loss event frequency value (LEF for those FAIR-minded folks) and the average monetary loss magnitude. For most information security risk practitioners and the organizations that employ them, the expected loss amount may be the most appropriate risk value to articulate to decision makers for any given risk issue. However, an additional minute or two of analysis of your loss distribution could result in you wanting to articulate a risk amount different then the expected loss amount.

Let’s take a look at some phrases and a few examples.

Loss event frequency: The probable frequency of which we expect a loss to incur.

Average loss magnitude: This is the average (or mean) loss value from a simulation or actual loss events. For example, if I perform 1001 simulations where a value between $1 and $10 dollars is drawn– I would add up the sum of all the simulations and divide it by 1000.

Expected loss magnitude: This is the product of the loss event frequency (most often the mean LEF) and the average loss magnitude. For example, if my loss event frequency is 0.1 per year (once every ten years), and my average loss magnitude is $10,000; my expected loss magnitude would be $1000.

Remember what the median is? The median is the number that is directly in the middle of a range of numbers. For example, if we perform 1001 simulations where a value between $1000 and $20,000 could be drawn and the number in the middle (value number 501, when ordered from lowest to highest) is $10,000 – that is our median.

At this point we have what could be the first comparison in determining which risk value to report. Generally speaking, if the mean and the median are close to each other, then the data set – or loss magnitude values may not be too skewed. If the mean is a lot higher then the median, then this could be the result of large loss magnitude values that are having a significant impact on the mean – somewhat “inflating” the average loss magnitude. The same concept applies is the mean is a lot lower then the median.

In some cases, using the mean loss magnitude to calculate the expected loss magnitude is appropriate. In other cases, the median may be more appropriate because the values influencing the mean are so far out in the distribution – or tail – that it would be inappropriate to use the average loss magnitude.

Now let’s look at another example. We have a risk scenario where the average loss value (per event) is $73,400, and you expect on average, 4 loss events per year. The annual expected loss ($73,400 x 4) is $293,600. However, we are dealing with probabilities and distributions and in reality there could be one year where we only have one loss event related to this specific issue and some years where we might have 10 loss events. How do we deal with this?

I performed a small experiment to help me better understand this.
From a previous risk issue, I derived the mean and standard deviations from the simulated loss event frequency (LEF) values and loss magnitudes (LM) values. In Excel, I wrote a small VBA-macro that allows me to define some simulation parameters and reference both the LEF and LM mean and standard deviation values. For each simulation iteration, the macro generates an LEF value based off a distribution that leverages the LEF mean and standard deviation. Then for each LEF value ( I round to the nearest integer), the macro then generates a loss magnitude value for each loss event and then sums those loss magnitude values. For example, if my LEF is two, then my utility randomly generated two loss values, using a distribution that leverages the LM mean and standard deviation; then sums those two values. The simulation continues until the desired number of iterations is complete. For my small experiment, I performed a simulation consisting of 3001 iterations. You can see the LEF and LM means and standard deviations in the image below.

risk_right_1_090521

Now that we have simulated loss values, we want to visually represent them. I want to represent the values two ways.

risk_right_2_090521

This is a small scatter plot diagram with a smoothed line. In Excel we create loss magnitude bins and count the number of times each iteration’s loss magnitude sum fell into these bins. As you can see the loss magnitude values look normally distributed.

risk_right_3_090521

In this chart, I want to show the percentage of loss magnitude values in relation to the loss amounts themselves. So in this chart, my simulated loss is greater then $14,924; 99.999% of the time. However, there is roughly a 10% chance that the risk could be greater then $404,924.

So what does all of this mean? What it means is that even though our expected loss value was $293,600* – the simulation resulted in the values below:

risk_right_4_090521

The lowest simulated loss magnitude was: $14,924.
The largest simulated loss magnitude was: $620,000.
The mean (average) loss magnitude was: $308,636.
The median of the loss magnitude value was: $309,000.
There is a 20% chance (80th percentile or 1-in-5), that the loss amount could be: $380,000.
There is a 5% chance (95th percentile or 1-in-20), that the loss amount could be: $441,900.

Note: The values above would change from simulation to simulation – but not significantly assuming the input parameters (LEF and LM mean and standard deviation values) remain constant.

Note: It is important to note that the term “tail risk” is usually associated with values at the 97.5th percentile or greater, or less then 2.5% of the time. While the numbers at the 1-in-20 and various tail risk points are tempting to use: please keep in mind that these are low probability / high magnitude loss amounts. Grandstanding on these values just for the shock factor – is the equivalent of crying wolf and undermines the value we can provide to our decision makers.

Now, our decision maker is faced with a harder decision. Do I assume or mitigate the risk associated with an expected loss amount of $308,636 or does this 1-in-5 loss magnitude value of $380,000 stand out to me? While it may seem like we are dealing with a small difference between the mean and the 1-in-5 values – risk tolerance, risk thresholds, and risk management strategies vary between decision makers and organizations.

Here is the take away: as you start going down the risk quantification road keep the following in mind:

1.    There is NO absolute 100% guaranteed predictable loss value – especially from a simulated loss distribution; but you have to report something. Thus choose a tool that lets you see the points from the distribution – not just a single value.

2.    Be mindful of how you articulate risk values. A consistent theme I hear and read about on a regular basis is that risk implies uncertainty – always. You need to underscore this when articulating risk to leadership.

3.    Have the discussion with your management / decision makers as to what loss value they would prefer to see. Their feedback may highly influence the value you report.

4.    Use the right value for the right purpose. For single risk issues, expected loss amounts may be appropriate. For a loss distribution (model) that represents dozens or even hundred of risks – the 1-in-5, 1-in-10, 1-in-20 and maybe some tail risk values may be the best values to react to or budget for.

Have a great Memorial Day weekend!

* In the interest of transparency, the observant reader will notice that my mean LEF is actually 4.17. For simulation purposes, I have rounded generated loss values to the nearest integer. In a given year, you can’t have 4.17 loss events. You would either have 4 or you would have 5. However, if you take the product of 4.17 and $73,400; $306,078 – you will notice that it is within a few thousand dollars of the simulation’s mean and median values.

Advertisements

10 Responses to The Risk Is Right.

  1. […] it strikes me as very useful. For those FAIR fans out there, it is very applicable to using FAIR. The Risk Is Right. << Risktical Ramblings Tags: ( risk-management […]

  2. Patrick Florer says:

    Hi, Chris –

    My comments from this morning seem to have disappeared, so here goes again.

    Median: if there are an odd number of values in the sample or population, then it’s the middle value, as you say. If the number of values in the sample is even, then some people say to take the average of the two values closest to the middle. Take your pick.

    Question: What do you mean by the following language?

    “a distribution that leverages the LM mean and standard deviation …”

    Is this a calculation of some sort? If so, what are you doing with the mean and std deviation?

    Comment: since FAIR Lite uses a betaPERT function that gives weight to the most likely estimate, why would you want to go through the exercise that you have? The sampling of the distribution is, by definition, skewed towards the most likely value.

    Comment: for some types of events – a large data breach, say – even if the simulation calculates multiple loss events per year, common sense tells us that once the event happens, it is not likely to happen again, during the same year, at least. In this case, we have to set LEF = 1 and calculate LM from there.

    Comment: I have found it useful to do FAIR Lite analyses in groups of three: a most optimistic, a most likely, and a least likely scenario. By presenting the results together, I can express a range of estimates that I, at least, find useful in making a decision. I have a model that does this, if you would like to see it.

    Thanks for you post!

    Best regards,

    Patrick Florer
    Dallas

  3. Chris Hayes says:

    @Patrick – Thanks for the comment. Your first post was tagged as SPAM by WordPress. From my perspective, quantifying the risk and modeling the risk are two separate things. So when I say leverage – what I am doing is taking the mean and standard deviation of all the loss magnitude values from a risk quantification simulation tool (FAIRLite) and using them for modeling purposes. To be more specific, within Excel, you can use the NORMSINV function to randomly generate values from a normal distribution if you know the mean and standard deviation from the distribution. The beauty of this is that you are now able to account for values further in the tail of the distribution. For example, if the FAIRLite tool generated loss event values of:

    Minimum: 1
    Most Likely: 4
    Maximum: 8

    Using NORMSINV with the MEAN and STDEV from all the simulation values may give you some values further in the tail; possible zeros or maybe a 9, possibly a 10. So how does this relate to betaPert – betaPert is bounded by the subject matter expertise. While that may be more then sufficient for generating a distribution that accounts for 90% certainty – leveraging the approach above for modeling purposes extends that certainty further.

    Regarding your loss event comment, your comment regarding ‘common sense tells us that the first loss event…”. I do not disagree with your statement. But this may not be the case for a lot of risk scenarios – especially in the world of AV. In addition, we have to remember that we are here to give the decision maker the best information we have. We can not assume that there will be mitigation activity taken just because one loss event occurred. Regardless, in those scenarios where you think a one time, large loss magnitude event is going to result in response / mitigation activity that result in the future (after the initial loss event) LEF being near 0, I would articulate that to management. For me personally, I would not adjust the loss impact downwards until after I have had a conversation with management.

    Thanks for the comment. Let me know if further clarification is needed. Also, please do share anything you are working on that you think would benefit the community at large.

  4. Patrick Florer says:

    @Chris –

    Thank you for the very helpful clarifications.

    Fascinating stuff, especially trying to get your hands around the tail events in a believable and meaningful way.

    My years of work in medical outcomes taught me the importance of what I would call the “sniff” test – while it’s true that our brains aren’t really wired to think correctly about probability, at the end of the day, every analysis has to pass the test of common sense.

    I have been thinking for some time to try sampling an exponential or Pareto distribution instead of a normal distribution for comparison purposes, but my math skills are a bit rusty. It seems like that might be another way to get at estimates of tail events. Maybe you know how to do this?

    We agree about LEF – some types of events like AV are likely to occur much more frequently than once per year.

    Patrick

  5. Chris Hayes says:

    @Patrick – I would be very cautious about using various distributions. I have often heard comments about matching the distribution to the shape of the curve.

    But this is not always the right approach. For example, in the case of the exponential distribution, it is usually leveraged when in scenarios where you need to represent the time between events. In absence of a large amount of data – this would seem to be problematic for most information security events.

    The beauty of the betaPERT is that the SME estimates can take the form of curves related to other continuous distributions (e.g. Log normal, F, maybe even exponential). There are philosophical types out there that even debate the normal distribution being past its prime. They can debate all they want. In the mean time – I believe that how RMI is leveraging the betaPERT in FAIRLite offers the most flexibility from a risk quantification perspective.

    Finally, regarding the sniff test – I absolutely agree. I recently “modeled” some risk issues where the aggregate expected loss magnitude value a very large amount ~$12M. There was no way this was close to being accurate. The loss estimates that were provided were no where close to being realistic. SNIFF test WIN in this case – not the right risk amount. The ironic part was that a few model iterations later, where the aggregate expected loss magnitude was around a tenth of the original – one of the people that had provided input to the original estimates stated the more realistic model was too much. You should have seen the look I received when I showed this individual the $12M model with the original estimates.

    Good discussion.

  6. […] The Risk Is Right. « Risktical Ramblings (tags: riskanalysis fair chrishayes) […]

  7. Adam says:

    Hi Chris,

    I think this might all be great, if we had some reasonable way to determine the numbers which go in.

    You say “For example, if I perform 1001 simulations where a value between $1 and $10 dollars is drawn– I would add up the sum of all the simulations and divide it by 1000.”

    How do you go about determining that?

  8. Chris Hayes says:

    @Adam – Thanks for reading the blog post and taking a few minutes to leave a comment. There are probably a few “reasonable” risk assessment methodologies and various tools one could leverage to feed into a loss model. I am very fond of FAIR and the FAIRLite tool. I do not consider myself just a “user” of the methodology / tool – but someone that understands every aspect of it. Working in the financial services industry – insurance – our information security risk practitioners have to be ready at all times to talk about how they derived either quantitative or qualitative risk values. In addition, both the FAIR methodology and the FAIRLite tool has withstood some pretty serious scrutiny from academia, information security professionals, as well as Fortune 100 enterprise risk management (ERM) professionals and modelers.

    Subject Matter Expertise in regards to risk assessments and loss models – especially in absence of large data sets – is a very acceptable method of model or tool input. However, there are prerequisites. There needs to be calibration or estimation training; usually with the goal of making estimates of which the SME feels 90% confident of the values they are providing. There needs to be reasonable logic in choosing priors (if available). Are the estimates for threat event frequency based off information your organization has, the industry your organization is in, etc. The reality is that the threat landscape changes from organization to organization.

    I am a strong advocate of FAIR / FAIRLite because when used right – it usually results in better decision making, consistent use of a simple methodology, and easier conversations with management.

    Finally – pardon the soap box moment; it is not directed at you – IT folks – especially information security folks need to take a step back and really understand that uncertainty and risk go hand in hand. I strongly believe that if those in the information security industry that frown upon and discourage risk quantification would learn more about the risk management profession – we could mature this discipline within our profession very quickly. This is no doubt easier said then done.

%d bloggers like this: