It Is Too Hard Or Impossible…

July 15, 2014

** Admitting that you don’t know how to make the sausage will always cast doubt on the quality of the sausage you do produce. **

One of my personal risk management aggravations relates to risk management professionals that claim it is too hard or impossible to quantify the frequency or severity of loss. First, there is the irony that we operate in a problem space of uncertainty and then make absolute statements that something cannot be done. When I witness this type of uttering, I will often challenge the person on the spot – keeping in mind the audience – in an effort to pull that person off the edge of mental failure. And make no mistake, I struggle with quantification as well – but to what degree I share that with stakeholders or peers is an aspect of professional perception that I intentionally manage. Reflecting on my own experiences and interactions with others, I want to share some quick litmus tests I use when addressing the “it is too hard or impossible” challenges.

1. Problem Scoping. Have I scoped the problem or challenge too broadly? Sometimes we take these super-big, gnarly problem spaces and become fascinated with them without trying to deconstruct the problem into more manageable chunks. Often, once you begin reducing your scope, the variables that drive frequency or severity will emerge.

2. Subject Matter Experts. This is one litmus test that I have to attribute to Jack Jones and the FAIR methodology. Often, we are not the best person to be making subject matter estimates for the variables that factor into the overall risk. The closer you get to the true experts and extract their knowledge for your analysis, the more robust and meaningful your analysis will become. In addition, leveraging subject matter experts fosters collaboration and in some cases innovation where leaders of unrelated value chains realize there is opportunity to reduce risk across one or more chains.

3. Extremes and Calibration. Once again, I have Jack Jones to thank for this litmus test and Doug Hubbard as well. Recently, a co-worker declared something was impossible to measure (workforce, increased expense related). After his “too hard” declaration, I simply asked: “Will it cost us more than $1BN?” The question stunned my co-worker, which resulted in a “Of course not!” to which I replied “It looks like it is greater than zero and less than 1 billion, we are making progress!” Here is the point, we can tease extremes and leverage calibration techniques to narrow down our uncertainty and define variable ranges versus anchoring in on a single, discreet value.

4. Am I Trying Hard Enough. This is a no-brainer but unfortunately I feel too many of us do not try hard enough. A simple phone call, email or even well crafted Google query can quickly provide useful information that in turn reduces our uncertainty.

These are just a few “litmus tests” you can use to evaluate if an estimation or scenario is too hard to quantify. But here is the deal, as risk professionals it is expected that we deal with tough things so our decision makers don’t have too.


Simple Risk Model (Part 4 of 5): Simulating both Loss Frequency & Loss Magnitude

February 5, 2011

Part 1 – Simulate Loss Frequency Method 1
Part 2 – Simulate Loss Frequency Method 2
Part 3 – Simulate Loss Frequency Method 3

In this post we want to combine the techniques demonstrated in parts two and three into a single simulation. To accomplish this simulation we will:

1.    Define input parameters
2.    Introduce VBA code – via a macro – that consumes the input parameters
3.    Perform functions within the VBA code
4.    Take the output from functions and store them in the spreadsheet
5.    Create a histogram of the simulation output.

Steps 3 & 4 will be performed many times; depending on the number of iterations we want to perform in our simulation.

You can download this spreadsheet to use as a reference throughout the post. The spreadsheet should be used in Excel only. The worksheets we are concerned with are:

test – This worksheet contains code that will step through each part of the loss magnitude potion of the simulation. By displaying this information, it allows you to validate that both the code and calculations are functioning as coded. This tab is also useful for testing code in small iterations. Thus, the number of iterations should be kept fairly low (“test”; B1).

prod – Unlike the “test” tab, this tab does not display the result of each loss magnitude calculation per iteration. This is the tab that you would want to run the full simulation on; thousands of iterations.

Here we go…and referencing the “prod” worksheet…

Input Parameters.
Expected Loss Frequency. It is assumed for this post that you have estimated or derived a most likely or average loss frequency value. Cell B2 contains this value. The value in this cell will be one of the input parameters into a POISSON probability distribution to return an inverse cumulative value (Part 2 of this Series).

Average Loss Magnitude. It is assumed for this post that you have estimated or derived a most likely or average loss magnitude value. Cell B3 contains this value. The value in this cell will be one of the input parameters into a NORMAL probability distribution to return an inverse cumulative value (Part 3 of this Series).

Loss Magnitude Standard Deviation. It is assumed for this post that you have estimated or derived the standard deviation for loss magnitude. Cell B4 contains this value. The value in this cell will be one of the input parameters into a NORMAL probability distribution to return an inverse cumulative value (Part 3 of this Series).

The Simulation.
On the “prod” tab, when you click the button labeled “Prod” – this will execute a macro composed of VBA code. I will let you explore the code on your own – it is fairly intuitive. I also left a few comments in the VBA so I remember what certain sections of the code are doing. There are four columns of simulation output that the macro will generate.

Iter# (B10). This is the iteration number. In cell B1 we set the number of iterations to be 5000. Thus, the VBA will cycle through a section of its code 5000 times.

LEF Random (C10). For each iteration, we will generate a random value between 0 and 1 to be used in generating a loss frequency value. Displaying the random value in the simulation is not necessary, but I prefer to see it so I can informally analyze the random values themselves and gauge the relationship between the random value and the inverse cumulative value in the next cell.

LEF Value (D10). For each iteration, we will use the random value we generated in the adjacent cell (column c), combine it with the Expected Loss Frequency value declared in B2 and input these values as parameters into a POISSON probability distribution that returns an inverse cumulative value. The value returned will be an integer – a whole number. Why a whole number? Because you can’t have half a loss event – just like a woman cannot be half pregnant ( <- one of my favorite analogies). This is a fairly important concept to realize from a loss event modeling perspective.

Loss Magnitude (E10). For each iteration, we will consume the value in the adjacent cell (column D) and apply logical rules to it.

a.    If the LEF Value = 0, then the loss magnitude is zero.
b.    If the LEF Value > 0, then for each instance of loss we will:
1.    Generate a random value
2.    Consume the average loss magnitude value in cell B3
3.    Consume the loss magnitude standard deviation in cell B4
4.    Use the values referenced in 1-3 as input parameters into a Normal probability distribution and return an inverse cumulative value. In other words, given a normal distribution with mean $2000 and standard deviation of $1000 – what is the value of that distribution point given a random value between 0 and 1.
5.    We will add all the instances of loss for that iteration and record the sum in column E.

Note: Steps 4 and 5 can be observed on the “test” worksheet by clicking the button labeled “test”.

The code will continue to loop until we have completed the number of iterations we specified in cell B1.

The Results. Now that the simulation is complete we can begin to analyze the output.

# of Iterations With No Loss (B5). This is the number of iterations where the returned inverse cumulative value was zero.

# of Iterations With Loss (B6). This is the number of iterations where the returned inverse cumulative value was greater than zero.

# of Loss Events (B7). This is the sum of loss events for all the iterations. There was some iteration where there was more then one loss event.

Max. # of Loss Events for an iteration (B8). This is the maximum number of loss events for any given iteration.

Next, let’s look at some of the simulation output in the context of loss severity; $.

Min. Loss (K6). This is minimum loss value returned from the simulation. I round the results to the nearest hundred in the worksheet.

Max. Loss (K7). This is maximum loss value returned from the simulation. I round the results to the nearest hundred in the worksheet.

Median (G5). This is the 50th percentile of the simulation results. In other words, 50% of the simulations results were equal to or less then this value.

Average (G6). This is the average loss value for the simulation. This is the quotient of summing all the loss magnitude values and dividing by the number of iterations. This value can quickly be compared to the median to make inferences about the skew of the simulation output.

80th % (G7). This is the 80th percentile of the simulation results. In other words, 80% of the simulations results were equal to or less then this value. In some industries, this is often referred to as the 1-in-5 loss.

90th % (G8). This is the 90th percentile of the simulation results. In other words, 90% of the simulations results were equal to or less then this value. In some industries, this is often referred to as the 1-in-10 loss.

95th % (G9). This is the 95th percentile of the simulation results. In other words, 95% of the simulations results were equal to or less then this value. In some industries, this is often referred to as the 1-in-20 loss.

99th % (G10). This is the 99th percentile of the simulation results. In other words, 99% of the simulations results were equal to or less then this value. In some industries, this is often referred to as the 1-in-100 loss.

Note 2: Generally speaking, the 95th, 99th and greater percentiles are often considered as being part of the tail of the loss distribution. I consider all the points in cells G5:G10 to be useful. For some loss exposures, the median and average values are more than enough to make informed decisions. For some loss exposures; the 80th, 90th, 95th, 99th and even larger percentiles are necessary.

Simulated Loss Magnitude Histogram. A histogram is a graphical representation showing the distribution of data. The histogram in the “prod” worksheet represents the distribution of data for all iterations where the loss was greater than zero.

Wrap Up. What I have presented in this post is a very simple model for a single loss exposure using randomness and probability distributions. Depending on your comfort level with VBA and creativity, one can easily build out more complex models; whether it is hundreds of loss exposures you want to model for or just a few dependent loss exposures.


Simple Risk Model (Part 3 of 5): Simulate Loss Magnitude

December 22, 2010

Part 1 – Simulate Loss Frequency Method 1
Part 2 – Simulate Loss Frequency Method 2

In parts one and two of this series we looked at two methods for simulating loss frequency. Method one – while useful – has shortcomings as it primarily requires working with expected loss frequency values less then 1 (once per year). In addition, with method one, it was not possible to determine iterations where loss frequency could be greater then once per year.

Method two overcame these limitations. We leveraged the Poisson probability distribution (discrete distribution) as well as an expected loss frequency value and a random value between 0 and 1 to return a loss value (an integer) for any given iteration. Using this method – about 10% of our iterations resulted in loss events and some of those iterations had multiple loss events. From my perspective method two is the more useful of the two – especially since it has the potential to account for low probability situations where there could be numerous loss events for any simulation iteration.

The purpose of this post is to simulate loss magnitude. Conceptually, we are going to do what we did with loss frequency method two – but our distribution and input parameters will differ. To simulate loss magnitude we need four things:

1.    A continuous probability distribution.
2.    A random value between 0 and 1
3.    An expected or average loss magnitude
4.    A loss magnitude standard deviation

Continuous Probability Distribution. Technically, if you have internal or external loss magnitude data, you would analyze that data and fit the data to an appropriate continuous probability distribution. There are dozens of such distributions. There are often times where we have limited data or we need to make good faith (or “educated”) assumptions about the shape of our loss magnitude curve. A lot of IT risk scenarios loss magnitude curves are often assumed to be normal or lognormal in nature. Normal is often assumed but it has its limitations since there can be negative values and rarely is there a “perfect” normal loss magnitude curve for IT risk scenarios. However, most of the “normal-like” distributions converge to normal (as data points increase). Thus, for the purposes of demonstration I am going to use the normal distribution.

Random Value Between 0 and 1. Because we are dealing with uncertainty and a distribution, we will use random values between 0 and 1 in our probability distribution; think Excel function RAND().

Expected or Average Loss Magnitude. Statistics 101 – If you take the sum of X values and divide by X you get the average. Quantitative risk analysis methodologies like FAIR can facilitate deriving an average loss magnitude estimate. Or maybe, you have actual loss magnitude data points. How you derive average loss magnitude is not the focus of this post – just remember that to use the normal distribution you need that average loss magnitude value.

Loss Magnitude Standard Deviation. More Statistics 101. At a high level, standard deviation is a statistic or measure of how spread out our data points are relative to the mean. The larger the number, the greater or flatter our distribution (think bell curve) will be; the smaller the number – the more narrow the bell curve will be. In the interest of brevity, it is assumed that either you can use existing Excel functions to calculate a standard deviation from your loss magnitude data points, or your risk analysis tool sets will provide this value to you. In some cases you may not have actual data sets to calculate a standard deviation let alone an average magnitude value – in those cases we have to make our best estimates and document assumptions accordingly.

How do these work together? In layman’s terms – given a normal loss distribution with an average loss magnitude of $5000 and a standard deviation of $1000; what is the loss value (inverse cumulative value) at any point in the distribution, given a random probability value?

You may want to download this Excel spreadsheet to reference for the rest of the post (it should work in Excel 2003, Excel 2007 and Excel 2010; I have not tested it on Office for Mac). Reference tab “magnitude” and make sure you view it in Excel and NOT in Google Apps.


a.    The average loss magnitude amount is $5000 (cell B1; tab “magnitude”)

b.    The loss magnitude standard deviation is $1000 (cell B2; tab “magnitude”)

c.    For purposes of demonstration, we will number some cells to reflect the number of iterations (A9:A1008; A9=1; A10=A9+1; drag A10 down to you get to 1000).

d.    In Excel, we would use the =RAND() function to generate the random values in cells B9:B1008.

e.    Now, in column C beginning in cell C9 – we are going to combine a Normal probability distribution with our average loss ($B$1), standard deviation ($B$2) and the random value(s) in column B to return a loss value. In other words, given a normal distribution with mean $5000 and standard deviation of $1000 – what is the value of that distribution given a random value between 0 and 1 – rounded to the nearest 10th? You would type the following in C9 and then drag C9 down to C1008:
=ROUND(MAX(NORMINV(B9,$B$1,$B$2),0),-1)

Let’s dissect this formula.

i.    ROUND. I am going to round the output of this formula to the nearest 10; annotated by the -1.
ii.    MAX. Because we are using the normal distribution and because some values could be less then zero which is not applicable for most IT scenarios, we are going to compare the value generated by the NORMINV function to 0. Which ever is larger is the value that then gets rounded to nearest 10.
iii.    NORMINV. This is the function built into Excel that returns an inverse cumulative value of a normal distribution given a probability, a mean and a standard deviation.

f.    Once you have values in all the cells – hit F9 a few times.

g.    Cell B3 gives the minimum loss value from cells C9 through C1008. The random value associated with the minimum value is probably less then 0.00xxxx.

h.    Cell B4 gives the maximum loss value from cells C9 through C1008. The random value associated with the maximum value is probably greater then 0.99xxxx.

i.    The histogram shows the count of iterations whose loss magnitude values falls within a loss magnitude bin. If you drew a line around the tops of each column it would resemble a bell curve. We expect to get this since we are using the normal distribution.

j.    Press the F9 key; new random values will be generated. Every time you press F9 think of it as a new simulation with 1000 iterations. Press F9 lots of times and you will notice that the histogram changes as well. While individual bin counts will change – the general shape of the histogram does not.

k.    By the way, if you change the average loss magnitude value in cell B1 – the histogram will probably break. But you can change the value in B2 to 500, hit F9 a few times and observer how the bell-curve shape becomes more narrow. Or, change B2 to 2000 and you will see a much flatter bell curve.

KEY TAKE-AWAY(S)

1.    As we did with simulating loss frequency, we leverage randomness to simulate loss magnitude.

2.    While we typically talk about an average loss magnitude value; losses can range in terms of magnitude. Being able to work within a range of loss values gives us a more complete view of our loss potential.

In part four of the series, we will combine loss frequency and loss magnitude into one simulation. For every iteration, we will randomly derive a loss frequency value (an integer) and a loss magnitude value. We will then calculate an expected loss, which is the product of the loss frequency and the loss magnitude values. Perform this cycle thousands or millions of time and you now have an expected loss distribution.


Useful Risk Modeling / Simulation Excel Add-Ons

October 18, 2010

I have had a lot of “modeling” and simulation conversations over the last several months. One common strand of these discussions is around Excel add-ons used to perform simulations and analysis. I want to share with you a few thoughts as well as some of the add-ons I use.

THOUGHTS

1.    Add-ons are great and they can make a lot of aspects of simulation and analysis “easy”. However, I would argue that you still need to understand “the why” and “the how” of features and functionality of the add-on to ensure you are using the add-on correctly. One of my modeling mentors discourages extensive use of “fancy” add-ons – he prefers plain vanilla Excel. He prefers native Excel because most of the add-ons are not worth the money. He has also suggested that if you are forced to write your own calculations and create your own model / charts, it forces you to learn the trade – especially the probability theory and statistics. There is truth in what he says but one has to balance time with learning.

2.    Generally speaking, most of what you need for Monte Carlo simulations is already in Excel. So, if you have the input parameters you need, know the right distribution to reference, and have some little to moderate VBA skills – you can begin to conquer the world.

3.    Add-on documentation. Any add-on or 3rd party plug-in should have ample documentation. In cases where the add-on includes its own probability distributions – understand why you would use one tool’s probability distribution implementation over any other.

TOOLS

1.    Microsoft Excel. At a minimum you want to use Excel 2007 – just to get around some of the row and column limitations of previous versions of Excel. Do not use versions of Excel prior to Excel 2003 as the random number generator was flawed. With Excel 2010, Microsoft included a new random number generator algorithm called Mersenne Twister. I won’t bore you with the details – just understand that it is a good thing and reduces the need for yet another Excel plug-in or 3rd party tool.

2.    SimTools. This is a recent find for me and it is FREE. The primary reason I like this add-in is for its broad implementation of inverse cumulative-probability functions that Excel 2007 does not have. Another reason I like this add-on is that the SimTool VBA code can be found here; just in case you need to validate calculations that are occurring or if your work is being audited / challenged.

3.    Simulacion. I was originally turned onto this add-on by my modeling mentor and it’s free as well; though there is a shortage of documentation. Some of the useful features are: distribution fitting and correlation capabilities. I do not use this add-on very much – but the distribution fitting capability alone makes it worth having.

4.    RiskAmp. I was introduced to RiskAmp via Risk Management Insight (RMI). Some of RMI’s tools leverage (require) the RiskAmp add-on. RiskAmp includes a few continuous probability distributions not included in Excel. One specifically is the betaPERT distribution. RiskAmp’s implementation of betaPERT allows for a fourth parameter called lamda which determines the height of the distribution. For what its worth, Oracle’s CrystalBall betaPert implementation does not allow for lamda. So, if betaPERT is a big deal in your analysis / simulations and you don’t need other flash-whiz-bang functionality out of Oracle ChrystalBall – save yourself over a $1000 dollars and purchase a copy of RiskAmp. One more thing on RiskAmp – I cannot prove it yet – but I think there are some memory sharing issues with their code. If you have multiple spreadsheets of which you are using RiskAmp for simulations – the speed of any one simulation is painfully slow.

5.    Mersenne Twister (MT) pseudorandom number generator. The nice folks at Structured Data (makers of RiskAmp) were nice enough to make a MT pseudorandom number generator add-on available for FREE. I have used this add-on in both Excel 2003 and Excel 2007. There is no need for it if you are using Excel 2010 – since Microsoft implemented the MT algorithm in its RAND() function.

6.    XLSim. This is a nifty Monte Carlo simulator that has a bunch of goodies in it that make it worth the couple of hundred dollars. There is a lot of flexibility with the chart output and the simulation menus are very intuitive. XLSim also includes the ability to store simulation results via the DIST standard. I will limit discussion of DIST for this post – but there are quite a few vendors that have already accommodated their software to create and/or work with DIST files.

7.    MegaStat. MegaStat came with one of my decision science textbooks while taking some courses at the Fisher College of Business; The Ohio State University. There is a great correlation matrix generator, the descriptive statistics functionality is great, and there are some other useful functions. While you can definitely find MegaStat for free on the Internet – consider buying a textbook from the creator of MegaStat, Professor J.B. Orris; Butler University. It’s ethically the right thing to do and it’s a small price to pay for a great utility!

8.    XY Chart Labler. This FREE Excel add-on has nothing to do with statistics or probability distributions. I use this add-on to add labels to data points for some charts. Does not sound like much but the functionality makes up for a shortcoming in Excel.

That’s it for now. Feel free to leave comments; especially if you have a really useful add-on or tool that you have used. I will mark as SPAM any comments that I feel are purely product pitches – with no substantive, experience based content.


Heat Map Love

May 6, 2010

First, I would like to welcome Jack Jones *back* to the world of risk blogging. Jack blogged a few weeks ago on the subject of heat maps; “Lipstick on Pigs” and “Lipstick Part II”; prompting a response by Jared Pfost of Third Defense. These are great posts that underscore the need to structure and leverage heat maps in an effective and defensible manner.

The purpose of this post is to share a recent “ah hah” moment involving heat maps and loss distributions. Whether you are an advocate or not of risk quantification or simulation modeling – it is hard to criticize one for having tools or procedures in place that essentially serve as a “risk sniff test”. I consider reconciling portions of a loss distribution to a heat map – a pretty useful sniff test.

QUESTION TO BE ANSWERED: How do I reconcile – or validate – the plotting of a heat map bubble with a loss distribution?

Well, it depends…but let’s establish some context.

•    5-by-5 heat map. The X axis of the heat map represents “Frequency of Loss”; the Y axis represents “Magnitude of Loss”. Each axis is broken into 5 sections.

•    Let’ say we have a heat map whose bubbles represent categories of risk issues (ISO 27002 categories, BASEL II OpRisk Categories, etc…).

•    At a minimum, all of these issues have been assessed with some methodology and/or tool (I prefer FAIR) that allow us to associate the frequency and magnitude of loss for each and every issue.

•    We can perform thousands of simulation iterations for each and every issue in the risk repository, perform analysis, determine categories of risk that are contributing the most to various percentiles of the loss distribution, and then associate them with a heat map.

•    For the purpose of this post we are going to make a good faith assumption that our loss distribution resembles a log-normal or normal “like” distribution.

Back in my “Rainbow Risk” post I shared an example of a “rainbow chart”; a 100% stacked bar chart representing the contribution of loss that a category contributes (by percentage) to any given loss distribution percentile. For example, in the rainbow chart on that post, it showed that Business Continuity Mgmt category of risk issues accounted for about 55% of the risk in the 99th percentile. On a heat map, most significant IT Business Continuity issues are probably going to be very low frequency, very high magnitude events. Thus, it is fairly intuitive that very low frequency / very high frequency magnitude loss events would “drive” the tail of a given loss distribution.

In the images below, I have mapped areas on a heat map (image 1) to areas on a distribution (image 2). Specifically, I am trying to illustrate how frequency and magnitude for any given issue factors into or most likely represented in a loss distribution.

Image 1
Image 2

Area A – Very low frequency, very high magnitude risk issues. These types of events or risk issues drive the tail portion of a loss distribution.

Area B – Very low frequency, moderate or high magnitude risk issues; or low to moderate frequency, very high magnitude loss events. It can be said that these type of issues also drive the tail – but maybe not as much past the 99th percentile like issues associated with Area A.

Area C – Low to Moderate frequency, moderate or high magnitude.  These issues are best represented in the middle of the distribution; generally speaking, around one standard deviation on both sides of the mean.

Area D – Very frequent, moderate or high magnitude. Loss associated with these issues is not as severe as those of Areas A and B; but are typically greater then the mean expected loss.

Area E – Very frequent, very high magnitude. Generally speaking, these issues probably drive the portion of the distribution between 1 and 2.5 standard deviations (to the right of the mean).

Area F – Low or moderate frequency, low or moderate magnitude. These issues best factor into the area of the distribution left of the mean. Loss associated with these issues is less then the mean.

In closing, I would share at least one use case for performing this analysis or validation. Key risk heat maps. If all of your issues have frequency and magnitude values as well as some other attributes associated with the issue, you can:

1.    Perform simulations on all of these issues.
2.    Calculate their contributions to various distribution percentiles
3.    Analyze the results by various attributes (ISO 27002, BASEL II, IT Process, etc…).
4.    Chart derived information (categories of risk) on a heat map
5.    Review for plausibility / accuracy (this should occur all the time)

I welcome any feedback!


Rainbow Risk

April 1, 2010

One of the benefits of quantifying risk for an information risk issue or finding is that it gives us an additional dimension for aggregate risk analysis. Instead of just simple counts of issues categorized by ISO 27002 section – we can now analyze risk (dollar values) in numerous contexts; limited primarily by the data you are collecting and your ability to associate some data elements to various frameworks. (ISO 27002, BASEL II, COBIT IT Processes, etc…).

Loss simulation and aggregate risk analysis is a big part of my world right now. However, one of the most complex challenges I face is visually representing risk in a meaningful manner to management. So, over the next few posts, I want to share some visualizations of risk and how they could possibly be used for decision making.

The chart above is often referred to as a “rainbow chart”. Technically speaking though, it is a 100% Stacked Bar Chart. What this chart is depicting is the breakdown of risk (percentage of loss and ISO 27002 section) of a simulated aggregate loss distribution; by loss distribution percentile. Say what…?

Let’s break it down…

Warning – I am going to oversimplify some statistical concepts. And yes – all of the percentages and dollar values are from a dataset with fictitious data for the purposes of illustration.

Output from simulations are often analyzed by percentiles. We often zero in on the 50th percentile; sometimes an indicator of the mean or expected loss for a defined time frame. For illustration purposes let’s associate some dollar values to loss distribution percentiles:

5th percentile: $100K
50th percentile: $1M
80th percentile: $2M
99th percentile: $5M

Given the dollar values above as well as the chart we can infer the following:

5th percentile:
a.    5% of the iterations resulted in simulated loss of around $100K.
b.    Roughly 30% of the simulated loss at the 5th percentile is related to Compliance issues (~33%; $33K).
c.    Another 30% of the simulated loss at the 5th percentile is related to Access Control issues (~30%; $30K).
d.    About 20% of the simulated loss at the 5th percentile is related to Systems Development & Maintenance issues (~20%; $20K).
e.    About 8% of the simulated loss at the 5th percentile is related to Business Continuity Mgmt issues (~8%; $8K).

50th percentile:
a.    50% of the iterations resulted in simulated loss of around $1M.
b.    Roughly 28% of the simulated loss at the 50th percentile is related to Compliance issues (~28%; $280K).
c.    Another 30% of the simulated loss at the 50th percentile is related to Access Control issues (~30%; $300K).
d.    About 22% of the simulated loss at the 50th percentile is related to Systems Development & Maintenance issues (~22%; $220K).
e.    About 12% of the simulated loss at the 50th percentile is related to Business Continuity Mgmt issues (~12%; $120K).

99th percentile:
a.    1% of the iterations resulted in simulated loss of around $5M (100% -99% = 1%).
b.    Roughly 12% of the simulated loss at the 99th percentile is related to Compliance issues (~12%; $600K).
c.    Another 15% of the simulated loss at the 99th percentile is related to Access Control issues (~15%; $750K).
d.    About 12% of the simulated loss at the 99th percentile is related to Systems Development & Maintenance issues (~12%; $600K).
e.    About 55% of the simulated loss at the 99th percentile is related to Business Continuity Mgmt issues (~55%; $2.75M).

PRACTICAL APPLICATION

This type of risk visualization allows for the following:

1.    BETTER VISIBILITY. Gives us visibility to risk by loss severity and by ISO 27002 sections.
2.    INFORMED DECISIONS. Allows for tactical and strategic decision making.
a.    Tactical. Risk around the 50th percentile is loss we would expect 50% of the time; time being annually. Thus, we can actively manage (reduce or maintain) expected annual loss by targeting categories of risk at this or surrounding percentiles.
b.    Strategic. Risk around the 80th (1-in-5) or 90th (1-in-10) percentiles is greater in severity but less likely to occur. However when you think of this in terms of years and given the volatility of threats in our profession – we have to be mindful of low frequency / high magnitude loss events. Thus, to address the risk at these percentiles – we could be more strategic in our planning and investments. It could take a few years to mitigate the risk down to an acceptable level – but we can spread those mitigation costs over time.
3.    COST / BENEFIT ANALYSIS. Above I listed risk at the 80th percentile to be about $2M dollars. We can quickly see that 30% of the risk at the 80th percentile is related to Access Control (~30%, $600K). If I want to reduce the access control related risk at this percentile and I estimate it’s going to cost me $100K – I can demonstrate a risk reduction both mathematically and by re-running my simulations absent issues that would be mitigated by my investment.
4.    DRILL DOWN. Depending on how your issues are tagged, we could drill down into ISO 27002 sections to see which sub-sections are driving the most risk for its parent section.

FINAL THOUGHT. There is uncertainty with risk. Performing aggregate analysis and simulations on your entire risk repository highlights the variability of loss and can make it easier to explain uncertainty to others outside the information security profession. Take for example the “compliance risk” values above. Annual expected loss associated with Compliance-related issues could be as little as $33K, most likely $280K and possibly $600K or worse. Do not underestimate management’s ability to understand this uncertainty and their appetite to make effective decisions around it.


Working With External Data (Part 2 of X)

February 2, 2010

This is the second post of a series related to working with external data for analysis or modeling purposes. You can read the first post HERE or read the “cliff notes” summary below.

***

Part 1 “Cliff Notes”

1.    Know what information you are hoping to derive from the data.
2.    Methodically narrow down your data for relevant data points. More data does not guarantee better or more accurate information.
3.    Some refinement considerations include:
a.    Time frames. Limit your data set to a span of time commensurate with a minimum level of technology as well as a consistent expectation of regulatory / industry standard requirements.
b.    Good fit. Consider points related to your industry, service offering / value proposition, and loss form categories.
c.    Duplicate records. When working with multiple external data sources, keep an eye out for duplicate event records.
d.    Consistency. Be consistent in how you analyze data points.

NOTE: Collecting and right sizing external data is useful for comparing your internal loss events to external loss events, understanding what a worst-case loss could like for your company and possibly incorporating into a data set to be used for modeling purposes.

***

In this post, I want to focus on right-sizing data points in your data set commiserate with the size of your company.

1.     Determine the minimum value where right-sizing is not worth the effort. When faced with hundreds or even thousands of data points – there is going to be a “magic number” of records where the number of records lost does not warrant right sizing. The more you understand your business processes and partner with the appropriate stakeholder (business partners, marketing, legal, privacy, etc…) the easier making this determination should be. They *should* be the subject matter expert(s) on these matters and be leveraged whenever possible.

TIME-OUT. For the uber-privacy / legal folks our there, the loss of just one record is not desirable. But, we have to be reasonable and acknowledge that are thresholds where the “squirm factor” varies.

2.    Understand the type of data lost / compromised. This is really easy to overlook. Some data loss events involve just customer (consumer) data, other events may include just employee data, and some events may include both types of data. Understanding the type of data lost could prove useful in determining which right-sizing method to use.

3.    Right-sizing factors (proportions). This is where things get interesting. It also where objectivity and consistency have to be demonstrated. Whether we are performing risk assessments, right-sizing data points, or collecting information to draw conclusions from – it is important that we are as objective as possible; reducing subjectivity whenever possible. The key point I want to make here is that if used appropriately and consistently, a right-sizing exercise is more objective in nature then stating it happened to company X so it could happen to us without any analysis whatsoever. You may want to make a brief note as to why you chose a certain right-sizing factor in case you need reminded at a later point for whatever reasons. Let’s look at a few right-sizing options (keep in mind that we are building upon what we covered in the first post):

a.    # of Employees. If a data point is from a company in the same industry or has the same value proposition – using number of employees could be a good right-sizing factor. Some inferences can be made around the number of employees. Is it unreasonable that if a company half of your size loses 10,000 records and is obligated to protect data equally as well as your company, that the same event could happen to your company for 20,000 records? Using number of employees is also useful when the data point only involves confidential employee data.

b.    Revenue. Revenue could be another right-sizing factor. Maybe the data point is from a different industry where staffing types / level differ from your industry but the value proposition is related (property & casualty insurance versus health insurance).

c.    Equity. In some cases, equity can be used as a right-sizing data point. # of employees or revenue may not be appropriate or the proportions could be unrealistic. Equity could be a third option.

d.    Others. There could be some other right sizing factors depending on your industry or the problem you need to make decisions for. Just make sure that whatever that factor is, that it is generally regarded as a sound comparative measurement factor and document any assumptions. Again, I cannot underscore the need for objectivity and consistency.

NOTE: Keep an eye out for some data points that have been right-sized but the calculated value far exceeds the total number of records you have in your organization. Consider another right-sizing factor or change to the maximum amount of records you have; and of course, document. Some situations may warrant keeping the value (like a risk assessment related to merger or acquisition where the number of records you are obligated to protect now- doubles, triples, etc…). What you don’t want is someone calling you out on a data point that is not even realistic for your organization because of a simple oversight (I speak from experience).

4.    Sources of Information. In order to right-size data points – especially in the context of the factors above (employees, revenue, equity), you have to get information about the companies related to that external data point. I would submit that the information you need is available in most of the cases – it is just a matter of time and creativity.

a.    Internal information. You need to collect your own internal information for the right sizing factors before you can right-size against external data point company information. HR may be able to give you # of employees by year and hopefully there are numerous internal authoritative resources that have your revenue and equity values (if you are publically traded – this information is publically available; though you should still confer / validate with internal sources).

b.    External information. Be Creative. Yahoo business, Dun & Bradstreet, company name Google searches combined with “annual report” or “corporate filings”, company websites and Fortune 100, 500 or 1000 lists; these are stating points. Just remember to make sure you are right-sizing in the context of the same year the incident occurred in – and be consistent.

In the next post for this series, we will look at analyzing a right-sized data set to begin collecting information. For example, does the data resemble a statistical model? Does it resemble your internal data points? What if my data set has too few data points?