The AICPCU ‘Associate in Risk Management’ (ARM)

September 14, 2012

A year or so ago I stumbled upon the ARM designation that is administered through the AICPCU or ‘the Institutes’ for short. What attracted me then to the designation was that it appeared to be a comprehensive approach to performing a risk assessment for scenarios that result in some form of business liability. Unfortunately, I did not start pursuing the designation until July 2012. The base designation consists of passing three tests on the topics of ‘risk assessment’, ‘risk control’ and ‘risk financing’. In addition, there are a few other tests which allows one to extend their designation to include disciplines such as ‘risk management for public entities’ and ‘enterprise risk management’.

I am about two months into my ARM journey and just passed the ARM-54 ‘Risk Assessment’ test. I wanted to share some perspective on the curriculum itself and some differentiators when compared to some other ‘risk assessment’ and ‘risk analysis / risk measurement’ frameworks.

1. Proven Approach. Insurance and risk management practices have been around for centuries. Insurance carriers especially those who write commercial insurance products are very skilled at identifying and understanding the various loss exposures businesses face. Within the information risk management and operational risk management space, many of the loss exposures we care about and look for are the same that insurance carriers may look for when they assess a business for business risk and hazard risk; so they can create a business insurance policy. In other words, the ‘so what’ associated with the bad things we and insurance carriers care about is essentially a business liability that we want to manage. Our problem space / skills and risk treatment options may be slightly different but the goal of our efforts is the same: risk management.

2. Comprehensive. The ARM-54 course alone covers an enormous amount of information. The material easily encompasses the high level learning objectives of six college undergraduate courses I have taken in the last few years:

– Insurance and Risk Management
– Commercial Insurance
– Statistics
– Business Law
– Calculus (Business / Finance Problem Analysis / Calculations)
– Business Finance

The test for ARM-54 was no walk in the park. Even though I passed on the first attempt, I short-changed myself on some of the objectives which caused a little bit of panic on my part. The questions were well written and quite a few of them forced you to understand problem context so you could choose the best answer.

3. ‘Risk Management Value Chain’. Some of the following thoughts are the largest selling points of this designation compared to other IT risk assessment frameworks, IT risk analysis frameworks and IT risk certifications / designations. The ARM curriculum connects the dots between risk assessment activities, risk management decisions and the financial implications of those decisions at various levels of abstraction. This is where existing IT-centric risk assessment / analysis frameworks fall short – they are either to narrow in focus, do not incorporate business context, are not practical to execute or in some cases, not useful at all in helping someone or a business manage risk.

4. Cost Effective. For between $300-$500 per ARM course – one can get some amazing reference material and pay for the test. Compare that to the cost of six university courses (between $6K – $9K) or the cost of one formal risk measurement course (~$1k). I am convinced that any risk management professional can begin applying learned concepts from the ARM-54 text within hours after having been introduced to the text. So just the cost of the text books alone (~$100 give or take) is justified even if you do not take the test(s).

5. Learn How To Fish. Finally, I think it is worth noting that there is nothing proprietary to the objectives and concepts presented in the ARM-54 ‘Risk Assessment’ curriculum. Any statistical probability calculations or mathematical finance problems are exactly that – good ole math and probability calculations. In addition, there is nothing proprietary about the methods or definitions presented as they relate to risk assessments or risk management proper. This is an important selling point to me because there are many information risk management practitioners that are begging for curricula or training such as ARM where they can begin applying what they are learning and not be dependent on proprietary tools, proprietary calculations or pay for the license to use a proprietary framework.

In closing, the ARM-54 curriculum is a very comprehensive risk management curriculum that establishes business context, introduces proven risk assessment methods, and reinforces sound risk management principals. In my opinion, it is very practical for the information / operational risk management professional – especially those that are new to risk management or looking for a non-IT or non-security biased approach to risk management – regardless of the industry you work in.

So there you have it. I am really psyched about this designation and the benefits I am already realizing in my job as a Sr. Risk Advisor for a Fortune 200 financial services firm. I wish I would have pursued this designation two years ago but I am optimistic that I will make for lost time and tangible business value very quickly.

Advertisements

Metricon 6 Wrap-Up

August 10, 2011

Metricon 6 was held in San Francisco, CA on August 9th, 2011. A few months ago, I and a few others were asked by the conference chair – Mr. Alex Hutton (@alexhutton) – to assist in the planning and organization of the conference. One of the goals established early-on was that this Metricon needed to be different then previous Metricon events. Having attended Metricon 5, I witnessed firsthand the inquisitive and skeptical nature of the conference attendees towards speakers and towards each other. So, one of our goals for Metricon 6 was to change the culture of the conference. In my opinion, we succeeded in doing that by establishing topics that would draw new speakers and strike a happy balance between metrics, security and information risk management.

Following are a few Metricon 6 after-thoughts…

Venue: This was my first non-military trip to San Francisco. I loved the city! The vibe was awesome! The sheer number of people made for great people-watching entertainment and so many countries / cultures were represented everywhere I went. It gave a whole new meaning to America being a melting pot of the world.

Speakers: We had some great speakers at Metricon. Every speaker did well, the audience was engaged, and while questions were limited due to time – they took some tough questions and dealt with them appropriately.

Full list of speakers and presentations…

Favorite Sessions: Three of the 11 sessions stood out to me:

Jake Kouns – Cyber Insurance. I enjoyed this talk for a few reasons: a. it is an area of interest I have and b. the talk was easy to understand. I would characterize it as an overview of what cyber insurance is [should be] as well as some of the some of the nuances. Keeping in mind it was an overview – commercial insurance policies can be very complex – especially for large organizations. Some organizations do not buy separate “cyber insurance” policies – but utilize their existing policies to cover potential claims / liability arising from operational information technology failures or other scenarios. Overall – Jake is offering a unique product and while I would like to know more details – he appears to be well positioned in the cyber insurance product space.

Allison Miller / Itai Zukerman – Operationalizing Analytics. Alli and Itai went from 0 to 60 in about 5 seconds. They presented some work that brought together data collection, modeling and analysis- in less then 30 minutes. Itai was asked a question about the underlying analytical engine used – and he just nonchalantly replied ‘I wrote it in Java myself’ – like it was no big deal. That was hot.

Richard Lippman – Metrics for Continuous Network Monitoring. Richard gave us a glimpse of a real-time monitoring application; specifically, tracking un-trusted devices on protected subnets. The demo was very impressive and probably gave a few in the room some ‘metrigasms’ (I heard this phrase from @mrmeritology).

People: All the attendees and speakers were cordial and professional. By the end of the day – the sense of community was stronger then what we started with. A few quick shout-outs:

Behind-the-scenes contributors / organizers. The Usenix staff helped us out a lot over the last few months. We also had some help from Richard Baker who performed some site reconnaissance in an effort to determine video recording / video streaming capabilities – thank you sir. There were a few others that helped in selecting conference topics – you know who you are – thank you!

@chort0 and his lovely fiancé Meredith. They pointed some of us to some great establishments around Union Square. Good luck to the two of you as you go on this journey together.

@joshcorman. I had some great discussion with Josh. While we have only known each other for a few months – he has challenged me to think about questions [scenarios] that no one else is addressing.

+Wendy Nather. Consummate professional. Wendy and I have known of each other for a few years but never met in person prior to Metricon6. We had some great conversation; both professional and personal. She values human relationships and that is more important in my book then just the social networking aspect.

@alexhutton & @jayjacobs – yep – it rocked. Next… ?

All the attendees. Without attendance, there is no Metricon. The information sharing, hallway collaboration and presentation questions contributed greatly to the event. Thank you!

***

So there you go everyone! It was a great event! Keep your eyes and ears open for information about the next Metricon. Consider reanalyzing your favorite conferences and if you are looking for small, intimate and stimulating conferences – filled with thought leadership and progressive mindsets – give Metricon a chance!


Deconstructing Some HITECH Hype

February 23, 2011

A few days ago I began analyzing some model output and noticed that the amount of loss exposure for ISO 27002 section “Communications and Operations Management” had increased by 600% in a five week time frame. It only took a few seconds to zero-in on an issue that was responsible for the increase.

The issue was related to a gap with a 3rd party of which there was some Health Information Technology for Economic and Clinical Health Act (HITECH) fine exposure. The estimated HITECH fines were really LARGE. Large in the sense that the estimates:

a.    Did not pass the sniff test
b.    Could not be justified based off any documented fines / or statutes.
c.    From a simulation perspective were completely skewing the average simulated expected loss value for the scenario itself.

I reached out to better understand the rationale of the practitioner who performed the analysis and after some discussion we were in agreement that some additional analysis was warranted to accurately represent assumptions as well as refine loss magnitude estimates – especially for potential HITECH fines. About 10 minutes of additional information gathering yielded valuable information.

In a nutshell, the HITECH penalty structure is a tiered system that takes into consideration the nature of the data breach, the fine per violation and maximum amounts of fines for a given year. See below (the tier summary is from link # 2 at the bottom of this post; supported by links # 1 and 3):

Tier A is for violations in which the offender didn’t realize he or she violated the Act and would have handled the matter differently if he or she had. This results in a $100 fine for each violation, and the total imposed for such violations cannot exceed $25,000 for the calendar year.

Tier B is for violations due to reasonable cause, but not “willful neglect.” The result is a $1,000 fine for each violation, and the fines cannot exceed $100,000 for the calendar year.

Tier C is for violations due to willful neglect that the organization ultimately corrected. The result is a $10,000 fine for each violation, and the fines cannot exceed $250,000 for the calendar year.

Tier D is for violations of willful neglect that the organization did not correct. The result is a $50,000 fine for each violation, and the fines cannot exceed $1,500,000 for the calendar year.

Given this information and the nature of the control gap – one can quickly determine the penalty tier as well as estimate fine amounts. The opportunity cost to gather this additional information was minimal and the benefits of the additional analysis will result  in not only more accurate and defendable analysis – but also spare the risk practitioner from what would have been certain scrutiny from other IT risk leaders and possibly business partner allegations of IT Risk Management once again “crying wolf”.

Key Take-Away(s)

1.    Perform sniff tests on your analysis; have others review your analysis.
2.    There is probably more information then you realize about the problem space you are dealing with.
3.    Be able to defend assumptions and estimates that you make.
4.    Become the “expert” about the problem space not the repeater of information that may not be valid to begin with.

Links / References associated with this post:

1.    HIPAA Enforcement Rule ref. HITECH <- lots of legalese
2.    HITECH Summary <- less legalese
3.    HITECH Act scroll down to section 13410 for fine information <-lots of legalese
4.    Actual instance of a HITECH-related fine
5.    Interesting Record Loss Threshold Observation; Is 500 records the magic number?


Risk Fu Fighting

January 31, 2011

If you are an information risk analyst or perform any type of IT risk analysis – you should really consider joining the Society of Information Risks Analysts mailing list. Over the last several weeks there have been some amazing exchanges of ideas, opinions, and spirited debate over the legitimacy and value of risk analysis. Some of the content is no doubt a pre-cursor to the anxiously awaited “Risk Management Smackdown” at RSA on 2/15. Regardless of my role within SIRA or the upcoming RSA debate, the SIRA mailing list is a great resource for learning more about IT risk analysis and IT risk management in general.

Below are a couple of links:

Society of Information Risk Analysts (SIRA)
Society of Information Risk Analysts – Mailing List
Society of Information Risk Analysts – Mailing List Archives (must be subscribed to the mailing list to view)
RSA Session Catalog (filter on security tag “risk management”)


Simple Risk Model (Part 3 of 5): Simulate Loss Magnitude

December 22, 2010

Part 1 – Simulate Loss Frequency Method 1
Part 2 – Simulate Loss Frequency Method 2

In parts one and two of this series we looked at two methods for simulating loss frequency. Method one – while useful – has shortcomings as it primarily requires working with expected loss frequency values less then 1 (once per year). In addition, with method one, it was not possible to determine iterations where loss frequency could be greater then once per year.

Method two overcame these limitations. We leveraged the Poisson probability distribution (discrete distribution) as well as an expected loss frequency value and a random value between 0 and 1 to return a loss value (an integer) for any given iteration. Using this method – about 10% of our iterations resulted in loss events and some of those iterations had multiple loss events. From my perspective method two is the more useful of the two – especially since it has the potential to account for low probability situations where there could be numerous loss events for any simulation iteration.

The purpose of this post is to simulate loss magnitude. Conceptually, we are going to do what we did with loss frequency method two – but our distribution and input parameters will differ. To simulate loss magnitude we need four things:

1.    A continuous probability distribution.
2.    A random value between 0 and 1
3.    An expected or average loss magnitude
4.    A loss magnitude standard deviation

Continuous Probability Distribution. Technically, if you have internal or external loss magnitude data, you would analyze that data and fit the data to an appropriate continuous probability distribution. There are dozens of such distributions. There are often times where we have limited data or we need to make good faith (or “educated”) assumptions about the shape of our loss magnitude curve. A lot of IT risk scenarios loss magnitude curves are often assumed to be normal or lognormal in nature. Normal is often assumed but it has its limitations since there can be negative values and rarely is there a “perfect” normal loss magnitude curve for IT risk scenarios. However, most of the “normal-like” distributions converge to normal (as data points increase). Thus, for the purposes of demonstration I am going to use the normal distribution.

Random Value Between 0 and 1. Because we are dealing with uncertainty and a distribution, we will use random values between 0 and 1 in our probability distribution; think Excel function RAND().

Expected or Average Loss Magnitude. Statistics 101 – If you take the sum of X values and divide by X you get the average. Quantitative risk analysis methodologies like FAIR can facilitate deriving an average loss magnitude estimate. Or maybe, you have actual loss magnitude data points. How you derive average loss magnitude is not the focus of this post – just remember that to use the normal distribution you need that average loss magnitude value.

Loss Magnitude Standard Deviation. More Statistics 101. At a high level, standard deviation is a statistic or measure of how spread out our data points are relative to the mean. The larger the number, the greater or flatter our distribution (think bell curve) will be; the smaller the number – the more narrow the bell curve will be. In the interest of brevity, it is assumed that either you can use existing Excel functions to calculate a standard deviation from your loss magnitude data points, or your risk analysis tool sets will provide this value to you. In some cases you may not have actual data sets to calculate a standard deviation let alone an average magnitude value – in those cases we have to make our best estimates and document assumptions accordingly.

How do these work together? In layman’s terms – given a normal loss distribution with an average loss magnitude of $5000 and a standard deviation of $1000; what is the loss value (inverse cumulative value) at any point in the distribution, given a random probability value?

You may want to download this Excel spreadsheet to reference for the rest of the post (it should work in Excel 2003, Excel 2007 and Excel 2010; I have not tested it on Office for Mac). Reference tab “magnitude” and make sure you view it in Excel and NOT in Google Apps.


a.    The average loss magnitude amount is $5000 (cell B1; tab “magnitude”)

b.    The loss magnitude standard deviation is $1000 (cell B2; tab “magnitude”)

c.    For purposes of demonstration, we will number some cells to reflect the number of iterations (A9:A1008; A9=1; A10=A9+1; drag A10 down to you get to 1000).

d.    In Excel, we would use the =RAND() function to generate the random values in cells B9:B1008.

e.    Now, in column C beginning in cell C9 – we are going to combine a Normal probability distribution with our average loss ($B$1), standard deviation ($B$2) and the random value(s) in column B to return a loss value. In other words, given a normal distribution with mean $5000 and standard deviation of $1000 – what is the value of that distribution given a random value between 0 and 1 – rounded to the nearest 10th? You would type the following in C9 and then drag C9 down to C1008:
=ROUND(MAX(NORMINV(B9,$B$1,$B$2),0),-1)

Let’s dissect this formula.

i.    ROUND. I am going to round the output of this formula to the nearest 10; annotated by the -1.
ii.    MAX. Because we are using the normal distribution and because some values could be less then zero which is not applicable for most IT scenarios, we are going to compare the value generated by the NORMINV function to 0. Which ever is larger is the value that then gets rounded to nearest 10.
iii.    NORMINV. This is the function built into Excel that returns an inverse cumulative value of a normal distribution given a probability, a mean and a standard deviation.

f.    Once you have values in all the cells – hit F9 a few times.

g.    Cell B3 gives the minimum loss value from cells C9 through C1008. The random value associated with the minimum value is probably less then 0.00xxxx.

h.    Cell B4 gives the maximum loss value from cells C9 through C1008. The random value associated with the maximum value is probably greater then 0.99xxxx.

i.    The histogram shows the count of iterations whose loss magnitude values falls within a loss magnitude bin. If you drew a line around the tops of each column it would resemble a bell curve. We expect to get this since we are using the normal distribution.

j.    Press the F9 key; new random values will be generated. Every time you press F9 think of it as a new simulation with 1000 iterations. Press F9 lots of times and you will notice that the histogram changes as well. While individual bin counts will change – the general shape of the histogram does not.

k.    By the way, if you change the average loss magnitude value in cell B1 – the histogram will probably break. But you can change the value in B2 to 500, hit F9 a few times and observer how the bell-curve shape becomes more narrow. Or, change B2 to 2000 and you will see a much flatter bell curve.

KEY TAKE-AWAY(S)

1.    As we did with simulating loss frequency, we leverage randomness to simulate loss magnitude.

2.    While we typically talk about an average loss magnitude value; losses can range in terms of magnitude. Being able to work within a range of loss values gives us a more complete view of our loss potential.

In part four of the series, we will combine loss frequency and loss magnitude into one simulation. For every iteration, we will randomly derive a loss frequency value (an integer) and a loss magnitude value. We will then calculate an expected loss, which is the product of the loss frequency and the loss magnitude values. Perform this cycle thousands or millions of time and you now have an expected loss distribution.


More Heat Map Love

May 11, 2010

In my previous post “Heat Map Love” I attempted to illustrate the relationship between plots on a heat map and a loss distribution. In this post I am going to illustrate another method to show the relationship – hopefully in simpler terms.

In the heat map above I have plotted five example risk issues:

I: Application security; cross-site scripting; external facing application(s); actual loss events between 2-10 times a year; low magnitude per event – less then $10,000.

II: Data confidentiality; lost / stolen mobile device or computer; no hard disk encryption; simulated or actual one loss event per year, low to moderate magnitude per event.

III:  PCI-DSS Compliance; level 2 Visa merchant; not compliant with numerous PCI-DSS standards; merchant self-reports not being in compliance this year; merchant expects monthly fines of $5,000 for a one year total of $60,000.

IV: Malware outbreak; large malware outbreak (greater then 10% of your protected endpoints). Less then once every ten years; magnitude per event could range between $100,000 and $1,000,000; productivity hit, external consulting, etc.

V: Availability; loss of data center; very low frequency; very large magnitude per event.

Since there is a frequency and magnitude of loss associated with each of these issues we can conceptually associate these issues with a loss distribution (assuming that our loss distribution is a normal-like or log normal).

Step 1: Hold a piece of paper with the heat map looking like the image below:

Step 2: Flip the paper towards you so the heat map looks like image below (flip vertical):


Step 3: Rotate the paper counter-clockwise 90 degrees; it should like the image below.


For ease of illustration; let’s overlay a log normal distribution.

What we see is in line with what we discussed in the “Heat Map Love” post:

Risk V – Loss of data center; is driving the tail; very low frequency; very large magnitude.
Risk IV – Malware outbreak; low frequency; but significant or high magnitude.
Risk III – Annual PCI fines from Visa via acquirer / processor; once per year; $60K.
Risk II – Lost or stolen laptop that had confidential information on it; response and notification costs not expected to be significant.
Risk I – Lots of small application security issues; for example cross site scripting; numerous detected and reported instances per year; low cost per event.

There you have it – a less technical way to perform a sniff test on your heat map plots and / or validate against a loss distribution.

Once you have taught everyone how to perform this artwork paper rotation trick. You can have a paper airplane flying contest.


Heat Map Love

May 6, 2010

First, I would like to welcome Jack Jones *back* to the world of risk blogging. Jack blogged a few weeks ago on the subject of heat maps; “Lipstick on Pigs” and “Lipstick Part II”; prompting a response by Jared Pfost of Third Defense. These are great posts that underscore the need to structure and leverage heat maps in an effective and defensible manner.

The purpose of this post is to share a recent “ah hah” moment involving heat maps and loss distributions. Whether you are an advocate or not of risk quantification or simulation modeling – it is hard to criticize one for having tools or procedures in place that essentially serve as a “risk sniff test”. I consider reconciling portions of a loss distribution to a heat map – a pretty useful sniff test.

QUESTION TO BE ANSWERED: How do I reconcile – or validate – the plotting of a heat map bubble with a loss distribution?

Well, it depends…but let’s establish some context.

•    5-by-5 heat map. The X axis of the heat map represents “Frequency of Loss”; the Y axis represents “Magnitude of Loss”. Each axis is broken into 5 sections.

•    Let’ say we have a heat map whose bubbles represent categories of risk issues (ISO 27002 categories, BASEL II OpRisk Categories, etc…).

•    At a minimum, all of these issues have been assessed with some methodology and/or tool (I prefer FAIR) that allow us to associate the frequency and magnitude of loss for each and every issue.

•    We can perform thousands of simulation iterations for each and every issue in the risk repository, perform analysis, determine categories of risk that are contributing the most to various percentiles of the loss distribution, and then associate them with a heat map.

•    For the purpose of this post we are going to make a good faith assumption that our loss distribution resembles a log-normal or normal “like” distribution.

Back in my “Rainbow Risk” post I shared an example of a “rainbow chart”; a 100% stacked bar chart representing the contribution of loss that a category contributes (by percentage) to any given loss distribution percentile. For example, in the rainbow chart on that post, it showed that Business Continuity Mgmt category of risk issues accounted for about 55% of the risk in the 99th percentile. On a heat map, most significant IT Business Continuity issues are probably going to be very low frequency, very high magnitude events. Thus, it is fairly intuitive that very low frequency / very high frequency magnitude loss events would “drive” the tail of a given loss distribution.

In the images below, I have mapped areas on a heat map (image 1) to areas on a distribution (image 2). Specifically, I am trying to illustrate how frequency and magnitude for any given issue factors into or most likely represented in a loss distribution.

Image 1
Image 2

Area A – Very low frequency, very high magnitude risk issues. These types of events or risk issues drive the tail portion of a loss distribution.

Area B – Very low frequency, moderate or high magnitude risk issues; or low to moderate frequency, very high magnitude loss events. It can be said that these type of issues also drive the tail – but maybe not as much past the 99th percentile like issues associated with Area A.

Area C – Low to Moderate frequency, moderate or high magnitude.  These issues are best represented in the middle of the distribution; generally speaking, around one standard deviation on both sides of the mean.

Area D – Very frequent, moderate or high magnitude. Loss associated with these issues is not as severe as those of Areas A and B; but are typically greater then the mean expected loss.

Area E – Very frequent, very high magnitude. Generally speaking, these issues probably drive the portion of the distribution between 1 and 2.5 standard deviations (to the right of the mean).

Area F – Low or moderate frequency, low or moderate magnitude. These issues best factor into the area of the distribution left of the mean. Loss associated with these issues is less then the mean.

In closing, I would share at least one use case for performing this analysis or validation. Key risk heat maps. If all of your issues have frequency and magnitude values as well as some other attributes associated with the issue, you can:

1.    Perform simulations on all of these issues.
2.    Calculate their contributions to various distribution percentiles
3.    Analyze the results by various attributes (ISO 27002, BASEL II, IT Process, etc…).
4.    Chart derived information (categories of risk) on a heat map
5.    Review for plausibility / accuracy (this should occur all the time)

I welcome any feedback!