Next up in my series on distributions, I'd like to talk about the Uniform distribution. Uniform is quickly becoming my favorite distribution because it is really easy to understand and it helps us to avoid a common mistake in risk modeling; namely that we tend to underestimate the likelihood of the extremes. If you remember back to my previous post on the Normal distribution you will recall that once you get three standard deviations away from the mean you are getting into frequencies that are pretty low. In fact, in a perfect normal distribution it would take 10,000 loss events to find 15 losses that were greater than 3 times the standard deviation.
Uniform distribution makes a great "safety" distribution. Maybe you're pretty confident that some random variable can be represented by a normal distribution, and if that is the case then use it. That's what it's there for. But what if your random variable is bimodal, meaning that has ups and downs and no single most likely value? What if you think that values at the far end of the distribution likely occur more often than what the Normal distribution allows? Uniform distribution never lets me underestimate my tails (unless my random variable is a strange U shaped phenomenon) and it never makes me ignore one mode in favor of another.
WHEN TO USE IT: Use the uniform distribution when you have a good idea about the upper and lower bound of your variable, but you are uncertain about the shape.
WHAT MAKES IT COOL: It's easy to understand. You can explain this distribution to even the most statistically challenged of executives. And thanks to the Central Limit Theorem when you combine several uniform distributions you'll get a normal distribution which will satisfy the people that want nice graphs and tighter estimates of the most likely value.
WHEN TO AVOID IT: You should absolutely avoid this distribution if you have any evidence that the variable you're representing is U-shaped, in other words the values at the extreme are more likely than values in the middle. Other than that, this is a great distribution if you want to be open to possibilities and are willing to admit that you don't know a whole lot.
EXTRA CREDIT: U-shaped distributions are most often seen where there is cyclic data. For example, if you were molding the the snowfall in a given month and your X axis starts with January and ends with December then you would likely see more snowfall on the two ends and less in the middle. Can you think of any variables in information security that might follow a U-shaped distribution?
No comments:
Post a Comment