Random variables

A random variable is any variable the value of which cannot be predicted with certainty. However, not all variables have complete freedom of movement. For example, a variable may vary randomly, but within a maximum and minimum boundary, with a "most likely" value somewhere in between. Share prices are an example of a bounded random variable, in that the price of a share can never be negative. Other variables may be completely random (at least theoretically) with infinite upper and lower values, but with an infinitesimal probability of occurrence.

Irrespective of the actual behaviour of a particular variable, the fact that it may have a range of possible values results in a distribution of possible values, referred to as a probability distribution. The purpose of statistical simulation is to describe the distribution and characteristics of a dependent variable, given the behaviour of an independent variable that determines the value of the dependent variable.

There are three basic probabilistic risk distributions:

■ Discrete distribution. The variable can only represent a discrete event, such as the number of pumps in use in a water filtration system.

■ Continuous distribution. Any value can occur within a described limit. A normal distribution with upper and/or lower bounds of positive or negative infinity is a continuous, unbounded distribution. A bounded distribution lies between a defined upper and lower boundary.

■ Parametric and non-parametric distributions. An exponential distribution is an example of a parametric distribution, used mostly in models of time dependent events, such as the lifetime of a device with a constant probability of failure. A triangular distribution is an example of a non-parametric distribution and is characterised by a minimum, most likely and maximum value.

The standard normal distribution, which is represented by the familiar "bell shaped" curve, is the most commonly used distribution in risk analysis. The normal distribution assumes that the expected or mean value of x has the highest probability of occurrence, with the probability of any variance distributed evenly around the mean. The key statistic in the normal distribution is the standard deviation (a), which is the measure of dispersion around the mean or expected value. Where a normal distribution is evident, the standard deviation provides a level of certainty that a random value in a sample population lies within a certain distance from the mean:

+/- σ of the mean = 68% of the probability density

+/- 2σ of the mean = 95% of the probability density

+/- 3σ of the mean = 99.7% of the probability density

A related measure to the standard deviation is the percentile. The nth percentile of a variable is that value for which there is an n% chance of the variable being equal to or lower than that value.

While the standard normal distribution is a useful measure of risk where the variables are random and continuous (for example, securities markets), it is unlikely that many of the costs in the PSC will exhibit the symmetry implied in the standard normal distribution. Cost variables may nevertheless be continuous within a bounded distribution that approximates a normal distribution, as discussed below.