Confidence interval

Quantitative Methods

• Confidence interval

A confidence interval (CI) is a measure of the reliability of an estimate. It is a type of interval estimate of a population parameter. It is an observed interval (i.e. it is calculated from the observations), in principle different from sample to sample, that frequently includes the parameter of interest if the experiment is repeated. How frequently the observed interval contains the parameter is determined by the confidence level or confidence coefficient. More specifically, the meaning of the term "confidence level" is that, if confidence intervals are constructed across many separate data analyses of repeated experiments, the proportion of such intervals that contain the true value of the parameter will match the confidence level; this is guaranteed by the reasoning underlying the construction of confidence intervals.Whereas two-sided confidence limits form a confidence interval, their one-sided counterparts are referred to as lower or upper confidence bounds.

Confidence intervals consist of a range of values  that act as good estimates of the unknown population parameter. However, in infrequent cases, none of these values may cover the value of the parameter. The level of confidence of the confidence interval would indicate the probability that the confidence range captures this true population parameter given a distribution of samples. It does not describe any single sample. This value is represented by a percentage, so when we say, "we are 99% confident that the true value of the parameter is in our confidence interval", we express that 99% of the observed confidence intervals will hold the true value of the parameter. After a sample is taken, the population parameter is either in the interval made or not; it is not a matter of chance. The desired level of confidence is set by the researcher . If a corresponding hypothesis test is performed, the confidence level is the complement of respective level of significance, i.e. a 95% confidence interval reflects a significance level of 0.05.The confidence interval contains the parameter values that, when tested, should not be rejected with the same sample. Greater levels of variance yield larger confidence intervals, and hence less precise estimates of the parameter. Confidence intervals of difference parameters not containing 0 imply that there is a statistically significant difference between the populations.

In applied practice, confidence intervals are typically stated at the 95% confidence level.However, when presented graphically, confidence intervals can be shown at several confidence levels, for example 50%, 95% and 99%.

Certain factors may affect the confidence interval size including size of sample, level of confidence, and population variability. A larger sample size normally will lead to a better estimate of the population parameter.

A confidence interval does not predict that the true value of the parameter has a particular probability of being in the confidence interval given the data actually obtained. Intervals with this property, called credible intervals, exist only in the paradigm of Bayesian statistics, as they require postulation of a prior distribution for the parameter of interest.

Shortfall

Expected shortfall (ES) is a risk measure, a concept used in finance (and more specifically in the field of financial risk measurement) to evaluate the market risk or credit risk of a portfolio. It is an alternative to value at risk that is more sensitive to the shape of the loss distribution in the tail of the distribution. The "expected shortfall at q% level" is the expected return on the portfolio in the worst q% of the cases.

Expected shortfall is also called conditional value at risk (CVaR), average value at risk (AVaR), and expected tail loss (ETL).

ES evaluates the value (or risk) of an investment in a conservative way, focusing on the less profitable outcomes. For high values of q it ignores the most profitable but unlikely possibilities, for small values of q it focuses on the worst losses. On the other hand, unlike the discounted maximum loss even for lower values of q expected shortfall does not consider only the single most catastrophic outcome. A value of q often used in practice is 5%.[citation needed]

Expected shortfall is a coherent, and moreover a spectral, measure of financial portfolio risk. It requires a quantile-level q, and is defined to be the expected loss of portfolio value given that a loss is occurring at or below the q-quantile.

Roy's Safety-First Ratio

An optimal portfolio is one that minimizes the probability that the portfolio's return will fall below a threshold level. In probability notation, if RP is the return on the portfolio, and RL is the threshold (the minimum acceptable return), then the portfolio for which P(RP < RL) is minimized will be the optimal portfolio according to Roy's safety-first criterion. The safety-first ratio helps compute this level by giving the number of standard deviations between the expected level and the minimum acceptable level, with the higher number considered safer.

Formula

SFRatio = (E(RP) - RL)/ σP
Example: Roy's Safety First Ratio

Let's say our minimum threshold is -2%, and we have the following expectations for portfolios A and B:

 Portfolio A Portfolio B Expected Annual Return 8% 12% Standard Deviation 10% 16%

The SFRatio for portfolio A is (8 - (-2))/10 = 1.0

The SFRatio for portfolio B is (12 - (-2))/16 = 0.875

In other words, the minimum threshold is one standard deviation away in Portfolio A, and just 0.875 away in Portfolio B, so by safety-first rules we opt for Portfolio A.

Log-normal distribution

log-normal (or lognormal) distribution is a continuous probability distribution of a random variable whose logarithm is normally distributed. Thus, if the random variable X is log-normally distributed, then Y = \log(X) has a normal distribution. Likewise, if Y has a normal distribution, then X = \exp(Y) has a log-normal distribution. A random variable which is log-normally distributed takes only positive real values.

The distribution is occasionally referred to as the Galton distribution or Galton's distribution, after Francis Galton. The log-normal distribution also has been associated with other names, such as McAlister, Gibrat and Cobb–Douglas.

A variable might be modeled as log-normal if it can be thought of as the multiplicative product of many independent random variables each of which is positive. (This is justified by considering the central limit theorem in the log-domain.) For example, in finance, the variable could represent the compound return from a sequence of many trades (each expressed as its return + 1); or a long-term discount factor can be derived from the product of short-term discount factors. In wireless communication, the delay caused by shadowing or slow fading from random objects is often assumed to be log-normally distributed: see log-distance path loss model