MARI elixir report
THE TIME VALUE OF
MONEY IN
FINANCE
PROFESSOR’S NOTE
The examples we
use in this reading are meant to show how the time value of money appears
throughout finance. Don’t worry if you are not yet familiar with the securities
we describe in this reading. We will see these examples again when we cover
bonds and forward interest rates in Fixed Income, stocks in Equity Investments,
foreign exchange in Economics, and options in Derivatives.
WARM-UP:
USING A FINANCIAL CALCULATOR
For the exam, you must be able to use a financial
calculator when working time value of money problems. You simply do not have
the time to solve these problems any other way.
CFA Institute alflows only two types of
calculators to be used for the exam: (1) the Texas
Instruments®
TI BA II Plus™ (including the BA II Plus Professional™) and (2) the HP® 12C
(including the HP 12C Platinum). This reading is written primarily with the TI
BA II Plus in mind. If you do not already own a calculator, purchase a TI BA II
Plus! However, if you already own the HP 12C and are comfortable with it, by
all means, continue to use it.
Before we begin working with financial
calculators, you should familiarize yourself with your TI BA II Plus by
locating the keys noted below. These are the only keys you need to know to
calculate virtually all of the time value of money problems:
N
= number of compounding periods
I/Y = interest rate per compounding period
PV = present value
FV = future value
PMT = annuity payments, or constant
periodic cash low CPT = compute
The TI BA II Plus comes preloaded from the
factory with the periods per year function (P/Y) set to 12. This automatically
converts the annual interest rate (I/Y) into monthly rates. While appropriate
for many loan-type problems, this feature is not suitable for the vast majority
of the time value of money applications we will be studying. So, before using
our SchweserNotes™, please set your P/Y key to “1" using the following
sequence of keystrokes:
As long as you do not change the P/Y
setting, it will remain set at one period per year until the battery from your
calculator is removed (it does not change when you turn the calculator on and
off). If you want to check this setting at any time, press [2nd] [P/Y]. The
display should read P/Y = 1.0. If it does, press [2nd] [QUIT] to get out of the
“programming” mode. If it does not, repeat the procedure previously described
to set the P/Y key. With P/Y set to equal 1, it is now possible to think of I/Y
as the interest rate per compounding period and N as the number of compounding periods under analysis. Thinking of
these keys in this way should help you keep things straight as we work through
time value of money problems.
PROFESSOR’S NOTE
We have provided an online video in the Resource Library on how to use
the TI calculator. You can view it by logging in to your account at www.schweser.com.
MODULE 2.1:
DISCOUNTED CASH FLOW VALUATION |
Video
covering this content is available online. |
|
LOS 2.a: Calculate and interpret the present value (PV) of fixed income and equity instruments based on expected future cash flows.
In our Rates and Returns reading, we gave
examples of the relationship between present values and future values. We can
simplify that relationship as follows:
If we are using continuous compounding,
this is the relationship:
Fixed-Income
Securities
One of the simplest examples of the time
value of money concept is a pure discount debt instrument, such as a zero-coupon bond. With a pure discount instrument, the investor pays less than
the face value to buy the instrument and receives the face value at maturity.
The price the investor pays depends on the instrument’s yield to maturity (the discount rate applied
to the face value) and the time until maturity. The amount of interest the
investor earns is the difference between the face value and the purchase price.
EXAMPLE: Zero-coupon bond A zero-coupon bond with a
face value of $1,000 will mature 15 years from today. The bond has a yield to
maturity of 4%. Assuming annual compounding, what is the bond’s price? Answer: |
We can infer a bond’s yield from its price
using the same relationship. Rather than solving for r with algebra, we typically use our financial calculators. For
this example, if we were given the price of $555.26, the face value of $1,000,
and annual compounding over 15 years, we would enter the following:
Then, to get the yield, CPT I/Y = 4.00.
PROFESSOR’S NOTE
Remember to enter cash out flows as
negative values and cash in flows as positive values. From the investor’s point
of view, the purchase price (PV) is an out low, and the return of the face
value at maturity (FV) is an in low.
In some circumstances, interest rates can
be negative. A zero-coupon bond with a negative yield would be priced at a premium, which means its price is
greater than its face value.
EXAMPLE: Zero-coupon bond with a negative yield If the bond in the previous
example has a yield to maturity of −0.5%, what is its price, assuming annual
compounding? Answer: |
A fixed-coupon
bond is only slightly more complex.
With a coupon bond, the investor receives a cash interest payment each period
in addition to the face value at maturity. The bond’s coupon rate is a
percentage of the face value and determines the amount of the interest
payments. For example, a 3% annual coupon, $1,000 bond pays 3% of $1,000, or
$30, each year.
The coupon rate and the yield to maturity are two different things. We only use the coupon
rate to determine the coupon payment (PMT). The yield to maturity (I/Y) is the
discount rate implied by the bond’s price.
EXAMPLE: Price of an annual coupon bond Consider a 10-year, $1,000
par value, 10% coupon, annual-pay bond. What is the value of this bond if its
yield to maturity is 8%? Answer: The coupon payments will be
10% × $1,000 = $100 at the end of each year. The $1,000 par value will be
paid at the end of Year 10, along with the last coupon payment. The value of this bond with a discount rate (yield to
maturity) of 8% is: The calculator solution is: The bond’s value is $1,134.20. |
PROFESSOR’S NOTE
For this reading
where we want to illustrate time value of money concepts, we are only using
annual coupon payments and compounding periods. In the Fixed Income topic area,
we will also perform these calculations for semiannual-pay bonds.
Some
bonds exist that have no maturity date. We refer to these as perpetual bonds
or perpetuities. We cannot speak
meaningfully of the future value of a perpetuity, but its present value simplifies
mathematically to the following:
An amortizing bond is one that pays a level amount
each period, including its maturity period. The difference between an
amortizing bond and a fixed-coupon bond is that for an amortizing bond, each
payment includes some portion of the principal. With a fixed-coupon bond, the entire
principal is paid to the investor on the maturity date.
Amortizing bonds are an example of an annuity instrument. For an annuity, the
payment each period is calculated as follows:
We can also determine an annuity payment
using a financial calculator.
EXAMPLE: Computing a loan payment Suppose you are considering
applying for a $2,000 loan that will be repaid with equal end-of-year
payments over the next 13 years. If the annual interest rate for the loan is
6%, how much are your payments? Answer: The
size of the end-of-year loan payment can be determined by inputting values
for the three known variables and computing PMT. Note that FV = 0 because the
loan will be fully paid off after the last payment: |
Equity
Securities
As with fixed-income
securities, we value equity securities such as common and
preferred stock as the present value of their future cash fflows. The key
differences are that equity securities do not mature, and their cash flows may
change over time.
Preferred stock pays a fixed dividend that is
stated as a percentage of its par value (similar to the face value of
a bond). As with bonds, we must distinguish between the stated percentage that
determines the cash flows and the discount rate we apply to the cash flows. We
say that equity investors have a required return that will induce them to
own an equity share. This required return is the discount rate we use to value
equity securities.
Because we can consider a preferred
stock’s fixed stream of dividends to be infinite, we can use the perpetuity
formula to determine its value:
EXAMPLE: Preferred stock valuation
A company’s
$100 par preferred stock pays a $5.00 annual dividend and has a required return
of 8%. Calculate the value of the preferred stock.
Answer:
Value of the
preferred stock: Dp/kP = $5.00/0.08 = $62.50
Common stock
is a residual claim to a company’s assets after it satisfies all other claims.
Common stock typically does not promise a fixed dividend payment. Instead, the
company’s management decides whether and when to pay common dividends.
Because the future cash fflows are
uncertain, we must use models to estimate the value of common stock. Here, we
will look at three approaches analysts use frequently, which we call dividend discount models (DDMs).
We will return to these examples in the Equity Investments topic area and
explain when each model is appropriate.
1. Assume a constant future dividend.
Under this assumption, we can value a common stock the same way we value a
preferred stock, using the perpetuity formula.
2. Assume a constant growth rate of dividends.
With this assumption, we can apply the constant growth DDM, also known as the Gordon growth model. In this model, we state the
value of a common share as follows:
In this model, V0 represents
the PV of all the dividends in future periods, beginning with D1.
Note that ke must be
greater than gc or the
math will not work.
EXAMPLE: Gordon growth model valuation
Calculate the
value of a stock that is expected to pay a $1.62 dividend next year, if
dividends are expected to grow at 8% forever and the required return on equity
is 12%.
3. Assume a changing growth rate of dividends.
This can be done in many ways. The example we will use here (and the one that
is required for the Level I CFA exam) is known as a multistage DDM.
Essentially, we assume a pattern of dividends in the short term, such as a
period of high growth, followed by a constant growth rate of dividends in the
long term.
To use a multistage DDM, we discount the
expected dividends in the short term as individual cash flows, then apply the
constant growth DDM to the long term. As we saw in the previous example, the
constant growth DDM gives us a value for an equity share one period before the dividend we use in the numerator.
EXAMPLE: Multistage growth Consider a stock with
dividends that are expected to grow at 15% per year for two years, after
which they are expected to grow at 5% per year, indefinitely. The last
dividend paid was $1.00, and ke
= 11%. Calculate the value of this stock using the multistage growth model. Answer: Calculate the dividends over the high growth period: Calculate the first dividend of the constant-growth period: Use the constant growth model to get P2, a value
for all the (infinite) dividends expected from time = 3 onward: Finally, we can sum the present values of dividends 1 and 2
and of P2 to get the present value of all the expected future
dividends during both the high-growth and constant-growth periods: |
PROFESSOR’S NOTE
A key point to notice in this example is
that when we applied the dividend in Period 3 to the constant growth model, it
gave us a value for the stock in Period 2. To get a value for the stock today,
we had to discount this value back by two periods, along with the dividend in
Period 2 that was not included in the constant growth value.
MODULE
QUIZ 2.1
1. Terry
Corporation preferred stock is expected to pay a $9 annual dividend in
perpetuity. If the required rate of return on an equivalent investment is 11%, one
share of Terry preferred should be worth: A. $81.82.
B. $99.00.
C. $122.22.
2. Dover Company
wants to issue a $10 million face value of 10-year bonds with an annual coupon
rate of 5%. If the investors’ required yield on Dover’s bonds is 6%, the amount
the company will receive when it issues these bonds (ignoring transactions
costs) will be:
A. less than $10
million.
B. equal to $10
million.
C. greater than
$10 million.
MODULE 2.2:
IMPLIED RETURNS AND
CASH FLOW
ADDITIVITY Video coveringthis content is
available
online.
LOS 2.b: Calculate and interpret the implied return of fixed-income
instruments and required return and implied growth of equity instruments given the present value (PV) and cash flows.
The examples we have seen so far illustrate the relationships among present
value, future cash flows, and the required rate of return. We can easily
rearrange these relationships and solve for the required rate of return, given
a security’s price and its future cash flows.
EXAMPLE: Yield of an annual coupon bond Consider the 10-year, $1,000 par value, 10% coupon,
annual-pay bond we examined in an earlier example, when its price was
$1,134.20 at a yield to maturity of 8%. What is its yield to
maturity if its price decreases to $1,085.00? Answer: The bond’s yield to maturity increased to 8.69%. |
Notice that the
relationship between prices and yields is inverse. When the price decreases, the yield to maturity increases. When the price increases, the yield to maturity
decreases. Or, equivalently, when the yield increases, the price decreases. When the yield decreases, the price increases. We will use this concept again
and again when we study bonds in the Fixed Income topic area.
In our examples for equity share values,
we assumed the investor’s required rate of return. In practice, the required
rate of return on equity is not directly observable. Instead, we use share
prices that we can observe in the market to derive implied required rates of
return on equity, given our assumptions about their future cash flows.
For example, if we assume a constant rate
of dividend growth, we can rearrange the constant growth DDM to solve for the
required rate of return:
That is, the required rate of return on
equity is the ratio of the expected dividend to the current price (which we
refer to as a share’s dividend yield) plus the assumed constant
growth rate.
We can also rearrange the model to solve
for a stock’s implied growth rate,
given a required rate of return:
That is, the implied growth rate is the
required rate of return minus the dividend yield.
LOS 2.c: Explain the cash low additivity principle, its importance for the no arbitrage condition, and its use in calculating implied forward interest rates,
forward exchange rates, and option values.
The cash low additivity principle refers to the fact that the
PV of any stream of cash flows equals the sum of the PVs of the cash flows. If
we have two series of cash flows, the sum of the PVs of the two series is the
same as the PVs of the two series taken together, adding cash flows that will
be paid at the same point in time. We can also divide up a series of cash flows
any way we like, and the PV of the “pieces” will equal the PV of the original
series.
This is a simple
example of replication. In effect,
we created the equivalent of the given series of uneven cash flows by combining
a 4-year annuity of 100 with a 3-year zero-coupon bond of 300.
We rely on the cash low additivity
principle in many of the pricing models we see in the Level I CFA curriculum.
It is the basis for the no-arbitrage
principle, or “law of one price,”
which says that if two sets of future cash flows are identical under all
conditions, they will have the same price today (or if they don’t, investors
will quickly buy the lower-priced one and sell the higher-priced one, which
will drive their prices together).
Three examples of valuation based on the
no-arbitrage condition are forward interest rates, forward exchange rates, and
option pricing using a binomial model. We will explain each of these examples
in greater detail when we address the related concepts in the Fixed Income,
Economics, and Derivatives topic areas. For now, just focus on how they apply
the principle that equivalent future cash flows must have the same present
value.
Forward
Interest Rates
A forward interest rate is the interest rate for a
loan to be made at some future date. The notation used must identify both the
length of the loan and when in the future the money will be borrowed. Thus, 1y1y is the rate for a 1-year loan to be
made one year from now; 2y1y is the
rate for a 1-year loan to be made two years from now; 3y2y is the 2-year forward rate three years from now; and so on.
By contrast, a spot interest rate is an interest rate for a
loan to be made today. We will use the notation S1 for a 1-year rate today, S2 for a 2-year rate today, and so on.
The way the cash low
additivity principle applies here is that, for example, borrowing for three
years at the 3-year spot rate, or borrowing for one-year periods in three
successive years, should have the same cost today. This relation is illustrated
as follows: (1 + S3)3 = (1 + S1)(1 + 1y1y)(1 +
2y1y).
In fact, any combination of spot and
forward interest rates that cover the same time period should have the same
cost. Using this idea, we can derive implied forward rates from spot rates
that are observable in the fixed-income markets.
Forward
Currency Exchange Rates
An exchange rate
is the price of one country’s currency in terms of another country’s currency.
For example, an exchange rate of 1.416 USD/EUR means that one euro (EUR) is
worth 1.416 U.S. dollars (USD). The Level I CFA curriculum refers to the
currency in the numerator (USD, in this example) as the price currency and the one in
the denominator (EUR in this example) as the base currency.
Like interest rates, exchange rates can be
quoted as spot rates for currency exchanges to be made today, or as forward
rates for currency exchanges to be made at a future date.
The percentage
difference between forward and spot exchange rates is approximately the
difference between the two countries’ interest rates. This is because there is
an arbitrage trade with a riskless pro it to be made when this relation does
not hold.
The possible
arbitrage is as follows: borrow Currency A at Interest Rate A, convert it to
Currency B at the spot rate and invest it to earn Interest Rate B, and sell the
proceeds from this investment forward at the forward rate to turn it back into
Currency A. If the forward rate does not correctly reflect the difference
between interest rates, such an arbitrage could generate a pro it to the extent
that the return from investing Currency B and converting it back to Currency A
with a forward contract is greater than the cost of borrowing Currency A for
the period.
For spot and forward rates expressed as
price currency/base currency, the no-arbitrage relation is as follows:
This formula can be rearranged as
necessary to solve for specific values of the relevant terms.
EXAMPLE: Calculating the arbitrage-free forward exchange rate Consider
two currencies, the ABE and the DUB. The spot ABE/DUB exchange rate is
4.5671, the 1-year riskless ABE rate is 5%, and the 1-year riskless DUB rate
is 3%. What is the 1-year forward exchange rate that will prevent arbitrage
pro its? Answer: Rearranging our formula, we have: and we can calculate the forward rate as: As you can see, the forward rate is greater than the spot
rate by 4.6558 / 4.5671 − 1 = 1.94%. This is approximately equal to the
interest rate differential of 5% − 3% = 2%. |
Option Pricing
Model
An option is the right, but not the
obligation, to buy or sell an asset on a future date for a specified price. The
right to buy an asset is a call option, and the right to sell an asset
is a put option.
Valuing options
is different from valuing other securities because the owner can let an option
expire unexercised. A call option owner will let the option expire if the
underlying asset can be bought in the market for less than the price specified in
the option. A put option owner will let the option expire if the underlying
asset can be sold in the market for more than the price specified in the
option. In these cases, we say an option is out of the money. If an option is in the money on its expiration date, the
owner has the right to buy the asset for less, or sell the asset for more, than
its market price— and, therefore, will exercise the option.
An approach to valuing options that we
will use in the Derivatives topic area is a binomial model. A
binomial model is based on the idea that, over the next period, some value will
change to one of two possible values. To construct a one-period binomial model
for pricing an option, we need the following:
A
value for the underlying asset at the beginning of the period
An exercise price for the option; the
exercise price can be different from the value of the underlying, and we assume
the option expires one period from now
Returns that will result from an up-move and
a down-move in the value of the underlying over one period
The risk-free rate over the period
As an example, we can model a call option
with an exercise price of $55 on a stock that is currently valued (S0)
at $50. Let us assume that in one period, the stock’s value will either
increase (S 1 u) to $60 or decrease (S 1 d) to $42. We state the return from an
upmove (Ru) as $60 / $50 = 1.20, and the return from a down-move (Rd)
as $42 / $50 = 0.84.
Figure 2.1: One-Period Binomial Tree
The call option
will be in the money after an up-move or out of the money after a down-move.
Its value at expiration after an up-move, , is $60 − $55 = $5. Its value
after a down-move, , is zero.
Now,
we can use no-arbitrage pricing to determine the initial value of the call
option (c0). We do this by creating a portfolio of the option and
the underlying stock, such that the portfolio will have the same value
following either an up-move () or a down move () in the stock. For our example,
we would write the call option (that is, we grant someone else the option to
buy the stock from us) and buy a number of shares of the stock that we will
denote as h. We must solve for the h that results in = :
The initial value of our portfolio, V0,
is hS0 − c0 (we subtract c0 because we are
short the call option).
The
portfolio value after an up-move, .
The portfolio value after a down-move, .
In our example, and solving for h, we get the following:
This result—the number of shares of the
underlying we would buy for each call option we would write—is known as the
hedge ratio for this option.
With
, the value of the portfolio
after one period is known with certainty. This means we can say that either must equal V0 compounded at the
risk-free rate for one period. In this example,
= 0.278($42) = $11.68, or = 0.278($60) − $5 = $11.68. Let us assume the
risk-free rate over one period is 3%. Then, V0 = $11.68 / 1.03 =
$11.34.
Now, we can solve for the value of the
call option, c0. Recall that V0 = hS0 − c0,
so c0 = hS0 − V0. Here, c0 =
0.278($50) − $11.34 = $2.56.
MODULE
QUIZ 2.2
1. For an equity
share with a constant growth rate of dividends, we can estimate its:
A. value as the
next dividend discounted at the required rate of return.
B. growth rate as
the sum of its required rate of return and its dividend yield.
C. required return
as the sum of its constant growth rate and its dividend yield.
2. An investment
of €5 million today is expected to produce a one-time payoff of €7 million
three years from today. The annual return on this investment, assuming annual
compounding, is closest to: A. 12%.
B. 13%.
C. 14%.
KEY CONCEPTS
LOS 2.a
The value of a fixed-income instrument or
an equity security is the present value of its future cash flows, discounted at
the investor’s required rate of return:
The PV of a perpetual bond or a preferred
stock , where r = required rate of return.
The PV of a common stock with a constant
growth rate of dividends is:
LOS 2.b
By rearranging the present value
relationship, we can calculate a security’s required rate of return based on
its price and its future cash flows. The relationship between prices and
required rates of return is inverse.
For an equity share with a constant rate
of dividend growth, we can estimate the required rate of return as the dividend
yield plus the assumed constant growth rate, or we can estimate the implied
growth rate as the required rate of return minus the
dividend yield.
LOS 2.c
Using the cash low additivity principle,
we can divide up a series of cash flows any way we like, and the present value
of the pieces will equal the present value of the original series. This
principle is the basis for the no-arbitrage condition, under which two sets of
future cash flows that are identical must have the same present value.
ANSWER KEY FOR MODULE QUIZZES
Module Quiz 2.1
1. A 9 / 0.11 = $81.82 (LOS 2.a)
2. A Because the required yield is greater than the coupon rate,
the present value of the bonds is less than their face value: N = 10; I/Y = 6;
PMT = 0.05 × $10,000,00 =
$500,000; FV = $10,000,000; and CPT PV =
−$9,263,991. (LOS 2.a)
Module Quiz 2.2
1. C Using the constant growth dividend discount model, we can
estimate the required rate of return as . The estimated value of a share
is all of its future dividends
discounted at the required rate of return, which simplifies to if we assume a constant growth rate. We can
estimate the constant
growth rate as the required rate of return minus the dividend yield. (LOS 2.b)
2. A
(LOS 2.b)
READING 3
STATISTICAL MEASURES
OF ASSET RETURNS
DISPERSION |
Video covering this content is
available online. |
|
MODULE 3.1: CENTRAL TENDENCY AND
LOS 3.a: Calculate, interpret, and evaluate measures of central tendency and location to address an investment problem.
Measures of Central Tendency
Measures of central tendency identify the center, or
average, of a dataset. This central point can then be used to represent the
typical, or expected, value in the dataset.
The arithmetic mean is the
sum of the observation values divided by the number of observations. It is the
most widely used measure of central tendency. An example of an arithmetic mean
is a sample mean, which is the sum of all the values in a sample of a
population, ΣX, divided by the number of observations in the sample, n. It is used to make inferences about the population mean.
The sample mean is expressed as follows:
The median is the
midpoint of a dataset, where the data are arranged in ascending or descending
order. Half of the observations lie above the median, and half are below. To
determine the median, arrange the data from the highest to lowest value, or
lowest to highest value, and find the middle observation.
The median is important because the arithmetic mean can be
affected by outliers, which are
extremely large or small values. When this occurs, the median is a better
measure of central tendency than the mean because it is not affected by extreme
values that may actually be the result of errors in the data.
EXAMPLE: The median using an odd number of observations
What is the
median return for ive portfolio managers with a 10-year annualized total
returns record of 30%, 15%, 25%, 21%, and 23%?
Answer:
First, arrange
the returns in descending order:
30%, 25%, 23%,
21%, 15%
Then, select
the observation that has an equal number of observations above and below it—the
one in the middle. For the given dataset, the third observation, 23%, is the
median value.
EXAMPLE: The median using an even number of observations
Suppose we add
a sixth manager to the previous example with a return of 28%. What is the
median return?
Answer:
Arranging the
returns in descending order gives us this:
30%, 28%, 25%,
23%, 21%, 15%
With an even
number of observations, there is no single middle value. The median value, in
this case, is the arithmetic mean of the two middle observations, 25% and 23%.
Thus, the median return for the six managers is 24% = 0.5(25 + 23).
The mode is the
value that occurs most frequently in a dataset. A dataset may have more than
one mode, or even no mode. When a distribution has one value that appears most
frequently, it is said to be unimodal.
When a dataset has two or three values that occur most frequently, it is said
to be bimodal or trimodal, respectively.
EXAMPLE: The mode What is the mode of the following dataset? Dataset:
[30%, 28%, 25%, 23%, 28%, 15%, 5%] Answer: The mode is 28% because it is the value appearing most
frequently. |
For continuous data, such as investment returns, we
typically do not identify a single outcome as the mode. Instead, we divide the
relevant range of outcomes into intervals, and we identify the modal interval
as the one into which the largest number of observations fall.
Methods for Dealing With Outliers
In some cases, a researcher may decide that outliers should
be excluded from a measure of central tendency. One technique for doing so is to
use a trimmed mean. A trimmed mean excludes a stated percentage of
the most extreme observations. A 1% trimmed mean, for example, would discard
the lowest 0.5% and the highest 0.5% of the observations.
Another technique is to use a winsorized mean. Instead
of discarding the highest and lowest observations, we substitute a value for
them. To calculate a 90% winsorized mean, for example, we would determine the
5th and 95th percentile of the observations, substitute the 5th percentile for
any values lower than that, substitute the 95th percentile for any values
higher than that, and then calculate the mean of the revised dataset.
Percentiles are measures of location, which we will address next.
Measures of Location
Quantile is the
general term for a value at or below which a stated proportion of the data in a
distribution lies. Examples of quantiles include the following:
Quartile. The distribution is divided
into quarters.
Quintile. The
distribution is divided into fifths.
Decile. The
distribution is divided into tenths.
Percentile. The
distribution is divided into hundredths (percentages).
Note that any quantile may be expressed as a percentile. For
example, the third quartile partitions the distribution at a value such that
three-fourths, or 75%, of the observations fall below that value. Thus, the
third quartile is the 75th percentile. The difference between the third
quartile and the first quartile (25th percentile) is known as the interquartile range.
To visualize a dataset based on quantiles, we can create a box and whisker plot,
as shown in Figure 3.1. In a box and whisker plot, the box represents the
central portion of the data, such as the interquartile range. The vertical line
represents the entire range. In Figure 3.1, we can see that the largest
observation is farther away from the center than is the smallest observation.
This suggests that the data might include one or more outliers on the high
side.
Figure 3.1: Box and Whisker Plot
investment problem.
Dispersion is defined
as the variability around the central tendency.
The common theme in finance and investments is the tradeoff between reward and
variability, where the central tendency is the measure of the reward and
dispersion is a measure of risk.
The range
is a relatively simple measure of variability, but when used with other
measures, it provides useful information. The range is the distance between the
largest and the smallest value in the dataset: range = maximum value − minimum
value
EXAMPLE: The range
What is the
range for the 5-year annualized total returns for ive investment managers if
the managers’ individual returns were 30%, 12%, 25%, 20%, and 23%?
Answer: range = 30 − 12 = 18%
The mean absolute deviation (MAD)
is the average of the absolute values of the deviations of individual
observations from the arithmetic mean:
The computation of the MAD uses the absolute values of each
deviation from the mean because the sum of the actual deviations from the
arithmetic mean is zero.
The sample variance, s2, is the measure of
dispersion that applies when we are evaluating a sample of n observations from a population. The sample variance is calculated
using the following formula:
The denominator for s2
is n − 1, one less than the sample size n.
Based on the mathematical theory behind statistical procedures, the use of the
entire number of sample observations, n,
instead of n − 1 as the divisor in the computation of s2, will systematically underestimate the population variance—particularly for small sample
sizes. This systematic underestimation causes the sample variance to be a biased estimator of the population
variance. Using n − 1 instead of n in
the denominator, however, improves the statistical properties of s2 as an estimator of the
population variance.
Thus, the
sample variance of 44.5(%2) can be interpreted to be an unbiased
estimator of the population variance. Note that 44.5 “percent squared” is
0.00445, and you will get this value if you put the percentage returns in
decimal form [e.g., (0.30 − 0.22)2].
A major problem with using variance is
the dif iculty of interpreting it. The computed variance, unlike the mean, is
in terms of squared units of measurement. How does one interpret squared
percentages, squared dollars, or squared yen? This problem is mitigated through
the use of the standard deviation. The units of standard
deviation are the same as the units of the data (e.g., percentage return,
dollars, euros). The sample standard deviation is the square root of the
sample variance. The sample standard deviation, s, is calculated as follows:
EXAMPLE: Sample standard deviation Compute the sample standard
deviation based on the result of the preceding example. Answer: Because the sample variance for the preceding example was
computed to be 44.5(%2), this is the sample standard deviation: This means that on average, an individual return from the
sample will deviate ±6.67% from the mean return of 22%. The sample standard
deviation can be interpreted as an unbiased estimator of the population
standard deviation. |
A direct comparison between two or
more measures of dispersion may be dif icult. For instance, suppose you are
comparing the annual returns distribution for retail stocks with a mean of 8%
and an annual returns distribution for a real estate portfolio with a mean of
16%. A direct comparison between the dispersion of the two distributions is not
meaningful because of the relatively large difference in their means. To make a
meaningful comparison, a relative measure of dispersion must be used. Relative dispersion is the amount of
variability in a distribution around a reference point or benchmark. Relative
dispersion is commonly measured with the coef
icient of variation (CV), which is computed as follows:
CV measures the amount of dispersion in a distribution
relative to the distribution’s mean. This is useful because it enables us to
compare dispersion across different sets of data. In an investments setting,
the CV is used to measure the risk (variability) per unit of expected return
(mean). A lower CV is better.
EXAMPLE: Coef icient of variation You have just been presented
with a report that indicates that the mean monthly return on T-bills is 0.25%
with a standard deviation of 0.36%, and the mean monthly return for the
S&P 500 is 1.09% with a standard deviation of 7.30%. Your manager has asked
you to compute the CV for these two investments and to interpret your
results. Answer: These results indicate that there is less dispersion (risk)
per unit of monthly return for T-bills than for the S&P 500 (1.44 vs.
6.70). |
PROFESSOR’S NOTE
To remember the
formula for CV, remember that the CV is a measure of variation, so standard
deviation goes in the numerator. CV is variation per unit of return.
When we use variance or standard deviation as risk measures,
we calculate risk based on outcomes both above and below the mean. In some
situations, it may be more appropriate to consider only outcomes less than the
mean (or some other specific value) in calculating a risk measure. In this
case, we are measuring downside risk.
One measure of downside risk is target downside deviation, which is also known as target semideviation.
Calculating target downside deviation is similar to calculating standard
deviation, but in this case, we choose a target value against which to measure
each outcome and only include deviations from the target value in our
calculation if the outcomes are below that target.
The formula for target downside deviation is stated as follows:
Note that the denominator remains the sample size n minus one, even though we are not
using all of the observations in the numerator.
EXAMPLE: Target downside deviation
Calculate the
target downside deviation based on the data in the preceding examples, for a
target return equal to the mean (22%), and for a target return of 24%.
1. A dataset has
100 observations. Which of the following measures of central tendency will be
calculated using a denominator of 100?
A. The winsorized
mean, but not the trimmed mean.
B. Both the
trimmed mean and the winsorized mean.
C. Neither the
trimmed mean nor the winsorized mean.
2. XYZ
Corp. Annual Stock Returns
What is the sample standard
deviation? A. 9.8%.
B. 72.4%.
C. 96.3%.
3. XYZ
Corp. Annual Stock Returns
Assume
an investor has a target return of 11% for XYZ stock. What is the stock’s
target downside deviation? A. 9.39%.
B. 12.10%.
C. 14.80%.
MODULE 3.2: SKEWNESS, KURTOSIS, AND
CORRELATION Video coveringthis content is
available online.
LOS 3.c: Interpret and evaluate measures of skewness and kurtosis
to address an investment problem.
A distribution is symmetrical if it is shaped identically on
both sides of its mean.
Distributional
symmetry implies that intervals of losses and gains will exhibit the same
frequency. For example, a symmetrical distribution with a mean return of zero
will have losses in the −6% to −4% interval as frequently as it will have gains
in the +4% to +6% interval. The extent to which a returns distribution is
symmetrical is important because the degree of symmetry tells analysts if
deviations from the mean are more likely to be positive or negative.
Skewness, or skew, refers to the extent
to which a distribution is not symmetrical. Nonsymmetrical distributions may be
either positively or negatively skewed and result from the occurrence of
outliers in the dataset. Outliers
are observations extraordinarily far from the mean, either above or below:
A positively skewed distribution is
characterized by outliers greater than the mean (in the upper region, or right
tail). A positively skewed distribution is said to be skewed right because of
its relatively long upper (right) tail.
A negatively skewed distribution has a
disproportionately large amount of outliers less than the mean that fall within
its lower (left) tail. A negatively skewed distribution is said to be skewed
left because of its long lower tail.
Skewness affects the location of the mean, median, and mode
of a distribution:
For
a symmetrical distribution, the mean, median, and mode are equal.
For a positively skewed, unimodal distribution, the mode is
less than the median, which is less than the mean. The mean is affected by
outliers; in a positively skewed distribution, there are large, positive
outliers, which will tend to pull the mean upward, or more positive. An example
of a positively skewed distribution is that of housing prices. Suppose you live
in a neighborhood with 100 homes; 99 of them sell for $100,000, and one sells
for $1,000,000. The median and the mode will be $100,000, but the mean will be
$109,000. Hence, the mean has been pulled upward (to the right) by the
existence of one home (outlier) in the neighborhood.
For a negatively skewed, unimodal
distribution, the mean is less than the median, which is less than the mode. In
this case, there are large, negative outliers that tend to pull the mean
downward (to the left).
PROFESSOR’S NOTE
The key to remembering how measures of central
tendency are affected by skewed data is to recognize that skew affects the mean
more than the median and mode, and the mean is pulled in the direction of the
skew. The relative location of the mean, median, and mode for different
distribution shapes is shown in Figure 3.2. Note that the median is between the
other two measures for positively or negatively skewed distributions.
Figure 3.2: Effect of Skewness on
Mean, Median, and Mode
Sample skewness is equal to the sum of the
cubed deviations from the mean divided by the cubed standard deviation and by
the number of observations. Sample skewness for large samples is approximated
as follows:
PROFESSOR’S NOTE
The LOS requires us to “interpret and evaluate” measures of
skewness and kurtosis, but not to calculate them.
Note that the denominator is always positive, but that the
numerator can be positive or negative depending on whether observations above
the mean or observations below the mean tend to be farther from the mean, on
average. When a distribution is right skewed, sample skewness is positive
because the deviations above the mean are larger, on average. A left-skewed
distribution has a negative sample skewness.
Dividing by
standard deviation cubed standardizes the statistic and alflows interpretation
of the skewness measure. If relative skewness is equal to zero, the data are
not skewed. Positive levels of relative skewness imply a positively skewed
distribution, whereas negative values of relative skewness imply a negatively
skewed distribution. Values of sample skewness in excess of 0.5 in absolute
value are considered signi icant.
Kurtosis is a measure of the degree to
which a distribution is more or less peaked than a normal distribution. Leptokurtic describes a distribution
that is more peaked than a normal distribution, whereas platykurtic refers to a distribution that is less peaked, or latter
than a normal one. A distribution is mesokurtic
if it has the same kurtosis as a normal distribution.
As indicated in Figure 3.3, a leptokurtic return
distribution will have more returns clustered around the mean and more returns
with large deviations from the mean (fatter tails). Relative to a normal
distribution, a leptokurtic distribution will have a greater percentage of
small deviations from the mean and a greater percentage of extremely large
deviations from the mean. This means that there is a relatively greater
probability of an observed value being either close to the mean or far from the
mean. Regarding an investment returns distribution, a greater likelihood of a
large deviation from the mean return is often perceived as an increase in risk.
Figure 3.3: Kurtosis
A distribution is said to exhibit excess kurtosis if it has
either more or less kurtosis than the normal distribution. The computed
kurtosis for all normal distributions is three. Statisticians, however,
sometimes report excess kurtosis, which is defined as kurtosis minus three.
Thus, a normal distribution has excess kurtosis equal to zero, a leptokurtic
distribution has excess kurtosis greater than zero, and platykurtic
distributions will have excess kurtosis less than zero.
Kurtosis is critical in a risk management setting. Most
research about the distribution of securities returns has shown that returns
are not normally distributed. Actual securities returns tend to exhibit both
skewness and kurtosis. Skewness and kurtosis are critical concepts for risk
management because when securities returns are modeled using an assumed normal
distribution, the predictions from the models will not take into account the
potential for extremely large, negative outcomes. In fact, most risk managers
put very little emphasis on the mean and standard deviation of a distribution
and focus more on the distribution of returns in the tails of the
distribution—that is where the risk is. In general, greater excess kurtosis and
more negative skew in returns distributions indicate increased risk.
Sample kurtosis for large samples is
approximated using deviations raised to the fourth power:
problem.
Scatter plots are a method for displaying
the relationship between two variables. With one variable on the vertical axis
and the other on the horizontal axis, their paired observations can each be
plotted as a single point. For example, in Panel A of Figure 3.4, the point
farthest to the upper right shows that when one of the variables (on the
horizontal axis) equaled 9.2, the other variable (on the vertical axis) equaled
8.5.
The scatter plot
in Panel A is typical of two variables that have no clear relationship. Panel B
shows two variables that have a strong linear relationship—that is, a high
correlation coef icient.
A key advantage of creating scatter plots is that they can
reveal nonlinear relationships, which
are not described by the correlation coef icient. Panel C illustrates such a
relationship. Although the correlation coef icient for these two variables is
close to zero, their scatter plot shows clearly that they are related in a
predictable way.
Figure 3.4: Scatter Plots
Covariance is a
measure of how two variables move together. The calculation of the sample covariance
is based on the following formula:
In practice, the covariance is dif icult to interpret. The
value of covariance depends on the units of the variables. The covariance of
daily price changes of two securities priced in yen will be much greater than
their covariance if the securities are priced in dollars. Like the variance,
the units of covariance are the square of the units used for the data.
Additionally, we cannot interpret the
relative strength of the relationship between two variables. Knowing that the
covariance of X and Y is 0.8756 tells us only that they tend
to move together because the covariance is positive. A standardized measure of
the linear relationship between two variables is called the correlation coef
icient, or simply correlation.
The correlation between two variables, X
and Y, is calculated as
follows:
The properties of the correlation of two random variables, X and Y, are summarized here:
Correlation measures the strength of the
linear relationship between two random variables.
Correlation
has no units.
The correlation ranges from −1 to +1. That is, −1 ≤ ρXY
≤ +1.
If ρXY = 1.0, the random variables have perfect
positive correlation. This means that a movement in one random variable results
in a proportional positive movement in the other relative to its mean.
If ρXY = −1.0, the random
variables have perfect negative correlation. This means that a movement in one
random variable results in an exact opposite proportional movement in the other
relative to its mean.
If ρXY = 0, there is no linear
relationship between the variables, indicating that prediction of Y cannot be made on the basis of X using linear methods.
EXAMPLE: Correlation The variance of returns on
Stock A is 0.0028, the variance of returns on Stock B is 0.0124, and their
covariance of returns is 0.0058. Calculate and interpret the correlation of
the returns for Stocks A and B. Answer: First, it is necessary to convert the variances to standard
deviations: Now, the correlation between the returns of Stock A and
Stock B can be computed as follows: The fact that this value is close to +1
indicates that the linear relationship is not only positive, but also is very
strong. |
Care should be taken
when drawing conclusions based on correlation. Causation is not implied just
from signi icant correlation. Even if it were, which variable is causing change
in the other is not revealed by correlation. It is more prudent to say that two
variables exhibit positive (or negative) association, suggesting that the
nature of any causal relationship is to be separately investigated or based on
theory that can be subject to additional tests.
One question that can be investigated is the role of outliers
(extreme values) in the correlation of two variables. If removing the outliers
signi icantly reduces the calculated correlation, further inquiry is necessary
into whether the outliers provide information or are caused by noise
(randomness) in the data used.
Spurious correlation refers to correlation that
is either the result of chance or present due to changes in both variables over
time that is caused by their association with a third variable. For example, we
can ind instances where two variables that are both related to the in lation
rate exhibit signi icant correlation, but for which causation in either
direction is not present.
In his book Spurious Correlation,1 Tyler
Vigen presents the following examples. The correlation between the age of each
year’s Miss America and the number of ilms Nicolas Cage appeared in that year
is 87%. This seems a bit random. The correlation between the U.S. spending on
science, space, and technology and suicides by hanging, strangulation, and
suffocation over the 1999–2009 period is 99.87%. Impressive correlation, but
both variables increased in an approximately linear fashion over the period.
MODULE
QUIZ 3.2
1. Which of the
following is most accurate regarding
a distribution of returns that has a mean greater than its median? A. It is
positively skewed.
B. It is a
symmetric distribution.
C. It has positive
excess kurtosis.
2. A distribution
of returns that has a greater percentage of small deviations from the mean and
a greater percentage of extremely large deviations from the mean compared with
a normal distribution: A. is positively skewed.
B. has positive
excess kurtosis.
C. has negative
excess kurtosis.
3. The correlation
between two variables is +0.25. The most
appropriate way to interpret this value is to say:
A. a scatter plot
of the two variables is likely to show a strong linear relationship.
B. when one
variable is above its mean, the other variable tends to be above its mean as
well.
C. a change in one
of the variables usually causes the other variable to change in the same
direction.
KEY CONCEPTS
LOS 3.a
The arithmetic mean is the average of observations. The
sample mean is the arithmetic mean of a sample:
The median is the midpoint of a dataset when the data are
arranged from largest to smallest.
The mode of a dataset is the value that occurs most
frequently. The modal interval is a measure of mode for continuous data.
A trimmed mean omits outliers, and a winsorized mean
replaces outliers with given values, reducing the effect of outliers on the
mean in both cases.
Quantile is the general term for a value at or below which
lies a stated proportion of the data in a distribution. Examples of quantiles
include the following:
Quartile. The distribution is divided
into quarters.
Quintile. The
distribution is divided into ifths.
Decile. The
distribution is divided into tenths.
Percentile. The
distribution is divided into hundredths (percentages).
LOS 3.b
The range is the difference between the largest and smallest
values in a dataset.
Mean absolute deviation (MAD) is the average of the absolute
values of the deviations from the arithmetic mean:
Variance is defined as the mean of the squared deviations from
the arithmetic mean:
Standard deviation is the positive square root of the
variance, and it is frequently used as a quantitative measure of risk.
The coef icient of variation (CV) for sample data, , is the ratio of the standard
deviation of the sample to its mean.
Target downside deviation or semideviation is a measure of
downside risk:
LOS 3.c
Skewness describes the degree to which a distribution is not
symmetric about its mean. A right-skewed distribution has positive skewness. A
left-skewed distribution has negative skewness.
For a positively skewed, unimodal distribution, the mean is
greater than the median, which is greater than the mode. For a negatively
skewed, unimodal distribution, the mean is less than the median, which is less
than the mode.
Kurtosis measures the peakedness of a distribution and the
probability of extreme outcomes (thickness of tails):
Excess kurtosis is measured relative to a
normal distribution, which has a kurtosis of 3.
Positive values of excess kurtosis indicate a
distribution that is leptokurtic (fat tails, more peaked), so the probability
of extreme outcomes is greater than for a normal distribution.
Negative values of excess kurtosis indicate a
platykurtic distribution (thin tails, less peaked).
LOS 3.d
Correlation is a standardized measure of association between
two random variables. It ranges in value from −1 to +1 and is equal to
Scatter plots are useful for revealing nonlinear
relationships that are not measured by correlation.
Correlation does not imply that changes in one variable
cause changes in the other. Spurious correlation may result by chance, or from
the relationships of two variables to a third variable.
ANSWER KEY FOR MODULE QUIZZES
Module Quiz 3.1
1. A The winsorized mean substitutes a value for some of the
largest and smallest observations. The trimmed mean removes some of the largest
and smallest observations. (LOS 3.a)
2. A The sample mean is [22% + 5% + −7% + 11% + 2% + 11%] / 6 =
7.3%. The sample standard deviation is the square root of the sample variance:
(LOS 3.b)
3. A Here are deviations from the target return:
(LOS 3.b)
Module Quiz 3.2
1. A A distribution with a mean greater than its median is
positively skewed, or skewed to the right. The skew pulls the mean. Kurtosis
deals with the overall shape of a distribution, not its skewness. (LOS 3.c)
2. B A distribution that has a greater percentage of small
deviations from the mean and a greater percentage of extremely large deviations
from the mean will be leptokurtic and will exhibit excess kurtosis (positive).
The distribution will be more peaked and have fatter tails than a normal
distribution. (LOS 3.c)
3. B A correlation of +0.25 indicates a positive linear
relationship between the variables—one tends to be above its mean when the
other is above its mean. The value 0.25 indicates that the linear relationship
is not particularly strong. Correlation does not imply causation. (LOS 3.d)
1 “Spurious
Correlations,” Tyler Vigen, www.tylervigen.com
READING 4
PROBABILITY TREES AND
CONDITIONAL EXPECTATIONS
MODULE 4.1: PROBABILITY MODELS, EXPECTED VALUES,
AND BAYES’
Video covering
FORMULA this content is
available
online.
LOS 4.a: Calculate expected values, variances, and standard
deviations and demonstrate their application to investment problems.
The expected value of a random variable is the
weighted average of the possible outcomes for the variable. The mathematical
representation for the expected value of random variable X, that can take on any of the values from x1 to xn,
is:
EXAMPLE: Expected earnings per share The probability distribution
of earnings per share (EPS) for Ron’s Stores is given in the following igure.
Calculate the expected EPS. EPS Probability Distribution Answer: The expected EPS is simply a weighted average of each
possible EPS, where the weights are the probabilities of each possible
outcome: |
Variance
and standard deviation measure the dispersion of a random variable
around its expected value, sometimes referred to as the volatility of a random variable. Variance (from a probability model) can be calculated as the
probability-weighted sum of the squared deviations from the mean (or expected
value). The standard deviation is the positive square root of the variance. The
following example illustrates the calculations for a probability model of
possible returns.
Note that in a previous reading, we estimated the standard
deviation of a distribution from sample data, rather than from a probability
model of returns. For the sample standard deviation, we divided the sum of the
squared deviations from the mean by n − 1, where n was the size of the sample. Here, we have no “n” because we have
no observations; a probability model is forward-looking. We use the probability
weights instead, as they describe the entire distribution of outcomes.
LOS 4.b: Formulate an investment problem as a probability tree and explain the use of conditional expectations in investment application.
You may wonder
where the returns and probabilities used in calculating expected values come
from. A general framework, called a probability tree, is used to show the
probabilities of various outcomes. In Figure 4.1, we have shown estimates of
EPS for four different events: (1) a good economy and relatively good results
at the company, (2) a good economy and relatively poor results at the company,
(3) a poor economy and relatively good results at the company, and (4) a poor
economy and relatively poor results at the company. Using the rules of
probability, we can calculate the probabilities of each of the four EPS
outcomes shown in the boxes on the right-hand side of the
probability tree.
The expected EPS of $1.51 is simply calculated as follows:
Note that the probabilities of the four possible outcomes
sum to 1.
Figure 4.1: A Probability Tree
Expected values or expected returns can be calculated using
conditional probabilities. As the name implies, conditional expected values are contingent on the outcome
of some other event. An analyst would use a conditional expected value to
revise his expectations when new information arrives.
Consider the effect a tariff on steel imports might have on
the returns of a domestic steel producer’s stock in the previous example. The
stock’s conditional expected return, given that the government imposes the
tariff, will be higher than the conditional expected return if the tariff is
not imposed.
LOS 4.c: Calculate and interpret an updated probability in an investment setting using Bayes’ formula.
Bayes’ formula is used to update a given
set of prior probabilities for a given event in response to the arrival of new
information. The rule for updating prior probability of an event is as follows:
We can derive Bayes’ formula using the multiplication rule
and noting that P(AB) = P(BA):
Because
,
and equals
, the joint probability of A and B divided by the
unconditional probability of B.
The following
example illustrates the use of Bayes’ formula. Note that A is outperform and AC is underperform, P(BA) is (outperform +
gains), P(ACB) is (underperform + gains), and the unconditional
probability P(B) is P(AB) + P(ACB), by the total probability rule.
We sum the probability of stock gains in
both states (outperform and underperform) to get 42% + 8% = 50%. Given that
the stock has gains and using Bayes’ formula, the probability that the
economy has outperformed is |
MODULE
QUIZ 4.1
1. Given the
conditional probabilities in the following table and the unconditional
probabilities P(Y = 1) = 0.3 and P(Y = 2) = 0.7, what is the expected value of X?
A.
5.0. B. 5.3.
C. 5.7.
2. An analyst
believes that Davies Company has a 40% probability of earning more than $2 per
share. She estimates that the probability that Davies Company’s credit rating
will be upgraded is 70% if its earnings per share (EPS) are greater than $2,
and 20% if its EPS are $2 or less. Given the information that Davies Company’s
credit rating has been upgraded, what is the updated probability that its EPS
are greater than $2? A. 50%. B. 60%.
C. 70%.
KEY CONCEPTS
LOS 4.a
The expected value of a random variable is the weighted
average of its possible outcomes:
Variance can be calculated as the probability-weighted sum
of the squared deviations from the mean or expected value. The standard
deviation is the positive square root of the variance.
LOS 4.b
A probability tree shows the
probabilities of two events and the conditional probabilities of two subsequent
events:
Conditional expected values depend on the outcome of some
other event. Forecasts of expected values for a stock’s return, earnings, and
dividends can be re ined, using conditional expected values, when new
information arrives that affects the expected outcome.
LOS 4.c
Bayes’ formula for updating probabilities based on the
occurrence of an event O is as follows:
Equivalently, based on the following tree diagram,
ANSWER KEY FOR MODULE QUIZZES
Module Quiz 4.1
1. B
(LOS 4.a)
2. C This is an application of Bayes’ formula. As the following
tree diagram shows, the updated
probability that EPS are greater than $2 is
Comments
Post a Comment