I started over this time generating 2,520 random returns [...] to represent ten years worth of data. I then calculated the compound return that a $1 investment would have earned had it been invested in this asset class for the whole 10 years [(1*(1+ Day 1 Return)*(1+ Day 2 Return) and son on]. Let’s say the results of this calculation was 2.158925, which is the dollar return over the 10 years, including the original investment. To get the annual return I did 2.158925^(1/10)-1 which gave me 8% annual return, which is far away from the 15% I was hoping for. However with all these calculations already in Excel cells and the RAND() function in place, every time I pressed F9 (recalculate) all the random numbers were recalculated, and within a few tries, I had a data series with a an annual return of 15.1% and an SD of 29.8%
I think there is a more reliable and direct way to ensure that random daily returns have an annual geometric mean of 15% and a
geometric sd of 30%.
There are many details to explain, if you are interested. But since I do not know that you are still monitoring this discussion, initially I will just provide an outline of the approach. Feel free to ask for details.
Note: I return to your original objective of generating 40 years of normally-distributed of daily returns. Thus, using 252 trade days per year, there are 10,080 daily returns.
First, we generate random
annual returns that are normally-distributed with a geometric mean of exactly 15% and a geometric sd of exactly 30%.
Second, for each year, we generate
daily returns that are normally-distributed with a geometric mean that compounds to the annual return for that year.
Download the file "houman norm dist2.xls" (
click here) [1].
Since the two-step process might be tedious to repeat for several assets or asset classes, we might implement the details in a macro. But for now, I use Excel formulas to demonstrate the concept.
For the first step, we might have the following design in the 'annual' worksheet:
Rich (BB code):
Formulas:
A2: =RAND()
B2: =NORMINV(A2,$G$2,$G$3)
C2: =$G$5*STANDARDIZE(B2,$G$6,$G$7) + $G$4
E2: =EXP(C2)-1
G4: =LN(1+G2)
G5: =LN(1+G3)
G6: =AVERAGE(B2:B41)
G7: =STDEVP(B2:B41)
G8: { =PRODUCT(1+E2:E41) - 1 }
G9: =(1+G8)^(1/40)-1
G10: { =EXP(STDEVP(LN(1+E2:E41))) - 1 }
H9: =G9-G2-0
H10: =G10-G3-0
Copy A2:C2 down through A41:C41
Copy E2 down through E41
"-0" in the formulas in H9:H10 thwart a dubious heuristic implemented in Excel whereby Excel
sometimes replaces actual arithmetic results with exactly zero if the result is otherwise
"close enough" to zero.
Although I show =RAND() in column A, I actually prefer to generate the constant RAND results in a macro so they do not change unintentionally.
Note that the formulas in G8 and G10 are
array-entered. Do not type the curly braces. Excel will display them when we
press ctrl+shift+Enter instead of just Enter.
The normal distribution is a distribution of data around a specified
arithmetic mean with a specified
arithmetic sd. The (natural) log of the geometric mean and sd of a compounded series of returns over time is the arithmetic mean and sd of the (natural) log of the returns.
So in G4 and G5, we calculate the "log mean" and "log sd" of the target geometric mean and sd (G2 and G3). And in column B, we generate a random sample from the normal distribution of log returns based on G4 and G5.
Since that is a random sample, its mean and sd usually does not exactly match the target log mean and log sd.
So we calculate the original log mean and log sd in G6 and G7. And in column C, we map the original random sample into an exact random sample by the fact that the original z-score = (B2-$G$6)/$G$7 = (C2-$G$4)/$G$5 = the exact z-score.
Finally, in column E, we convert the exact random sample of log returns into the exact random sample of annual returns.
The formulas in G8:G10 demonstrate that the random sample in column E has a compounded total return (G8) with a geometric mean of 15% (G9) and geometric sd of 30% (G10).
The infinitesimal differences in H9:H10 demonstrate that G9:G10 truly are about the same as target parameters in G2:G3.
I suspect you actually want an arithmetic sd of 30%. But that would mean the sd of the log returns is 30%, not the sd of the annual returns. That is a detail that we can fix easily, if you wish.
For the second step, we might have the following design in the 'daily' worksheet:
Rich (BB code):
Formulas:
A2: =RAND()
B2: =NORMSINV(A2)
C2: =1 + INT((ROWS($B$2:B2)-1)/252)
D2: =INDEX($G$4:$AT$4,C2)
* STANDARDIZE(B2,INDEX($G$7:$AT$7,C2),INDEX($G$8:$AT$8,C2))
+ INDEX($G$5:$AT$5,C2)
E2: =EXP(D2)-1
G1:AT1: 1 through 40
G2: =INDEX(annual!$E$2:$E$41,G1)
G3: =(1+G2)^(1/252) - 1
G4: =annual!$G$3 / SQRT(252)
G5: =LN(1+G3)
G6: =LN(1+G4)
G7: =AVERAGE(OFFSET($B$2,252*(G1-1),0,252))
G8: =STDEVP(OFFSET($B$2,252*(G1-1),0,252))
G9: { =PRODUCT(1+OFFSET($E$2,252*(G1-1),0,252)) - 1 }
G10: { =PRODUCT(1+E2:E10081) - 1 }
G11: =(1+G10)^(1/40) - 1
G12: { =(EXP(STDEVP(LN(1+E2:E10081))) - 1) * SQRT(252) }
Copy A2:E2 down through A10081:E10081
Copy G2:G9 across through AT2:AT9
Again, although I show =RAND() in column A, I actually prefer to generate the constant RAND results in a macro so they do not change unintentionally.
And the formulas in G9, G10 and G12 are
array-entered. Do not type the curly braces. Excel will display them when we
press ctrl+shift+Enter instead of just Enter.
In G3, the target mean is the daily geometic mean of the target annual return in G2.
In G4, the target sd is intended to be the estimated daily geometric sd, based on the target annual geometric sd. I use your estimate, namely =annual!$G$3/SQRT(252). But I think the square-root-of-time rule applies to the arithmetic sd of the log returns. So the formula should be =EXP(LN(1+annual!$G$3)/SQRT(252)-1.
However, mathematically, we can actually use any daily geometric sd, notwithstanding the often-cited square-root-of-time rule.
In column B, we generate a trial random sample from the normal distribution. We do it differently than the method in the 'annual' worksheet. But the effect is equivalent.
G5:G8 are calculated and used in a manner similar to G4:G7 in the 'annual' worksheet, which is explained above.
And the exact distribution of log returns in column D and the actual daily returns in column E are calculcated in a manner similar to columns C and E in the 'annual' worksheet, which is also explained above.
The formulas in G9:AT9 demonstrate that each year's random sample of daily returns does indeed have a compounded total return that is the same as the random annual return.
And the formulas in G10:G11 demonstrate that the 40-year random sample of daily returns has a compounded total return with an annual geometric mean of 15%.
Note that the annual geometric sd of the compounded total return (G12) is not exactly 30%. I believe that is to be expected. In fact, if we choose a random target geometric sd for each year, G12 will be very different.
The annual geometric sd should be calculated from the random annual returns, not from the random daily returns.
-----
[1] https://app.box.com/s/pgkut63i4894ck92nwwq0gnvwv0ww39m