The Optimization Paradox
The Optimization Paradox
By Curtis Faith, an Original Turtle; partner in Galt Capital,
LLP; managing member of Acceleration Capital, LLC; and founder
of Turtle Trading Software
I've noticed something when discussing hypothetical trading
results with people that seems to get little discussion but
which can really be misleading, something I call the
Optimization Paradox.
The Optimization Paradox:
Proper optimization results in a system that is more likely to
perform well in the future, but is less likely to perform as
well as the simulation indicates. So optimization improves the
performance of the system while decreasing the predictive
accuracy of the historical results.
I believe that an incomplete understanding of this paradox and
its causes has led many to shy away from optimizing systems out
of a fear of over-optimizing or curve-fitting a system.
However, I contend that proper optimization is always
desirable.
NOTE: A complete discussion of the complete process of Proper
Optimization is beyond the scope of this article.
What is a Parameter?
A well known commercial system is touted by its developer as a
better system because it has only "one parameter".
While the developer may have optimized only one parameter, I
believe there are, in fact, many parameters. Many constant
values used in the system, like 2, or 2%, or 5, etc. are
actually parameters that have not been optimized (or perhaps
the optimization has been hidden from the purchasers).
The system actually has more like five or six actual
parameters, even if the developer does not make it clear, or
even believe himself, that these are parameters.
Consider a simple moving average crossover system:
Using a simple 80 day moving average, buy the next open if the
price closes over the moving average, sell if it closes below
the moving average. The entry stop is 2 ATR below the moving
average for longs, 2 ATR above for shorts. Bet size is 2% of
total equity.
How many parameters are there in this system? Many people would
answer one parameter, the number of days in the moving average.
I'd answer differently. First, there is nothing magical about
the crossing over the price of the moving average. Just because
we have decided that the exact price of the moving average is
the threshold to buy, this does not mean that one couldn't
choose other prices related to the moving average, say 1/2 ATR
higher, or 1 ATR higher, etc.
Second, in the stop of 2 ATR, the value 2 is a parameter. Also
with the bet size of 2%, the 2 is a parameter. There is nothing
magical about the 2, one could just as easily use 1%, or 1.5%.
So the 2 is just one value of many that could have been used.
Each of these placeholders for values are parameters.
The Benefits of Optimization
Optimization is always beneficial when done correctly while
accompanied by a mature understanding of its implications. The
basic reason is that it is always better to understand the
performance characteristics of a parameter than to be ignorant
of them.
Optimization is simply the process of discovering the impact on
the results of varying a particular parameter across different
values; then using that information to make an informed
decision about which specific parameter value to use in actual
trading.
Using parameter values in actual trading that result from
proper optimization should increase the likelihood of getting
good results in actual trading in the future.
A specific example will help. Consider the rules to the
Original Turtle System, which I and others have made available
free on the site: www.originalturtles.org.
The "Unit Add in N" Parameter
One of the rules is that positions should be put on piecemeal,
each part was called a unit. The second part of a position was
added at 1/2 ATR (Average True Range which was known as N to
the Turtles) after the initial position, and subsequent parts
each 1/2 ATR later. In the Turtle System, the 1/2 in the
position addition rule is a parameter.
Now, consider a test made using the VeriTrader? Demo of this
parameter, known as "Unit Add in N". The following is a graph
of the values of two measures of system performance, MAR and
Sharpe Ratio, as the value of "Unit Add in N" varies from 0.0
ATR (meaning all the units are put on at once) and 1.0 ATR:
Notice how the results for the 0.5 ATR value are the peak for
this test. In fact, the results for 0.6 ATR are signification
worse, the MAR ratio drops from over 4 to approximately 2.8.
Unoptimized = Arbitrary
Now, returning to our premise that optimization is beneficial,
suppose we had not considered optimizing "Unit Add in N" and
had started with a value of 1.0, a nice round number. We would
have been leaving a lot of money on the table, and would have
subjected our trading to much greater drawdowns than a 0.5
value would have provided. Not optimizing is simply leaving
things to chance through ignorance.
Having done the optimization, we now have a much greater
understanding of the performance ramifications of the "Unit Add
in N" parameter and how the results are sensitive to this
parameter. We know that it is important not to wait too long,
that if the markets move 0.5 ATR, we should act immediately and
add another unit. We know that waiting even a little bit longer
will likely result in decreased performance.
Avoiding this optimization research because we were afraid of
over optimizing or curve fitting would have deprived us of a
good deal of useful knowledge, knowledge that could materially
improve our trading results.
The Flip Side: Decreased Predictive Accuracy
Now consider a few more parameters (again we'll use results
from VeriTrader Demo so the reader can experiment directly with
these concepts):
The "Stop in N" Parameter
The Stop for the Turtle System was expressed in ATR from the
entry point. The following is a graph of the MAR for various
values of the "Stop in N" parameter:
Notice how a value of 2.0 for the stop ATR shows the highest
MAR at slightly more than 3.0.
The "Max Directional Units" Parameter
The Turtle System had a limit to the number of units which
could be put on in a given direction, long or short, called in
the VeriTrader Demo the "Max Directional Units". The following
graph shows the MAR and Sharpe Ratio for various values of this
parameter:
The "Max Directional Units" value of 10 units is significantly
better than any other value with the highest MAR and Sharpe
Ratio. Notice the steep drop off between 10 units and 11 units.
The "Entry Failsafe Breakout" Parameter
One of the most important, but least understood rules of the
Turtle System involved an early entry when the last trade had
been a loser. In the event that the last breakout was a
profitable breakout, there was a late breakout called the
"Failsafe Breakout" in the Original Turtle System rules. This
breakout was to be taken irrespective of the profitability of
the last breakout (see the Original Turtle Rules for more
details).
This parameter exhibits a broader range of results values with
the highest value corresponding to a "Failsafe Entry Breakout"
of 65. The best value, it might be argued, is actually 60,
since it sits in the center of the region of higher results
even though it does not represent the highest value (65 has a
better MAR, 55 has both a better MAR and Sharpe Ratio).
The Basis of Predictive Value
A historical test has predictive value to the extent that it
shows performance which a trader is likely to encounter in the
future. The more the future is like the past, the more future
trading results will be similar to historical simulation
results.
The main problem with using historical testing as a means of
system analysis is that the future will never be exactly like
the past. To the extent that a system captures its profits from
the effects of unchanging human behavioral characteristics that
reflect themselves in the market, the past offers a reasonable
approximation of the future, but never an exact one.
The historical results of a test which is run using all
optimized parameters represents a very specific set of trades,
those trades that would have resulted had the system been
traded with the very best parameters. The corresponding
simulation results represent a best-case view of the past. One
should expect to get these results in actual trading should the
future correspond exactly to the past.
It won't!
Now consider the above parameter graphs, each of these graphs
has a shape like the top of a mountain with a peak value. One
might represent a given parameter with the following graph:
Two Example Parameters
If the value at point A represents a typical non-optimized
parameter value, and the value at point B represents an
optimized parameter, I argue that B represents a better
parameter value to trade but one where the future actual
trading results will likely be worse than that indicated by
historical tests. Parameter A is a worse parameter to trade but
one with better predictive value because if the system is
traded at that value, future actual trading results are just as
likely to be better than worse than those indicated by the
historical tests using value A for the parameter.
Why is this?
To make this clearer let's assume that the future will vary
such that it is likely to alter the graph slightly to the left
or the right, we don't know which. The following graph shows A
and B with a band of values to the left and right which
represent the possible shifts due to the future being different
than the past which we'll call Margins of Error:
Two Parameters with Margins of Error
In the case of value A, any shifts of the graph to right which
would cause the value of A to move left on the graph will
result in worse performance than point A, any shifts of the
graph to the left will result in better performance. So A
represents a decent predictor irrespective of how the future
changes since it is just as likely to be under predicting the
future as over predicting the future.
The same is not the case with value B. In all cases, any shift
at all, either to the left or the right, will result in worse
performance. This means that a test run with a value of B is
very likely to be over predicting the futures result.
When this effect is compounded across many different
parameters, the effect of a drift in the future will also be
compounded meaning that with many optimized parameters it
becomes more and more unlikely that the future will be as
bright as the predictions of the testing using those optimized
parameters.
Important Note: This does not mean that we should rather use
value A in our trading. Even in the event of a sizeable shift,
the values around the B point are still higher than those
around the A point.
Now returning to the parameter "Unit Add in N":
Note how the results steeply drop off to the right of the 0.5
ATR value. In the event of drift, a 0.5 ATR value is a somewhat
risky choice for trading because of the risk that if the future
was slightly different and the optimal value shifts lightly
lower, there might be a significant drop off in performance of
actual results corresponding with the drop shown here between
0.5 ATR and 0.6 ATR.
The mitigating factor in this particular case is the fact that
the 0.5 value is the original values given by Richard Dennis.
It was optimal 20 years ago, and it has held up extremely well
over many years. In fact, I can't recall a single test of the
Turtle System of the hundreds or thousands that I have made
over the years with many different markets, including stocks,
where a value other than 0.5 had the best results.
Factors that Affect Drift
Several factors affect the drift (width of the Margin of Error)
in historical tests results for parameter values over time, and
hence the predictive value of optimized parameter tests:
Number of Markets - Tests run with more markets will display
less drift than those run with fewer markets. Tests optimized
over the portfolio will have much less drift than those where
the optimization has been done on a market specific basis.
Amount of Data - Tests run over longer periods will have less
drift than those run over shorter periods.
Market Conditions of Test - Tests run over different types of
markets will drift less than those run over specific markets,
e.g. stock system tests run only over the last years of the
bull market, compare with tests run over the last 20 years.
For example, the VeriTrader Demo results are tested over a
relatively small number of markets (15), over a fairly short
time period (less than size years), for this reason the
parameters are likely to drift significantly in the future,
making it very unlikely that one would be able to achieve the
results indicated using the optimized parameter values. Running
the same tests and optimization process over more markets over
a much longer period will generate results that are much better
predictors of potential future results.
Conclusion
The Optimization Paradox has been the source of much confusion.
It is also the source of much deception and scamming. Many
unscrupulous system vendors have used the very high returns and
incredible results made possible through optimization,
especially over shorter periods of time using market-specific
optimization.
However, just because optimization can result in tests that
overstate likely future results, this does not mean that
optimization should not be done. Indeed, optimization is
critical to the building of robust trading systems.
In a future article, I will discuss how to improve the
predictive value of optimized tests to compensate for the
Optimization Paradox.
--
※ 發信站: 批踢踢實業坊(ptt.cc)
◆ From: 218.167.220.57
→
08/13 19:06, , 1F
08/13 19:06, 1F
討論串 (同標題文章)
Trading 近期熱門文章
PTT職涯區 即時熱門文章
133
157