One question we are often asked in technical support is "how can a model that does so well during the optimization period fall apart in the out-of-sample period?" The answer is that any time you use optimization, you can "overfit" your model to the data in the optimization period. There is a very good tip called "Steve Ward's tips on preventing over-optimization" on www.ward.net in the Tips and Techniques section that addresses this problem with a variety of different solutions.
In this article we'll concentrate on the use of a paper trading data set that appears in both the prediction and trading strategy wizards on the dates tab in NeuroShell Trader. (If paper trading is not visible on the Dates tab of the Trading Strategy wizard, click on the options button and choose Optimization Range Specification in the Dates Interface.) Click on the paper trading check box and the paper trading period appears in orange on the timeline.
 |
|
This chart shows results for the optimization period in white/gray, paper trading in orange, and out-of-sample data in green.
|
When you choose paper trading, the model's parameters are still optimized on the gray colored optimization data set, but each new optimal solution that is found by the GA is applied to the paper trading set. If that optimal solution is found to get better results on the paper trading set than previous optimal solutions, then it is saved as the 'best model'. Optimal solutions that underperformed on the paper trading are still used in the GA optimization process to find an optimal solution on the optimal data set, but they are not used as the 'best model'. The final model selected by the optimization is the last saved 'best model'.
How Much Data Do You Include in Paper Trading?
When you choose paper trading, the default setting splits the data loaded in the chart so that the oldest half of the data becomes the optimization set, and remaining data becomes the paper trading set.
Note that there is a third option to create an out-of-sample data set called "Trading", but we'll talk about that in a minute.
But back to paper trading. Is using half of your data for paper trading the best practice for building your model?
It All Depends
First take a look at your data, no matter what type of bars you are using. Does the range and direction of the optimization period match the paper trading period? Are there similar peaks and valleys in both data sets? If that's the case, deciding where to break the data doesn't matter that much. If the data in the two periods doesn't match, you may want to look for a shorter paper trading period that matches the majority of the data in your optimization period. This choice has the added advantage of training a model that should more closely match current market conditions.
Another option is to choose a paper trading period that reflects market conditions you want to be able to identify. To use this option, you have to enter the start and end dates for the paper trading period. To enter the start date. Select "Specify Date" from the drop down box in the paper trading section.
To specify an end date rather than using the end of the chart data, turn on the option for "Start trading before last chart date" in the Dates tab. (If you don't see this option listed, go the Trader Tools Menu, Options, and select the Advanced tab. Under Date Interface Settings, click on the check box to "Allow real trading to begin before last chart date".) The rest of the data in your chart (displayed in green) will not be used to build the model, but you can use it to gauge real world performance.
If your chart is based on intraday data, you may want to skip having a paper trading data set. The theory behind this choice is that there is enough diversity in intraday price movement to cover all market conditions. You might want to watch the model for a day or two, and then trade it for a few days. Reoptimize the model on all of the data up to present, watch for a day or two and then trade. Repeat as needed.
The exception to the rule for intraday data is when you want to trade only certain hours in a trading day, such as London market hours that overlap with the US. You're creating a more specialized model that might provide better results by using paper trading.
Next newsletter: Using the Power User batch processing and walk forward features to decide the size of the paper trading set.