The trading simulator is the part of the trading platform that takes the programmed rules of a trading system and computes the simulated paper-trades along a time interval. The reason for its existence is the speed, efficiency, and accuracy with which it can perform thousands of computations.
Forex traders usually get MT4 for free when opening a trading account, and MT4 includes what it calls a “strategy tester” which is accessible by clicking a magnifying glass button or Ctrl-R.
All trading simulators generate outputs containing a considerable amount of information about the performance of the system on a particular market. Data about gross profit, net profit, maximum drawdown, percent winners, mean and max profit and loss, mean reward to risk, return on account, etc. should be present in a general report.
In well-designed simulators, the output is presented as a several page report, with text and graphics depicting relevant information of the system.
An example of a summary report of the MT4 Strategy Tester is shown in Fig. 1.
MetaTrader has a summary report of average quality, but since it’s widely used let’s discuss its main components. This summary report shows the minimum needed to realize if our strategy is worthwhile or junk. The main parameters are shown below.
- Initial deposit: The amount of initial paper money account. It only matters as the initial reference for testing.
- Total net profit: The difference between “Gross profit” and “Gross loss”
- Gross profit: The sum of all profits on profitable trades
- Gross loss: the sum of all losses on unprofitable trades
- Profit factor: The percent ratio between the gross profit and the gross loss: A fundamental metric.
- Expected payoff: The monetary expectancy of the system. The average monetary value of a trade. One of the most important values. It should be greater than zero for a positive expectancy.
- Absolute drawdown: The largest loss that is below the initial deposit amount.
- Maximal drawdown: The largest drawdown took from a local maximum of the balance. It’s essential, because an otherwise good strategy might be useless if it has several 50%+ drawdowns. You need to set your mark on the maximum allowed level you’re able to tolerate.
Even if the system’s results are below your mark, it might be optimum to perform Monte Carlo permutations (preferably more than 10.000) to study the probability of those drawdowns closely. Please be aware also that this value changes with position size, so it will increase as you increase your position and, consequently your risk.
- Relative drawdown: The maximum percent loss relative to the account balance at the local maximum. The same as above but percentwise.
- Total trades: Total number of trades. It is important to get at least 100 trades to get a statistically good approximation. This value is of interest, also, when comparing systems. When multiplied by the expected payoff it should return the total net profit value.
- Short positions (won %): The number and percent profitable on short positions. It shows how good the system is in short positions. You should watch if there is an asymmetry when compared to long positions and analyze why this might occur.
- Long positions (won %): The number and percent winners on long positions.
- Profit trades (% of total): The total number and percent of profitable trades. Although technically it’s not that important to get high values on this parameter, as long as there is a good profit factor, this value psychologically is important. For example, many trend-following systems have no more than 35% winners. Therefore, you should decide if you’re able to accept just one winner out of three or you’re more comfortable with higher values at the expense of less reward-to-risk ratios on trades.
Percent profits allow us to compute the probability of a winning streak, and, when multiplied by the average profit it shows the likelihood of a monetary streak.
The probability of a winning streak of length n is the %Profit to the power of n:
Probability of an n-Winning Streak:
PWn = %Profit n
Then, the expected profit on an n-streak of winners is
Expected profit_S on = PWn * average profit trade.
Below the probability curve of an n run-up streak in a system with 58% profits.
- Loss trades (% of total): The number and percent of unprofitable trades. This value is computed subtracting 1- %Profits. Besides the psychological effect on traders, it allows us to directly compute the probability of a losing streak in the same way as in the winner streak case
Probability of an n-Losing streak
PLn = %Lossn
We should remember that
%Loss = 1 – %Gain
Therefore, the probability of a loss on a 45% gainers system is 55%.
And the expected loss on that streak will be:
Expected loss_S = PLn * Average loss trade
Below the distribution on a system with 48% losers
Thus, using these two metrics, %Profit and %Loss, we can build a distribution curve of run-ups and drawdowns, and build a graph, as an alternative to a full Monte Carlo simulation, so we could get a more in-depth insight of what to expect of the system in term of run-ups and drawdowns.
Above, the distributions of run-ups and drawdowns of the same system, normalized to the risk taken called R, that matches the average loss.
These figures were performed by a simple algorithmic computation using Python and it’s plotting library matplotlib.
Let’s say that, in this case, our average loss is 500 €. And we are using a system on which, according to fig 2d, there is a 2.5% likelihood of hitting a 5xR drawdown. Therefore, if we have a 10,000 € account, there is a 2,5% chance that it will reach a 2500 € loss, or 25% drawdown. If we increase our risk to 1,000 € per trade, we’ll end up with 50% drawdown. Moreover, if we don’t like a 25% max drawdown, we should reduce the size of our trades to set the risk according to the max drawdown we are willing to accept.
As we see, this kind of analysis is much richer than a mere maximum drawdown measure, because we know the actual probability of a drawdown length and its monetary size, and is linked to the maximum allowed risk.
- Largest profit trade: The largest profitable trade, we need to analyze large trades and evaluate if they are accidental outliers. We should evaluate their contribution to the profit curve. If for example, our profit balance is due to a small number of random outliers, and the rest of the profitable trades are just scratching the break-even mark we should be cautious about the future profitability of the system.
- Largest loss trade: The largest losing trade. Largest losing trades give us hints about the positioning of our stop-loss levels. If we get sporadic but large losses, we need to check if we have bad historical data or if we suffer from gaps or spikes that needed to be corrected.
- Average profit trade: The result of dividing the total profit by the number of profitable trades. An important metric to be used in conjunction with the statistical method described above.
- Average loss trade: The result of dividing the total loss by the Nr. of losing trades. The average loss is a description of out mean risk per trade. We must be aware of the implications of that figure. We need to be prepared for more than 5 consecutive losses, and its associated risk, as previously discussed.
- Maximum consecutive wins (profit expressed as money): the longest streak of profitable trades and its total profit.
- Maximum consecutive losses (loss expressed as money): The longest streak of unprofitable trades and its total loss. As on the previous point, these values are better analyzed using the statistical method described above.
Other convenient parameters might have been handy (but not included):
- Standard deviation of the expected payoff: This value is only computable by exporting the results list and performing the computation on a spreadsheet or Python notebook.
- Win/loss ratio: The ratio of the mean winner to the mean loser, easily computed since those two values are shown.
- Sharpe ratio or similar quality metric: Computed from Expected payoff and its Standard deviation.
- T-statistics, that may help determine if the system has some statistical validity or it’s close to a zero-mean random system. It’s also important to know if the system delivers positive or negative skewed results.
These statistics can be obtained by saving the strategy report and using Excel to compute them, or using Python. In both cases, we need to build a small script that takes a bit of effort the first time but will reward us with a continuous stream of subtle details that no simulator shows.
The optimization procedure on the MT4 strategy tester uses a complete back-tested approach. There is no out of sample testing, so to avoid false expectations, it might be good to perform the optimization procedure using only a chunk of the historical data available and, after optimizing, running the optimized system in another chunk of the data series to get out of sample results.
This tab shows the equity balance graphic on a back-test (in this case a free EA: Headstrong Free, after optimizing it from 2011 to 2013). Below, the optimized EA behavior in out-of-sample data. Close to a random system.
Trade by trade reports
The “results” tab shows a trade by trade report organized by date and time. That report presents the time, type, size, price profit and balance of every operation. Of course, open actions don’t show a profit.
By right-clicking on any part of the report, you can save it on file for future use or further analysis using Excel or Python.
One major issue with data testing is that if we test on all the data available we “burn” it. That means that any posterior retest will be a bit more curve-fit. The issue is that while we go from a fair system to a great one, we introduce a small change here and there, and when back-testing that variation, we select the best performers and discard others. As the number of back-tests increases using the same data, a hidden curve fitting is emerging.
Theoretically one should perform a test using a data set and, after a change on a parameter, make a new test using another set, but, in practice, we end up testing all our strategy variants on a single dataset. That’s the reason we need to be cautious.
So, what’s the best way to use our data so it won’t burn too much? Kevin J. Davey says he uses just a portion of all the data he has, large enough to get statistically valid results. He also says he makes a random selection of the portion to be used. That way he uses new data after a change in parameters and makes sure his historical database is used minimally.
In the next issue, we’ll deal with entry evaluation. Let’s remember that we still are in the “limited testing” stage. The idea of entry evaluation is to assess the validity of an idea as an entry signal. That we’ll discuss, as said, in the next article of this series.
Building Winning Algorithmic Trading Systems, Kevin J. Davey
Encyclopedia of Trading Strategies, Jeffrey Owen Katz, and Donna L. McCormick
Graphs were done using custom Python 3 software and also, from MT4, trading platform.