How to Avoid the "Overfitting" Trap in Your Backtests

Bullish_Varun · Friday at 11:33 AM

If you build trading rules using historical Indian market data, you want them to work in live trading too. Overfitting is when a model looks perfect on past data but fails in real time because it learned noise, not market structure. This short guide explains simple, practical steps for traders and quants in India to reduce that risk and make backtests more realistic.

Start with a clear idea and keep it simple
Complex models with many parameters are more likely to fit random patterns. Begin with a clear trading thesis: why should the rule work on Nifty or a particular stock? Prefer fewer parameters and choices that you can defend logically. A simple moving average crossover with reasonable lookbacks usually generalises better than ten tuned indicators.

Use realistic costs
Indian trading has brokerage, STT, GST, stamp duty and slippage. Always include these in your simulations. For example, if your backtest ignores a ₹20 brokerage per executed order and a ₹1–3 slippage per share, your apparent edge can vanish. Model the full round-trip cost and conservative slippage for small-cap stocks where liquidity is thin.

Hold out data and walk-forward
Never evaluate a strategy only on the data used to build it. Split your dataset into:

in-sample (for parameter selection),
out-of-sample (for validation).

Better yet, use walk-forward testing: optimise on a rolling window (e.g., 3 years), then test on the next 6–12 months, and repeat. This mimics how a strategy would be re-optimised and deployed over time.

Avoid data snooping
If you test hundreds of indicator combinations and keep only the best performers without correcting for multiple testing, you will pick lucky results. Limit the number of hypotheses, or apply statistical corrections like the Bonferroni method or a bootstrap to estimate how often a result could happen by luck.

Stress test and use Monte Carlo
Markets change. Test strategy performance under different scenarios: worse slippage, higher transaction costs, reduced fill rates, or sudden volatility spikes like the 2008 or 2020 moves. Monte Carlo resampling of trade sequences helps you understand the distribution of outcomes and the chance of long drawdowns.

Use cross-validation for parameters
Instead of a single set of optimised parameters, check how sensitive performance is to small changes. If a slight tweak collapses profitability, the rule is fragile. Prefer parameter regions where performance stays stable across nearby values.

Keep a trade-level realistic simulation
Model realistic order execution: partial fills, order queues for illiquid stocks, spread, and time-in-force. Paper trading on live NSE / BSE markets for a few months provides feedback before committing capital.

Look for economic rationale
Ask: does the edge exploit data quirks or a real market friction? A valid explanation could be behavioural biases, structural rules, tax effects, or liquidity cycles in Indian markets. Strategies without a plausible reason tend to fail when market microstructure changes.

Use out-of-time events and regime tests
Test your strategy across different regimes: high vs low volatility, pre- and post-GST days, or periods around major reforms. A robust approach maintains reasonable performance across regimes, not just during one lucky stretch.

Regularisation and ensemble approaches
Techniques like shrinkage, L1/L2 penalties, or limiting model complexity help prevent overfitting in statistical and machine learning models. Ensemble methods—combining several weak, diverse models—often generalise better than a single heavily tuned model.

Document and stick to rules
Keep a research log: hypotheses, parameter choices, tests run, and results. Documented decisions reduce accidental curve-fitting by later tweaking parameters to chase past returns. If you change rules, re-run out-of-sample tests.

Monitor live, and be ready to adapt
Once live, monitor performance against expectations. If the strategy deviates meaningfully, pause and investigate rather than keep trading until losses grow. Maintain conservative position sizing and risk limits, especially in the initial live period.

Small, repeatable checks beat flashy in-sample returns. Focus on realism: costs, fills, regime variation, and a clear rationale. That’s how you move from past fit to future edge.

A disciplined workflow — simple hypotheses, realistic costs, strict out-of-sample testing, stress testing and clear documentation — will dramatically reduce the chance that your backtest is just a mirage. Keep learning from live experience, and treat backtesting as part of an ongoing validation cycle rather than a one-time proof.

How to Avoid the "Overfitting" Trap in Your Backtests

Bullish_Varun