Skip to content

Latest commit

 

History

History
52 lines (29 loc) · 1.52 KB

rl-for-trading.md

File metadata and controls

52 lines (29 loc) · 1.52 KB

Reinforcement Learning for Trading [link]

Performance functions:

  • Profit or Wealth
  • Sharpe ratio
  • differential Sharpe ratio

Differential Sharpe ratio - new value function for risk-adjusted return that enables learning to be done online.

Consider performance functions for systems with a single asset portfolio (price series z_t).

See Moody et al. (1998) for detailed discussion of multiple asset portfolios.

Trader's actions:

  • take long
  • take neutral
  • take short

- positions F_t \in {-1, 0, 1} of constant magnitude

F_t is established at the end of time interval t and is re-assessed at the end of period t+1.

Return R_t is realized at the end of the time interval (t-1, t] - it's profit or loss resulting from the position F_{t-1}.

Additive Profit:

fig1

- where r_t = z\_t - z_{t-1}.

Wealth W_T = W_0 + P_T.

Multiplicative profit are appropriate when a fixed fraction of accumulated wealth \nu > 0 is invested in each long or short trade. The Wealth at time T:

fig2

- where r_t = (z\_t/z_{t-1} - 1).

Maximizing profits Maximizing risk-adjusted return (according to Modern Portfolio Theory).

The measure is Sharpe ratio:

fig3

Differential Sharpe ratio for online optimization of trading system performance:

fig4

fig5

fig6