Some early code from work on a big RL trading project.
Implements Direct Reinforcement Learning trader based on Moody et al 2001.
- Generates time-series by random walk with autoregression
- Optimises reward (profit) by batch training a RNN policy.
- pure numpy implementation