Summary - DRL doesn't work, yet.


This post is a summary of the extremely insightful blog - Deep Reinforcement Learning Doesn’t Work Yet - that is making the rounding on twitter since it came out a couple of weeks ago.


Training times are mind-numbingly large.

Classical control-theory methods obliterate DRL in many tasks in terms of final performance

RL requires reward functions, which are (1) difficult to design, (2) hard to get to work

When DRL works, it has generally overfit.

Okay, suppose it works. But now it’s unstable and irreproducible.

Where do we actually see DRL in real life?

Ohkay… So will DRL ever work?

Concluding remarks

Takeaway: DRL has a long way to go before becoming a plug-and-play technology. So while one could be annoyed with the current state of affairs in DRL (wherein everything is seemingly in shambles), one should also believe in where it could be. As Andrew Ng says,

[there is] a lot of short-term pessimism, balanced by even more long-term optimism.

Go RL!


Written on February 25, 2018 +1617
Back to Posts