Harvard Learning Seminar - Shared screen with speaker view
Lucas B Janson
Related to that question about pre-training to learn about objects, etc: To what extent is it relevant that this game was (presumably painstakingly) designed for humans? So in terms of the human "learning a model for the game”, is it relevant that the game was probably specifically designed to be easily modeled by a human brain? In that sense it seems to me like this is really an extremely adversarial challenge for a RL algorithm, since it is likely much harder to learn the model for the RL algorithm, even if it may be doing it?
Lucas B Janson
(it's hard to hear on zoom the question asked in the room )
Blake Bordelon
https://arxiv.org/pdf/1712.01815.pdf , 44 million games for alpha zero
Neeraj Sharma
I guess human also play multiple times in their thoughts and dreams. I don’t see how this can be quantified for a fair comparison.
Dongrui Deng
One question seems to be naive: What about on-policy RL? ex. policy iteration, SARSA etc.
Yicong Jiang
Thank you for the great talk! I wonder whether the method will be short-sighted. For instance, maybe in chess, sacrificing the queen can lead to great reward 10 steps later. How can one build theory on this?
Aslan Satary Dizaji
Great talk! In your opinion what is a good trade-off between biological plausibility and cognitive plausibility in deep neural networks?
Boaz Barak
Hi all, once we are done with questions from people in the room we'll take questions from zoom