Logo

Harvard Learning Seminar - Shared screen with speaker view
Lucas B Janson
43:42
Related to that question about pre-training to learn about objects, etc: To what extent is it relevant that this game was (presumably painstakingly) designed for humans? So in terms of the human "learning a model for the game”, is it relevant that the game was probably specifically designed to be easily modeled by a human brain? In that sense it seems to me like this is really an extremely adversarial challenge for a RL algorithm, since it is likely much harder to learn the model for the RL algorithm, even if it may be doing it?
Lucas B Janson
48:53
(it's hard to hear on zoom the question asked in the room )
Blake Bordelon
53:01
https://arxiv.org/pdf/1712.01815.pdf , 44 million games for alpha zero
Neeraj Sharma
58:33
I guess human also play multiple times in their thoughts and dreams. I don’t see how this can be quantified for a fair comparison.
Dongrui Deng
01:12:36
One question seems to be naive: What about on-policy RL? ex. policy iteration, SARSA etc.
Yicong Jiang
01:16:26
Thank you for the great talk! I wonder whether the method will be short-sighted. For instance, maybe in chess, sacrificing the queen can lead to great reward 10 steps later. How can one build theory on this?
Aslan Satary Dizaji
01:20:00
Great talk! In your opinion what is a good trade-off between biological plausibility and cognitive plausibility in deep neural networks?
Boaz Barak
01:23:33
Hi all, once we are done with questions from people in the room we'll take questions from zoom