Logo

This is a calendar meeting - Shared screen with speaker view
Weiyu Li
32:25
Question about the notations: the max objective is rho(1), then what’s rho(i) with the index (i) in the Output line?
Paul
44:34
Are there classes of problems where earlier layers don't converge faster than later layers (or types of inputs that this does worse on)?
Preetum Nakkiran
48:10
Q: Do you know if early layers also stop moving in transfer learning? (Eg, when fine-tuning a network instead of training from scratch)
Preetum Nakkiran
01:27:03
Have a quick Qns
Preetum Nakkiran
01:27:32
Can I be unmuted
Mark Kong
02:09:01
I have a question about something from way earlier: As a network learns, you mentioned how layers converge one layer at a time according to SVCCA. When a layer gets close to its final state according to SVCCA, have the parameters also usually mostly stopped moving?