Logo

This is a calendar meeting - Shared screen with speaker view
Sergey Plis
43:13
I wonder if this is related to https://arxiv.org/pdf/1912.07559.pdf Hard to formulate it as a concise question but this paper covers research on interpolation of parameters while keeping the loss constant
Sergey Plis
45:54
I understand, just sharing :-/
Sergey Plis
46:00
sure
Ben Edelman
48:52
Does this difference in importance between layers hold for MLPs too, or just conv nets?
Rahim Entezari
49:43
What if we rewind back to early iterations not zero-th?
Ben Edelman
01:01:50
If lower layers are in charge of general features, why does this imply they shouldn't be critical? Something about the general features being more robust than the specialized features?
Cathy Wu
01:02:01
I have the same question
Ben Edelman
01:02:14
(I just asked a question above)
Ben Edelman
01:07:43
Can you speak about the relations to Frankle et al.'s work on linear mode connectivity and the lottery ticket hypothesis? (which I believe is about basins w.r.t. SGD noise)
Ben Edelman
01:08:06
https://arxiv.org/abs/1912.05671
Rahim Entezari
01:08:22
Can LT make use out of these finding to where to rewind back?