
22:34
Does g depend on inputs? if not then it can be taken into the weight matrix

23:28
yes

23:29
thanks

39:07
But, the singular vectors of the matrices do change?

39:18
so decoupled basis does change?

39:56
thanks

45:19
Was there a pattern observed whether certain 1D network gradient has a correlation with difficult of the examples ?

47:09
Yep, meant difficult to classify

48:06
Like images with blur or glare are difficult to classify

48:44
Maybe the examples we see now will help

59:21
Naive question - In a class incremental setting, when a new class arrives for a particular task - will it create a new pathway or just fine tune the existing pathway for that task?

01:00:51
Interesting, thanks :)

01:04:53
Follow up question - whether the learning dynamics on these pathway structures have been investigated under curriculum learning setting as compared to passing shuffled data samples during training?

01:05:00
In the top plot, since the rates are comparable, why is XOR path completely suppressed?

01:34:08
Great talk, can you comment on how/if this extends to recurrent networks?

01:46:50
Great, talk!

01:46:51
thanks a lot

01:46:54
Thank you

01:46:55
:)

01:46:59
Thank you.