Logo

Benham theory ML talk - Shared screen with speaker view
Brenda Ng
51:23
Would it be more insightful to compare how quickly the deep and narrow networks get to the early stopping state faster, assuming both’s early stopping state gives comparable accuracy? This is in reference to the comparison of 400 epochs vs. 4000 epochs.
Kameron Harris
54:07
Hi Brenda, that was kind of what I was thinking too! I guess if all nets were trained with vanilla SGD that would be a good thing to check :-)
Yamini’s iPhone
01:08:51
do we know what the best performance we can get from an ensemble of underparameterized networks for CIFAR-10?
Zhilong Fang
01:12:17
When comparing different networks and training the different networks, which parameter that you fix the same for all the networks? Computational time or Epoch or Something else? Why do you think that parameter is a reasonable choice for the comparison?
Brenda Ng
01:15:33
Have you considered running neural architecture search to see what “optimal” CNNs they come up with and how they compare against your models?
Manos Theodosis
01:26:23
So for this you train the network first and then you’re pruning connections? Or do you do principled learning, learning sparse weights constructively?
Rahim Entezari
01:27:08
@Manos: 2nd I guess
Manos Theodosis
01:27:40
@Rahim: yeah, this is what it seems like!
Kameron Harris
01:28:39
I think it can be seen as the proximal-gradient update of a modified L1 penalty... usually beta would be related to the step size
Kameron Harris
01:28:56
(soft-thresholding)
Manos Theodosis
01:29:52
Yeah, I just asked because there’s a series of results on network pruning after training (a paper from this year’s ICML using tropical polynomial division comes to mind)
brian
01:39:38
Behnam, you and the text you cited on the MDL slide said that the description language is arbitrary. could you use a language that assigns a smaller description length to models with more parameters (e.g., use 1/(32*#params)), and if so, does that hurt the applicability of MDL as an explanation here?
Manos Theodosis
02:10:50
Thank you!
brian
02:10:50
thanks!