Who can see your viewing activity?
Wonder how does one reconcile this line of argument with simply getting an upper bound on sample complexity by say bounding the VC dimension by simply counting the number of composed functions possible using the given structure ? Does using error continuous approximation captured by epsilon requirement make all the difference ?
J. Antonio LB
I am pretty new in DL so perhaps some of my questions may be naïve, but I am wondering why the assumption of IID is set in many papers, in many examples it is more natural to have realization over a process with a nonvanishing correlation function. For instance in the example you mentioned of heat equation, the parameters in the inverse problem are usually taken from a Gaussian Process. So what is the meaning in this scenario of generalization?Besides that, what of the theory can be generalizable to operators? (Which is closer to my own interest).