####################################################################### # # # Questions for # # "Data-Dependent Initialization of Convolutional Neural Networks" # # by Philipp Kraehenbuehl, Carl Doersch, Jeff Donahue, Trevor Darrell # # via arXiv:1511.06856v1 # # # # Answers due ** June 8, 2016 at 10am** to mayern@cs.uni-freiburg.de # # # ####################################################################### 1. The authors sample some input images for their data-dependent weight initialization procedure. Can you think of circumstances in which this might go wrong? Hint: Think about what assumptions are made about the data pool. (1-2 sentences) 2. Prove that the claim in the first paragraph on page 2 is correct, i.e. that changing weights and biases in the specified way does not change the computed function of a Convolution-ReLU-Convolution network. Use X_(i+1) := weights_i * X_i + biases_i as the function of convolutional layer i, and X_(i+1) := max(0, X_(i+1)) for a ReLU after layer i (=you may ignore that ReLU's slope factor). (2-3 lines of math; no need to be formally perfect) 3. Does (2.) work if we use a sigmoid function instead of the ReLU? What if we use the square function (x^2)? (1 sentence)