1. The coefficients omega are computed based on the dot product of gradients of train batches with the gradient of val batch. Explain why 2. Adding an L1 regularization on omega which should not affect the performance. Experimentally, it leads to a substantial improvement. explain why? 3. The authors paid much attention to the sample composition. How did they do it and why?