1. Cohen, G., Afshar, S., Tapson, J., van Schaik, A.:EMNIST:an extension of MNIST to handwritten letters (2017). arXiv:1702.05373 2. Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.:AutoAugment:learning augmentation strategies from data. In:Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 113-123 (2019) 3. DeVries, T., Taylor, G.W.:Improved regularization of convolutional neural networks with cutout (2017). arXiv:1708.04552 4. Gastaldi, X.:Shake-shake regularization (2017). arXiv:1705.07485 5. He, K., Zhang, X., Ren, S., Sun, J.:Deep residual learning for image recognition. In:Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770-778 (2016) 6. He, K., Zhang, X., Ren, S., Sun, J.:Identity mappings in deep residual networks. In:European Conference on Computer Vision, pp. 630-645. Springer, Cham (2016) 7. Hu, J., Shen, L., Sun, G.:Squeeze-and-excitation networks. In:Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7132-7141 (2018) 8. Huang, G., Liu, Z., Van Der Maaten, L., Weinberger, K.Q.:Densely connected convolutional networks. In:Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4700-4708 (2017) 9. Huang, G., Sun, Y., Liu, Z., Sedra, D., Weinberger, K.Q.:Deep networks with stochastic depth. In:European Conference on Computer Vision, pp. 646-661. Springer, Cham (2016) 10. Iofe, S., Szegedy, C.:Batch normalization:accelerating deep network training by reducing internal covariate shift (2015). arXiv:1502.03167 11. Krizhevsky, A., Hinton, G.:Learning multiple layers of features from tiny images. Technical Report TR-2009, University of Toronto, Toronto (2009) 12. Krizhevsky, A., Sutskever, I., Hinton, G.E.:Imagenet classifcation with deep convolutional neural networks. In:Advances in Neural Information Processing Systems, pp. 1097-1105 (2012) 13. Krueger, D., Maharaj, T., Kramár, J., Pezeshki, M., Ballas, N., Ke, N.R., Goyal, A., Bengio, Y., Courville, A., Pal, C.:Zoneout:regularizing RNNs by randomly preserving hidden activations (2016). arXiv:1606.01305 14. Lee, J., Xiao, L., Schoenholz, S., Bahri, Y., Novak, R., Sohl-Dickstein, J., Pennington, J.:Wide neural networks of any depth evolve as linear models under gradient descent. In:Advances in Neural Information Processing Systems, pp. 8570-8581 (2019) 15. Li, X., Chen, S., Hu, X., Yang, J.:Understanding the disharmony between dropout and batch normalization by variance shift. In:Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2682-2690 (2019) 16. Netzer, Y., Wang, T., Coates, A., Bissacco, A., Wu, B., Ng, A.Y.:Reading digits in natural images with unsupervised feature learning. In:NIPS Workshop on Deep Learning and Unsupervised Feature Learning, Granada, Spain (2011) 17. Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., Berg, A.C., Li, F.F.:Imagenet large scale visual recognition challenge. Int. J. Comput. Vis. 115(3), 211-252 (2015) 18. Simonyan, K., Zisserman, A.:Very deep convolutional networks for large-scale image recognition (2014). arXiv:1409.1556 19. Singh, S., Hoiem, D., Forsyth, D.:Swapout:learning an ensemble of deep architectures. In:Advances in Neural Information Processing Systems, pp. 28-36 (2016) 20. Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.:Dropout:a simple way to prevent neural networks from overftting. J. Mach. Learn. Res. 15(1), 1929-1958 (2014) 21. Sutskever, I., Martens, J., Dahl, G., Hinton, G.:On the importance of initialization and momentum in deep learning. In:International Conference on Machine Learning, pp. 1139-1147 (2013) 22. Wager, S., Wang, S., Liang, P.S.:Dropout training as adaptive regularization. In:Advances in Neural Information Processing Systems, pp. 351-359 (2013) 23. Wan, L., Zeiler, M., Zhang, S., Le Cun, Y., Fergus, R.:Regularization of neural networks using dropconnect. In:International Conference on Machine Learning, pp. 1058-1066 (2013) 24. Xie, L., Wang, J., Wei, Z., Wang, M., Tian, Q.:DisturbLabel:regularizing CNN on the loss layer. In:Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4753-4762 (2016) 25. Xie, S., Girshick, R., Dollár, P., Tu, Z., He, K.:Aggregated residual transformations for deep neural networks. In:Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1492-1500 (2017) 26. Xu, B., Wang, N., Chen, T., Li, M.L.:Empirical evaluation of rectifed activations in convolutional network (2015). arXiv:1505.00853 27. Yamada, Y., Iwamura, M., Akiba, T., Kise, K.:Shakedrop regularization for deep residual learning (2018). arXiv:1802.02375 28. Zagoruyko, S., Komodakis, N.:Wide residual networks (2016). arXiv:1605.07146 29. Zeiler, M.D., Fergus, R.:Stochastic pooling for regularization of deep convolutional neural networks (2013). arXiv:1301.3557 30. Zeiler, M.D., Fergus, R.:Visualizing and understanding convolutional networks. In:European Conference on Computer Vision, pp. 818-833. Springer, Cham (2014) 31. Zhang, H., Cisse, M., Dauphin, Y.N., Lopez-Paz, D.:Mixup:beyond empirical risk minimization (2017). arXiv:1710.09412 |