Communications on Applied Mathematics and Computation ›› 2021, Vol. 3 ›› Issue (2): 337-356.doi: 10.1007/s42967-020-00084-4

• ORIGINAL PAPER • Previous Articles     Next Articles

A Non-intrusive Correction Algorithm for Classifcation Problems with Corrupted Data

Jun Hou, Tong Qin, Kailiang Wu, Dongbin Xiu   

  1. Department of Mathematics, The Ohio State University, Columbus, OH 43210, USA
  • Received:2020-02-10 Revised:2020-06-02 Online:2021-06-20 Published:2021-05-26
  • Contact: Dongbin Xiu, Jun Hou, Tong Qin, Kailiang Wu E-mail:xiu.16@osu.edu;hou.345@osu.edu;qin.428@osu.edu;wu.3423@osu.edu

Abstract: A novel correction algorithm is proposed for multi-class classifcation problems with corrupted training data. The algorithm is non-intrusive, in the sense that it post-processes a trained classifcation model by adding a correction procedure to the model prediction. The correction procedure can be coupled with any approximators, such as logistic regression, neural networks of various architectures, etc. When the training dataset is sufciently large, we theoretically prove (in the limiting case) and numerically show that the corrected models deliver correct classifcation results as if there is no corruption in the training data. For datasets of fnite size, the corrected models produce signifcantly better recovery results, compared to the models without the correction algorithm. All of the theoretical fndings in the paper are verifed by our numerical examples.

Key words: Data corruption, Deep neural network, Cross-entropy, Label corruption, Robust loss

CLC Number: