Communications on Applied Mathematics and Computation

CAMC Focused Section on Tensor Computation

Delin Chu, Michael Ng, Liqun Qi, Qing-Wen Wang

2021, 3(2): 199-199. doi:10.1007/s42967-020-00113-2

Asbtract ( 8191 )

PDF

Related Articles | Metrics

T-Jordan Canonical Form and T-Drazin Inverse Based on the T-Product

Yun Miao, Liqun Qi, Yimin Wei

2021, 3(2): 201-220. doi:10.1007/s42967-019-00055-4

Asbtract ( 6205 )

PDF

References | Related Articles | Metrics

In this paper, we investigate the tensor similarity and propose the T-Jordan canonical form and its properties. The concepts of the T-minimal polynomial and the T-characteristic polynomial are proposed. As a special case, we present properties when two tensors commute based on the tensor T-product. We prove that the Cayley-Hamilton theorem also holds for tensor cases. Then, we focus on the tensor decompositions: T-polar, T-LU, T-QR and T-Schur decompositions of tensors are obtained. When an F-square tensor is not invertible with the T-product, we study the T-group inverse and the T-Drazin inverse which can be viewed as the extension of matrix cases. The expressions of the T-group and T-Drazin inverses are given by the T-Jordan canonical form. The polynomial form of the T-Drazin inverse is also proposed. In the last part, we give the T-core-nilpotent decomposition and show that the T-index and T-Drazin inverses can be given by a limit process.

Parallel Active Subspace Decomposition for Tensor Robust Principal Component Analysis

Michael K. Ng, Xue-Zhong Wang

2021, 3(2): 221-241. doi:10.1007/s42967-020-00063-9

Asbtract ( 7227 )

PDF

References | Related Articles | Metrics

Tensor robust principal component analysis has received a substantial amount of attention in various felds. Most existing methods, normally relying on tensor nuclear norm minimization, need to pay an expensive computational cost due to multiple singular value decompositions at each iteration. To overcome the drawback, we propose a scalable and efcient method, named parallel active subspace decomposition, which divides the unfolding along each mode of the tensor into a columnwise orthonormal matrix (active subspace) and another small-size matrix in parallel. Such a transformation leads to a nonconvex optimization problem in which the scale of nuclear norm minimization is generally much smaller than that in the original problem. We solve the optimization problem by an alternating direction method of multipliers and show that the iterates can be convergent within the given stopping criterion and the convergent solution is close to the global optimum solution within the prescribed bound. Experimental results are given to demonstrate that the performance of the proposed model is better than the state-of-the-art methods.

The Spectral Radii of Intersecting Uniform Hypergraphs

Peng, Li Zhang, Xiao, Dong Zhang

2021, 3(2): 243-256. doi:10.1007/s42967-020-00073-7

Asbtract ( 5048 )

PDF

References | Related Articles | Metrics

The celebrated Erdős-Ko-Rado theorem states that given n ≥ 2k, every intersecting k-uniform hypergraph G on n vertices has at most $(_{k - 1}^{n - 1})$ edges. This paper states spectral versions of the Erdős-Ko-Rado theorem: let G be an intersecting k-uniform hypergraph on n vertices with n ≥ 2k. Then, the sharp upper bounds for the spectral radius of $\mathcal{A}$_α(G) and $\mathcal{Q}$^*(G) are presented, where $\mathcal{A}$_α(G) = α$\mathcal{D}$(G) + (1 - α)$\mathcal{A}$(G) is a convex linear combination of the degree diagonal tensor $\mathcal{D}$(G) and the adjacency tensor $\mathcal{A}$(G) for 0 ≤ α < 1, and $\mathcal{Q}$^*(G) is the incidence $\mathcal{Q}$-tensor, respectively. Furthermore, when n > 2k, the extremal hypergraphs which attain the sharp upper bounds are characterized. The proof mainly relies on the Perron-Frobenius theorem for nonnegative tensor and the property of the maximizing connected hypergraphs.

MERACLE: Constructive Layer-Wise Conversion of a Tensor Train into a MERA

Kim Batselier, Andrzej Cichocki, Ngai Wong

2021, 3(2): 257-279. doi:10.1007/s42967-020-00090-6

Asbtract ( 2589 )

PDF

References | Related Articles | Metrics

In this article, two new algorithms are presented that convert a given data tensor train into either a Tucker decomposition with orthogonal matrix factors or a multi-scale entanglement renormalization ansatz (MERA). The Tucker core tensor is never explicitly computed but stored as a tensor train instead, resulting in both computationally and storage efcient algorithms. Both the multilinear Tucker-ranks as well as the MERA-ranks are automatically determined by the algorithm for a given upper bound on the relative approximation error. In addition, an iterative algorithm with low computational complexity based on solving an orthogonal Procrustes problem is proposed for the frst time to retrieve optimal rank-lowering disentangler tensors, which are a crucial component in the construction of a low-rank MERA. Numerical experiments demonstrate the efectiveness of the proposed algorithms together with the potential storage beneft of a low-rank MERA over a tensor train.

On the Complexity of Finding Tensor Ranks

Mohsen Aliabadi, Shmuel Friedland

2021, 3(2): 281-289. doi:10.1007/s42967-020-00103-4

Asbtract ( 2390 )

PDF

References | Related Articles | Metrics

The purpose of this note is to give a linear algebra algorithm to fnd out if a rank of a given tensor over a feld $\mathbb{F}$ is at most k over the algebraic closure of $\mathbb{F}$, where k is a given positive integer. We estimate the arithmetic complexity of our algorithm.

A Focused Section on Deep Neural Network Machine Learning for Scientifc Computing

Wei Cai

2021, 3(2): 291-291. doi:10.1007/s42967-020-00109-y

Asbtract ( 2302 )

PDF

Related Articles | Metrics

Drop-Activation: Implicit Parameter Reduction and Harmonious Regularization

Senwei Liang, Yuehaw Khoo, Haizhao Yang

2021, 3(2): 293-311. doi:10.1007/s42967-020-00085-3

Asbtract ( 2334 )

PDF

References | Related Articles | Metrics

Overftting frequently occurs in deep learning. In this paper, we propose a novel regularization method called drop-activation to reduce overftting and improve generalization. The key idea is to drop nonlinear activation functions by setting them to be identity functions randomly during training time. During testing, we use a deterministic network with a new activation function to encode the average efect of dropping activations randomly. Our theoretical analyses support the regularization efect of drop-activation as implicit parameter reduction and verify its capability to be used together with batch normalization (Iofe and Szegedy in Batch normalization: accelerating deep network training by reducing internal covariate shift. arXiv:1502.03167, 2015). The experimental results on CIFAR10, CIFAR100, SVHN, EMNIST, and ImageNet show that drop-activation generally improves the performance of popular neural network architectures for the image classifcation task. Furthermore, as a regularizer drop-activation can be used in harmony with standard training and regularization techniques such as batch normalization and AutoAugment (Cubuk et al. in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 113-123, 2019). The code is available at https://github.com/LeungSamWai/Drop-Activ ation.

Learning the Mapping x ↦ $\sum\limits_{i = 1}^d {x_i^2}$: the Cost of Finding the Needle in a Haystack

Jiefu Zhang, Leonardo Zepeda-Núñez, Yuan Yao, Lin Lin

2021, 3(2): 313-335. doi:10.1007/s42967-020-00078-2

Asbtract ( 3083 )

PDF

References | Related Articles | Metrics

The task of using the machine learning to approximate the mapping x ↦ $\sum\limits_{i = 1}^d {x_i^2}$ with x_i ∈ [-1, 1] seems to be a trivial one. Given the knowledge of the separable structure of the function, one can design a sparse network to represent the function very accurately, or even exactly. When such structural information is not available, and we may only use a dense neural network, the optimization procedure to fnd the sparse network embedded in the dense network is similar to fnding the needle in a haystack, using a given number of samples of the function. We demonstrate that the cost (measured by sample complexity) of fnding the needle is directly related to the Barron norm of the function. While only a small number of samples are needed to train a sparse network, the dense network trained with the same number of samples exhibits large test loss and a large generalization gap. To control the size of the generalization gap, we fnd that the use of the explicit regularization becomes increasingly more important as d increases. The numerically observed sample complexity with explicit regularization scales as $\mathcal{O}$(d^2.5), which is in fact better than the theoretically predicted sample complexity that scales as $\mathcal{O}$(d⁴). Without the explicit regularization (also called the implicit regularization), the numerically observed sample complexity is signifcantly higher and is close to $\mathcal{O}$(d^4.5).

A Non-intrusive Correction Algorithm for Classifcation Problems with Corrupted Data

Jun Hou, Tong Qin, Kailiang Wu, Dongbin Xiu

2021, 3(2): 337-356. doi:10.1007/s42967-020-00084-4

Asbtract ( 14384 )

PDF

References | Related Articles | Metrics

A novel correction algorithm is proposed for multi-class classifcation problems with corrupted training data. The algorithm is non-intrusive, in the sense that it post-processes a trained classifcation model by adding a correction procedure to the model prediction. The correction procedure can be coupled with any approximators, such as logistic regression, neural networks of various architectures, etc. When the training dataset is sufciently large, we theoretically prove (in the limiting case) and numerically show that the corrected models deliver correct classifcation results as if there is no corruption in the training data. For datasets of fnite size, the corrected models produce signifcantly better recovery results, compared to the models without the correction algorithm. All of the theoretical fndings in the paper are verifed by our numerical examples.

Discovering Phase Field Models from Image Data with the Pseudo-Spectral Physics Informed Neural Networks

Jia Zhao

2021, 3(2): 357-369. doi:10.1007/s42967-020-00105-2

Asbtract ( 3257 )

PDF

References | Related Articles | Metrics

In this paper, we introduce a new deep learning framework for discovering the phase-feld models from existing image data. The new framework embraces the approximation power of physics informed neural networks (PINNs) and the computational efciency of the pseudo-spectral methods, which we named pseudo-spectral PINN or SPINN. Unlike the baseline PINN, the pseudo-spectral PINN has several advantages. First of all, it requires less training data. A minimum of two temporal snapshots with uniform spatial resolution would be adequate. Secondly, it is computationally efcient, as the pseudo-spectral method is used for spatial discretization. Thirdly, it requires less trainable parameters compared with the baseline PINN, which signifcantly simplifes the training process and potentially assures fewer local minima or saddle points. We illustrate the efectiveness of pseudo-spectral PINN through several numerical examples. The newly proposed pseudo-spectral PINN is rather general, and it can be readily applied to discover other PDE-based models from image data.

Table of Content