# Alex Gittens' Homepage

I graduated in June 2013, so this page is no longer maintained. In particular, my contact information has changed and some of the linked code and papers have been updated. Please redirect to agittens.net for more up-to-date information.

## Generalities

I graduated from the Applied and Computational Mathematics Department (ACM) at CalTech in Spring 2013 with a PhD in applied mathematics. My thesis is (not so originally) entitled "Topics in Randomized Numerical Linear Algebra." In it, I provide guarantees on the quality of two classes of randomized low-rank matrix approximation algorithms, analyzed the errors of randomized matrix sparsification schemes, and extended the matrix Laplace transform framework of Ahlswede--Winter to produce eigenvalue tail bounds for all eigenvalues of a sum of appropriate random matrices. My PhD advisor was Joel Tropp. I received my undergraduate degrees (EE and Math) from the University of Houston.

## Teaching

• In Fall 2012, I taught the ACM11 course on Mathematica and Matlab; that winter, I TAed ACM113, the introductory course on convex analysis and optimization. In Fall 2011, I TAed an introductory graduate course on probability. I TAed ACM105, an introductory graduate course on functional analysis, in Fall 2010. I taught the ACM11 course on Mathematica and Matlab in Fall 2009, after co-teaching it with Stephen Becker in Fall 2008. Prior to 2008, I TAed ACM101, an introduction to complex analysis, ODEs, PDEs, and perturbation analysis; ACM104, our graduate introduction to linear algebra; and ACM105.

## Research

I am interested in computational and statistical applications of probability and functional analysis.

## Publications

• Introduction to the minimax Laplace transform technique (arXiv link). The idea is straightforward: by combining the Courant-Fischer characterization of eigenvalues and the Laplace transform machinery, we bound the deviations, in both directions, of each of the eigenvalues of a sum of independent Hermitian random matrices. We derive eigenvalue Bennett/Bernstein and Chernoff inequalities using this machinery, and we also consider two standard problems: quantifying the effect of column sampling on the spectrum of a matrix with orthogonal rows, and covariance estimation. In the former, we show that (a quantity like) the coherence of the matrix controls the deviation from the expected singular values. In the latter, we produce the first results on relative error approximation of the eigenvalues of the covariance matrix. Specifically, we show that for a Gaussian vector, it suffices to take $$\varepsilon^{-2} p \log p$$ samples to determine the eigenvalues of the covariance matrix to within a factor of $$1 \pm \varepsilon.$$
• Error Bounds for Random Matrix Approximation Schemes (arXiv link). An investigation of the error incurred when you use random sparsification schemes to sparsify a matrix. The norms investigated are the spectral norm, due to its ubiquity, and the $$\infty\rightarrow 1$$ and $$\infty\rightarrow 2$$ operator norms, due to their relevance to graph theoretical algorithms. We provide simple, asymptotically optimal bounds on the approximation errors due to schemes which sparsify the entries of matrices independently in such a manner that the approximants average to the original matrices. Our results easily recover guarantees similar or better than several found earlier in the literature using scheme-specific analyses.
• The Masked Sample Covariance Estimator (arXiv link). In many modern statistical problems, there are far fewer observations than there are variables (the `large p, small n' regime). This means that a crucial operation, that of estimating the covariance matrix from samples, is not well-posed; for one thing, the sample covariance matrix is not even full rank, much less a good approximation to the actual covariance matrix. There are several approaches to dealing with this issue, all of which come down to assuming that your actual covariance matrix is within a certain class, and then choosing your estimator so that it is in that class and also not too far from your sample covariance matrix. In this paper, we consider a particular model for covariance matrices, namely that they are sparse, and analyze the error of estimators formed by zeroing the entries of the sample covariance matrix in certain locations that we somehow know apriori are zero. Intuitively, this reduces the complexity of the estimation problem because we are now estimating many fewer variables. We sharpen results due to Levina and Vershynin in the gaussian case, and extend their results to cover the case of subgaussian distributions.
• The spectral norm error of the naive Nystrom extension (arXiv link). Many applications in machine learning and image processing require either multiplicating against large kernels or calculating the eigenvectors of such kernels. One class of techniques for speeding up these operations is known as Nystrom extensions: these involve sampling the columns of a positive matrix to form a low-rank approximation which can be used for either fast approximate multiplication or the fast approximation of the dominant eigenvectors. The simplest Nystrom technique is the naive scheme where columns are uniformly sampled. In this work, I established the first relative error spectral norm bound for this scheme (around the time this went up on arXiv, Chiu and Demanet arXiv link released a paper on a related but more general low-rank approximation scheme and provided bounds that reduce to mine in a special case).
• Improved matrix algorithms via the Subsampled Randomized Hadamard Transform (arXiv link). Many randomized linear algebra algorithms rely upon fast dimension reductional methods. A popular choice is the Subsampled Randomized Hadamard Transform. In this preprint, we address the efficacy in the Frobenius and spectral norms, of an SRHT-based low-rank matrix approximation technique introduced by Wolfe, Liberty, Rohklin, and Tygert. We establish a slightly better Frobenius norm error bound than currently available, and a much sharper spectral norm bound (in the presence of reasonable decay of the singular values). Along the way, we produce several results on matrix operations with SRHTs (such as approximate matrix multiplication) that may be of independent interest. Our approach builds upon Tropp's in "Improved analysis of the subsampled randomized Hadamard Transform."
• Revisiting the Nystrom Method for Improved Large-Scale Machine Learning (arXiv link, MATLAB Code). We reconsider randomized algorithms for the low-rank approximation of symmetric positive semi-definite (SPSD) matrices such as Laplacian and kernel matrices that arise in data analysis and machine learning applications. Our main results consist of an empirical evaluation of the performance quality and running time of sampling and projection methods on a diverse suite of SPSD matrices. Our results highlight complementary aspects of sampling versus projection methods based on leverage scores. We complement our empirical results with a suite of worst-case theoretical bounds for both random sampling and random projections methods. These bounds are qualitatively superior to existing bounds---e.g., improved additive-error bounds for the spectral and Frobenius norm errors and relative-error bounds for the trace norm error.
• Topics in Randomized Numerical Linear Algebra (PhD Thesis). Still in editing.

## Slides

• Slides from a talk I gave during the 2013 ILAS meeting on randomized low-rank matrix approximations (specifically, fast projection-based approximations to general matrices and SPSD sketches of positive matrices), appropriately entitled Randomized low-rank approximations, in theory and practice.
• Slides from a talk about my research in using randomness as a tool in algorithms for linear algebra, entitled Random Methods in Linear Algebra. I gave this talk during the first ever ACM Tea hour!
• Slides from a talk tangentially about my research (in that the $$\gamma_2$$ norm on matrices is related to the matrix sign decomposition problem via Grothendieck's inequality), entitled A complexity measure on matrices. I gave this talk during another ACM Tea hour.

Last updated June 4, 2013