ML p(r)ior | Efficient Frequent Directions Algorithm for Sparse Matrices

Efficient Frequent Directions Algorithm for Sparse Matrices

2016-02-01
1602.00412 | cs.DS
This paper describes Sparse Frequent Directions, a variant of Frequent Directions for sketching sparse matrices. It resembles the original algorithm in many ways: both receive the rows of an input matrix $A^{n \times d}$ one by one in the streaming setting and compute a small sketch $B \in R^{\ell \times d}$. Both share the same strong (provably optimal) asymptotic guarantees with respect to the space-accuracy tradeoff in the streaming setting. However, unlike Frequent Directions which runs in $O(nd\ell)$ time regardless of the sparsity of the input matrix $A$, Sparse Frequent Directions runs in $\tilde{O} (nnz(A)\ell + n\ell^2)$ time. Our analysis loosens the dependence on computing the Singular Value Decomposition (SVD) as a black box within the Frequent Directions algorithm. Our bounds require recent results on the properties of fast approximate SVD computations. Finally, we empirically demonstrate that these asymptotic improvements are practical and significant on real and synthetic data.

Highlights - Most important sentences from the article

Related Articles

2015-05-28
1505.07570 | cs.MS

Matrix operations such as matrix inversion, eigenvalue decomposition, singular value decomposition a… show more

Highlights - Most important sentences from the article

2014-10-14
1410.3886 | cs.DS

In this work, we propose a new randomized algorithm for computing a low-rank approximation to a give… show more

Highlights - Most important sentences from the article

2016-10-21
1610.06656 | stat.ML

In this paper we present a new algorithm for computing a low rank approximation of the product $A^TB… show more Highlights - Most important sentences from the article 2018-10-16 1810.06825 | cs.LG Principal component analysis (PCA) is widely used for dimension reduction and embedding of real data… show more Highlights - Most important sentences from the article 2019-05-11 1905.04452 | physics.data-an Most of the real world complex networks such as the Internet, World Wide Web and collaboration netwo… show more Highlights - Most important sentences from the article 2017-10-08 1710.02812 | cs.NA Singular value decomposition (SVD) is a widely used technique for dimensionality reduction and compu… show more Highlights - Most important sentences from the article 2019-05-24 1905.10415 | quant-ph We study the practical performance of quantum-inspired algorithms for recommendation systems and lin… show more Highlights - Most important sentences from the article 2019-01-10 1901.03254 | cs.DS Semidefinite programming (SDP) is a central topic in mathematical optimization with extensive studie… show more Highlights - Most important sentences from the article 2019-05-13 1905.05107 | cs.NA Proper Orthogonal Decomposition (POD), also known as Principal component analysis (PCA), is a dimens… show more Highlights - Most important sentences from the article 2018-05-10 1805.03765 | cs.DS We initiate the study of numerical linear algebra in the sliding window model, where only the most r… show more Highlights - Most important sentences from the article 2016-09-19 1609.05885 | cs.DS A central problem in data streams is to characterize which functions of an underlying frequency vect… show more Highlights - Most important sentences from the article 2018-04-09 1804.03065 | cs.LG We consider the problem of finding anomalies in high-dimensional data using popular PCA based anomal… show more Highlights - Most important sentences from the article 2018-09-19 1809.06986 | cs.DS Let$\mathbf{P}=\{ p_1, p_2, \ldots p_n \}$and$\mathbf{Q} = \{ q_1, q_2 \ldots q_m \}$be two poin… show more Highlights - Most important sentences from the article 2017-11-01 1711.00439 | cs.NA This paper addresses matrix approximation problems for matrices that are large, sparse and/or that a… show more Highlights - Most important sentences from the article 2018-03-07 1803.02661 | math.NA Principal component regression (PCR) is a useful method for regularizing linear regression. Although… show more Highlights - Most important sentences from the article 2017-06-09 1706.02803 | cs.LG Kernel$k\$-means clustering can correctly identify and extract a far more varied collection of clust… show more

Highlights - Most important sentences from the article