Yining Wang

my photo

PhD student

Machine Learning Department,
School of Computer Science,
Carnegie Mellon University
Pittsburgh, PA, USA

Email: yiningwa at cs dot cmu dot edu
or ynwang dot yining at gmail dot com

Office: GHC 8021

I am a second year PhD student in the Machine Learning Department at Carnegie Mellon University. My advisor is Aarti Singh. Before coming to CMU, I was an undergraduate student at the Yao Class in Tsinghua University.

I am generally interested in statistical machine learning. Some topics that interest me include matrix completion and approximation, column subset selection and subspace clustering. I also want to know whether these tasks can be solved when only partial observations are available, and how can we improve over existing approaches by adaptively sampling/sensing in a feedback-driven manner.

Previously I have worked on Bayesian nonparametric modeling and the method of moments (spectral methods). I'm still interested in these topics. If you have common interests, feel free to stop by my office and we can have a chat.

Download CV 


Carnegie Mellon University

PhD student in Machine Learning, School of Computer Science
Advisor: Aarti Singh

2014 - present

Tsinghua University

B. Eng. in Computer Science
Undergraduate thesis: Spectral Methods in Supervised Topic Modeling
Thesis advisor: Jun Zhu

2010 - 2014


Symantec Research Labs

Research Intern.
Supervisor: Petros Efstathopoulos and Kevin Roundy.

June 2015 - Aug 2015

Design and implementation of Project Harbinger, a system for enterprise level malicious attack prediction based on collaborative filtering.

Tsinghua University

RA at State Key Laboratory of Intelligent Technology and Systems.
Advisor: Jun Zhu

Aug 2013 - Jul 2014

Research topics include small-variance asymptotic analysis for Bayesian nonparametric models and spectral learning for latent variable models.

Massachusetts Institute of Technology

Undergraduate exchange program at Department of EECS

Jan 2013 - May 2013

Courses: Inference and Information, Nonlinear Programming and Automatic Speech Recognition
Research advisors: Jingjing Liu and Cynthia Rudin
Research topics: semantic role labeling in spoken dialogue systems and discrete optimization for learning to rank applications.

Microsoft Research Asia

Research intern at Technical Strategies group
Supervisors: Eric Chang and Junichi Tsujii

Oct 2011 - Jan 2013

Development of natural language processing applications in the medical informatics domain.

Selected publications and preprints

A full publication list is available in the curriculum vitae.


Computationally Feasible Near-Optimal Subset Selection for Linear Regression under Measurement Constraints

Yining Wang, Adams Wei Yu and Aarti Singh

Last updated: July, 2016.

We present computationally feasible near-optimal subset selection methods for linear regression. Both estimation and prediction problems are considered and interpretable selection algorithms are discussed under random design settings.

abstract pdf arXiv

Graph Connectivity in Noisy Sparse Subspace Clustering

Yining Wang, Yu-Xiang Wang and Aarti Singh

Last updated: April, 2016. To appear at Internatinoal Conference on Artificial Intelligence and Statistics (AISTATS), 2016.

This paper studies graph connectivity of sparse subspace clustering when input data are corrupted by noise and proposes a solution that works under deterministic eigenvalue conditions.

abstract pdf arXiv

An Improved Gap-Dependency Analysis of the Noisy Power Method

Maria-Florina Balcan*, Simon Du*, Yining Wang* and Adams Wei Yu* (*alphabetic order)

Last updated: February, 2016. To appear in Confernece on Learning Theory (COLT), 2016.

We improve upon Hardt and Price's analysis on noisy power method by relating the noise magnitude to an "enlarged" spectral gap quantity. Potential applications to streaming private PCA are also discussed.

abstract pdf arXiv


Fast and Guaranteed Tensor Decomposition via Sketching

Yining Wang, Hsiao-Yu Tung, Alex Smola and Anima Anandkumar

Last updated: June, 2015. An abridged version to appear at Advances in Neural Information Processing Systems (NIPS), 2015 and is selected for spotlight presentation.

In this paper, we demonstrated an accelerated tensor CP decomposition algorithm based on tensor sketching and applied it to learning latent topics of collection of documents.

abstract pdf arXiv poster slides code

Differentially Private Subspace Clustering

Yining Wang, Yu-Xiang Wang and Aarti Singh

Advances in Neural Information Processing Systems (NIPS), 2015. Camera-ready forthcoming.

We proposed two private subspace algorithms: one is based on the sample-aggregate framework and has formal utility guarantees; the other is based on the exponential framework and works well in practice.

abstract pdf poster

Provably Correct Active Sampling Algorithms for Matrix Column Subset Selection with Missing Data

Yining Wang and Aarti Singh

Last updated: May, 2015.

  • A short version appeared at International Conference on Artificial Intelligence and Statistics (AISTATS), 2015.
  • A follow-up note, presented at Annual Allerton Conference on Communication, Control and Computing (Allerton), 2015, with slides, investigates empirical performance of the proposed methods on fully observed inputs.

Is column subset selection possible when most of the data are unavailable? Based on the idea of active sampling, we provide in this paper three provably correct algorithms that excel in different aspects.

abstract pdf arXiv AISTATS poster

A Deterministic Analysis of Noisy Sparse Subspace Clustering for Dimensionality-reduced Data

Yining Wang, Yu-Xiang Wang and Aarti Singh

International Conference on Machine Learning (ICML), 2015.

In this paper we report deterministic success conditions of sparse subspace clustering when data are compressed in various ways.

abstract pdf poster slides

DP-space: Bayesian Nonparametric Subspace Clustering with Small-variance Asymptotics

Yining Wang and Jun Zhu

International Conference on Machine Learning (ICML), 2015.

We propose a Bayesian nonparametric model for subspace clustering that can learn both the number of subspaces and the intrinsic dimension of each subspace automatically from training data.

abstract pdf poster slides code


Noise-adaptive Margin-based Active Learning and Lower Bounds under Tsybakov Noise Condition

Yining Wang and Aarti Singh

Last updated: November, 2015. An abridged version to appear in AAAI Conference on Artificial Intelligence (AAAI), 2016.

We analyzed a noise-adaptive version of margin-based active learning algorithm under Tsybakov noise conditions. We also showed that assuming the data distribution is uniform does not make the corresponding active learning problem easier.

abstract pdf arXiv slides

Spectral Methods for Supervised Topic Models

Yining Wang and Jun Zhu

Advances in Neural Information Processing Systems (NIPS), 2014

We give spectral methods that provably recover parameters in a supervised topic model.

abstract pdf appendix NIPS poster code

Small Variance Asymptotics for Dirichlet Process Mixtures of SVMs

Yining Wang and Jun Zhu

AAAI Conference on Artificial Intelligence (AAAI), 2014. (Oral presentation)

By applying small-variance asymptotic analysis, we derive a deterministic learning procedure of the infinite SVM model which is much faster than existing variational inference and Gibbs sampling approaches.

abstract pdf poster slides code


A Theoretical Analysis of Normalized Discounted Cumulative Gain (NDCG) Ranking Measures

Yining Wang, Liwei Wang, Yuanzhi Li, Di He, Wei Chen and Tie-yan Liu

Conference on Learning Theory (COLT), 2013.

In this paper we proved the power of distinguishability of the standard NDCG loss function as well as its other variants under a pointwise ranking model.

abstract pdf COLT poster arXiv