The problem of correlation screening arises in many disciplines including gene expression analysis, finance, and security, where the number of variables can range from a few hundred to hundreds of thousands. In this case the number p of variables is much larger than the number n of samples making the sample covariance matrix singular.
The objective of sparse correlation screening is detection: we wish to find a set of variables that have high correlations or high partial correlations under a user specified false positive constraint. This is in contrast to the well known problem of covariance selection, which is a problem of estimation: it attempts to find a good sparse approximation to sample covariance or inverse covariance.
In this talk, we will review several applications of sparse correlation screening, present scalable screening algorithms, develop mathematical theory for predicting error rates and phase transitions, and illustrate the theory and algorithms for bioinformatics problems.
Pour en savoir plus:
http://www.eecs.umich.edu/~hero/