Skip to main content

Algorithms for Testing Properties of Distributions

Objective

In a wide variety of computational settings, where the input data is most naturally viewed as coming from a distribution, it is often crucial to determine whether the underlying distribution satisfies various properties. Examples of such properties include whether two distributions are close or far in statistical distance, whether a joint distribution is independent, and whether a distribution has high entropy. For most such properties, standard statistical techniques which approximate the distribution lead to algorithms which use a number of samples that is nearly linear in the domain size. Until very recently, distributions over large domains, for which linear sample complexity can be daunting, have received surprisingly little attention. However, new interest in these questions comes from many directions, including data mining, research in the natural sciences, and networking algorithms. Recent results have shown that one can achieve results which are significantly more efficient than the standard techniques for the case of large domains. We propose a research program that will lead to an understanding of the sample, time and space complexity required to identify various natural properties of a probability distribution. We will focus on determining which properties can be understood with a number of samples that is sublinear in the domain size, and will lead to an understanding of the aspects of algorithm design that are specific to these constraints. The questions that will be considered range from considering the complexity of testing previously unstudied properties, understanding the complexity of approximating the distance to having a property, finding improved algorithms for important subclasses of distributions, investigating new models of distribution testing, and further understanding the relationship between the computational complexity and sample complexity.

Field of science

  • /natural sciences
  • /natural sciences/computer and information sciences/data science/data mining

Call for proposal

FP7-PEOPLE-IRG-2008
See other projects for this call

Funding Scheme

MC-IRG - International Re-integration Grants (IRG)

Coordinator

TEL AVIV UNIVERSITY
Address
Ramat Aviv
69978 Tel Aviv
Israel
Activity type
Higher or Secondary Education Establishments
EU contribution
€ 100 000
Administrative Contact
Lea Pais (Ms.)