Options
Concentration and Tail Bounds for Missing Mass
Date Issued
01-07-2019
Author(s)
Chandra, Prafulla
Indian Institute of Technology, Madras
Abstract
The missing mass of a sequence is defined as the total probability of the elements that have not appeared or occurred in the sequence. Estimation of missing mass is an important ingredient in many practical applications in language modeling and ecology. Exponential tail bounds have been known for missing mass, and improving them results in better confidence in estimation. In this work, we improve upon the best-known left and right tail bounds for missing mass. For the left tail, our proof method is arguably simpler than prior methods and provides a better bound for small sample sizes. For the right tail, we provide a new bounding method for the moment generating function of a generalized version of missing mass that results in a noticeable improvement in the tail bound.
Volume
2019-July