ML p(r)ior | Lower bounds for identifying subset members with subset queries

Lower bounds for identifying subset members with subset queries

9411219 | math.CO
An instance of a group testing problem is a set of objects $\cO$ and an unknown subset $P$ of $\cO$. The task is to determine $P$ by using queries of the type ``does $P$ intersect $Q$'', where $Q$ is a subset of $\cO$. This problem occurs in areas such as fault detection, multiaccess communications, optimal search, blood testing and chromosome mapping. Consider the two stage algorithm for solving a group testing problem. In the first stage a predetermined set of queries are asked in parallel and in the second stage, $P$ is determined by testing individual objects. Let $n=\cardof{\cO}$. Suppose that $P$ is generated by independently adding each $x\in \cO$ to $P$ with probability $p/n$. Let $q_1$ ($q_2$) be the number of queries asked in the first (second) stage of this algorithm. We show that if $q_1=o(\log(n)\log(n)/\log\log(n))$, then $\Exp(q_2) = n^{1-o(1)}$, while there exist algorithms with $q_1 = O(\log(n)\log(n)/\log\log(n))$ and $\Exp(q_2) = o(1)$. The proof involves a relaxation technique which can be used with arbitrary distributions. The best previously known bound is $q_1+\Exp(q_2) = \Omega(p\log(n))$. For general group testing algorithms, our results imply that if the average number of queries over the course of $n^\gamma$ ($\gamma>0$) independent experiments is $O(n^{1-\epsilon})$, then with high probability $\Omega(\log(n)\log(n)/\log\log(n))$ non-singleton subsets are queried. This settles a conjecture of Bill Bruno and David Torney and has important consequences for the use of group testing in screening DNA libraries and other applications where it is more cost effective to use non-adaptive algorithms and/or too expensive to prepare a subset $Q$ for its first test.

Highlights - Most important sentences from the article

Login to like/save this paper, take notes and configure your recommendations