Large Deviations for Statistical Sequence Matching

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

We revisit the problem of statistical sequence matching between two databases of sequences initiated by Unnikrishnan (TIT 2015) and derive achievable theoretical performance guar-antees for a generalized likelihood ratio test (G LRT) in the large deviations regime, when the number of matched pairs of sequences between two databases is unknown. In this case, the task is to accurately estimate the number of matched pairs and identify the matched pairs of sequences among all possible matches between the sequences in the two databases. We generalize the GLRT by Unnikrishnan and explicitly characterize the tradeoff among the exponential decay rates for probabilities of mismatch, false reject and false alarm. When one of the two databases contains a single sequence, the problem of statistical sequence matching specializes to the problem of multiple classification introduced by Gutman (TIT 1989). For this special case, our result strengthens previous result of Gutman (TIT 1989) and Zhou, Tan and Motani (Information and Inference 2020) by allowing the testing sequence to be generated from a distribution that is different from generating distributions of all training sequences.

Original languageEnglish
Title of host publication2024 IEEE International Symposium on Information Theory, ISIT 2024 - Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages1275-1280
Number of pages6
ISBN (Electronic)9798350382846
DOIs
StatePublished - 2024
Event2024 IEEE International Symposium on Information Theory, ISIT 2024 - Athens, Greece
Duration: 7 Jul 202412 Jul 2024

Publication series

NameIEEE International Symposium on Information Theory - Proceedings
ISSN (Print)2157-8095

Conference

Conference2024 IEEE International Symposium on Information Theory, ISIT 2024
Country/TerritoryGreece
CityAthens
Period7/07/2412/07/24

Keywords

  • False alarm
  • False reject
  • Finite length analysis
  • Misclassification
  • Second-order asymptotics

Fingerprint

Dive into the research topics of 'Large Deviations for Statistical Sequence Matching'. Together they form a unique fingerprint.

Cite this