Abstract

Peak detection is a central problem in genomic data analysis, with algorithms for this task being unsupervised and mostly effective for a single data type and pattern only. In response to this, “PeakSeg” was proposed, which is a constrained maximum Poisson likelihood segmentation model (more information on which can be found here).

The package PeakSegDP provides the baseline that computes an approximate solution (subject to a constraint on the number of segments) to the up-down (changes must alternate following an up-down trend) constrained problem which satisfies the PeakSeg constraints via the constrained dynamic programming algorithm (cDPA).

Time Complexity

cDPA is known to be a quadratic time algorithm, and the implementation for the same in PeakSegDP::cDPA is a low-level interface to the C solver, which retains the same computational complexity.

Reference

Source enacting as reference for the above information includes the github repository and this pdf.

Testing

We can verify the quadratic time complexity for this algorithm by a run with our time complexity analysis functions asymptoticTimings and asymptoticTimeComplexityClass:

library(testComplexity)
data.vec <- rpois(N, 1)
df <- asymptoticTimings(PeakSegDP::cDPA(data.vec, maxSegments = 3L), data.sizes = 10^seq(1, 4, by = 0.5))
asymptoticTimeComplexityClass(df)
#> [1] "quadratic"

Created on 2020-08-17 by the reprex package (v0.3.0)