Here’s a list of complexity classes which in turn are lists of functions with corresponding complexity, to be used as test cases passed onto testComplexity functions:
# packages: # functions:
library(PeakSegDP) # cDPA()
library(PeakSegDisk) # PeakSegFPOP_vec()
library(PeakSegOptimal) # PeakSegPDPA(), PeakSegFPOP()
library(fpop) # fpop()
library(gfpop) # gfpop()
library(opart) # opart_gaussian()
library(changepoint) # cpt.mean()
expected.funs <- list(
#-------------------------------------------------------------------------------------------------------------
# Complexity class A : Linear
linear = list(
# A.1: The substring function. (retrieve/replace substring in a string)
# Case: Expected linear time complexity, but found quadratic for this case: (bug)
# reference case: https://stat.ethz.ch/pipermail/r-devel/2019-February/077318.html
substring = function(N){
substring(paste(rep("A", N), collapse = ""), 1:N, 1:N)
},
# A.2: The gregexpr function. (global regexpr, giving all matches in a string)
# Case: Similar to substring (in terms of complexity) and strsplit(), expected linear but is quadratic
# reference case: https://stat.ethz.ch/pipermail/r-devel/2017-January/073577.html
gregexpr = function(N){
gregexpr("a", paste(collapse = "", rep("ab", N)), perl = TRUE)
},
# Note: Both substring and gregexpr were quadratic in R-3.5.2 but are linear starting in R-3.6.0.
# reference: https://github.com/tdhock/namedCapture-article#6-mar-2019
# A.3: The cpt.mean function using PELT method:
changepoint = function(N){
changepoint::cpt.mean(rnorm(N), method = "PELT") # additional parameters: penalty, pen.value etc.
}
),
#-------------------------------------------------------------------------------------------------------------
# Complexity class B : Loglinear
loglinear = list(
# B.1: The fpop function.
# reference : https://arxiv.org/pdf/1810.00117.pdf
fpop = function(N){
fpop::Fpop(rnorm(N), 1)
},
# B.2: The gfpop function. (generalized fpop)
# reference: https://arxiv.org/abs/1810.00117
gfpop = function(N){
gfpop(data = dataGenerator(as.integer(N), c(0.1, 0.2, 0.3, 0.4, 0.6, 0.8, 1), c(0, 0.5, 1, 1.5, 2, 2.5, 3), sigma = 1), mygraph = graph(penalty = 2*log(as.integer(N)), type = "isotonic"), type = "mean")
},
# B.3: The PeakSegPDPA function.
PeakSegPDPA = function(N){
PeakSegOptimal::PeakSegPDPA(rpois(N, 1),rep(1, length(rpois(N, 1))), 3L) # max.segments >= 2.
},
# B.4: The PeakSegFPOP function.
# Reference: https://github.com/tdhock/PeakSegFPOP
PeakSegFPOP = function(N){
PeakSegOptimal::PeakSegFPOP(rpois(N, 1), rep(1, length(rpois(N, 1))), 3L)
},
# B.5: The PeakSegDisk::PeakSegFPOP_df function. (Fpop algo for writing data frame to disk)
PeakSegDisk = function(N){
PeakSegDisk::PeakSegFPOP_vec(rpois(N,10), 10)
}
),
#-------------------------------------------------------------------------------------------------------------
# Complexity class C : Quadratic
quadratic = list(
# C.1: The opart function
opart = function(N){
opart::opart_gaussian(rnorm(N), 1)
},
# C.2: The changepoint::cpt.mean function using SegNeigh method:
# Default penalty won't work so am using asymptotic penalty (i.e. 0 < penalty <= 1)
# It still explicity says "SegNeigh is computationally slow, use PELT instead".
changepoint = function(N){
changepoint::cpt.mean(rnorm(N), penalty = "Asymptotic", pen.value = 0.01, method = "SegNeigh")
},
# C.3: The cDPA function.
PeakSegDP = function(N){
PeakSegDP::cDPA(rpois(N, 1), rep(1, length(rpois(N, 1))), 3L)
},
# C.4: The PeakSegDisk::PeakSegFPOP_df function.
PeakSegDisk = function(N){
PeakSegDisk::PeakSegFPOP_vec(1:N, 10)
}
)
)
It’s currently composed of subsequent lists for the three major complexity classes we are concerned with, i.e. namely linear, log-linear and quadratic complexities.
Anirban | 05/20/2020 |