Title: | Weighted Segmentation using Functional Pruning and Optimal Partioning |
---|---|
Description: | Weighted-L2 FPOP Maidstone et al. (2017) <doi:10.1007/s11222-016-9636-3> and pDPA/FPSN Rigaill (2010) <arXiv:1004.0887> algorithm for detecting multiple changepoints in the mean of a vector. Also includes a few model selection functions using Lebarbier (2005) <doi:10.1016/j.sigpro.2004.11.012> and the 'capsushe' package. |
Authors: | Guillem Rigaill [aut, cre] |
Maintainer: | Guillem Rigaill <[email protected]> |
License: | GPL (>= 3) |
Version: | 1.1 |
Built: | 2024-11-18 05:38:52 UTC |
Source: | https://github.com/cran/fpopw |
compress data and return a weighted profile
compress.data(x)
compress.data(x)
x |
a numerical vector |
a list with the compressed profile x and associated repeat vector vrep
Function to run the Fpop algorithm (Maidstone et al. 2016). It uses functional pruning and optimal partionning. It optimizes the L2-loss for a penalty lambda per change.
Fpop(x, lambda, mini = min(x), maxi = max(x))
Fpop(x, lambda, mini = min(x), maxi = max(x))
x |
a numerical vector to segment |
lambda |
the penalty per changepoint (see Maidstone et al. 2016) |
mini |
minimum mean segment value to consider in the optimisation. |
maxi |
maximum mean segment value to consider in the optimisation. |
return a list with a vector t.est containing the position of the change-points, the number of changes K and, the cost J.est.
x <- c(rnorm(100), rnorm(10^3)+2, rnorm(1000)+1) est.sd <- sdDiff(x) ## rough estimate of std-deviation res <- Fpop(x=x,lambda=2*est.sd^2*log(length(x))) smt <- getSMT(res)
x <- c(rnorm(100), rnorm(10^3)+2, rnorm(1000)+1) est.sd <- sdDiff(x) ## rough estimate of std-deviation res <- Fpop(x=x,lambda=2*est.sd^2*log(length(x))) smt <- getSMT(res)
Function to run the Fpop algorithm (Maidstone et al. 2016) with weights. It uses functional pruning and optimal partionning. It optimizes the weighted L2-loss () for a penalty lambda per change.
Fpop_w(x, w, lambda, mini = min(x), maxi = max(x))
Fpop_w(x, w, lambda, mini = min(x), maxi = max(x))
x |
a numerical vector to segment. |
w |
a numerical vector of weights (values should be larger than 0). |
lambda |
the penalty per changepoint (see Maidstone et al. 2016). |
mini |
minimum mean segment value to consider in the optimisation. |
maxi |
maximum mean segment value to consider in the optimisation. |
return a list with a vector t.est containing the position of the change-points, the number of changes K and, the cost J.est.
x <- c(rnorm(100), rnorm(10^3)+2, rnorm(1000)+1) est.sd <- sdDiff(x) ## rough estimate of std-deviation res <- Fpop_w(x=x, w=rep(1, length(x)), lambda=2*est.sd^2*log(length(x))) smt <- getSMT(res)
x <- c(rnorm(100), rnorm(10^3)+2, rnorm(1000)+1) est.sd <- sdDiff(x) ## rough estimate of std-deviation res <- Fpop_w(x=x, w=rep(1, length(x)), lambda=2*est.sd^2*log(length(x))) smt <- getSMT(res)
The fpopw package provides wrapper to four functionnal pruning functions to to solve the optimal partionning and segment neighborhood problems with the L2-loss: Fpop, Fpop_w, Fpsn, Fpsn_w
fpopw functions are Fpop, Fpop_w, Fpsn, Fpsn_w, Fpsn_w_nomemory
Function to run the pDPA algorithm (Rigaill 2010 and 2015). It uses functional pruning and segment neighborhood. It optimizes the L2-loss for 1 to Kmax changes.
Fpsn(x, Kmax, mini = min(x), maxi = max(x))
Fpsn(x, Kmax, mini = min(x), maxi = max(x))
x |
a numerical vector to segment |
Kmax |
max number of segments (segmentations in 1 to Kmax segments are recovered). |
mini |
minimum mean segment value to consider in the optimisation |
maxi |
maximum mean segment value to consider in the optimisation |
return a list with a matrix t.est containing the change-points of the segmentations in 1 to Kmax changes and, the cost J.est in 1 to Kmax changes.
x <- c(rnorm(100), rnorm(10^3)+2, rnorm(1000)+1) res <- Fpsn(x=x, K=100) select.res <- select_Fpsn(res, method="givenVariance") smt <- getSMT(res, select.res)
x <- c(rnorm(100), rnorm(10^3)+2, rnorm(1000)+1) res <- Fpsn(x=x, K=100) select.res <- select_Fpsn(res, method="givenVariance") smt <- getSMT(res, select.res)
Function to run the weighted pDPA algorithm (Rigaill 2010 and 2015). It uses functional pruning and segment neighborhood. It optimizes the weighted L2-loss () for 1 to Kmax changes.
Fpsn_w(x, w, Kmax, mini = min(x), maxi = max(x))
Fpsn_w(x, w, Kmax, mini = min(x), maxi = max(x))
x |
a numerical vector to segment |
w |
a numerical vector of weights (values should be larger than 0). |
Kmax |
max number of segments (segmentations in 1 to Kmax segments are recovered). |
mini |
minimum mean segment value to consider in the optimisation |
maxi |
maximum mean segment value to consider in the optimisation |
return a list with a matrix t.est containing the change-points of the segmentations in 1 to Kmax changes and, the costs J.est in 1 to Kmax changes.
x <- c(rnorm(100), rnorm(10^3)+2, rnorm(1000)+1) res <- Fpsn_w(x=x, w=rep(1, length(x)), K=100) select.res <- select_Fpsn(res, method="givenVariance") smt <- getSMT(res, select.res)
x <- c(rnorm(100), rnorm(10^3)+2, rnorm(1000)+1) res <- Fpsn_w(x=x, w=rep(1, length(x)), K=100) select.res <- select_Fpsn(res, method="givenVariance") smt <- getSMT(res, select.res)
Function to run the weighted pDPA algorithm (Rigaill 2010 and 2015) without storing the set of last changes. It only return the cost in 1 to Kmax changes. It uses functional pruning and segment neighborhood. It optimizes the weighted L2-loss () for 1 to Kmax changes.
Fpsn_w_nomemory(x, w, Kmax, mini = min(x), maxi = max(x))
Fpsn_w_nomemory(x, w, Kmax, mini = min(x), maxi = max(x))
x |
a numerical vector to segment |
w |
a numerical vector of weights (values should be larger than 0). |
Kmax |
max number of segments (segmentations in 1 to Kmax segments are recovered). |
mini |
minimum mean segment value to consider in the optimisation |
maxi |
maximum mean segment value to consider in the optimisation |
return a list with the costs J.est in 1 to Kmax changes.
res <- Fpsn_w_nomemory(x=rnorm(10^4), w=rep(1, 10^4), K=100)
res <- Fpsn_w_nomemory(x=rnorm(10^4), w=rep(1, 10^4), K=100)
Function returning changes in a smoothed profile
get.change(smt)
get.change(smt)
smt |
smoothed profile |
a vector of changes including n
A function to get the segment sums of a vector given some changes including n
getSegSums_(x, tau)
getSegSums_(x, tau)
x |
data |
tau |
changes (including n) |
a vector of the sums
A function to get the smoothed profile from the output of Fpop, Fpop_w, Fpsn and Fpsn_w
getSMT(res, K = NULL)
getSMT(res, K = NULL)
res |
output of Fpop, Fpop_w, Fpsn or Fpsn_w |
K |
the number of changes (only if Fpsn or Fpsn_w) |
a vector of the smoothed profile
A function to get the smoothed profile from the data, weights and changepoints
getSMT_(x, weights = NULL, tauHat)
getSMT_(x, weights = NULL, tauHat)
x |
data |
weights |
weights |
tauHat |
changes (including n) |
a vector of the smoothed profile
function to recover changes for a given selected K after fpsn_nomemory
getTau_nomemory(res_fpsn, K_selected)
getTau_nomemory(res_fpsn, K_selected)
res_fpsn |
output of the function res_fpsn_nomemory |
K_selected |
K obtained using select_Fpsn |
return a set of changes
x <- c(rnorm(100), rnorm(10^3)+2, rnorm(1000)+1) res <- Fpsn_w_nomemory(x=x, w=rep(1, length(x)), K=100) select.res <- select_Fpsn(res, method="givenVariance") tau <- getTau_nomemory(res, select.res) smt <- getSMT_(res$signal, res$weights, tau)
x <- c(rnorm(100), rnorm(10^3)+2, rnorm(1000)+1) res <- Fpsn_w_nomemory(x=x, w=rep(1, length(x)), K=100) select.res <- select_Fpsn(res, method="givenVariance") tau <- getTau_nomemory(res, select.res) smt <- getSMT_(res$signal, res$weights, tau)
Function used internally by Fpop and Fpop_w to do the backtracking and recover the best set of changes from 1 to i
retour_op(path, i)
retour_op(path, i)
path |
vector of length n containing the best last changes for any j in |
i |
the last position to consider to start the backtracking. |
set of optimal changes up to i.
Function used internally by Fpsn and Fpsn_w to do the backtracking and recover the best set of segmentations in 1 to K changes from 1 to n.
retour_sn(path)
retour_sn(path)
path |
matrix of size (K x n) containing the last optimal changes up to j in k segments with i in |
a matrix of size (K x K) containing the best segmentations in 1 to K segments.
model selection function taken from S3IB,
saut(Lv, pen, Kseq, n, seuil = sqrt(n)/log(n), biggest = TRUE)
saut(Lv, pen, Kseq, n, seuil = sqrt(n)/log(n), biggest = TRUE)
Lv |
likelihood |
pen |
penalty |
Kseq |
number of changes |
n |
number of datapoints |
seuil |
threshold |
biggest |
heuristic (biggest jump or slope) |
a selected number of chagnes
Function to estimate the standard deviation
sdDiff(x, method = "MAD")
sdDiff(x, method = "MAD")
x |
signal |
method |
used to estimate the variance : MAD or HALL |
return a numeric value
function to select the number of changepoints after Fpsn or Fpsn_w using the penalty of Lebarbier 2005 given a estimator of the variance
select_Fpsn( res_fpsn, method = "givenVariance", sigma = sdDiff(res_fpsn$signal) )
select_Fpsn( res_fpsn, method = "givenVariance", sigma = sdDiff(res_fpsn$signal) )
res_fpsn |
output of Fpsn or Fpsn_w containg the costs in J.est and the segmented signal |
method |
one of (1) "givenVariance" = using the penalty of Lebarbier 2005 given a estimator of the variance, (2) "biggest.S3IB" = biggest=TRUE in saut taken from S3IB, (3) "notbiggest.S3IB" biggest=FALSE in saut taken from S3IB. |
sigma |
variance used of the selection. If NULL use MAD on unweighted data. |
return an integer: selected number of changes
x <- c(rnorm(100), rnorm(10^3)+2, rnorm(1000)+1) res <- Fpsn_w(x=x, w=rep(1, length(x)), K=100) select.res <- select_Fpsn(res, method="givenVariance") smt <- getSMT(res, select.res)
x <- c(rnorm(100), rnorm(10^3)+2, rnorm(1000)+1) res <- Fpsn_w(x=x, w=rep(1, length(x)), K=100) select.res <- select_Fpsn(res, method="givenVariance") smt <- getSMT(res, select.res)
vector to decompress a compressed smoothed profile (a call to rep)
uncompress.smt(smt.CP, vec.rep)
uncompress.smt(smt.CP, vec.rep)
smt.CP |
smoothed and compressed profile |
vec.rep |
weights to use for decompression |
a vector to replicate duplicated datapoints
return a vector to uncompress a profile, segmentation or smt
uncompress.vec(vec.rep)
uncompress.vec(vec.rep)
vec.rep |
integer vector with the number of time each point should be repeated |
return a vector to uncompress a profile, segmentation or smt