% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/moonboot.R
\name{estimate.m}
\alias{estimate.m}
\title{Estimating a Subsample Size m}
\usage{
estimate.m(
  data,
  statistic,
  tau = NULL,
  R = 1000,
  replace = FALSE,
  min.m = 3,
  method = "bickel",
  params = NULL,
  ...
)
}
\arguments{
\item{data}{The data to be bootstrapped.}

\item{statistic}{A function returning the statistic of interest. It must take two arguments. The first argument passed will be the original data, the second
will be a vector of indices. Any further arguments can be passed through the \code{...} argument.}

\item{tau}{The convergence rate.}

\item{R}{The amount of bootstrap replicates. Must be a positive integer.}

\item{replace}{If the sampling should be done with replacement. Setting this value to true requires a sufficient smooth estimator.}

\item{min.m}{Minimum subsample size to be tried. Should be the minimum size for which the statistic make sense.}

\item{method}{The method to be used, one of \code{c("goetze","bickel","politis", "sherman")}.}

\item{params}{Additional parameters to be passed to the internal functions, see details for more information.}

\item{...}{Additional parameters to be passed to the statistic.}
}
\value{
Subsampling size \code{m} choosen by the selected method.
}
\description{
Estimates \code{m} using the selected \code{method}.
Additional parameters can be passed to the underlying methods using \code{params}.
It is also possible to pass parameters to the statistic using '...'.
}
\details{
The different methods have different parameters. Therefore, this wrapper method has been given the \code{params} parameter, which can be used to
pass method-specific arguments to the underlying methods. The specific parameters are described below.
Most of the provided methods need \code{tau}. If not provided, it will be estimated using
\code{estimate.tau}. Note that method 'sherman' is using an alternative approach without using the scalation factor and
therefore \code{tau} will not be computed if selecting 'sherman' as method. Any non \code{NULL} values will be ignored when
selecting the method 'sherman'.

Possible methods are:

\describe{\item{goetze:}{
The method from Goetze and Rackauskas is based on minimizing the distance between the
CDF of the bootstrap distributions of different subsampling sizes 'm'.
As distance measurement the 'Kolmogorov distance' is used.
The method uses the pairs 'm' and 'm/2' to be minimized.
As this would involve trying out all combinations of 'm' and 'm/2' this method has a running time of order Rn^2.
To reduce the runtime in practical use, \code{params} can be used to pass a \code{goetze.interval}, which is a
list of the smallest and largest value for m to try.}
\item{bickel:}{
This method works similary to the previous one. The difference here is that the subsample sizes to be
compared are consecutive subsample sizes generated by \code{q^j*n} for \code{j = seq(2,n)} and a chosen \code{q} value between
zero and one.
The parameter \code{q} can be selected using \code{params}. The default value is \code{q=0.75}, as suggested in the corresponding paper.}
\item{politis:}{
This method is also known as the 'minimum volatility method'. It is based on the idea that there
should be some range for subsampling sizes, where its choice has little effect on the estimated confidence points.
The algorithm starts by smoothing the endpoints of the intervals and then calculates the standard deviation.
The \code{h.ci} parameter is used to select the number of neighbors used for smoothing.
The \code{h.sigma} parameter is the number of neighbors used in the standard deviation calculation.
Both parameters can be set by using \code{params}.
Note that the \code{h.*} neigbors from each side are used.
To use five elements for smoothing, \code{h.ci} should therefore be set to 2.}
\item{sherman:}{
This method is based on a 'double-bootstrap' approach.
It tries to estimate the coverage error of different subsampling sizes and chooses the subsampling
size with the lowest one.
As estimating the coverage error is highly computationally intensive, it is not practical to try all m values.
Therefore, the \code{gamma} parameter can be used to control which \code{m} values are tried. The values
are then calculated by \code{ms = n^gamma}. The default value is a sequence between 0.3 and 0.9 out of 15 values.
This parameter can be set using \code{params}.}}
}
\examples{
data <- runif(1000)
estimate.max <- function(data, indices) {return(max(data[indices]))}
tau <- function(n){n} # convergence rate (usually sqrt(n), but n for max) 
choosen.m <- estimate.m(data, estimate.max, tau=tau, R = 1000, method = "bickel")
print(choosen.m)


}
\references{
Götze F. and Rackauskas A. (2001) Adaptive choice of bootstrap sample sizes.
\emph{Lecture Notes-Monograph Series}, 36(State of the Art in Probability and Statistics):286-309

Bickel P.J. and Sakov A. (2008) On the choice of m in the m out of n bootstrap and confidence bounds for extrema.
\emph{Statistic Sinica}, 18(3):967-985.

Politis D.N. et al. (1999)
\emph{Subsampling}, Springer, New York.

Sherman M. and Carlstein E. (2004) Confidence intervals based on estimators with unknown rates of convergence.
\emph{Computional statistics & data analysis}, 46(1):123-136.
}
\seealso{
mboot estimate.tau
}
