% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/shash.R
\name{cnorm.shash}
\alias{cnorm.shash}
\title{Fit a Sinh-Arcsinh (shash) Regression Model for Continuous Norming}
\usage{
cnorm.shash(
  age,
  score,
  weights = NULL,
  mu_degree = 3,
  sigma_degree = 2,
  epsilon_degree = 2,
  delta_degree = 1,
  delta = 1,
  control = NULL,
  scale = "T",
  plot = TRUE
)
}
\arguments{
\item{age}{A numeric vector of predictor values (typically age, but can be any continuous predictor).}

\item{score}{A numeric vector of response values (raw test scores). Must be the same length as age.
The value range is unresticted and it can include zeros and negative values.}

\item{weights}{An optional numeric vector of weights for each observation.
Useful for incorporating sampling weights. If NULL (default), all observations are weighted equally.}

\item{mu_degree}{Integer specifying the degree of the polynomial for modeling the location parameter mu(age).
Default is 3. Higher degrees allow more flexible modeling of how the central tendency changes with age,
but may lead to overfitting with small samples. Common choices:
\itemize{
  \item 1: Linear change with age
  \item 2: Quadratic change (allows one inflection point)
  \item 3: Cubic change (allows two inflection points, suitable for most developmental curves)
  \item 4+: Higher-order changes (use cautiously, mainly for large samples)
}}

\item{sigma_degree}{Integer specifying the degree of the polynomial for modeling the scale parameter sigma(age).
Default is 2. This controls how the variability (spread) of scores changes with age.
Lower degrees are often sufficient as variability typically changes more smoothly than location.}

\item{epsilon_degree}{Integer specifying the degree of the polynomial for modeling the skewness parameter epsilon(age).
Default is 2. This controls how the asymmetry of the distribution changes with age.}

\item{delta_degree}{Integer specifying the plynomial for modelling the tail weight parameter delte(age). Default is 1.
The tail weight can be fixed as well in case of numerical instability. In that case, set 'delta_degree' to NULL and
specify a value for delta instead. Recommendation: Keep delta_degree low to avoid overfitting.}

\item{delta}{Fixed tail weight parameter (must be > 0). Default is 1. This parameter controls the
heaviness of the distribution tails and is kept constant across all ages in this implementation.
It is only used, if 'delta_degree' is set to NULL. Common values:
\itemize{
  \item delta = 1: Normal-like tail behavior (baseline)
  \item delta > 1: Heavier tails, higher kurtosis (more extreme scores than normal distribution)
  \item delta < 1: Lighter tails, lower kurtosis (fewer extreme scores than normal distribution)
}}

\item{control}{An optional list of control parameters passed to the \code{optim} function for
maximum likelihood estimation. If NULL, sensible defaults are chosen automatically based on
the model complexity. Common parameters to adjust:
\itemize{
  \item \code{factr}: Controls precision of optimization (default: 1e-8)
  \item \code{maxit}: Maximum number of iterations (default: n_parameters * 200)
  \item \code{lmm}: Memory limit for L-BFGS-B (default: min(n_parameters, 20))
}
Increase \code{maxit} or decrease \code{factr} if optimization fails to converge.}

\item{scale}{Character string or numeric vector specifying the type of norm scale for output.
This affects the scaling of derived norm scores but does not influence model fitting:
\itemize{
  \item "T": T-scores (mean = 50, SD = 10) - default
  \item "IQ": IQ-like scores (mean = 100, SD = 15)
  \item "z": z-scores (mean = 0, SD = 1)
  \item c(M, SD): Custom scale with specified mean M and standard deviation SD
}}

\item{plot}{Logical indicating whether to automatically display a diagnostic plot of the fitted model.
Default is TRUE.}
}
\value{
An object of class "cnormShash" containing the fitted model results. This is a list with components:
  \item{mu_est}{Numeric vector of estimated coefficients for the location parameter mu(age).
    The first coefficient is the intercept, subsequent coefficients correspond to polynomial terms.}
  \item{sigma_est}{Numeric vector of estimated coefficients for the scale parameter log(sigma(age)).
    Note: These are coefficients for log(sigma) to ensure sigma > 0.}
  \item{epsilon_est}{Numeric vector of estimated coefficients for the skewness parameter epsilon(age).}
  \item{delta}{The fixed tail weight parameter value used in fitting.}
  \item{delta_est}{Numeric vector of estimated coefficients for the tail weight parameter delta(age) -
     in case, a degree has been set.}
  \item{se}{Numeric vector of standard errors for all estimated coefficients (if Hessian computation succeeds).}
  \item{mu_degree, sigma_degree, epsilon_degree}{The polynomial degrees used for each parameter.}
  \item{result}{Complete output from the \code{optim} function, including convergence information,
    log-likelihood value, and other optimization details.}
}
\description{
This function fits a Sinh-Arcsinh (shash; Jones & Pewsey, 2009) regression model for continuous norm
score modeling, where the distribution parameters vary smoothly as polynomial functions of age or other
predictors. The shash distribution is well-suited for psychometric data as it can flexibly model
skewness and tail weight independently, making it ideal for handling floor effects, ceiling effects,
and varying degrees of individual differences across age groups. In a simulation study (Lenhard et
al, 2019), the shash model demonstrated superior performance compared to other parametric approaches
from the Box Cox family of functions. In contrast to Box Cox, Sinh-Arcsinh can model distributions
including zero and negativ values.
}
\details{
This implementation uses the Jones & Pewsey (2009) parameterization of the Sinh-Arcsinh distribution.
Parameters are estimated using maximum likelihood via the L-BFGS-B algorithm. In case, optimization
fails, try reducing model complexity by reducing polynomial degrees or fixing the delta parameter.

\subsection{The Sinh-Arcsinh Distribution}{
The shash distribution is defined by the transformation:
\deqn{X = \mu + \sigma \cdot \sinh\left(\frac{\text{arcsinh}(Y) - \epsilon}{\delta}\right)}
where Y is a standard normal variable, Y ~ N(0,1).

This transformation provides:
\itemize{
  \item mu: Location parameter (similar to mean)
  \item sigma: Scale parameter (similar to standard deviation)
  \item epsilon: Skewness parameter (epsilon = 0 for symmetry)
  \item delta: Tail weight parameter (delta = 1 for normal-like tails)
}
}

\subsection{Model Selection}{
Choose polynomial degrees based on:
\itemize{
  \item Sample size (higher degrees need more data)
  \item Theoretical expectations about developmental trajectories
  \item Model comparison criteria (AIC, BIC)
  \item Visual inspection of fitted curves
}

For most applications, mu_degree = 3, sigma_degree = 2, epsilon_degree = 2, delta_degree = 1 provides
a good balance of flexibility and parsimony.
}
}
\note{
\itemize{
  \item The function requires the input data to have sufficient variability. Very small datasets
    or datasets with little age spread may cause convergence problems.
  \item Polynomial models can exhibit edge effects at the boundaries of the age range.
    Predictions outside the observed age range should be made cautiously.
  \item If convergence fails, try: (1) reducing polynomial degrees, (2) adjusting the delta parameter,
    (3) providing custom control parameters, or (4) checking for data quality issues.
  \item The tail weight parameter delta is fixed across ages by default. For applications
    where tail behavior changes substantially with age, consider setting the delta_degree paramerer to 1 or 2.
}
}
\examples{
\dontrun{
# Basic usage with default settings
model <- cnorm.shash(age = children$age, score = children$raw_score)

# Custom polynomial degrees for complex developmental pattern
model_complex <- cnorm.shash(
  age = adolescents$age,
  score = adolescents$vocabulary_score,
  mu_degree = 4,         # Complex mean trajectory
  sigma_degree = 3,      # Changing variability pattern
  epsilon_degree = 2,    # Skewness shifts
  epsilon_degree = NULL, # set to NULL to activate fixed delta
  delta = 1.3            # Slightly heavy tails
)

# With sampling weights
model_weighted <- cnorm.shash(
  age = survey$age,
  score = survey$score,
  weights = survey$sample_weight
)

# Custom optimization control for difficult convergence
model_robust <- cnorm.shash(
  age = mixed$age,
  score = mixed$score,
  control = list(factr = 1e-6, maxit = 2000),
  delta = 1.5
)

# Compare model fit
compare(model, model_complex)
}

}
\references{
Jones, M. C., & Pewsey, A. (2009). Sinh-arcsinh distributions. *Biometrika*, 96(4), 761-780.

Lenhard, A., Lenhard, W., Gary, S. (2019). Continuous norming of psychometric tests: A
simulation study of parametric and semi-parametric approaches. *PLoS ONE*, 14(9), e0222279.
https://doi.org/10.1371/journal.pone.0222279
}
\seealso{
\code{\link{plot}} for plotting fitted models,
\code{\link{predict}} for generating predictions,
\code{\link{cnorm.betabinomial2}} for discrete beta-binomial alternative
}
\author{
Wolfgang Lenhard
}
