ASCA_decompose — ASCA_decompose • gASCA

This function performs the full ASCA decomposition

ASCA_decompose(
  d,
  x,
  f,
  glm_par = vector(mode = "list", length = 0),
  res_type = "response"
)

Arguments

d: a data.frame/matrix with the design
x: a data.frame/matrix of numeric values to be decomposed
f: a string holding the formula of the decomposition
glm_par: a list with the parameters to be passed to the glm call
res_type: the types of GLM residuals

Value

a list with the full outcomes of the decomposition with the following elements

decomposition: a list holding the results of the decomposition
mu: a vector with the constant terms of the univariate models
residuals: a matrix holding the model residuals. Their type is stored in the res_type element.
prediction: the matrix with the predicted values in the linear predictor space
pseudoR2: a parameter to assess the goodness of fit for the model on each variable.
glm_par: a list with the parameters used for modelling
res_type: the type of residuals
varimp: the importance of the individual variables in the decomposition terms
terms_L2: the L2 norm of the individual terms back transformed in the response space
d: a data.frame with the design
x: a data.frame with the initial data
f: the string defining the decomposition
combined: a vector holding the combined terms ()
invlink: the inverse of the link function used in the glm fitting

Details

The ASCA decomposition of a data matrix is performed by using Generalized Linear Models to estimate univariate expected values. The use of GLM's allows the extension of the method to non normal data and unbalanced designs. This function performs only the decomposition without the SVD, which have to be performed by ASCA_svd. The level of fit for each variable is assessed calculating the pseudoR2, which is defined as:

$1-residual_deviance/null_deviance$

The variable importance element stores a measure of the importance of each variable $c$ for each term calculated as the norm of each column. L2 norm is also calculated for each decomposition term.

It is important to highlight that in the case of count data with large fraction of zeroes the variable importance and the term L2 norms cannot be considered as reliable measure of importance because in presence of log links very low expected values are associated to very large negative values in the linear predictor space.

Examples

## load the data
data("synth_count_data")

## perform the ASCA decomposition
dec_test <- ASCA_decompose(
d = synth_count_data$design,
x = synth_count_data$counts, 
f = "time + treatment + time:treatment",
glm_par = list(family = poisson())
)