Skip to contents

Overview

The CMTFtoolbox package provides R users with two data fusion methods that have previously been presented in the MATLAB sphere.

  • cmtf_opt: Coupled Matrix and Tensor Factorization (CMTF) as described in Acar et al., 2011.
  • acmtf_opt: Advanced Coupled Matrix and Tensor Factorization (ACMTF) as described in Acar et al., 2013 and Acar et al., 2014.
  • acmtfr_opt: ACMTF-regression, currently in development.

Both of these methods were implemented using the all-at-once optimization approaches as described in the papers above. This implementation was achieved using the S4 Tensor object from rTensor and the various conjugate gradient approaches from mize. Other features of the package include:

  • ncrossreg: Jack-knife approach for determining the correct number of components for ACMTF-R.
  • npred: Prediction of Y for a new sample using an existing ACMTF-R model.
  • reinflateFac: reinflates all data blocks based on a CMTF, ACMTF, or ACMTF-R model for inspection and residual calculation.
  • Jakobsen2025: A three-block example dataset containing a subject-linked longitudinal microbiome infant gut microbiome, mother milk microbiome and mother milk metabolomics dataset. More information can be found in Poulsen et al., 2022.
  • The option of running the optimization algorithm as a line search (default) or in L-BFGS. See the documentation for more details.

Installation

You can install the development version of CMTFtoolbox from GitHub with:

# install.packages("devtools")
devtools::install_github("GRvanderPloeg/CMTFtoolbox")

Citation

Please use the following citation when using this package:

  • van der Ploeg, G. R., White, F. T. G., Westerhuis, J., Heintz-Buschart, A., & Smilde, A. (2024). ACMTF-R: multi-way data integration of biological variation of interest (manuscript in preparation).

Usage

library(CMTFtoolbox)

set.seed(123)
numComponents = 3
I = 108
J = 100
K = 10
L = 100
A = array(rnorm(I*numComponents), c(I, numComponents))  # shared subject mode
B = array(rnorm(J*numComponents), c(J, numComponents))  # distinct feature mode of X1
C = array(rnorm(K*numComponents), c(K, numComponents))  # distinct condition mode of X1
D = array(rnorm(L*numComponents), c(L, numComponents))  # distinct feature mode of X2
Y = matrix(A[,1])
lambdas = array(c(1, 1, 1, 0, 0, 1), c(2,3))

df1 = array(0L, c(I, J, K))
df2 = array(0L, c(I, L))
for(i in 1:numComponents){
  df1 = df1 + lambdas[1,i] * reinflateTensor(A[,i], B[,i], C[,i])
  df2 = df2 + lambdas[2,i] * reinflateMatrix(A[,i], D[,i])
}
datasets = list(df1, df2)
modes = list(c(1,2,3), c(1,4))
Z = setupCMTFdata(datasets, modes, normalize=TRUE)

cmtf_model = cmtf_opt(Z, 3)
acmtf_model = acmtf_opt(Z, 3)
acmtfr_model = acmtfr_opt(Z, Y, 3)