Overview
The CMTFtoolbox
package provides R users with two data fusion methods that have previously been presented in the MATLAB sphere.
-
cmtf_opt
: Coupled Matrix and Tensor Factorization (CMTF) (doi:10.48550/arXiv.1105.3422). -
acmtf_opt
: Advanced Coupled Matrix and Tensor Factorization (ACMTF) (doi:10.1186/1471-2105-15-239). -
acmtfr_opt
: ACMTF-regression (ACMTF-R) as described in van der Ploeg et al., 2025 (see citation below).
Both of these methods were implemented using the all-at-once optimization approaches as described in the papers above. This implementation was achieved using the S4 Tensor object from rTensor
and the various conjugate gradient approaches from mize
. Other features of the package include:
-
ACMTF_modelSelection
: Combined random initialization and cross-validation approach for determining the correct number of components in ACMTF. -
ACMTFR_modelSelection
: Combined random initialization and cross-validation approach for determining the correct number of components in ACMTF-R. -
npred
: Prediction of Y for a new sample using an existing ACMTF-R model. -
Georgiou2025
: An example dataset containing a tensor of inflammatory mediator data and a matrix of tooth microbiome data in a cohort of apical periodontitis patients (doi:10.1111/iej.13854 and doi:10.1111/iej.13912).
Installation
The CMTFtoolbox
package can be installed from CRAN using:
install.packages("CMTFtoolbox")
Development version
You can install the development version of CMTFtoolbox
from GitHub with:
# install.packages("devtools")
devtools::install_github("GRvanderPloeg/CMTFtoolbox")
Citation
Please use the following citation when using this package:
- van der Ploeg, G. R., White, F. T. G., Jakobsen, R. R., Westerhuis, J., Heintz-Buschart, A., & Smilde, A. (2024). ACMTF-R: supervised multi-omics data integration uncovering shared and distinct outcome-associated variation. bioRxiv. 2025-07
Usage
library(CMTFtoolbox)
set.seed(123)
numComponents = 3
I = 108
J = 100
K = 10
L = 100
A = array(rnorm(I*numComponents), c(I, numComponents)) # shared subject mode
B = array(rnorm(J*numComponents), c(J, numComponents)) # distinct feature mode of X1
C = array(rnorm(K*numComponents), c(K, numComponents)) # distinct condition mode of X1
D = array(rnorm(L*numComponents), c(L, numComponents)) # distinct feature mode of X2
Y = matrix(A[,1])
lambdas = array(c(1, 1, 1, 0, 0, 1), c(2,3))
df1 = array(0L, c(I, J, K))
df2 = array(0L, c(I, L))
for(i in 1:numComponents){
df1 = df1 + lambdas[1,i] * reinflateTensor(A[,i], B[,i], C[,i])
df2 = df2 + lambdas[2,i] * reinflateMatrix(A[,i], D[,i])
}
datasets = list(df1, df2)
modes = list(c(1,2,3), c(1,4))
Z = setupCMTFdata(datasets, modes, normalize=TRUE)
cmtf_model = cmtf_opt(Z, 3)
acmtf_model = acmtf_opt(Z, 3)
acmtfr_model = acmtfr_opt(Z, Y, 3)