Compute the centered log ratio (CLR) transform of a count matrix.
Arguments
- counts
count data with samples as rows and variables are columns
- pseudocount
added to counts to avoid issues with zeros
Details
The CLR of a vector x
of counts in D
categories is defined as
clr(x) = log(x) - mean(log(x))
. For details see van den Boogaart and Tolosana-Delgado (2013).
References
Van den Boogaart KG, Tolosana-Delgado R (2013). Analyzing compositional data with R, volume 122. Springer. https://link.springer.com/book/10.1007/978-3-642-36809-7.
Examples
# set probability of each category
prob <- c(0.1, 0.2, 0.3, 0.5)
# number of total counts
countsTotal <- 300
# number of samples
n_samples <- 100
# simulate info for each sample
info <- data.frame(Age = rgamma(n_samples, 50, 1))
rownames(info) <- paste0("sample_", 1:n_samples)
# simulate counts from multinomial
counts <- t(rmultinom(n_samples, size = n_samples, prob = prob))
colnames(counts) <- paste0("cat_", 1:length(prob))
rownames(counts) <- paste0("sample_", 1:n_samples)
# centered log ratio
clr(counts)[1:4, ]
#> cat_1 cat_2 cat_3 cat_4
#> sample_1 -0.8080212 -0.31847293 0.32507731 0.8014168
#> sample_2 -0.9426299 -0.01464310 0.26720806 0.6900649
#> sample_3 -0.9427424 0.15586990 0.07419186 0.7126806
#> sample_4 -1.2667081 0.04896869 0.26722226 0.9505171