MLE for Dirichlet Multinomial
Arguments
- counts
matrix with rows as samples and columns as categories
- ...
additional arguments passed to
optim()
Value
list storing alpha
parameter estimates, logLik, and details about convergence
alpha
estimated \(alpha\) parameters
overdispersion
Overdispersion value \(1 + \rho^2(n-1)\) compared to multinomial
logLik
value of function
scale
scaling of \(\alpha\) parameters computed in a second optimization step
evals
number of function evaluations in step 1
convergence
convergence details from step 1
Details
Maximize Dirichlet Multinomial (DMN) log-likelihood with optim()
using log likelihood function and its gradient. This method uses a second round of optimization to estimate the scale of \(\alpha\) parameters, which is necessary for accurate estimation of overdispersion metric.
The covariance between counts in each category from DMN distributed data is \(n(diag(p) - pp^T) (1 + \rho^2(n-1))\) for \(n\) total counts, and vector of proportions \(p\), where \(\rho^2 = 1 / (a_0 + 1)\) and \(a_0 = \sum_i \alpha_i\). The count data is overdispersed by a factor of \(1 + \rho^2(n-1)\) compared to a multinomial (MN) distribution. As \(a_0\) increases, the DMN converges to the MN.
See https://en.wikipedia.org/wiki/Dirichlet-multinomial_distribution#Matrix_notation
See also
Other functions also estimate DMN parameters. MGLM::MGLMfit()
and dirmult::dirmult()
give good parameter estimates but are slower. Rfast::dirimultinom.mle()
often fails to converge
Examples
library(HMP)
#> Loading required package: dirmult
#>
#> Attaching package: ‘HMP’
#> The following object is masked from ‘package:dirmult’:
#>
#> weirMoM
set.seed(1)
n_samples <- 1000
n_counts <- 5000
alpha <- c(500, 1000, 2000)
# Dirichlet.multinomial
counts <- Dirichlet.multinomial(rep(n_counts, n_samples), alpha)
fit <- dmn.mle(counts)
fit
#> $alpha
#> Taxa 1 Taxa 2 Taxa 3
#> 506.3946 1015.2996 2027.6421
#>
#> $overdispersion
#> [1] 2.408036
#>
#> $logLik
#> [1] -4777957
#>
#> $scale
#> [1] 0.7098813
#>
#> $evals
#> function gradient
#> 2 2
#>
#> $convergence
#> [1] 0
#>
# overdispersion: true value
a0 <- sum(alpha)
rhoSq <- 1 / (a0 + 1)
1 + rhoSq * (n_counts - 1)
#> [1] 2.427878
# multinomial, so overdispersion is 1
counts <- t(rmultinom(n_samples, n_counts, prob = alpha / sum(alpha)))
dmn.mle(counts)
#> $alpha
#> [1] 2165173253 4324433613 8651164924
#>
#> $overdispersion
#> [1] 1
#>
#> $logLik
#> [1] -4779160
#>
#> $scale
#> [1] 70011.04
#>
#> $evals
#> function gradient
#> 33 33
#>
#> $convergence
#> [1] 0
#>
#