A key step in the production of chocolate is the fermentation of cocoa beans. This importance relies on its role in the development of chocolate’s flavor and aroma. Unlike other food fermentation processes, this specific fermentation is well known because of its lack of control and multiple ways in which it is performed. Here, a quantitative model of cocoa bean fermentation is constructed on previously available data regarding microbiological and metabolites dynamics. The model is formulated as a system of coupled ordinary differential equations (ODEs) with two different types of state variables: (1) Metabolite concentrations of glucose (Glc), fructose (Fru), ethanol (EtOH), lactic acid (LA) and acetic acid (Ac), and (2) population sizes of yeast (Y), lactic acid bacteria (LAB) and acetic acid bacteria (AAB). In total, the model comprehends 25 unknown parameters that were estimated using the Markov chain Monte Carlo No-U-Turn sampler in Rstan. Thereafter, we demonstrate that the model can quantitatively describe existing fermentation series and that the estimated parameters can be used to extract and interpret differences in environmental conditions between two independent fermentation trials [1].
The cocoa bean fermentation process occurs mostly in the pulp of raw cocoa beans right after opening of cocoa pods. It is called a spontaneous fermentation given that there is little or not control on its conduction. Opposed to other fermentation processes, the microorganisms responsible for driving it come into first contact with the raw material (cocoa beans) by a pletora of ways, including their exposure with the farmers’ hands, with tools used for the harvest and with the surface of fermenting devices. With respect to the latter, there are several methods that vary depending on the region where cocoa trees grow (i.e., heaps, wooden boxes and plastic containers) [2,3]. Despite these disparity of methodologies and inoculation means, it has been widely described in successful fermentations, the occurrence of three overlapping phasis. Each of these are governed by a different microbial group at a time in the same sequential order: (1) an early stage dominated by yeasts (Y), (2) a mid phase dominated by lactic acid bacteria (LAB) and (3) a final phase dominated by acetic acid bacteria (AAB).
Figure 1. (a) Metabolite and (b) microbial dynamics during cocoa bean fermentation [1,2].
The experimental data in this study were reported by Papalexandratou et al.[3] and Camu et al. [4] from two spontaneous fermentations trials conducted in boxes in Brazil and one by the heap method in Ghana, respectively. Papalexandratou et al. trials were done by means of the wooden boxes fermentation method under slightly different conditions. In one instance, the trial known as ‘box 1’, was placed under a metal roof that provided protection from weather conditions. In the other, the trial known as ‘box 2’, was placed within a fermentary room.
By the qualitative description of the process, it is expected that the problem will become one of a coupled nature given the dependencies of all its dynamics on microbial growth. The basic interactions of the state variables can be summarized into the conversion of substrates (glucose (Glc) and fructose (Fru)) into ethanol (EtOH) by Y, Glc into lactic acid (LA) by LAB and EtOH into acetic acid (Ac) by AAB. These relationships can be defined by the use of equations that must take into account the substrate and product interactions. In the first case, several approximations have been developed, among them, the classical Monod and Contois models. For the later, we considered either the lack or precense of interactions between fermentation products and microbial growth by including linear and non-linear relationships upon the death rates responsible for the decrease on their population. Among these baseline decisions, four candidate models were assessed to fit the previously described data sets as shown in Table 1.
Model | Multiple substrate | Product toxicity | Population size effect for LA consumption |
---|---|---|---|
M1 | no | no | no |
M2 | yes | no | no |
M3 | yes | yes | no |
M4 | yes | yes | yes |
Where Model 1 considers as substrate a sum of Glc and Fru and the Population size effect for LA consumption refers to the use of the Contois equation for the consumption of LA by AAB.
For a more detailed description of the conceptualization of the model, check the supplementary material accompanying the original manuscript
Take for example model 4. Its mathematical formulation it is as follows.
Figure 2. Network diagram of the cocoa bean fermentation process. Growth and death rates are expressed as \(v_i\) [1].
The growth and death rates (\(v_i\)) of microbial populations are driving the wires of the network. For the definition of the these rates, we used Monod or Contois equations for the former, and second- and third-order reaction terms for the latter.
Thus, growth rates are defined as:
\[\begin{align} v_1 & = \frac{\mu_\text{max}^{\text{Y}_\text{Glc}}\,\left[\text{Glc}\right]}{\left[\text{Glc}\right]+K_{\text{Glc}}^{\text{Y}}}\,\left[\text{Y}\right]\\[4pt] v_2 &= \frac{\mu_\text{max}^{\text{Y}_\text{Fru}}\,\left[\text{Fru}\right]}{\left[\text{Fru}\right]+K_{\text{Fru}}^{\text{Y}}}\,\left[\text{Y}\right]\\[4pt] v_3 &= \frac{\mu_\text{max}^\text{LAB}\,\left[\text{Glc}\right]}{\left[\text{Glc}\right]+K_{\text{Glc}}^{\text{LAB}}}\,\left[\text{LAB}\right]\\[4pt] v_4 &= \frac{\mu_\text{max}^{\text{AAB}_\text{EtOH}}\,\left[\text{EtOH}\right]}{\left[\text{EtOH}\right]+K_{\text{EtOH}}^{\text{AAB}}}\,\left[\text{AAB}\right]\\[4pt] v_5 &= \frac{\mu_\text{max}^{\text{AAB}_\text{LA}}\,\left[\text{LA}\right]}{\left[\text{LA}\right]+K_{\text{LA}}^{\text{AAB}}\,\left[\text{AAB}\right]}\left[\text{AAB}\right] \end{align}\]
And the death rates as:
\[\begin{align} v_6 &= k_{\text{Y}}\,\left[\text{Y}\right]\,\left[\text{EtOH}\right]\\ v_7 &= k_{\text{LAB}}\,\left[\text{LAB}\right]\,\left[\text{LA}\right]\\ v_8 &= k_{\text{AAB}}\,\left[\text{AAB}\right]\,\left[\text{Ac}\right]^2 \end{align}\]
So far, a sub-total of 13 unknown parameters are considered in these equations. Five maximum specific growth rates \(\mu_\text{max}\) of the form \(\mu_{\text{max}}^{i_n}\), where \(i\) can be either Y, LAB and AAB, and \(n\) refers whether \(\mu\) corresponds to the maximum specific growth of Y on either Glc or Fru, or AAB on either EtOH or LA. Five substrate saturation constants for the growth of Y, LAB and AAB of the form \(K_{m}^{j}\), where \(j\) can be either Y or LAB and \(m\) can be either Glc, Fru, EtOH and LA. Three constant mortality rates of the form \(k_i\), where \(i\) can be either Y, LAB or AAB.
By putting all these rates together, the addition of 11 more parameters is necessary for accounting the comsumption of substrates (i.e., Glc and Fru) and production of metabolites (i.e., EtOH, LA and Ac). Hence, by the formulation of the following system of ODEs, 11 yield coefficients of the form \(Y_{\text{i}|{j}}^{k}\) where \(i\) can be either Glc, Fru, EtOH, LA or Ac, \(j\) can be either Y, LAB or AAB and \(k\) can be either Glc, Fru, EtOH or LA.
\[\begin{align} \frac{d\left[\text{Glc}\right]}{dt} &= -Y_{\text{Glc}|\text{Y}}\,v_1-Y_{\text{Glc}|\text{LAB}}\,v_3 \\[4pt] \frac{d\left[\text{Fru}\right]}{dt} &= -Y_{\text{Fru}|\text{Y}}\,v_2 \\[4pt] \frac{d\left[\text{EtOH}\right]}{dt} &= Y^{\text{Glc}}_{\text{EtOH}|\text{Y}}\,v_1 + Y^{\text{Fru}}_{\text{EtOH}|\text{Y}}\,v_2 - Y_{\text{EtOH}|\text{AAB}}\,v_4 \\[4pt] \frac{d\left[\text{LA}\right]}{dt} &= Y_{\text{LA}|\text{LAB}}\,v_3 - Y_{\text{LA}|\text{AAB}}\,v_5 \\[4pt] \frac{d\left[\text{Ac}\right]}{dt} &= Y_{\text{Ac}|\text{LAB}}\,v_3 + Y^{\text{EtOH}}_{\text{Ac}|\text{AAB}}\,v_4 + Y^{\text{LA}}_{\text{Ac}|\text{AAB}}\,v_5 \label{2.5} \\[4pt] \frac{d\left[\text{Y}\right]}{dt} &= v_1+v_2-v_6 \\[2pt] \frac{d\left[\text{LAB}\right]}{dt} &= v_3-v_7 \\[2pt] \frac{d\left[\text{AAB}\right]}{dt} &= v_4+v_5-v_8 \end{align}\]
A complete interpretation of these parameters is the following:
Parameter | Units | Interpretation |
---|---|---|
\(\mu_\text{max}^{\text{Y}_\text{Glc}}\) | \(\text{mg Y }\text{h}^{-1}\) | Maximum specific growth rate of Y on Glc |
\(\mu_\text{max}^{\text{Y}_\text{Fru}}\) | \(\text{mg Y }\text{h}^{-1}\) | Maximum specific growth rate of Y on Fru |
\(\mu_\text{max}^\text{LAB}\) | \(\text{mg LAB }\text{h}^{-1}\) | Maximum specific growth rate of LAB on Glc |
\(\mu_\text{max}^{\text{AAB}_\text{EtOH}}\) | \(\text{mg AAB }\text{h}^{-1}\) | Maximum specific growth rate of AAB on EtOH |
\(\mu_\text{max}^{\text{AAB}_\text{LA}}\) | \(\text{mg AAB }\text{h}^{-1}\) | Maximum specific growth rate of AAB on LA |
\(K_{\text{Glc}}^{\text{Y}}\) | \(\text{mg Glc g pulp}^{-1}\) | Substrate saturation constant of Y growth on Glc |
\(K_{\text{Fru}}^{\text{Y}}\) | \(\text{mg Fru g pulp}^{-1}\) | Substrate saturation constant of Y growth on Fru |
\(K_{\text{Glc}}^{\text{LAB}}\) | \(\text{mg Glc g pulp}^{-1}\) | Substrate saturation constant of LAB growth on Glc |
\(K_{\text{EtOH}}^{\text{AAB}}\) | \(\text{mg EtOH g pulp}^{-1}\) | Substrate saturation constant of AAB growth on EtOH |
\(K_{\text{LA}}^{\text{AAB}}\) | \(\text{mg LA g pulp}^{-1}\) | Substrate saturation constant of AAB growth on LA |
\(k_{\text{Y}}\) | \(\text{mg Y mg EtOH}^{-1}\text{. h}^{-1}\) | Mortality rate constant of Y |
\(k_{\text{LAB}}\) | \(\text{mg LAB mg LA}^{-1}\text{. h}^{-1}\) | Mortality rate constant of LAB |
\(k_{\text{AAB}}\) | \(\text{mg AAB mg Ac}^{-2}\text{. h}^{-1}\) | Mortality rate constant of AAB |
\(Y_{\text{Glc}|\text{Y}}\) | \(\text{mg Glc mg Y}^{-1}\) | Y-to-Glc yield coefficient |
\(Y_{\text{Glc}|\text{LAB}}\) | \(\text{mg Glc mg LAB}^{-1}\) | LAB-to-Glc yield coefficient |
\(Y_{\text{Fru}|\text{Y}}\) | \(\text{mg Fru mg Y}^{-1}\) | Y-to-Fru yield coefficient |
\(Y^{\text{Glc}}_{\text{EtOH}|\text{Y}}\) | \(\text{mg EtOH mg Y}^{-1}\) | Y-to-EtOH from Glc yield coefficient |
\(Y^{\text{Fru}}_{\text{EtOH}|\text{Y}}\) | \(\text{mg EtOH mg Y}^{-1}\) | Y-to-EtOH from Fru yield coefficient |
\(Y_{\text{EtOH}|\text{AAB}}\) | \(\text{mg EtOH mg AAB}^{-1}\) | AAB-to-EtOH yield coefficient |
\(Y_{\text{LA}|\text{LAB}}\) | \(\text{mg LA mg LAB}^{-1}\) | LAB-to-LA yield coefficient |
\(Y_{\text{LA}|\text{AAB}}\) | \(\text{mg LA mg AAB}^{-1}\) | AAB-to-LA yield coefficient |
\(Y_{\text{Ac}|\text{LAB}}\) | \(\text{mg Ac mg LAB}^{-1}\) | LAB-to-Ac yield coefficient |
\(Y^{\text{EtOH}}_{\text{Ac}|\text{AAB}}\) | \(\text{mg Ac mg AAB}^{-1}\) | AAB-to-Ac from EtOH yield coefficient |
\(Y^{\text{LA}}_{\text{Ac}|\text{AAB}}\) | \(\text{mg Ac mg AAB}^{-1}\) | AAB-to-Ac from LA yield coefficient |
The aforementioned parameters can be considered elements of a vector \(\theta\) of the form \([\theta_1, \theta_2,\dots,\theta_k]\). By assuming that \(\theta\) has originated the data \(\mathcal{D}\), the parameter estimation problem can be solved by inferring the posterior probability of \(\theta\) given \(\mathcal{D}\), \(P(\theta|\mathcal{D})\). This \(P(\theta|\mathcal{D})\) can be expressed as the product over all series and each data point within time series as:
\[\begin{equation} P\left(\theta\mid \mathcal{D}\right) \propto \prod_{i=1}^{T} \prod_{j=1}^{N} P\left(\mathcal{D}_{i,j}\mid\theta \right)\,P\left(\theta \right), \end{equation}\]
where \(\mathcal{D}_{i,j}\) is the data point from time series \(i\) measured at time \(j\).
By assuming that each observation \(\mathcal{D}_{i,j}\) is drawn from a sampling distribution of the form:
\[\begin{equation} \mathcal{D}_{i,j} \sim \mathcal{N}\left(f\left(x_{i,j},\theta\right),\sigma\right), \end{equation}\]
Our \(P(\theta|\mathcal{D})\) takes the following form:
\[\begin{equation} P\left(\theta\mid \mathcal{D}\right) \propto \prod_{i=1}^{T} \prod_{j=1}^{N} \mathcal{N}\left(f\left(x_{i,j},\theta\right),\sigma\right)\,P\left(\theta \right), \end{equation}\]
where an extra parameter \(\sigma\) is needed for accounting the total standard deviation of the model.
Two problems arise when handling these data. On the one hand, the units in which microbial growth is measured constitutes the counting of Colony Forming Units (CFU). CFUs cannot be easily tossed in model with concentration measurements of metabolites (usually in milligrams per gram of pulp). For tackling this inconvenient, we make use of conversion factors from CFU to milligrams. These conversion factors were 15 picograms per CFU (\(pg\,CFU^{-1}\)), 1.25 \(pg\,CFU^{-1}\) and 0.28 \(pg\,CFU^{-1}\) for Y, LAB and AAB respectively (for a complete explanation of these conversion factors, see [1]).
On the other hand, after the conversion of microbial counts into mass units, the time series corresponding to Y, LAB and AAB showed accute differences in orders of magnitude with respect to the units of the metabolites. Such differences, as it is well known, are likely to lead to computational problems in any optimization scheme. With the aim of overcoming such issues, a variable scaling was performed. This scaling consisted on dividing each time series by its corresponding maximum. In other words, all time series are constrained to account with maximum values equal to unity.
From this point, the parameter estimation scheme is done over the scaled time series and once the posterior distribution of the parameters were obtained, these were subject to a re-scaling process to its real units through conversion factors derived mathematically from the ODE system (to see in further detail the re-scaling process, see [1]).
By the scaling of the time series, we assumed that the differences of the parameters on their real scales that can be away in orders of magnitude, are then regularized. This means that priors for each single parameter can lie between 0 and 1. Thus, an independent normal distrubution with mean 0.5 and standard deviation equal to 0.3 can be impossed to each \(k\) element of \(\theta\). In the case of the standard deviation of the model \(\sigma\), a heavy tailed distribution such a Cauchy with scale parameters of 0 and 1 will allow the sampling of extreme values that will cover possible outlying observations of the original data.
\[\begin{equation} \begin{aligned} \theta_k &\sim \mathcal{N}(0.5,\,0.3), && \theta_k>0 \\ \sigma &\sim \mathcal{C}(0,\,1), && \sigma>0. \end{aligned} \end{equation}\]
The aforementioned system of ODEs translated into Stan language is solved as an initial value problem, where the initial concentration of the 8 state variables are provided as they were reported in the original work of Papalexandratou et al. for boxes 1 and 2. In both cases, the ODEs were specified and solved by the mechanism rk45. The datasets were fitted by running 4 parallel Markov Chains of 3000 iterations each, 1000 of which were used as warm-up.
functions {
real[] cbf(real t,
real[] x,
real[] theta,
real[] y_r,
int[] y_i) {
real dxdt[8];
dxdt[1] = - theta[1]*theta[12]*x[1]*x[6]/(theta[17]+x[1]) - theta[2]*theta[14]*x[1]*x[7]/(theta[19]+x[1]);
dxdt[2] = - theta[3]*theta[13]*x[2]*x[6]/(theta[18]+x[2]);
dxdt[3] = theta[4]*theta[12]*x[1]*x[6]/(theta[17]+x[1]) + theta[5]*theta[13]*x[2]*x[6]/(theta[18]+x[2]) - theta[6]*theta[15]*x[3]*x[8]/(theta[20]+x[3]);
dxdt[4] = theta[7]*theta[14]*x[1]*x[7]/(theta[19]+x[1]) - theta[8]*theta[16]*x[4]*x[8]/(theta[21]*x[8]+x[4]);
dxdt[5] = theta[9]*theta[14]*x[1]*x[7]/(theta[19]+x[1]) + theta[10]*theta[15]*x[3]*x[8]/(theta[20]+x[3]) +theta[11]*theta[16]*x[4]*x[8]/(theta[21]*x[8]+x[4]);
dxdt[6] = theta[12]*x[1]*x[6]/(theta[17]+x[1]) + theta[13]*x[2]*x[6]/(theta[18]+x[2]) - theta[22]*x[6]*x[3];
dxdt[7] = theta[14]*x[1]*x[7]/(theta[19]+x[1]) - theta[23]*x[7]*x[4];
dxdt[8] = theta[15]*x[3]*x[8]/(theta[20]+x[3]) + theta[16]*x[4]*x[8]/(theta[21]*x[8]+x[4]) - theta[24]*x[8]*x[5]^2;
return dxdt;
}
}
data {
int<lower=1> T;
real<lower=0> x[T,8];
real t0;
real ts[T];
real x0[8];
real scl[8];
}
transformed data {
real y_r[0];
int y_i[0];
real<lower=0> x0_1[8];
real<lower=0> xn[T,8];
for (t in 1:T)
for (n in 1:8)
xn[t,n]=x[t,n]/scl[n];
for (n in 1:8)
x0_1[n] = x0[n]/scl[n];
}
parameters {
real <lower=0> sigma;
real <lower=0> mu1;
real <lower=0> mu2;
real <lower=0> mu3;
real <lower=0> mu4;
real <lower=0> mu5;
real <lower=0> ks1;
real <lower=0> ks2;
real <lower=0> ks3;
real <lower=0> ks4;
real <lower=0> ks5;
real <lower=0> k1;
real <lower=0> k2;
real <lower=0> k3;
real <lower=0> yc1;
real <lower=0> yc2;
real <lower=0> yc3;
real <lower=0> yc4;
real <lower=0> yc5;
real <lower=0> yc6;
real <lower=0> yc7;
real <lower=0> yc8;
real <lower=0> yc9;
real <lower=0> yc10;
real <lower=0> yc11;
}
transformed parameters {
real x_hat[T,8];
{
real theta[24];
theta[1] = yc1;
theta[2] = yc2;
theta[3] = yc3;
theta[4] = yc4;
theta[5] = yc5;
theta[6] = yc6;
theta[7] = yc7;
theta[8] = yc8;
theta[9] = yc9;
theta[10] = yc10;
theta[11] = yc11;
theta[12] = mu1;
theta[13] = mu2;
theta[14] = mu3;
theta[15] = mu4;
theta[16] = mu5;
theta[17] = ks1;
theta[18] = ks2;
theta[19] = ks3;
theta[20] = ks4;
theta[21] = ks5;
theta[22] = k1;
theta[23] = k2;
theta[24] = k3;
x_hat = integrate_ode_rk45(cbf, x0_1, t0, ts, theta, y_r, y_i,1.0E-6, 1.0E-6, 1.0E8);
}
}
model{
mu1~normal(0.5,0.3);
mu2~normal(0.5,0.3);
mu3~normal(0.5,0.3);
mu4~normal(0.5,0.3);
mu5~normal(0.5,0.3);
ks1~normal(0.5,0.3);
ks2~normal(0.5,0.3);
ks3~normal(0.5,0.3);
ks4~normal(0.5,0.3);
ks5~normal(0.5,0.3);
k1~normal(0.5,0.3);
k2~normal(0.5,0.3);
k3~normal(0.5,0.3);
yc1~normal(0.5,0.3);
yc2~normal(0.5,0.3);
yc3~normal(0.5,0.3);
yc4~normal(0.5,0.3);
yc5~normal(0.5,0.3);
yc6~normal(0.5,0.3);
yc7~normal(0.5,0.3);
yc8~normal(0.5,0.3);
yc9~normal(0.5,0.3);
yc10~normal(0.5,0.3);
yc11~normal(0.5,0.3);
sigma~cauchy(0,1);
for (t in 1:T)
xn[t]~normal(x_hat[t],sigma);
}
generated quantities{
real log_lik[T,8];
for (t in 1:T)
for (n in 1:8)
log_lik[t,n]=normal_lpdf(xn[t,n]|x_hat[t,n],sigma);
}
Both, Stan model and running scripts can be found in model/
.
The results for Camu et al., box 1 and box 2 of Papalexandratou et al. can be found in output/
. The R scripts for re-producing the following results can be found in scripts/
.
By conduction Leave-one-out cross-validation and widely applicable information criterion for the four models on all the data sets, the results suggest that models 3 and 4 have similar performances, at leas for the data corresponding to Camu and Box 2.
Figure 3. LOO and WAIC for the models (a) Camu, (b) box 1 and (c) box 2
However, we decided to take model 4 as the most plausible one given that the maximum growth rate of AAB on EtOH for model 3 showed a bimodal distribution.
Figure 4. Posterior distribution of the maximum especific growth rate of AAB on EtOH for model 3
For Camu, box 1 and box 2, the traceplots did not show evidence of problems in the mixing of the chains.
Figure 5. Traceplots of the scaled 24 parameters of the model for Camu (excluding sigma).
Figure 6. Traceplots of the scaled 24 parameters of the model for Box 1 (excluding sigma).
Figure 7. Traceplots of the scaled 24 parameters of the model for Box 2 (excluding sigma).
This is confirmed by the obtained \(\hat{R}\) for Camu:
## mean se_mean sd 2.5% 25% 50% 75% 97.5% n_eff Rhat valid
## mu1 0.248 0.002 0.094 0.098 0.179 0.237 0.304 0.464 3678 1.001 1
## mu2 0.363 0.002 0.107 0.178 0.286 0.358 0.433 0.593 4299 1.001 1
## mu3 0.358 0.001 0.068 0.237 0.311 0.354 0.402 0.501 4696 1.001 1
## mu4 0.378 0.002 0.095 0.200 0.317 0.376 0.437 0.570 2152 1.001 1
## mu5 0.009 0.001 0.016 0.000 0.002 0.005 0.010 0.048 603 1.001 1
## ks1 0.677 0.004 0.265 0.181 0.490 0.667 0.859 1.209 5366 1.001 1
## ks2 0.620 0.003 0.256 0.141 0.441 0.611 0.788 1.145 6310 1.000 1
## ks3 0.731 0.003 0.239 0.298 0.560 0.721 0.885 1.226 4829 1.001 1
## ks4 0.724 0.003 0.250 0.258 0.547 0.716 0.887 1.234 6795 1.000 1
## ks5 0.567 0.004 0.265 0.091 0.373 0.558 0.744 1.115 5476 1.001 1
## k1 0.743 0.002 0.111 0.551 0.665 0.734 0.814 0.978 5023 1.000 1
## k2 0.047 0.000 0.012 0.028 0.039 0.046 0.054 0.076 3945 1.001 1
## k3 0.257 0.001 0.053 0.163 0.220 0.254 0.290 0.373 5779 1.000 1
## yc1 0.626 0.003 0.208 0.283 0.470 0.605 0.759 1.079 5558 1.000 1
## yc2 0.317 0.002 0.119 0.125 0.229 0.301 0.386 0.595 3891 1.000 1
## yc3 0.677 0.003 0.185 0.388 0.541 0.653 0.788 1.103 4758 1.001 1
## yc4 0.325 0.002 0.192 0.026 0.178 0.306 0.452 0.748 5892 1.001 1
## yc5 0.248 0.002 0.141 0.021 0.144 0.236 0.336 0.560 5062 1.001 1
## yc6 0.116 0.002 0.071 0.040 0.076 0.103 0.137 0.266 969 1.001 1
## yc7 0.696 0.002 0.102 0.513 0.626 0.691 0.761 0.911 4196 1.000 1
## yc8 0.432 0.004 0.263 0.023 0.225 0.408 0.612 0.997 4814 1.001 1
## yc9 0.513 0.001 0.086 0.359 0.454 0.508 0.567 0.695 4584 1.000 1
## yc10 0.035 0.001 0.040 0.001 0.014 0.027 0.045 0.108 1023 1.001 1
## yc11 0.449 0.004 0.266 0.028 0.242 0.429 0.624 1.026 4805 1.001 1
## sigma 0.149 0.000 0.010 0.131 0.142 0.149 0.156 0.171 6549 1.000 1
## Q5 Q50 Q95 MCSE_Q2.5 MCSE_Q25 MCSE_Q50 MCSE_Q75 MCSE_Q97.5
## mu1 0.114 0.237 0.425 0.002 0.001 0.002 0.003 0.004
## mu2 0.201 0.358 0.549 0.003 0.002 0.002 0.002 0.005
## mu3 0.253 0.354 0.474 0.002 0.001 0.001 0.001 0.003
## mu4 0.234 0.376 0.535 0.013 0.002 0.001 0.002 0.003
## mu5 0.000 0.005 0.029 0.000 0.000 0.000 0.000 0.009
## ks1 0.256 0.667 1.121 0.010 0.006 0.005 0.004 0.008
## ks2 0.207 0.611 1.058 0.012 0.005 0.004 0.004 0.009
## ks3 0.357 0.721 1.143 0.007 0.005 0.004 0.004 0.007
## ks4 0.324 0.716 1.143 0.006 0.005 0.004 0.004 0.009
## ks5 0.145 0.558 1.014 0.008 0.006 0.005 0.004 0.010
## k1 0.576 0.734 0.935 0.003 0.002 0.001 0.002 0.005
## k2 0.031 0.046 0.070 0.000 0.000 0.000 0.000 0.001
## k3 0.176 0.254 0.349 0.002 0.001 0.001 0.001 0.002
## yc1 0.324 0.605 0.995 0.005 0.003 0.004 0.003 0.006
## yc2 0.150 0.301 0.531 0.002 0.001 0.002 0.002 0.008
## yc3 0.420 0.653 1.015 0.003 0.003 0.003 0.004 0.011
## yc4 0.048 0.306 0.670 0.002 0.003 0.003 0.004 0.009
## yc5 0.039 0.236 0.499 0.003 0.003 0.003 0.003 0.009
## yc6 0.047 0.103 0.214 0.001 0.001 0.001 0.001 0.020
## yc7 0.537 0.691 0.875 0.003 0.002 0.002 0.002 0.005
## yc8 0.046 0.408 0.896 0.003 0.005 0.005 0.005 0.011
## yc9 0.382 0.508 0.662 0.003 0.002 0.001 0.002 0.003
## yc10 0.003 0.027 0.086 0.000 0.000 0.000 0.001 0.007
## yc11 0.056 0.429 0.919 0.002 0.006 0.005 0.004 0.009
## sigma 0.133 0.149 0.167 0.000 0.000 0.000 0.000 0.001
## MCSE_SD Bulk_ESS Tail_ESS
## mu1 0.001 3530 4066
## mu2 0.001 4080 4193
## mu3 0.001 4463 4105
## mu4 0.001 2645 1083
## mu5 0.001 2304 1060
## ks1 0.003 4925 2708
## ks2 0.002 5750 2817
## ks3 0.002 4568 3805
## ks4 0.002 6264 3280
## ks5 0.003 4624 2149
## k1 0.001 5143 5705
## k2 0.000 4024 4848
## k3 0.000 5719 5351
## yc1 0.002 5237 4326
## yc2 0.001 3870 4930
## yc3 0.002 4750 5733
## yc4 0.002 5428 3831
## yc5 0.001 4193 2329
## yc6 0.002 3069 1388
## yc7 0.001 4141 5242
## yc8 0.003 3610 1696
## yc9 0.001 4499 5143
## yc10 0.001 2729 1897
## yc11 0.003 3471 2125
## sigma 0.000 6649 5429
Box 1:
## mean se_mean sd 2.5% 25% 50% 75% 97.5% n_eff Rhat valid
## mu1 0.063 0.000 0.025 0.023 0.046 0.060 0.077 0.120 4254 1.000 1
## mu2 0.083 0.000 0.030 0.039 0.062 0.079 0.100 0.152 4568 1.000 1
## mu3 0.417 0.001 0.096 0.248 0.348 0.411 0.479 0.626 4883 1.000 1
## mu4 0.149 0.001 0.051 0.058 0.112 0.146 0.181 0.257 4931 1.000 1
## mu5 0.025 0.000 0.017 0.002 0.012 0.022 0.035 0.065 4094 1.000 1
## ks1 0.623 0.004 0.274 0.114 0.426 0.617 0.812 1.162 5698 1.000 1
## ks2 0.502 0.004 0.283 0.038 0.282 0.482 0.694 1.096 4907 1.000 1
## ks3 0.576 0.004 0.251 0.130 0.392 0.564 0.744 1.098 4578 1.001 1
## ks4 0.626 0.003 0.274 0.129 0.430 0.615 0.810 1.193 7072 1.001 1
## ks5 0.552 0.003 0.271 0.071 0.350 0.542 0.738 1.117 6348 1.001 1
## k1 0.311 0.001 0.071 0.200 0.260 0.302 0.350 0.477 4715 1.000 1
## k2 0.310 0.001 0.088 0.158 0.248 0.303 0.364 0.499 3637 1.001 1
## k3 0.061 0.000 0.028 0.022 0.041 0.056 0.075 0.130 4027 1.001 1
## yc1 0.928 0.003 0.223 0.513 0.773 0.921 1.074 1.380 6045 1.000 1
## yc2 0.136 0.002 0.092 0.017 0.073 0.117 0.178 0.367 2869 1.001 1
## yc3 1.042 0.002 0.199 0.684 0.903 1.032 1.171 1.451 6337 1.000 1
## yc4 0.419 0.003 0.209 0.054 0.267 0.407 0.558 0.859 6228 1.000 1
## yc5 0.396 0.003 0.173 0.075 0.273 0.395 0.514 0.736 4465 1.000 1
## yc6 0.541 0.003 0.218 0.172 0.385 0.525 0.678 1.007 5565 1.001 1
## yc7 0.214 0.001 0.051 0.138 0.178 0.206 0.242 0.338 3830 1.001 1
## yc8 0.506 0.003 0.247 0.085 0.325 0.485 0.666 1.048 6324 1.001 1
## yc9 0.103 0.000 0.034 0.048 0.079 0.099 0.122 0.183 5144 1.000 1
## yc10 0.412 0.003 0.193 0.077 0.274 0.398 0.539 0.822 3899 1.001 1
## yc11 0.478 0.003 0.248 0.058 0.292 0.461 0.643 0.998 6122 1.000 1
## sigma 0.167 0.000 0.014 0.144 0.158 0.167 0.176 0.196 6203 1.000 1
## Q5 Q50 Q95 MCSE_Q2.5 MCSE_Q25 MCSE_Q50 MCSE_Q75 MCSE_Q97.5
## mu1 0.029 0.060 0.109 0.001 0.000 0.000 0.000 0.001
## mu2 0.043 0.079 0.138 0.000 0.000 0.001 0.001 0.002
## mu3 0.270 0.411 0.587 0.002 0.002 0.001 0.002 0.003
## mu4 0.071 0.146 0.237 0.001 0.001 0.001 0.001 0.002
## mu5 0.004 0.022 0.058 0.000 0.000 0.000 0.000 0.001
## ks1 0.180 0.617 1.075 0.009 0.006 0.005 0.004 0.012
## ks2 0.077 0.482 1.000 0.006 0.008 0.006 0.004 0.013
## ks3 0.184 0.564 1.017 0.010 0.006 0.005 0.005 0.007
## ks4 0.194 0.615 1.098 0.011 0.005 0.005 0.004 0.010
## ks5 0.127 0.542 1.015 0.009 0.007 0.005 0.004 0.011
## k1 0.214 0.302 0.445 0.001 0.001 0.001 0.001 0.005
## k2 0.179 0.303 0.463 0.002 0.002 0.001 0.002 0.005
## k3 0.026 0.056 0.114 0.000 0.000 0.000 0.001 0.002
## yc1 0.575 0.921 1.306 0.009 0.004 0.003 0.004 0.006
## yc2 0.027 0.117 0.310 0.001 0.001 0.002 0.002 0.009
## yc3 0.731 1.032 1.385 0.006 0.003 0.003 0.003 0.006
## yc4 0.095 0.407 0.786 0.006 0.003 0.003 0.005 0.008
## yc5 0.114 0.395 0.687 0.008 0.003 0.003 0.003 0.005
## yc6 0.214 0.525 0.929 0.004 0.004 0.003 0.003 0.010
## yc7 0.146 0.206 0.309 0.001 0.001 0.001 0.001 0.003
## yc8 0.131 0.485 0.950 0.006 0.004 0.005 0.004 0.008
## yc9 0.055 0.099 0.163 0.001 0.001 0.000 0.001 0.002
## yc10 0.117 0.398 0.753 0.009 0.004 0.003 0.004 0.006
## yc11 0.094 0.461 0.919 0.005 0.005 0.005 0.005 0.007
## sigma 0.147 0.167 0.191 0.000 0.000 0.000 0.000 0.001
## MCSE_SD Bulk_ESS Tail_ESS
## mu1 0.000 3708 3104
## mu2 0.000 4045 4078
## mu3 0.001 4415 3091
## mu4 0.001 4547 4417
## mu5 0.000 4049 4292
## ks1 0.003 5004 2310
## ks2 0.003 3926 2351
## ks3 0.003 4143 2578
## ks4 0.002 6142 2867
## ks5 0.002 5374 2558
## k1 0.001 5108 5391
## k2 0.001 3710 4256
## k3 0.000 4325 5382
## yc1 0.002 5898 3931
## yc2 0.001 3294 3595
## yc3 0.002 6164 5292
## yc4 0.002 5516 3248
## yc5 0.002 4120 2888
## yc6 0.002 5039 4222
## yc7 0.001 3998 4746
## yc8 0.002 5382 3182
## yc9 0.000 5302 5313
## yc10 0.002 3439 2123
## yc11 0.002 5208 3077
## sigma 0.000 6383 5661
And for Box 2:
## mean se_mean sd 2.5% 25% 50% 75% 97.5% n_eff Rhat valid
## mu1 0.366 0.002 0.131 0.157 0.269 0.351 0.446 0.664 4125 1.001 1
## mu2 0.573 0.002 0.151 0.293 0.468 0.568 0.671 0.884 5038 1.000 1
## mu3 0.499 0.002 0.151 0.233 0.388 0.490 0.601 0.814 5123 1.001 1
## mu4 0.168 0.001 0.053 0.078 0.129 0.164 0.202 0.278 4447 1.000 1
## mu5 0.021 0.000 0.016 0.001 0.009 0.018 0.030 0.059 4273 1.001 1
## ks1 0.698 0.003 0.251 0.230 0.523 0.691 0.865 1.218 5731 1.000 1
## ks2 0.614 0.003 0.256 0.126 0.437 0.610 0.787 1.130 6131 1.000 1
## ks3 0.450 0.003 0.247 0.048 0.261 0.432 0.614 0.983 5292 1.001 1
## ks4 0.574 0.004 0.287 0.056 0.364 0.561 0.768 1.164 5280 1.001 1
## ks5 0.525 0.003 0.273 0.053 0.324 0.512 0.710 1.094 6441 1.000 1
## k1 0.652 0.003 0.176 0.333 0.519 0.648 0.775 1.003 4486 1.000 1
## k2 0.283 0.002 0.104 0.106 0.208 0.273 0.348 0.510 3170 1.002 1
## k3 0.127 0.001 0.052 0.053 0.091 0.119 0.153 0.253 5670 1.000 1
## yc1 0.682 0.002 0.201 0.352 0.533 0.662 0.811 1.124 6828 1.000 1
## yc2 0.055 0.001 0.056 0.001 0.016 0.038 0.075 0.208 3922 1.001 1
## yc3 0.837 0.003 0.229 0.442 0.666 0.821 0.986 1.316 7063 1.001 1
## yc4 0.291 0.002 0.155 0.031 0.176 0.281 0.392 0.628 5086 1.000 1
## yc5 0.201 0.002 0.135 0.010 0.092 0.181 0.285 0.511 5601 1.000 1
## yc6 0.650 0.003 0.208 0.295 0.499 0.636 0.778 1.106 6858 1.000 1
## yc7 0.363 0.002 0.131 0.186 0.267 0.337 0.431 0.689 3386 1.001 1
## yc8 0.415 0.003 0.248 0.035 0.222 0.388 0.576 0.963 6229 1.000 1
## yc9 0.108 0.001 0.069 0.012 0.059 0.096 0.141 0.282 4111 1.001 1
## yc10 0.473 0.003 0.192 0.112 0.344 0.466 0.596 0.874 4627 1.000 1
## yc11 0.554 0.003 0.260 0.086 0.367 0.541 0.730 1.081 6431 1.000 1
## sigma 0.167 0.000 0.013 0.144 0.158 0.166 0.176 0.195 6835 1.001 1
## Q5 Q50 Q95 MCSE_Q2.5 MCSE_Q25 MCSE_Q50 MCSE_Q75 MCSE_Q97.5
## mu1 0.180 0.351 0.606 0.001 0.002 0.003 0.003 0.007
## mu2 0.332 0.568 0.833 0.005 0.003 0.003 0.003 0.007
## mu3 0.263 0.490 0.757 0.005 0.003 0.003 0.003 0.006
## mu4 0.089 0.164 0.261 0.001 0.001 0.001 0.001 0.002
## mu5 0.002 0.018 0.052 0.000 0.000 0.000 0.000 0.001
## ks1 0.301 0.691 1.124 0.013 0.005 0.004 0.004 0.012
## ks2 0.194 0.610 1.045 0.009 0.006 0.003 0.004 0.010
## ks3 0.084 0.432 0.889 0.005 0.005 0.004 0.005 0.006
## ks4 0.118 0.561 1.074 0.008 0.007 0.005 0.005 0.009
## ks5 0.099 0.512 0.991 0.007 0.007 0.005 0.005 0.010
## k1 0.372 0.648 0.949 0.003 0.003 0.004 0.003 0.005
## k2 0.127 0.273 0.466 0.002 0.002 0.002 0.002 0.006
## k3 0.060 0.119 0.225 0.001 0.001 0.001 0.001 0.003
## yc1 0.388 0.662 1.044 0.004 0.003 0.003 0.003 0.007
## yc2 0.003 0.038 0.166 0.000 0.001 0.001 0.001 0.005
## yc3 0.491 0.821 1.236 0.004 0.004 0.004 0.003 0.007
## yc4 0.058 0.281 0.570 0.004 0.003 0.002 0.003 0.007
## yc5 0.020 0.181 0.451 0.001 0.002 0.002 0.003 0.006
## yc6 0.340 0.636 1.029 0.006 0.004 0.003 0.004 0.008
## yc7 0.203 0.337 0.611 0.002 0.002 0.002 0.003 0.011
## yc8 0.063 0.388 0.865 0.004 0.005 0.004 0.004 0.008
## yc9 0.021 0.096 0.240 0.001 0.001 0.001 0.002 0.005
## yc10 0.165 0.466 0.797 0.010 0.004 0.003 0.003 0.007
## yc11 0.140 0.541 1.000 0.008 0.005 0.004 0.004 0.010
## sigma 0.147 0.166 0.190 0.000 0.000 0.000 0.000 0.001
## MCSE_SD Bulk_ESS Tail_ESS
## mu1 0.001 3989 4877
## mu2 0.002 4934 5002
## mu3 0.001 4631 3335
## mu4 0.001 4085 3940
## mu5 0.000 3716 3126
## ks1 0.002 5178 2831
## ks2 0.002 5504 2626
## ks3 0.002 4436 2939
## ks4 0.003 4466 2468
## ks5 0.002 5130 2316
## k1 0.002 4469 4740
## k2 0.001 3312 4391
## k3 0.001 5891 5217
## yc1 0.002 6660 5821
## yc2 0.001 3893 4222
## yc3 0.002 6895 6028
## yc4 0.002 4552 3173
## yc5 0.001 5264 3324
## yc6 0.002 6518 4945
## yc7 0.002 3527 4882
## yc8 0.002 5004 3470
## yc9 0.001 3486 2586
## yc10 0.002 4256 2431
## yc11 0.002 5632 3017
## sigma 0.000 7171 5189
By looking at the predictions, the model performs remarkably well for all data sets:
Figure 8. Predictions of the model for Camu (a) Glc, (b) Fru, (c) EtOH, (d) LA, (e) Ac, (f) Y, (g) LAB and (h) AAB. Solid red lines represent the median of the posterior distribution at each time point. Dashed red lines represent the 95% credible interval. Black dots represent the real data reported in [4].
Figure 9. Predictions of the model for box 1 (a) Glc, (b) Fru, (c) EtOH, (d) LA, (e) Ac, (f) Y, (g) LAB and (h) AAB. Solid red lines represent the median of the posterior distribution at each time point. Dashed red lines represent the 95% credible interval. Black dots represent the real data reported in [3].
Figure 10. Predictions of the model for box 2 (a) Glc, (b) Fru, (c) EtOH, (d) LA, (e) Ac, (f) Y, (g) LAB and (h) AAB. Solid red lines represent the median of the posterior distribution at each time point. Dashed red lines represent the 95% credible interval. Black dots represent the real data reported in [3].
At looking at the obtained posterior distributions from box 1 and 2 it is possible to have hints on how subtle differences in how both were conducted might affect kinetic parameters.
Figure 11. Violin plots of the re-scaled posterior distributions of the 24 parameters showing the differences between boxes 1 and 2.
We hypothesize that three main differences between these two fermentations played a role in the disparity of the parameters: (1) initial concentration of microorganisms, (2) concentration ratios of initial substrates and (3) evaporation rates.
The biggest difference between the two trials is the initial concentration of Y. This were equal to 0.18 and 0.005 \(mg\,g^{-1}\) for box 1 and 2 respectively. This could be the cause of the differing maximum growth rates estimated for Y between the trials. Similar effect can be seen on parameters directly related to the population of Y such as, the yield coefficient of Y growing on Glc (\(Y_{\text{Glc}|Y}\)) where box 1 showed a higher value than box 2 as a consequence of its higher initial inoculum of Y.
It is worth of attention to take into account that the ratio of the initial substrates Glc and Fru differed between both trials. This ratio (\(\frac{Glc}{Fru}\)) were equal to 1.11 and 0.64 for boxes 1 and 2 respectively. Such a dissimilarity would be the explanation of the estimates obtained for the yield coefficient of the growth of LAB on Glc (\(Y_{\text{Glc}|\text{LAB}}\)) suggesting a resource-type competition between Y and LAB.
Finally, regarding the yield coefficient of growth of AAB on EtOH (\(Y_{\text{EtOH}|\text{AAB}}\)), the higher value obtained in box 1 with respect to box 2 might be explained for a possibly greater evaporation rate of EtOH in box 1 (remember that box 1 was conducted under a metal roof instead as in a fermentary room as box 2). The evaporation rate might explain the difference of other parameters too, such as the saturation constant for the growth of AAB on LA (\(K_{\text{LA}}^{\text{AAB}}\)) and the yield coefficient of consumption of LA by AAB (\(Y_{\text{LA}|\text{AAB}}\)).
The aforementioned model represents the first fully working kinetic model for cocoa bean fermentation capable of predict notably well fermentation time series. It represents also an interesting method to see into the mechanisms of the process that might serve for its optimization in the future.
The authors would like to express their acknowledgement to Barry Callebaut for funding this research as part of the COMETA project, driven by Jacobs University Bremen. Finally, but not the least, MM would like to give a deep thank you to all the Stan Development team as well as the active community of the discussion group of Stan for their help.
[1] Moreno-Zambrano, M., Grimbs, S., Ullrich M. S. and Hütt, M-T. (2018). A mathematical model of cocoa bean fermantation. Royal Society Open Science, 5(10), 180 964.doi: 10.1098/rsos.180964
[2] De Vuyst, L. and Weckx, S. (2016). The cocoa bean fermentation process: from ecosystem analysis to starter culture development. Journal of Applied Microbiology. 121, 5-17.
[3] Papalexandratou, Z., Vrancken, G., Bruyne, K. D., Vandamme, P. and De Vuyst, L. (2011). Spontaneous organic cocoa bean fermentations in Brazil are characterized by a restricted species diversity of lactic acid bacteria and acetic acid bacteria. Food Microbiology. 28, 1326-1338.
[4] Camu, N., De Winter, T., Verbrugghe, K., Cleenwerck, I., Vandamme, P., Takrama, J. S., Vancanneyt, M. and De Vuyst, L. (2007). Dynamics and biodiversity of populations of lactic acid bacteria and acetic acid bacteria involved in spontaneous heap fermentation of cocoa beans in Ghana. Applied and Environmental Microbiology. 74:6, 1809-1824.