|
| 1 | +.. _covariancesection: |
| 2 | + |
1 | 3 | Covariance Matrix Estimation |
2 | 4 | ============================ |
3 | 5 |
|
4 | | -The uncertainty in the estimated parameters is quantified using the covariance matrix. |
5 | | -The diagonal of the covariance matrix contains the variance of the estimated parameters. |
6 | | -Assuming Gaussian independent and identically distributed measurement errors, the |
7 | | -covariance matrix of the estimated parameters can be computed using the following |
8 | | -methods which have been implemented in parmest. |
| 6 | +The goal of parameter estimation (see :ref:`driversection` Section) is to estimate unknown model parameters |
| 7 | +from experimental data. When the model parameters are estimated from the data, their accuracy is measured by |
| 8 | +computing the covariance matrix. The diagonal of this covariance matrix contains the variance of the |
| 9 | +estimated parameters which is used to calculate their uncertainty. Assuming Gaussian independent and identically |
| 10 | +distributed measurement errors, the covariance matrix of the estimated parameters can be computed using the |
| 11 | +following methods which have been implemented in parmest. |
9 | 12 |
|
10 | 13 | 1. Reduced Hessian Method |
11 | 14 |
|
12 | | - When the objective function is the sum of squared errors (SSE): |
13 | | - :math:`\text{SSE} = \sum_{i = 1}^n \left(y_{i} - \hat{y}_{i}\right)^2`, |
14 | | - the covariance matrix is: |
| 15 | + When the objective function is the sum of squared errors (SSE) for homogeneous data, defined as |
| 16 | + :math:`\text{SSE} = \sum_{i = 1}^{n} \left(\boldsymbol{y}_{i} - \boldsymbol{f}(\boldsymbol{x}_{i}; |
| 17 | + \boldsymbol{\theta})\right)^\text{T} \left(\boldsymbol{y}_{i} - \boldsymbol{f}(\boldsymbol{x}_{i}; |
| 18 | + \boldsymbol{\theta})\right)`, the covariance matrix is: |
15 | 19 |
|
16 | 20 | .. math:: |
17 | | - V_{\boldsymbol{\theta}} = 2 \sigma^2 \left(\frac{\partial^2 \text{SSE}} |
18 | | - {\partial \boldsymbol{\theta} \partial \boldsymbol{\theta}}\right)^{-1}_{\boldsymbol{\theta} |
19 | | - = \boldsymbol{\theta}^*} |
20 | | -
|
21 | | - When the objective function is the weighted SSE (WSSE): |
22 | | - :math:`\text{WSSE} = \frac{1}{2} \left(\mathbf{y} - f(\mathbf{x};\boldsymbol{\theta})\right)^\text{T} |
23 | | - \mathbf{W} \left(\mathbf{y} - f(\mathbf{x};\boldsymbol{\theta})\right)`, |
| 21 | + \boldsymbol{V}_{\boldsymbol{\theta}} = 2 \sigma^2 \left(\frac{\partial^2 \text{SSE}} |
| 22 | + {\partial \boldsymbol{\theta}^2}\right)^{-1}_{\boldsymbol{\theta} |
| 23 | + = \hat{\boldsymbol{\theta}}} |
| 24 | +
|
| 25 | + Similarly, when the objective function is the weighted SSE (WSSE) for heterogeneous data, defined as |
| 26 | + :math:`\text{WSSE} = \frac{1}{2} \sum_{i = 1}^{n} \left(\boldsymbol{y}_{i} - |
| 27 | + \boldsymbol{f}(\boldsymbol{x}_{i};\boldsymbol{\theta})\right)^\text{T} \boldsymbol{\Sigma}_{\boldsymbol{y}}^{-1} |
| 28 | + \left(\boldsymbol{y}_{i} - \boldsymbol{f}(\boldsymbol{x}_{i};\boldsymbol{\theta})\right)`, |
24 | 29 | the covariance matrix is: |
25 | 30 |
|
26 | 31 | .. math:: |
27 | | - V_{\boldsymbol{\theta}} = \left(\frac{\partial^2 \text{WSSE}} |
28 | | - {\partial \boldsymbol{\theta} \partial \boldsymbol{\theta}}\right)^{-1}_{\boldsymbol{\theta} |
29 | | - = \boldsymbol{\theta}^*} |
30 | | -
|
31 | | - Where :math:`V_{\boldsymbol{\theta}}` is the covariance matrix of the estimated |
32 | | - parameters, :math:`y` are the observed measured variables, :math:`\hat{y}` are the |
33 | | - predicted measured variables, :math:`n` is the number of data points, |
34 | | - :math:`\boldsymbol{\theta}` are the unknown parameters, :math:`\boldsymbol{\theta^*}` |
35 | | - are the estimates of the unknown parameters, :math:`\mathbf{x}` are the decision |
36 | | - variables, and :math:`\mathbf{W}` is a diagonal matrix containing the inverse of the |
37 | | - variance of the measurement error, :math:`\sigma^2`. When the standard |
38 | | - deviation of the measurement error is not supplied by the user, parmest |
39 | | - approximates the variance of the measurement error as |
40 | | - :math:`\sigma^2 = \frac{1}{n-l} \sum e_i^2` where :math:`l` is the number of |
41 | | - fitted parameters, and :math:`e_i` is the residual for experiment :math:`i`. |
| 32 | + \boldsymbol{V}_{\boldsymbol{\theta}} = \left(\frac{\partial^2 \text{WSSE}} |
| 33 | + {\partial \boldsymbol{\theta}^2}\right)^{-1}_{\boldsymbol{\theta} |
| 34 | + = \hat{\boldsymbol{\theta}}} |
| 35 | +
|
| 36 | + Where :math:`\boldsymbol{V}_{\boldsymbol{\theta}}` is the covariance matrix of the estimated |
| 37 | + parameters :math:`\hat{\boldsymbol{\theta}} \in \mathbb{R}^p`, :math:`\boldsymbol{y}_{i} \in \mathbb{R}^m` are |
| 38 | + observations of the measured variables, :math:`\boldsymbol{f}` is the model function, |
| 39 | + :math:`\boldsymbol{x}_{i} \in \mathbb{R}^{q}` are the input variables, :math:`n` is the number of experiments, |
| 40 | + :math:`\boldsymbol{\Sigma}_{\boldsymbol{y}}` is the measurement error covariance matrix, and :math:`\sigma^2` |
| 41 | + is the variance of the measurement error. When the standard deviation of the measurement error is not supplied |
| 42 | + by the user, parmest approximates :math:`\sigma^2` as: |
| 43 | + :math:`\hat{\sigma}^2 = \frac{1}{n-p} \sum_{i=1}^{n} \boldsymbol{\varepsilon}_{i}(\boldsymbol{\theta})^{\text{T}} |
| 44 | + \boldsymbol{\varepsilon}_{i}(\boldsymbol{\theta})`, and :math:`\boldsymbol{\varepsilon}_{i} \in \mathbb{R}^m` |
| 45 | + are the residuals between the data and model for experiment :math:`i`. |
42 | 46 |
|
43 | 47 | In parmest, this method computes the inverse of the Hessian by scaling the |
44 | | - objective function (SSE or WSSE) with a constant probability factor. |
| 48 | + objective function (SSE or WSSE) with a constant probability factor, :math:`\frac{1}{n}`. |
45 | 49 |
|
46 | 50 | 2. Finite Difference Method |
47 | 51 |
|
48 | | - In this method, the covariance matrix, :math:`V_{\boldsymbol{\theta}}`, is |
49 | | - calculated by applying the Gauss-Newton approximation to the Hessian, |
50 | | - :math:`\frac{\partial^2 \text{SSE}}{\partial \boldsymbol{\theta} \partial \boldsymbol{\theta}}` |
| 52 | + In this method, the covariance matrix, :math:`\boldsymbol{V}_{\boldsymbol{\theta}}`, is |
| 53 | + computed by differentiating the Hessian, |
| 54 | + :math:`\frac{\partial^2 \text{SSE}}{\partial \boldsymbol{\theta}^2}` |
51 | 55 | or |
52 | | - :math:`\frac{\partial^2 \text{WSSE}}{\partial \boldsymbol{\theta} \partial \boldsymbol{\theta}}`, |
53 | | - leading to: |
| 56 | + :math:`\frac{\partial^2 \text{WSSE}}{\partial \boldsymbol{\theta}^2}`, and |
| 57 | + applying Gauss-Newton approximation which results in: |
| 58 | + |
| 59 | + .. math:: |
| 60 | + \boldsymbol{V}_{\boldsymbol{\theta}} = \left(\sum_{i = 1}^n \boldsymbol{G}_{i}^{\text{T}} |
| 61 | + \boldsymbol{\Sigma}_{\boldsymbol{y}}^{-1} \boldsymbol{G}_{i} \right)^{-1} |
| 62 | +
|
| 63 | + where |
54 | 64 |
|
55 | 65 | .. math:: |
56 | | - V_{\boldsymbol{\theta}} = \left(\sum_{i = 1}^n \mathbf{G}_{i}^{\mathrm{T}} \mathbf{W} |
57 | | - \mathbf{G}_{i} \right)^{-1} |
| 66 | + \boldsymbol{G}_{i} = \frac{\partial \boldsymbol{f}(\boldsymbol{x}_i;\boldsymbol{\theta})} |
| 67 | + {\partial \boldsymbol{\theta}} |
58 | 68 |
|
59 | | - This method uses central finite difference to compute the Jacobian matrix, |
60 | | - :math:`\mathbf{G}_{i}`, for experiment :math:`i`, which is the sensitivity of |
61 | | - the measured variables with respect to the parameters, :math:`\boldsymbol{\theta}`. |
62 | | - :math:`\mathbf{W}` is a diagonal matrix containing the inverse of the variance |
63 | | - of the measurement errors, :math:`\sigma^2`. |
| 69 | + This method uses central finite difference to compute the Jacobian matrix, :math:`\boldsymbol{G}_{i}`, |
| 70 | + for experiment :math:`i`. |
| 71 | + |
| 72 | + .. math:: |
| 73 | + \boldsymbol{G}_{i}[:,\,k] \approx \frac{\boldsymbol{f}(\boldsymbol{x}_i;\theta_k + \Delta \theta_k) |
| 74 | + \vert_{\hat{\boldsymbol{\theta}}} - \boldsymbol{f}(\boldsymbol{x}_i;\theta_k - \Delta \theta_k) |
| 75 | + \vert_{\hat{\boldsymbol{\theta}}}}{2 \Delta \theta_k} \quad \forall \quad \theta_k \, \in \, |
| 76 | + [\theta_1,\cdots, \theta_p] |
64 | 77 |
|
65 | 78 | 3. Automatic Differentiation Method |
66 | 79 |
|
67 | 80 | Similar to the finite difference method, the covariance matrix is calculated as: |
68 | 81 |
|
69 | 82 | .. math:: |
70 | | - V_{\boldsymbol{\theta}} = \left( \sum_{i = 1}^n \mathbf{G}_{\text{kaug},\, i}^{\mathrm{T}} |
71 | | - \mathbf{W} \mathbf{G}_{\text{kaug},\, i} \right)^{-1} |
| 83 | + \boldsymbol{V}_{\boldsymbol{\theta}} = \left( \sum_{i = 1}^n \boldsymbol{G}_{\text{kaug},\, i}^{\text{T}} |
| 84 | + \boldsymbol{\Sigma}_{\boldsymbol{y}}^{-1} \boldsymbol{G}_{\text{kaug},\, i} \right)^{-1} |
| 85 | +
|
| 86 | + However, this method uses implicit differentiation and the model-optimality or Karush–Kuhn–Tucker (KKT) conditions |
| 87 | + to compute the Jacobian matrix, :math:`\boldsymbol{G}_{\text{kaug},\, i}`, for experiment :math:`i`. |
| 88 | + |
| 89 | + .. math:: |
| 90 | + \boldsymbol{G}_{\text{kaug},\,i} = \frac{\partial \boldsymbol{f}(\boldsymbol{x}_i,\boldsymbol{\theta})} |
| 91 | + {\partial \boldsymbol{\theta}} + \frac{\partial \boldsymbol{f}(\boldsymbol{x}_i,\boldsymbol{\theta})} |
| 92 | + {\partial \boldsymbol{x}_i}\frac{\partial \boldsymbol{x}_i}{\partial \boldsymbol{\theta}} |
72 | 93 |
|
73 | | - However, this method uses the model optimality (KKT) condition to compute the |
74 | | - Jacobian matrix, :math:`\mathbf{G}_{\text{kaug},\, i}`, for experiment :math:`i`. |
| 94 | +The covariance matrix calculation is only supported with the built-in objective functions "SSE" or "SSE_weighted". |
75 | 95 |
|
76 | | -The covariance matrix calculation is only supported with the built-in objective |
77 | | -functions "SSE" or "SSE_weighted". |
| 96 | +In parmest, the covariance matrix can be computed after creating the |
| 97 | +:class:`~pyomo.contrib.parmest.experiment.Experiment` class, |
| 98 | +defining the :class:`~pyomo.contrib.parmest.parmest.Estimator` object, |
| 99 | +and estimating the model parameters using :class:`~pyomo.contrib.parmest.parmest.Estimator.theta_est` |
| 100 | +(all these steps were addressed in the :ref:`driversection` Section). |
78 | 101 |
|
79 | | -In parmest, the covariance matrix can be calculated after defining the |
80 | | -:class:`~pyomo.contrib.parmest.parmest.Estimator` object and estimating the unknown |
81 | | -parameters using :class:`~pyomo.contrib.parmest.parmest.Estimator.theta_est`. To |
82 | | -estimate the covariance matrix, with the default method being "finite_difference", call |
83 | | -the :class:`~pyomo.contrib.parmest.parmest.Estimator.cov_est` function, e.g., |
| 102 | +To estimate the covariance matrix, with the default method being "finite_difference", call |
| 103 | +the :class:`~pyomo.contrib.parmest.parmest.Estimator.cov_est` function as follows: |
84 | 104 |
|
85 | 105 | .. testsetup:: * |
86 | 106 | :skipif: not __import__('pyomo.contrib.parmest.parmest').contrib.parmest.parmest.parmest_available |
|
0 commit comments