corrections mineures à l'occasion de l'édition des slides

2022-03-06 22:54:12 +01:00 · 2022-03-06 22:54:12 +01:00 · 9517fb8458
commit 9517fb8458
parent 52f4ea14a2
1 changed files with 8 additions and 8 deletions
--- a/03_tikhonov.Rmd
+++ b/03_tikhonov.Rmd
@ -25,17 +25,17 @@ Avec moins d'observations que de fonctions de base ($n<p$), le système $\mathbf
 \end{array} \right)
 =
 \left( \begin{array}{c}
-1 \\ 0.99 
+1 \\ 0.99
 \end{array} \right)
 \]
-Sa solution est $\boldsymbol\beta^T = (1001,-1000)$. Cependant, la solution approchée $\boldsymbol\beta^T = (0.5,0.5)$  semble préférable. En effet, la solution optimale a peu de chance de bien s'adapter à de nouvelles observations (par exemple, l'observation $(1,2)$ serait projetée sur l'étiquette -999).
+Sa solution est $\boldsymbol\beta^T = (1001,-1000)$. Cependant, la solution approchée $\boldsymbol\beta^T = (0.5,0.5)$  semble préférable. En effet, la solution optimale a peu de chance de bien s'adapter à de nouvelles observations (par exemple, l'observation $(1,2)$ serait projetée sur l'étiquette $-999$).

 # Ajout de contraintes de régularité

 Ainsi, lorsqu'il faut choisir entre plusieurs solutions, il peut être efficace d'exprimer une préférence envers celles dont les coefficients (ou paramètres) ont de faibles valeurs. Cela consiste par exemple à minimiser $|\beta_1|+|\beta_2|+\dots$ (aussi noté $\|\boldsymbol\beta\|_1$, la "norme 1") ou encore $\beta_1^2+\beta_2^2+\dots$ (aussi noté $\|\boldsymbol\beta\|_2^2$, le carré de la "norme 2"). Dans ce dernier cas, il s'agit de résoudre un nouveau problème de minimisation :

 \begin{align*}
-\min_{\boldsymbol\beta} \|\mathbf{X}\boldsymbol\beta-\mathbf{y}\|^2_2 + \lambda \|\boldsymbol\beta|^2_2 \\
+\min_{\boldsymbol\beta} \|\mathbf{X}\boldsymbol\beta-\mathbf{y}\|^2_2 + \lambda \|\boldsymbol\beta\|^2_2 \\
 avec \quad 0 \leq \lambda
 \end{align*}

@ -62,7 +62,7 @@ Nous pouvons également centrer ou standardiser la cible. Supposons que nous cho
 \begin{align*}
 & \mathbf{\hat{y_i}} - \mathbf{\bar{y}} = \sum_{j=1}^{p} \hat{\beta_j} \left( \frac{x_{ij}-\bar{x_j}}{\sigma_j} \right) \\
 = \{& \text{arithmétique} \} \\
- & \mathbf{\hat{y_i}} = \left( \mathbf{\bar{y}} - \sum_{j=1}^{p} \hat{\beta_j} \frac{\bar{x_j}}{\sigma_j} \right) + 
+ & \mathbf{\hat{y_i}} = \left( \mathbf{\bar{y}} - \sum_{j=1}^{p} \hat{\beta_j} \frac{\bar{x_j}}{\sigma_j} \right) +
   \sum_{j=1}^{p} \frac{\hat{\beta_j}}{\sigma_j} x_{ij} \\
 \end{align*}

@ -80,15 +80,15 @@ data = gendat(10,0.2)

 par(mfrow=c(1,3))
 coef <- ridge(0, data, 7)
-plt(data,f,main=expression(paste(plain("Degré = "), 7, plain(", "), lambda, 
+plt(data,f,main=expression(paste(plain("Degré = "), 7, plain(", "), lambda,
                                 plain(" = 0"))))
 pltpoly(coef)
 coef <- ridge(1E-4, data, 7)
-plt(data,f,main=expression(paste(plain("Degré = "), 7, plain(", "), lambda, 
+plt(data,f,main=expression(paste(plain("Degré = "), 7, plain(", "), lambda,
                                 plain(" = 1E-4"))))
 pltpoly(coef)
 coef <- ridge(1, data, 7)
-plt(data,f,main=expression(paste(plain("Degré = "), 7, plain(", "), lambda, 
+plt(data,f,main=expression(paste(plain("Degré = "), 7, plain(", "), lambda,
                                 plain(" = 1"))))
 pltpoly(coef)
 ```
@ -114,7 +114,7 @@ Prenons l'exemple d'un modèle linéaire : $F(\mathbf{X}) = \mathbf{W}^T\mathbf{
 & |F(\mathbf{X}) - F(\mathbf{X^*})| \\
 = \{& F(\mathbf{X}) = \mathbf{W}^T\mathbf{X} + b \} \\
 & |\mathbf{W}^T\mathbf{X} - \mathbf{W}^T\mathbf{X^*}| \\
-= \phantom{\{}& \\
+= \{& \text{Algèbre linéaire} \} \\
 & |\mathbf{W}^T(\mathbf{X}-\mathbf{X^*})| \\
 = \{& \mathbf{X^*} = \mathbf{X} + \mathbf{\epsilon} \} \\
 & |\mathbf{W}^T \mathbf{\epsilon}| \\