CFD Online Logo CFD Online URL
www.cfd-online.com
[Sponsors]
Home > Wiki > Introduction to turbulence/Statistical analysis/Estimation fro...

Introduction to turbulence/Statistical analysis/Estimation from a finite number of realizations

From CFD-Wiki

(Difference between revisions)
Jump to: navigation, search
(Bias and convergence of estimators)
(Bias and convergence of estimators)
Line 82: Line 82:
</math>   
</math>   
</td><td width="5%">(2)</td></tr></table>
</td><td width="5%">(2)</td></tr></table>
 +
 +
Thus ''the variability of the stimator depends inversely on the number of independent realizations, <math>N</math>, and linearly on the relative fluctuation level of the random variable itself <math>\epsilon / X</math>

Revision as of 07:29, 10 June 2006

Estimators for averaged quantities

Since there can never an infinite number of realizations from which ensemble averages (and probability densities) can be computed, it is essential to ask: How many realizations are enough? The answer to this question must be sought by looking at the statistical properties of estimators based on a finite number of realization. There are two questions which must be answered. The first one is:

  • Is the expected value (or mean value) of the estimator equal to the true ensemble mean? Or in other words, is yje estimator unbiased?

The second question is

  • Does the difference between the and that of the true mean decrease as the number of realizations increases? Or in other words, does the estimator converge in a statistical sense (or converge in probability). Figure 2.9 illustrates the problems which can arise.

Bias and convergence of estimators

A procedure for answering these questions will be illustrated by considerind a simple estimator for the mean, the arithmetic mean considered above, X_{N}. For N independent realizations x_{n}, n=1,2,...,N where N is finite, X_{N} is given by:

    
X_{N}=\frac{1}{N}\sum^{N}_{n=1} x_{n}
(2)

Now, as we observed in our simple coin-flipping experiment, since the x_{n} are random, so must be the value of the estimator X_{N}. For the estimator to be unbiased, the mean value of X_{N} must be true ensemble mean, X, i.e.

    
\lim_{N\rightarrow\infty} X_{N} = X
(2)

It is easy to see that since the operations of averaging adding commute,

    
\begin{matrix}
\left\langle X_{N} \right\rangle & = & \left\langle \frac{1}{N} \sum^{N}_{n=1} x_{n} \right\rangle \\
& = & \frac{1}{N} \sum^{N}_{n=1} \left\langle  x_{n} \right\rangle \\
& = & \frac{1}{N} NX = X \\
\end{matrix}
(2)

(Note that the expected value of each x_{n} is just X since the x_{n} are assumed identically distributed). Thus x_{N} is, in fact, an unbiased estimator for the mean.

The question of convergence of the estimator can be addressed by defining the square of variability of the estimator, say \epsilon^{2}_{X_{N}}, to be:

    
\epsilon^{2}_{X_{N}}\equiv \frac{var \left\{ X_{N} \right\} }{X^{2}} = \frac{\left\langle  \left( X_{N}- X \right)^{2} \right\rangle }{X^{2}}
(2)

Now we want to examine what happens to \epsilon_{X_{N}} as the number of realizations increases. For the estimator to converge it is clear that \epsilon_{x} should decrease as the number of sample increases. Obviously, we need to examine the variance of X_{N} first. It is given by:

    
\begin{matrix}
var \left\{ X_{N} \right\} & = & \left\langle X_{N} - X^{2} \right\rangle \\
& = & \left\langle \left[ \lim_{N\rightarrow\infty} \frac{1}{N} \sum^{N}_{n=1} \left( x_{n} - X \right) \right]^{2} \right\rangle - X^{2}\\
\end{matrix}
(2)

since \left\langle X_{N} \right\rangle = X from equation 2.46. Using the fact that operations of averaging and summation commute, the squared summation can be expanded as follows:

    
\begin{matrix}
\left\langle \left[ \lim_{N\rightarrow\infty} \sum^{N}_{n=1} \left( x_{n} - X \right) \right]^{2} \right\rangle & = & \lim_{N\rightarrow\infty}\frac{1}{N^{2}} \sum^{N}_{n=1} \sum^{N}_{m=1} \left\langle \left( x_{n} - X \right) \left(  x_{m} - X \right) \right\rangle \\
& = & \lim_{N\rightarrow\infty}\frac{1}{N^{2}}\sum^{N}_{n=1}\left\langle \left(  x_{n} - X \right)^{2} \right\rangle \\
& = & \frac{1}{N} var \left\{ x \right\} \\
\end{matrix}
(2)

where the next to last step follows from the fact that the x_{n} are assumed to be statistically independent samples (and hence uncorrelated), and the last step from the definition of the variance. It follows immediately by substitution into equation 2.49 that the square of the variability of the estimator, X_{N}, is given by:

    
\begin{matrix}
\epsilon^{2}_{X_{N}}& =& \frac{1}{N}\frac{var\left\{x\right\}}{X^{2}} \\
& = & \frac{1}{N} \left[ \frac{\sigma_{x}}{X} \right]^{2} \\ 
\end{matrix}
(2)

Thus the variability of the stimator depends inversely on the number of independent realizations, N, and linearly on the relative fluctuation level of the random variable itself \epsilon / X

My wiki