# Introduction to turbulence/Statistical analysis/Ensemble average

(Difference between revisions)
 Revision as of 19:18, 19 February 2007 (view source)Jola (Talk | contribs)m (The ensemble and Ensemble Average moved to The ensemble and ensemble average: titles should be lowercase)← Older edit Latest revision as of 12:29, 2 July 2011 (view source)Bluebase (Talk | contribs) m (→Fluctuations about the mean) (15 intermediate revisions not shown) Line 1: Line 1: - = The ensemble and Ensemble Average = + {{Introduction to turbulence menu}} - + == The mean or ensemble average == == The mean or ensemble average == - The concept of an ''ensebmble average'' is based upon the existence of independent statistical event. For example, consider a number of inviduals who are simultaneously flipping unbiased coins. If a value of one is assigned to a head and the value of zero to a tail, then the average of the numbers generated is defined as + The concept of an ''ensemble average'' is based upon the existence of independent statistical event. For example, consider a number of inviduals who are simultaneously flipping unbiased coins. If a value of one is assigned to a head and the value of zero to a tail, then the ''arithmetic average'' of the numbers generated is defined as -
+ :$X_{N}=\frac{1}{N} \sum{x_{n}}$ - :$+ - X_{N}=\frac{1}{N} \sum{x_{n}} + - + - (2) + where our [itex] n$ th flip is denoted as $x_{n}$ and $N$ is the total number of flips. where our $n$ th flip is denoted as $x_{n}$ and $N$ is the total number of flips. + Now if all the coins are the same, it doesn't really matter whether we flip one coin $N$  times, or $N$  coins a single time. The key is that they must all be ''independent events'' - meaning the probability of achieving a head or tail in a given flip must be completely independent of what happens in all the other flips. Obviously we can't just flip one coin and count it $N$ times; these cleary would not be independent events - + '''Exercise:''' Carry out an experiment where you flip a coin 100 times in groups of 10 flips each. Compare the values you get for $X_{10}$ for each of the 10 groups, and note how they differ from the value of $X_{100}$. - + - Now if all the coins are the same, it doesn't really matter whether we flip one coin $N$ times, or $N$  coins a single time. The key is that they must all be ''independent events'' - meaning the probability of achieving a head or tail in a given flip must be completely independent of what happens in all the other flips. Obviously we can't just flip one coin and count it $N$ times; these cleary would not be independent events + Unless you had a very unusual experimental result, you probably noticed that the value of the $X_{10}$'s was also a random variable and differed from ensemble to ensemble. Also the greater the number of flips in the ensemle, the closer you got  to $X_{N}=1/2$. Obviously the bigger $N$ , the less fluctuation there is in $X_{N}$ Unless you had a very unusual experimental result, you probably noticed that the value of the $X_{10}$'s was also a random variable and differed from ensemble to ensemble. Also the greater the number of flips in the ensemle, the closer you got  to $X_{N}=1/2$. Obviously the bigger $N$ , the less fluctuation there is in $X_{N}$ Line 22: Line 16: Now imagine that we are trying to establish the nature of a random variable $x$. The $n$th ''realization'' of $x$  is denoted as $x_{n}$. The ''ensemble average'' of $x$ is denoted as $X$ (or $\left\langle x \right\rangle$ ), and ''is defined as'' Now imagine that we are trying to establish the nature of a random variable $x$. The $n$th ''realization'' of $x$  is denoted as $x_{n}$. The ''ensemble average'' of $x$ is denoted as $X$ (or $\left\langle x \right\rangle$ ), and ''is defined as'' -
+ :$X = \left\langle x \right\rangle \equiv \lim_{N \rightarrow \infty} \frac{1}{N} \sum^{N}_{n=1} x_{n}$ - :$+ - + - X = \left\langle x \right\rangle \equiv \lim_{N \rightarrow \infty} \frac{1}{N} \sum^{N}_{n=1} x_{n} + - + - + - (2) + Obviously it is impossible to obtain the ensemble average experimentally, since we can never achieve an infinite number of independent realizations. The most we can ever obtain is the arithmetic mean for the number of realizations we have. For this reason the arithmetic mean can also referred to as the ''estimator'' for the true mean ensemble average. Obviously it is impossible to obtain the ensemble average experimentally, since we can never achieve an infinite number of independent realizations. The most we can ever obtain is the arithmetic mean for the number of realizations we have. For this reason the arithmetic mean can also referred to as the ''estimator'' for the true mean ensemble average. Line 34: Line 22: Even though the true mean (or ensemble average) is unobtainable, nonetheless, the idea is still very useful. Most importantly,we can almost always be sure the ensemble average exists, even if we can only estimate what it really is. The fact of its existence, however, does not always mean that it is easy to obtain in practice. All the theoretical deductions in this course will use this ensemble average. Obviously this will mean we have to account for these "statistical differenced" between true means and estimates when comparing our theoretical results to actual measurements or computations. Even though the true mean (or ensemble average) is unobtainable, nonetheless, the idea is still very useful. Most importantly,we can almost always be sure the ensemble average exists, even if we can only estimate what it really is. The fact of its existence, however, does not always mean that it is easy to obtain in practice. All the theoretical deductions in this course will use this ensemble average. Obviously this will mean we have to account for these "statistical differenced" between true means and estimates when comparing our theoretical results to actual measurements or computations. - In general, the x_{n}$ could be realizations of any random variable. The $X$ defined by equation 2.2 represents the ensemble average of it. The quantity $X is sometimes referred to as the ''expected value '' of the random variables [itex] x$ , or even simple its ''mean''. + Figure 2.1 not uploaded yet - For example, the velocity vector at a given point in space and time $x^{\rightarrow},t$ , in a given turbulent flow can be considered to be a random variable, say $u_{i} \left( x^{\rightarrow},t \right)$. If there were a large number of identical experiments so that the  $u^{\left( n \right)}_{i} \left( x^{\rightarrow},t \right)$ in each of them were identically distributed, then the ensemble average of $u^{\left( n \right)}_{i} \left( x^{\rightarrow},t \right)$ would be given by + In general, the $x_{n}$ could be realizations of any random variable. The $X$ defined by the ensemle average definition defined above represents the ensemble average of it. The quantity $X$ is sometimes referred to as the ''expected value '' of the random variables $x$ , or even simple its ''mean''. -
+ For example, the velocity vector at a given point in space and time \vec{x},t , in a given turbulent flow can be considered to be a random variable, say $u_{i} \left( \vec{x},t \right)$. If there were a large number of identical experiments so that the  $u^{\left( n \right)}_{i} \left( \vec{x},t \right)$ in each of them were identically distributed, then the ensemble average of $u^{\left( n \right)}_{i} \left( \vec{x},t \right)$ would be given by - :$+ - \left\langle u_{i} \left( x^{\rightarrow} , t \right) \right\rangle = U_{i} \left( x^{\rightarrow} , t \right) \equiv \lim_{N \rightarrow \infty} \frac{1}{N} \sum^{N}_{n=1} u^{ \left( n \right) }_{i} \left( x^{\rightarrow} , t \right) + - + :[itex]\left\langle u_{i} \left( \vec{x} , t \right) \right\rangle = U_{i} \left( \vec{x} , t \right) \equiv \lim_{N \rightarrow \infty} \frac{1}{N} \sum^{N}_{n=1} u^{ \left( n \right) }_{i} \left( \vec{x} , t \right) - (2) + - Note that this ensemble average, [itex] U_{i} \left( x^{\rightarrow},t \right)$ , will , in general, vary with independent variables $x^{\rightarrow}$ and $t$. It will be seen later, that under certain conditions the ensemble average is the same as the average which would be generated by averaging in time. Even when a time average is not meaningful, however, the ensemble average can still be defined; e.g., as in non-stationary or periodic flow. Only ensemble averages will be used in the development of the turbulence equations here unless otherwise stated. + Note that this ensemble average, $U_{i} \left( \vec{x},t \right)$ , will, in general, vary with independent variables $\vec{x}$ and $t$. It will be seen later, that under certain conditions the ensemble average is the same as the average which would be generated by averaging in time. Even when a time average is not meaningful, however, the ensemble average can still be defined; e.g., as in non-stationary or periodic flow. Only ensemble averages will be used in the development of the turbulence equations here unless otherwise stated. == Fluctuations about the mean == == Fluctuations about the mean == Line 51: Line 36: It is often important to know how a random variable is distributed about the mean. For example, figure 2.1 illustrates portions of two random functions of time which have identical means, but are obviously members of different ensembles since the amplitudes of their fluctuations are not distributed the same. it is possible to distinguish between them by examining the statistical properties of the fluctuations about the mean (or simply the fluctuations) defined by: It is often important to know how a random variable is distributed about the mean. For example, figure 2.1 illustrates portions of two random functions of time which have identical means, but are obviously members of different ensembles since the amplitudes of their fluctuations are not distributed the same. it is possible to distinguish between them by examining the statistical properties of the fluctuations about the mean (or simply the fluctuations) defined by: -
+ :$x' = x - X$ - :$+ - X^{'}= x- X + - + - (2) + It is easy to see that the average of the fluctuation is zero, i.e., It is easy to see that the average of the fluctuation is zero, i.e., - + :[itex]\left\langle x' \right\rangle = 0$ - :$+ - \left\langle x^{'} \right\rangle = 0 + - + - (2) + - On the other hand, the ensemble average of the square of the fluctuation is ''not'' zero. In fact, it is such an important statistical measure we give it a special name, the '''variance''', and represent it symbolically by either [itex] var \left[ x \right]$ or $\left\langle \left( x^{'} \right) ^{2} \right\rangle$ + On the other hand, the ensemble average of the square of the fluctuation is ''not'' zero. In fact, it is such an important statistical measure we give it a special name, the '''variance''', and represent it symbolically by either $var \left[ x \right]$ or $\left\langle \left( x' \right) ^{2} \right\rangle$ The ''variance'' is defined as The ''variance'' is defined as -
+ :$var \left[ x \right] \equiv \left\langle \left( x' \right) ^{2} \right\rangle = \left\langle \left[ x - X \right]^{2} \right\rangle$ - :$+ - var \left[ x \right] \equiv \left\langle \left( x^{'} \right) ^{2} \right\rangle = \left\langle \left[ x - X \right]^{2} \right\rangle + - + - (2) + - + :[itex]= \lim_{N\rightarrow \infty} \frac{1}{N} \sum^{N}_{n=1} \left[ x_{n} - X \right]^{2}$ - :$+ - = \lim_{N\rightarrow \infty} \frac{1}{N} \sum^{N}_{n=1} \left[ x_{n} - X \right]^{2} + - + - (2) + Note that the variance, like the ensemble average itself, can never really be measured, since it would require an infinite number of members of the ensemble. Note that the variance, like the ensemble average itself, can never really be measured, since it would require an infinite number of members of the ensemble. + It is straightforward to show from the definition of ensemble average the variance can be written as - It is straightforward to show from equation 2.2 that the variance in equation 2.6 can be written as + :[itex]var \left[ x \right] = \left\langle x^{2} \right\rangle - X^{2}$ - + -
+ - :$+ - var \left[ x \right] = \left\langle x^{2} \right\rangle - X^{2} + - + - (2) + Thus the variance is the ''second-moment'' minus the square of the ''first-moment'' (or mean). In this naming convention, the ensemble mean is the ''first moment''. Thus the variance is the ''second-moment'' minus the square of the ''first-moment'' (or mean). In this naming convention, the ensemble mean is the ''first moment''. - The variance can also referred to as the ''second central moment of x''. The word central implies that the mean has been subtracted off before squaring and averaging. The reasons for this will be clear below. If two random variables are identically distributed, then they must have the same mean and variance. The variance can also referred to as the ''second central moment of x''. The word central implies that the mean has been subtracted off before squaring and averaging. The reasons for this will be clear below. If two random variables are identically distributed, then they must have the same mean and variance. Line 98: Line 61: The variance is closely related to another statistical quantity called the ''standard deviation'' or root mean square (''rms'') value of the random variable [itex] x$ , which is denoted by the symbol, $\sigma_{x}$. Thus, The variance is closely related to another statistical quantity called the ''standard deviation'' or root mean square (''rms'') value of the random variable $x$ , which is denoted by the symbol, $\sigma_{x}$. Thus, -
+ :$\sigma_{x} \equiv \left( var \left[ x \right] \right)^{1/2}$ - :$+ - \sigma_{x} \equiv \left( var \left[ x \right] \right)^{1/2} + - + - (2) + or or - + :[itex]\sigma^{2}_{x} = var \left[ x \right]$ - :$+ - \sigma^{2}_{x} = var \left[ x \right] + - + - (2) + - ---------------------------------------- + Figure 2.2 not uploaded yet == Higher moments == == Higher moments == - Figure 2.2 illustrates tow random variables of time which have the same mean and also the same variances, but clearly they are still quite different. It is useful, therefore, to define higher moments of the distribution to assist in distinguishing these differences. + Figure 2.2 illustrates two random variables of time which have the same mean and also the same variances, but clearly they are still quite different. It is useful, therefore, to define higher moments of the distribution to assist in distinguishing these differences. - The [itex] m$-th moment of the random variable is defined as + The $m$-th moment of the random variable is defined as -
+ :$\left\langle x^{m} \right\rangle = \lim_{N \rightarrow \infty} \frac{1}{N} \sum^{N}_{n=1} x^{m}_{n}$ - :$+ - \left\langle x^{m} \right\rangle = \lim_{N \rightarrow \infty} \frac{1}{N} \sum^{N}_{n=1} x^{m}_{n} + - + - (2) + It is usually more convenient to work with the ''central moments'' defined by: It is usually more convenient to work with the ''central moments'' defined by: - + :[itex]\left\langle \left( x' \right)^{m} \right\rangle = \left\langle \left( x-X \right)^{m} \right\rangle = \lim_{N \rightarrow \infty} \frac{1}{N} \sum^{N}_{n=1} \left[x_{n} - X \right]^{m}$ - :$+ - \left\langle \left( x^{'} \right)^{m} \right\rangle = \left\langle \left( x-X \right)^{m} \right\rangle = \lim_{N \rightarrow \infty} \frac{1}{N} \sum^{N}_{n=1} \left[x_{n} - X \right]^{m} + - + - (2) + The central moments give direct information on the distribution of the values of the random variable about the mean. It is easy to see that the variance is the second central moment (i.e., [itex] m=2$ ). The central moments give direct information on the distribution of the values of the random variable about the mean. It is easy to see that the variance is the second central moment (i.e., $m=2$ ). + + + {{Turbulence credit wkgeorge}} + + {{Chapter navigation||Probability}}

## Latest revision as of 12:29, 2 July 2011

 Nature of turbulence Statistical analysis Reynolds averaged equation Turbulence kinetic energy Stationarity and homogeneity Homogeneous turbulence Free turbulent shear flows Wall bounded turbulent flows Study questions ... template not finished yet!

## The mean or ensemble average

The concept of an ensemble average is based upon the existence of independent statistical event. For example, consider a number of inviduals who are simultaneously flipping unbiased coins. If a value of one is assigned to a head and the value of zero to a tail, then the arithmetic average of the numbers generated is defined as

$X_{N}=\frac{1}{N} \sum{x_{n}}$

where our $n$ th flip is denoted as $x_{n}$ and $N$ is the total number of flips.

Now if all the coins are the same, it doesn't really matter whether we flip one coin $N$ times, or $N$ coins a single time. The key is that they must all be independent events - meaning the probability of achieving a head or tail in a given flip must be completely independent of what happens in all the other flips. Obviously we can't just flip one coin and count it $N$ times; these cleary would not be independent events

Exercise: Carry out an experiment where you flip a coin 100 times in groups of 10 flips each. Compare the values you get for $X_{10}$ for each of the 10 groups, and note how they differ from the value of $X_{100}$.

Unless you had a very unusual experimental result, you probably noticed that the value of the $X_{10}$'s was also a random variable and differed from ensemble to ensemble. Also the greater the number of flips in the ensemle, the closer you got to $X_{N}=1/2$. Obviously the bigger $N$ , the less fluctuation there is in $X_{N}$

Now imagine that we are trying to establish the nature of a random variable $x$. The $n$th realization of $x$ is denoted as $x_{n}$. The ensemble average of $x$ is denoted as $X$ (or $\left\langle x \right\rangle$ ), and is defined as

$X = \left\langle x \right\rangle \equiv \lim_{N \rightarrow \infty} \frac{1}{N} \sum^{N}_{n=1} x_{n}$

Obviously it is impossible to obtain the ensemble average experimentally, since we can never achieve an infinite number of independent realizations. The most we can ever obtain is the arithmetic mean for the number of realizations we have. For this reason the arithmetic mean can also referred to as the estimator for the true mean ensemble average.

Even though the true mean (or ensemble average) is unobtainable, nonetheless, the idea is still very useful. Most importantly,we can almost always be sure the ensemble average exists, even if we can only estimate what it really is. The fact of its existence, however, does not always mean that it is easy to obtain in practice. All the theoretical deductions in this course will use this ensemble average. Obviously this will mean we have to account for these "statistical differenced" between true means and estimates when comparing our theoretical results to actual measurements or computations.

In general, the $x_{n}$ could be realizations of any random variable. The $X$ defined by the ensemle average definition defined above represents the ensemble average of it. The quantity $X$ is sometimes referred to as the expected value of the random variables $x$ , or even simple its mean.

For example, the velocity vector at a given point in space and time $\vec{x},t$ , in a given turbulent flow can be considered to be a random variable, say $u_{i} \left( \vec{x},t \right)$. If there were a large number of identical experiments so that the $u^{\left( n \right)}_{i} \left( \vec{x},t \right)$ in each of them were identically distributed, then the ensemble average of $u^{\left( n \right)}_{i} \left( \vec{x},t \right)$ would be given by

$\left\langle u_{i} \left( \vec{x} , t \right) \right\rangle = U_{i} \left( \vec{x} , t \right) \equiv \lim_{N \rightarrow \infty} \frac{1}{N} \sum^{N}_{n=1} u^{ \left( n \right) }_{i} \left( \vec{x} , t \right)$

Note that this ensemble average, $U_{i} \left( \vec{x},t \right)$ , will, in general, vary with independent variables $\vec{x}$ and $t$. It will be seen later, that under certain conditions the ensemble average is the same as the average which would be generated by averaging in time. Even when a time average is not meaningful, however, the ensemble average can still be defined; e.g., as in non-stationary or periodic flow. Only ensemble averages will be used in the development of the turbulence equations here unless otherwise stated.

It is often important to know how a random variable is distributed about the mean. For example, figure 2.1 illustrates portions of two random functions of time which have identical means, but are obviously members of different ensembles since the amplitudes of their fluctuations are not distributed the same. it is possible to distinguish between them by examining the statistical properties of the fluctuations about the mean (or simply the fluctuations) defined by:

$x' = x - X$

It is easy to see that the average of the fluctuation is zero, i.e.,

$\left\langle x' \right\rangle = 0$

On the other hand, the ensemble average of the square of the fluctuation is not zero. In fact, it is such an important statistical measure we give it a special name, the variance, and represent it symbolically by either $var \left[ x \right]$ or $\left\langle \left( x' \right) ^{2} \right\rangle$ The variance is defined as

$var \left[ x \right] \equiv \left\langle \left( x' \right) ^{2} \right\rangle = \left\langle \left[ x - X \right]^{2} \right\rangle$
$= \lim_{N\rightarrow \infty} \frac{1}{N} \sum^{N}_{n=1} \left[ x_{n} - X \right]^{2}$

Note that the variance, like the ensemble average itself, can never really be measured, since it would require an infinite number of members of the ensemble.

It is straightforward to show from the definition of ensemble average the variance can be written as

$var \left[ x \right] = \left\langle x^{2} \right\rangle - X^{2}$

Thus the variance is the second-moment minus the square of the first-moment (or mean). In this naming convention, the ensemble mean is the first moment.

The variance can also referred to as the second central moment of x. The word central implies that the mean has been subtracted off before squaring and averaging. The reasons for this will be clear below. If two random variables are identically distributed, then they must have the same mean and variance.

The variance is closely related to another statistical quantity called the standard deviation or root mean square (rms) value of the random variable $x$ , which is denoted by the symbol, $\sigma_{x}$. Thus,

$\sigma_{x} \equiv \left( var \left[ x \right] \right)^{1/2}$

or

$\sigma^{2}_{x} = var \left[ x \right]$

## Higher moments

Figure 2.2 illustrates two random variables of time which have the same mean and also the same variances, but clearly they are still quite different. It is useful, therefore, to define higher moments of the distribution to assist in distinguishing these differences.

The $m$-th moment of the random variable is defined as

$\left\langle x^{m} \right\rangle = \lim_{N \rightarrow \infty} \frac{1}{N} \sum^{N}_{n=1} x^{m}_{n}$

It is usually more convenient to work with the central moments defined by:

$\left\langle \left( x' \right)^{m} \right\rangle = \left\langle \left( x-X \right)^{m} \right\rangle = \lim_{N \rightarrow \infty} \frac{1}{N} \sum^{N}_{n=1} \left[x_{n} - X \right]^{m}$

The central moments give direct information on the distribution of the values of the random variable about the mean. It is easy to see that the variance is the second central moment (i.e., $m=2$ ).

## Credits

This text was based on "Lectures in Turbulence for the 21st Century" by Professor William K. George, Professor of Turbulence, Chalmers University of Technology, Gothenburg, Sweden.