CFD Online Logo CFD Online URL
www.cfd-online.com
[Sponsors]
Home > Forums > General Forums > Hardware

OpenFOAM benchmarks on various hardware

Register Blogs Community New Posts Updated Threads Search

Like Tree492Likes

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
Old   March 28, 2024, 22:10
Default
  #761
Senior Member
 
Will Kernkamp
Join Date: Jun 2014
Posts: 316
Rep Power: 12
wkernkamp is on a distinguished road
Interesting that the multiples of 24 show in the result.
wkernkamp is offline   Reply With Quote

Old   March 29, 2024, 06:39
Default
  #762
Member
 
Philipp Wiedemer
Join Date: Dec 2016
Location: Munich, Germany
Posts: 42
Rep Power: 9
MangoNrFive is on a distinguished road
My Hypothesis is, when benchmarking all of the core-counts we would see a saw-tooth-pattern in the speedup.

My explanation for this is, that for example at 168 cores all of the CCDs are balanced nicely with 7 cores each. If we add just one more core, than the workload of each of the cores is only a tiny bit less (decreased by 169/168) but one of the CCDs now has 8 cores instead of 7 (so an increase of the workload of this one CCD by (8/7 * 168/169). So in the simulation, this one CCD acts as the weakest link. If we add yet another core, we get another tiny speedup by decreasing the workload of each core (now a decrease of 170/168 in total) we now have two "bad" CCDs but this second "bad" CCD doesn´t make it worse because just the weakest link counts. So 168 is best, 169 is a lot worse, and 170 a little better than 169 but still a lot worse than 168 and so on until 192 where the CCD are balanced again and we have the best performance.
wkernkamp likes this.
MangoNrFive is offline   Reply With Quote

Old   March 29, 2024, 07:35
Default
  #763
Super Moderator
 
flotus1's Avatar
 
Alex
Join Date: Jun 2012
Location: Germany
Posts: 3,399
Rep Power: 46
flotus1 has a spectacular aura aboutflotus1 has a spectacular aura about
True in theory, when scattering threads across CCDs. In order to actually see this, quite a bit of effort would be necessary, in order to reduce run-to-run variance.
Additionally, with this many cores, the quality of domain decomposition can have a larger impact than adding/removing a few cores.
flotus1 is offline   Reply With Quote

Old   March 30, 2024, 02:50
Default
  #764
Senior Member
 
Will Kernkamp
Join Date: Jun 2014
Posts: 316
Rep Power: 12
wkernkamp is on a distinguished road
Quote:
Originally Posted by flotus1 View Post
True in theory, when scattering threads across CCDs. In order to actually see this, quite a bit of effort would be necessary, in order to reduce run-to-run variance.
Additionally, with this many cores, the quality of domain decomposition can have a larger impact than adding/removing a few cores.
I don't know if that is true. He had on his first run a recognizable pattern already each 24 cores.
wkernkamp is offline   Reply With Quote

Old   April 3, 2024, 05:12
Default
  #765
New Member
 
Alexander Kazantcev
Join Date: Sep 2019
Posts: 23
Rep Power: 6
AlexKaz is on a distinguished road
I think the main difference of new Epyc results lies in only huge L3 cache. I saw with /proc/your_process_id/status, that binaries of snappyHexMesh and simpleFoam use about 400 MB of code with libraries. It is no surprising that if all the code lies in cache, then we sometimes are watching the big speed up. Also, as I seeing, the main code is very short, about some MB.
DVSoares likes this.
AlexKaz is offline   Reply With Quote

Old   April 3, 2024, 07:31
Default
  #766
New Member
 
Daniel
Join Date: Jun 2010
Posts: 12
Rep Power: 15
DVSoares is on a distinguished road
Quote:
Originally Posted by AlexKaz View Post
I think the main difference of new Epyc results lies in only huge L3 cache. I saw with /proc/your_process_id/status, that binaries of snappyHexMesh and simpleFoam use about 400 MB of code with libraries. It is no surprising that if all the code lies in cache, then we sometimes are watching the big speed up. Also, as I seeing, the main code is very short, about some MB.
Fully agree, especially considering how cache memory is faster than system RAM. Intel has a not too technical article on that: https://www.intel.com/content/www/us...-nutshell.html
I’m confident that anyone could find more scientific papers on computer science journals that cover this topic, if needed.
DVSoares is offline   Reply With Quote

Old   April 16, 2024, 16:26
Default
  #767
New Member
 
Marius
Join Date: Sep 2022
Posts: 19
Rep Power: 3
Counterdoc is on a distinguished road
Apple Macbook Pro with M1 Max and 32 GB RAM running the natively compiled Openfoam version 2312.



# cores Wall time (s):
------------------------
8 85.57
6 102.25
4 135.12
2 240.02
1 433.18
wkernkamp and Crowdion like this.

Last edited by Counterdoc; April 20, 2024 at 17:19.
Counterdoc is offline   Reply With Quote

Old   April 17, 2024, 14:30
Default
  #768
New Member
 
DS
Join Date: Jan 2022
Posts: 13
Rep Power: 4
Crowdion is on a distinguished road
Lenovo ThinkStation P520c, Xeon W-2275 (HT Off), 4 x 32GB DDR4 2666Mhz
OpenFoam2312 (precompiled), Ubuntu 23.10.1, Motorbike_bench_template.tar.gz (default settings)

Meshing (real)
# cores | Meshing Wall time| Solver Wall time(s):
------------------------
1 | 10m5s | 790
2 | 7m13s | 412
4 | 4m10s | 205
6 | 2m57s | 153
8 | 2m26s | 134
12 | 2m2s | 118
14 | 2m15s | 116
wkernkamp likes this.
Crowdion is offline   Reply With Quote

Old   April 18, 2024, 03:38
Default CPU frequency vs. L3 cache
  #769
New Member
 
Jamie
Join Date: Apr 2024
Posts: 1
Rep Power: 0
fishladderguy is on a distinguished road
Hi, I am quite new in CFD (2D-guy, water simulations). I have read this thread and learnt a lot about hardware, thanks everyone. I am going to build a server something like 2x Epyc (used) and 16x16 RAM. Question is should I prefer CPU frequency or L3 cache if other specs are about similar? Like Epyc 7532 (2.4 GHz/256 Mb) vs. Epyc 7542 (2,9 GHz/128 Mb). Or is there any notable difference? Thanks
fishladderguy is offline   Reply With Quote

Old   April 19, 2024, 18:48
Default
  #770
Senior Member
 
Will Kernkamp
Join Date: Jun 2014
Posts: 316
Rep Power: 12
wkernkamp is on a distinguished road
Quote:
Originally Posted by fishladderguy View Post
Hi, I am quite new in CFD (2D-guy, water simulations). I have read this thread and learnt a lot about hardware, thanks everyone. I am going to build a server something like 2x Epyc (used) and 16x16 RAM. Question is should I prefer CPU frequency or L3 cache if other specs are about similar? Like Epyc 7532 (2.4 GHz/256 Mb) vs. Epyc 7542 (2,9 GHz/128 Mb). Or is there any notable difference? Thanks
I recommend you go for cache because it helps memory performance which is critical for CFD. In addition, more cache usually means more chiplets. Each chiplet contributes it's own infiniti fabric lanes and memory channels. Flotus is the expert on the Epycs. Check with him if I am right to prefer the Epyc 7532.
wkernkamp is offline   Reply With Quote

Old   April 20, 2024, 08:55
Default
  #771
New Member
 
DS
Join Date: Jan 2022
Posts: 13
Rep Power: 4
Crowdion is on a distinguished road
HPE DL360 Gen9 , 2 x E5-2643 v4 (HT Off), 16 x 16GB DDR4 2400Mhz (operates at 2133MHz, measured bandwidth (Intel MLC) is 105 GB/s)
OpenFoam2312 (precompiled), Ubuntu 23.10.1, Motorbike_bench_template.tar.gz (default settings)

Meshing (real)
# cores | Meshing Wall time| Solver Wall time(s):
------------------------
1 | 13m14s | 1000
2 | 8m17s | 479
4 | 4m40s | 219
6 | 3m26s | 156
8 | 2m50s | 122
12 | 2m34s | 94

P.S.

I tested how the number of populated RAM slots affects the actual RAM bandwidth and got some pretty weird results. When the number of RAM modules is 2 modules per 1 memory channel, i.e. total there are 16 2400 Mhz RAM modules installed, the RAM operating frequency is slightly reduced down to 2133 MHz, and the measured actual throughput is 105 GB/s. While when installing 1 RAM module per memory channel, i.e. total 8 RAM modules are installed, the RAM operating frequency is 2400 MHz, and the measured actual throughput is 103 GB/s.. That is, as the RAM frequency increases, the throughput decreases. This is quite a strange behavior.

Last edited by Crowdion; April 20, 2024 at 12:52.
Crowdion is offline   Reply With Quote

Old   April 20, 2024, 16:00
Default
  #772
Senior Member
 
Will Kernkamp
Join Date: Jun 2014
Posts: 316
Rep Power: 12
wkernkamp is on a distinguished road
Quote:
Originally Posted by Crowdion View Post
HPE DL360 Gen9 , 2 x E5-2643 v4 (HT Off), 16 x 16GB DDR4 2400Mhz (operates at 2133MHz, measured bandwidth (Intel MLC) is 105 GB/s)


I tested how the number of populated RAM slots affects the actual RAM bandwidth and got some pretty weird results. When the number of RAM modules is 2 modules per 1 memory channel, i.e. total there are 16 2400 Mhz RAM modules installed, the RAM operating frequency is slightly reduced down to 2133 MHz, and the measured actual throughput is 105 GB/s. While when installing 1 RAM module per memory channel, i.e. total 8 RAM modules are installed, the RAM operating frequency is 2400 MHz, and the measured actual throughput is 103 GB/s.. That is, as the RAM frequency increases, the throughput decreases. This is quite a strange behavior.
The available bandwidth for your 8 channel system is:
DDR4-2133: 8*2133*8=136.5 GB/s
DDR4-2400: 8*2400*8=153.6 GB/s

Both your measurements do not reach the true limit. On a dual socket system, the measurement can be affected by one cpu reading from memory attached to another through the interconnect. This interconnect has it's own limits and of course a delay.

The nice thing of the openfoam motobike benchmark is that it does (on a properly set up system) have performance proportional to bandwidth. Does that benchmark show a difference in bandwidth between the 2133 and 2400 memory speeds?

I don't remember if on the HP DL360, you can force DDR-2400 speed when two slots are occupied. Most server systems have that option. I have found that on these XEON v1 through v4 systems, the fastest configuration is two dimms per channel of dual rank memory. All dimms must be the same to keep the system symmetric. Unsymmetric memory configurations incur large penalties. One dimm per channel, or single rank memory in one or more channels incurs small penalties.
wkernkamp is offline   Reply With Quote

Old   April 20, 2024, 16:07
Default
  #773
Senior Member
 
Will Kernkamp
Join Date: Jun 2014
Posts: 316
Rep Power: 12
wkernkamp is on a distinguished road
Quote:
Originally Posted by Crowdion View Post
HPE DL360 Gen9 , 2 x E5-2643 v4 (HT Off), 16 x 16GB DDR4 2400Mhz (operates at 2133MHz, measured bandwidth (Intel MLC) is 105 GB/s)
OpenFoam2312 (precompiled), Ubuntu 23.10.1, Motorbike_bench_template.tar.gz (default settings)

Meshing (real)
# cores | Meshing Wall time| Solver Wall time(s):
------------------------
1 | 13m14s | 1000
2 | 8m17s | 479
4 | 4m40s | 219
6 | 3m26s | 156
8 | 2m50s | 122
12 | 2m34s | 94
My fastest Xeon v4 system is in the thread here: OpenFOAM benchmarks on various hardware

It has this performance for comparison with your run:
Flow Calculation:
1 924.05
2 483.68
4 214.54
8 113.42
12 85.05

Your results is already looking good! Note that that run has two 16 core cpus that each have a proportionally larger cache, which helps memory access. So, I don't think you will be able to reach 85.05 on your system with E5-2643 v4 cpus.
wkernkamp is offline   Reply With Quote

Old   April 20, 2024, 19:37
Default
  #774
New Member
 
DS
Join Date: Jan 2022
Posts: 13
Rep Power: 4
Crowdion is on a distinguished road
Quote:
Originally Posted by wkernkamp View Post
The available bandwidth for your 8 channel system is:
DDR4-2133: 8*2133*8=136.5 GB/s
DDR4-2400: 8*2400*8=153.6 GB/s

Both your measurements do not reach the true limit. On a dual socket system, the measurement can be affected by one cpu reading from memory attached to another through the interconnect. This interconnect has it's own limits and of course a delay.

The nice thing of the openfoam motobike benchmark is that it does (on a properly set up system) have performance proportional to bandwidth. Does that benchmark show a difference in bandwidth between the 2133 and 2400 memory speeds?

I don't remember if on the HP DL360, you can force DDR-2400 speed when two slots are occupied. Most server systems have that option. I have found that on these XEON v1 through v4 systems, the fastest configuration is two dimms per channel of dual rank memory. All dimms must be the same to keep the system symmetric. Unsymmetric memory configurations incur large penalties. One dimm per channel, or single rank memory in one or more channels incurs small penalties.
Yes, I was surprised, that the actual peak bandwidth(BW) of my system was considerably lower than the your mentioned theoretical peak values for 2133 and 2400 MHz. Actually, 105GB/s for dual CPU config corresponds to DDR3 rates.
HP declares, that 2 DIMM/channel operates at 2133MHz, and 2400MHz at 1 DIMM/channel.



My DL360g9 has installed HP certified HPE 809082-091 single rank RAM.

I have removed 8 DIMMs to make 1 DIMM/channel configuration and rerun the benchmark and get the following results:

HPE DL360 Gen9 , 2 x E5-2643 v4 (HT Off), 8 x 16GB DDR4 2400Mhz (operates at 2400MHz)
OpenFoam2312 (precompiled), Ubuntu 23.10.1, Motorbike_bench_template.tar.gz (default settings)

Meshing (real)
# cores | Meshing Wall time| Solver Wall time(s):

"8 x 16GB" conf. | "16x 16GB" conf.
........"2400MHz" | "2133MHz"
--------------------------------------------------
1 | 12m2s | 902 | 13m14s | 1000
2 | 8m10s | 471 | 8m17s | 479
4 | 4m35s | 221 | 4m40s | 219
6 | 3m32s | 160 | 3m26s | 156
8 | 2m45s | 130 | 2m50s | 122
12 | 2m34s |115| 2m34s | 94


The single core performance is better for "8 x 16GB" config, whereas the multithread performance is better for "16 x 16GB" config. Hmm, a very strange situation.

I found in the Reddit thread reported measured BW values in different systems. It is reported ~138GB/S peak BW value for 2 x E5- 2683v4 (Supermicro X10DRG-OT+-CPU, 8x32GB DDR4 2400 Samsung RAM), which is much closer the theoretical peak BW value of 153.6 GB/s than mine measured
Crowdion is offline   Reply With Quote

Old   April 20, 2024, 20:32
Default
  #775
Senior Member
 
Will Kernkamp
Join Date: Jun 2014
Posts: 316
Rep Power: 12
wkernkamp is on a distinguished road
Quote:
Originally Posted by Crowdion View Post
Yes, I was surprisedHPE DL360 Gen9 , 2 x E5-2643 v4 (HT Off), 8 x 16GB DDR4 2400Mhz (operates at 2400MHz)
OpenFoam2312 (precompiled), Ubuntu 23.10.1, Motorbike_bench_template.tar.gz (default settings)

Meshing (real)
# cores | Meshing Wall time| Solver Wall time(s):

"8 x 16GB" conf. | "16x 16GB" conf.
........"2400MHz" | "2133MHz"
--------------------------------------------------
1 | 12m2s | 902 | 13m14s | 1000
2 | 8m10s | 471 | 8m17s | 479
4 | 4m35s | 221 | 4m40s | 219
6 | 3m32s | 160 | 3m26s | 156
8 | 2m45s | 130 | 2m50s | 122
12 | 2m34s |115| 2m34s | 94


The single core performance is better for "8 x 16GB" config, whereas the multithread performance is better for "16 x 16GB" config. Hmm, a very strange situation.

I found in the Reddit thread reported measured BW values in different systems. It is reported ~138GB/S peak BW value for 2 x E5- 2683v4 (Supermicro X10DRG-OT+-CPU, 8x32GB DDR4 2400 Samsung RAM), which is much closer the theoretical peak BW value of 153.6 GB/s than mine measured
This is a fun puzzle!

I think that you need at least two ranks per channel to reach maximum throughput, because the controller alternates addressing these ranks for better performance. It is called rank interleaving I think.

Your best option is to go in the bios and see if you can force 2400 for the two-dimms-per-channel config. You may already have that setting, because the 94 sec result is pretty decent. Just checked and the e5-2643 v4 xeon has an unusually large cache of 20M. That is more than the 2.5xNcores which is the norm. So that is one reason it is doing so well.

If this is a machine you are going to use for a CFD project, you should look into getting a higher core count processor. They are cheap. Note that a bios upgrade is sometimes needed before the newer higher core count cpus work. I have had that problem. I have a pair of E5-2683 v4, which is the lower clocked 16-core cpu (versus the E5-2697A for which I showed the result). The 2683 will do the benchmark in 64 seconds, so not much slower. I also have the 18 core E5-2686 v4. Don't remember it's performance, probably 62 seconds. The Gygabyte motherboard has better memory performance for the same cpu. It has the ability to run two dimms at 2400 per channel as a feature per the manual.

Just looked on Ebay and the E5-2683 v4 is on offer for $25.
wkernkamp is offline   Reply With Quote

Old   April 21, 2024, 05:04
Default
  #776
New Member
 
Marius
Join Date: Sep 2022
Posts: 19
Rep Power: 3
Counterdoc is on a distinguished road
I am currently looking for a second hand server system. I found some offers on ebay for the Intel E7-8880 in different versions.


4x Intel Xeon E7-8880 v4 - 22 core @ 2.2 GHz base / 3.3 GHz turbo - DDR4 1866 MHz - 55 MB cache

4x Intel Xeon E7-8880 v3 - 18 core @ 2.2 GHz base / 3.1 GHz turbo - DDR4 1866 MHz - 45 MB cache

8x Intel Xeon E7-8880 v2 - 15 core @ 2.5 GHz base / 3.1 GHz turbo - DDR3 1600 MHz - 37,5 MB cache




Of course the v2 is the cheapest. It would also have the most cores. v2 = 120 cores, v3 = 72 cores, v4 = 88 cores.


I think v3 makes no sense as the prices are similar to v4 systems. But how would v2 compare with v4 when there is such a big difference in number of cores? Could DDR3 or the cache be the bottleneck?
Counterdoc is offline   Reply With Quote

Old   April 21, 2024, 14:01
Default
  #777
Senior Member
 
Will Kernkamp
Join Date: Jun 2014
Posts: 316
Rep Power: 12
wkernkamp is on a distinguished road
Quote:
Originally Posted by Counterdoc View Post
I am currently looking for a second hand server system. I found some offers on ebay for the Intel E7-8880 in different versions.

4x Intel Xeon E7-8880 v4 - 22 core @ 2.2 GHz base / 3.3 GHz turbo - DDR4 1866 MHz - 55 MB cache

4x Intel Xeon E7-8880 v3 - 18 core @ 2.2 GHz base / 3.1 GHz turbo - DDR4 1866 MHz - 45 MB cache

8x Intel Xeon E7-8880 v2 - 15 core @ 2.5 GHz base / 3.1 GHz turbo - DDR3 1600 MHz - 37,5 MB cache

Of course the v2 is the cheapest. It would also have the most cores. v2 = 120 cores, v3 = 72 cores, v4 = 88 cores.

I think v3 makes no sense as the prices are similar to v4 systems. But how would v2 compare with v4 when there is such a big difference in number of cores? Could DDR3 or the cache be the bottleneck?

The number of cores is less important than the total bandwidth. If you search "intel ark E7-8880" you will find that all three processors have the same 85 GB/s bandwidth. This means that the eight processor v2 system will have almost twice the performance of the other two with only four processors provided the memory is configured correctly. The E7 v2, v3 and v4 have a special memory controller "Jordan Creek" that allows two DDR 1333 dimms to act as one DDR 2666 unit. The bandwidth is then 2666*4*8/1000=85 GB/s. That is the speed limit for each of these cpus. The higher speeds DDR4 that the v3 and v4 allow are only useful when there is just one dimm per channel. You need eight dimms per processor to reach the bandwidth. These dimms are not expensive if you need to buy more.



The Xeon cpus get progressively more fuel efficient. So if power consumption is a concern, you should go for the v4 system.


Once you have bought your system, run the benchmark an post your result here. This is a good check to see if you have everything working correctly.
bigphil and Counterdoc like this.
wkernkamp is offline   Reply With Quote

Old   April 23, 2024, 14:16
Default
  #778
New Member
 
Marius
Join Date: Sep 2022
Posts: 19
Rep Power: 3
Counterdoc is on a distinguished road
Quote:
Originally Posted by wkernkamp View Post
The number of cores is less important than the total bandwidth. If you search "intel ark E7-8880" you will find that all three processors have the same 85 GB/s bandwidth. This means that the eight processor v2 system will have almost twice the performance of the other two with only four processors provided the memory is configured correctly. The E7 v2, v3 and v4 have a special memory controller "Jordan Creek" that allows two DDR 1333 dimms to act as one DDR 2666 unit. The bandwidth is then 2666*4*8/1000=85 GB/s. That is the speed limit for each of these cpus. The higher speeds DDR4 that the v3 and v4 allow are only useful when there is just one dimm per channel. You need eight dimms per processor to reach the bandwidth. These dimms are not expensive if you need to buy more.



The Xeon cpus get progressively more fuel efficient. So if power consumption is a concern, you should go for the v4 system.


Once you have bought your system, run the benchmark an post your result here. This is a good check to see if you have everything working correctly.



Thanks a lot for the explanation and recommendation! What about the SSDs? I don't want to have a bottleneck there either.
Counterdoc is offline   Reply With Quote

Old   April 23, 2024, 16:56
Default
  #779
Senior Member
 
Will Kernkamp
Join Date: Jun 2014
Posts: 316
Rep Power: 12
wkernkamp is on a distinguished road
Quote:
Originally Posted by Counterdoc View Post
Thanks a lot for the explanation and recommendation! What about the SSDs? I don't want to have a bottleneck there either.
You did not ask about SSD's before. In general, the disk storage is not a big factor because the data gets cached in RAM. So repeated reads from disk are read from RAM instead. Then there is the fact that these old servers typically come with some kind of SAS card. Those cards can be used in raid configs that have high speed read and write, or better, you can just attach SSD's to them. That is more economical because they use almost no power when not in use and have way faster random access.

What I tend to do is set up redundant zfs with the HDD's (cheap storage) and SSDs as cache disk and for keeping the log. For this purpose your old 128GB SSD's are perfect. The problem of the old server HDDs is that they are often end of life. With zfs you can correct disk errors when they occur. Mostly the old server SAS HDD's are fine, but there is an occasional failure that won't hurt if you run zfs. (Hardware raid is not as good for that.)

If you don't need a lot of storage you can just use one or two 1TB SSD's. Two mirrored disks read twice as fast and obviously have redundancy.

If you are looking to use nvme drives, you might be better of with a v3 or v4 system, because their bios usually allows boot from nvme and pcie splitting. My quanta grid server has an nvme slot on the motherboard. There are cheap pciex16 cards that allow say 4 nvme SSDs to be run of the pcie16x4 slot when split into 4xpciex4 chunks. Of course you can use specialized cards that handle multiple nvme and M.2 SATA drives on a pcie slot that has not been split.

The bios on v2 systems can sometimes be modified to allow boot from nmve. I have successfully done this on a Supermicro server. This server bios now also allows the pcie splitting.
wkernkamp is offline   Reply With Quote

Reply


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
How to contribute to the community of OpenFOAM users and to the OpenFOAM technology wyldckat OpenFOAM 17 November 10, 2017 15:54
UNIGE February 13th-17th - 2107. OpenFOAM advaced training days joegi.geo OpenFOAM Announcements from Other Sources 0 October 1, 2016 19:20
OpenFOAM Training Beijing 22-26 Aug 2016 cfd.direct OpenFOAM Announcements from Other Sources 0 May 3, 2016 04:57
New OpenFOAM Forum Structure jola OpenFOAM 2 October 19, 2011 06:55
Hardware for OpenFOAM LES LijieNPIC Hardware 0 November 8, 2010 09:54


All times are GMT -4. The time now is 09:40.