CFD Online Logo CFD Online URL
www.cfd-online.com
[Sponsors]
Home > Forums > Software User Forums > OpenFOAM > OpenFOAM Installation

[OpenFOAM.org] Trouble Compiling OpenFOAM-dev using Intel Compiler 15 for use on Xeon Phi

Register Blogs Community New Posts Updated Threads Search

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
Old   August 18, 2015, 14:32
Default Trouble Compiling OpenFOAM-dev using Intel Compiler 15 for use on Xeon Phi
  #1
New Member
 
A
Join Date: Aug 2015
Posts: 6
Rep Power: 10
foamer123 is on a distinguished road
Hi All,

As the title states, I've been having some trouble getting OpenFOAM-dev to run on the Xeon Phi's MIC architecture. I have successfully compiled and used OpenFOAM-dev using Intel Compiler 15 on the host machine (CentOS7 using x86_64 arch) before. But altering the setup for use on the Phi ("-mmic" options, etc.) has proven difficult. The below issues have surfaced compiling with either "icpc" or "mpiicpc". It doesn't seem to make a difference.

Given the setup I'll lay out below based on the attached files, I have had a mostly successful compilation. The "src" directory compiles with no errors. The only issue in compiling the "applications/utilities" directory is an issue with "setSet". The "applications/solvers" directory successfully compiles the solvers that are of use to me (there are some errors with some of the "reacting foams", "solid foams", etc. but these don't matter to me).

After compilation, the host machine cannot execute the binaries as expected while the Xeon Phi can. Unfortunately, when executing any command (checkMesh, blockMesh, icoFoam, etc.) the only output produced is "Segmentation Fault". There is no other output so I have had some issues troubleshooting the cause. Hopefully, somebody on this forum can provide some insight.

I've attached all the files in my setup that I believe are of interest. "compileSourceMe" is sourced before compilation. "micSourceMe" is sourced to setup the environment on the Phi after compilation. The long list of linked libraries in "wmake/rules/linux64Icc/c++" under "LINKEXE" appears to be necessary. Despite looking in places like "FOAM_LIBBIN" regardless, I would get "cannot find library" errors unless I explicitly defined them like this (this was not necessary when compiling for use on the host machine).

I also used this website (http://machls.cc.oita-u.ac.jp/kenkyu...-0-on-xeon-phi) as a guide for setting up the Third Party applications. This was written for Intel Compiler 14 on CentOS6.5 so it is slightly out of date and I had to make some tweaks. I'm not sure of the necessity of the "CGAL/boost" workaround anymore but this shouldn't be an issue because CGAL is optional, correct? I did not receive any errors relating to boost libraries using this method.

One aside, the current attempts have been with 32bit labels as I have had some issues with scotch trying to compile with 64bit labels (I will most likely need 64bit labels in the future but, one step at a time, right?). Not sure if this is relevant to my current issue or not but figured it was worth mentioning.

Thanks for the help. I put all my various "wmake rules" files into a single text file separated with headers due to the 5 file upload limit. I can upload my Allwmake logs in a separate post as well if needed. If there's anything else I can provide, just let me know.
Attached Files
File Type: txt compileSourceMe.txt (295 Bytes, 8 views)
File Type: txt micSourceMe.txt (307 Bytes, 5 views)
File Type: txt etc-bashrc.txt (7.4 KB, 2 views)
File Type: txt etc-config-settings.sh.txt (18.0 KB, 2 views)
File Type: txt wmake-rules-linux64Icc.txt (3.5 KB, 3 views)
foamer123 is offline   Reply With Quote

Old   August 18, 2015, 14:39
Default
  #2
New Member
 
A
Join Date: Aug 2015
Posts: 6
Rep Power: 10
foamer123 is on a distinguished road
And here are the Allwmake logs.
Attached Files
File Type: txt applications-solversAllwmakeLog.txt (50.1 KB, 2 views)
File Type: txt applications-utilitiesAllwmakeLog.txt (72.5 KB, 3 views)
File Type: txt src-srcAllwmakeLog.txt (19.7 KB, 3 views)
foamer123 is offline   Reply With Quote

Old   August 18, 2015, 15:06
Default
  #3
Retired Super Moderator
 
Bruno Santos
Join Date: Mar 2009
Location: Lisbon, Portugal
Posts: 10,975
Blog Entries: 45
Rep Power: 128
wyldckat is a name known to allwyldckat is a name known to allwyldckat is a name known to allwyldckat is a name known to allwyldckat is a name known to allwyldckat is a name known to all
Greetings foamer123 and welcome to the forum!

I don't have access to a Xeon Phi, so I can't help with that part.
Nonetheless, I do understand the reason for the issues you attached on the second post.
  • In the "solvers" folder the problem was that it seems that you added the dependency "-lreactingTwoPhaseSystem" to "applications/solvers/lagrangian/reactingParcelFoam/simpleReactingParcelFoam".
  • In the "utilities" folder, the problem is this:
    Code:
    catastrophic error: cannot open source file "readline/readline.h"
    You need to install the respective library in your system, e.g.:
    Code:
    yum install readline-devel ncurses-devel
    • Of course the problem is that you probably do have this installed in your system, but not for the Xeon Phi. In which case, edit the Allwmake for setSet, location given by this command:
      Code:
      echo $FOAM_UTILITIES/mesh/manipulation/setSet/Allwmake
      and comment out this block:
      Code:
      if [ -f /usr/include/readline/readline.h ]
      then
          echo "Found <readline/readline.h>  --  enabling readline support."
          export COMP_FLAGS="-DHAS_READLINE"
          export LINK_FLAGS="-lreadline"
      fi
Best regards,
Bruno
__________________
wyldckat is offline   Reply With Quote

Old   August 18, 2015, 16:54
Default
  #4
New Member
 
A
Join Date: Aug 2015
Posts: 6
Rep Power: 10
foamer123 is on a distinguished road
Bruno,

Thanks for the help.
  • You were correct on the solver issue. I'm not sure why I added that dependency in there. The solvers compiled fine without it.
  • I believe you are correct that the setSet issue had to do with the Phi's libraries (or lack thereof) as this was not a problem compiling for the host. Following your suggestion fixed this as well

OpenFOAM now compiles without throwing any visible errors. Unfortunately the segmentation fault issue continues, albeit now with more commands to fail on . You would not happen to have any idea what could cause this on some other architecture you are more familiar with, would you? "Segmentation Fault" is the only output produced (not even the standard OpenFOAM header is read to stdout).

Thanks again!
-A
foamer123 is offline   Reply With Quote

Old   August 18, 2015, 17:13
Default
  #5
New Member
 
A
Join Date: Aug 2015
Posts: 6
Rep Power: 10
foamer123 is on a distinguished road
Some additional information for anyone who stumbles upon this thread. The "dmesg" command when run on the Xeon Phi shows that the "segmentation fault" output is created by a "general protection" error, see below:

Quote:
[1467354.007001] icoFoam[12333] general protection ip:4181d7 sp:7fff4c46fb80 error:0 in icoFoam[400000+c8000]
From Intel's website:

Quote:
Alignment: All memory-based operations must be on properly aligned addresses. Each source-of- memory operand must have an address that is aligned to the number of bytes accessed by the operand. Otherwise a #GP (General Protection) fault will occur. The alignment requirement is dictated by the number of data elements and the type of the data element. For example, if a vector operation needs to access 16 elements of 4-byte (32 bit) single-precision floats, the referenced data elements must be 16x4=64 [number of elements x size of (float)] byte aligned. The Intel® Xeon Phi™ coprocessor memory alignment rules for vector operations are shown in Table 5
Table 5 can be found here: https://software.intel.com/en-us/art...roarchitecture

-A
foamer123 is offline   Reply With Quote

Old   August 18, 2015, 19:17
Default
  #6
Retired Super Moderator
 
Bruno Santos
Join Date: Mar 2009
Location: Lisbon, Portugal
Posts: 10,975
Blog Entries: 45
Rep Power: 128
wyldckat is a name known to allwyldckat is a name known to allwyldckat is a name known to allwyldckat is a name known to allwyldckat is a name known to allwyldckat is a name known to all
Mmm... I re-read your first post and took a look at the environment you load after compiling... my guess is that there might be a library that is being loaded from the main system.

Try running:
Code:
ldd $(which icoFoam)
This will give you list of libraries the application is linked to.

Beyond this, my guess would be for you to take a step back to OpenFOAM 2.3.0 and try out the instructions you mentioned, to try and see if some of the changes you made are being done properly. Hopefully this will help you isolate the origin of the problem.
As adapting to ICC 15... ah, found it, I believe this commit has most of the changes needed: https://github.com/OpenFOAM/OpenFOAM...1957a2c7fb7f08
wyldckat is offline   Reply With Quote

Old   August 19, 2015, 14:34
Default
  #7
New Member
 
A
Join Date: Aug 2015
Posts: 6
Rep Power: 10
foamer123 is on a distinguished road
Bruno,

I followed your suggestion and
Code:
ldd $(which icoFoam)
showed that some libraries were being pulled from /lib64 which I am not sure was correct. I have redirected these to Xeon Phi specific locations (specifically various directories under /usr/linux-k1om-4.7) but it still does not solve the issue. I've attached a log from the ldd command on the modified library paths with "k1om" being the Xeon Phi's architecture designation and "mic" locations being specific to the Phi as well.

I've also attached a log from running
Code:
strace icoFoam
From looking at this you can see that the last location accessed before the segmentation fault was
Code:
etc/cellModels
This was also the only "open" command which returned a value other than 3 (in this case 4). Technically, that value is acceptable and not an error but could still be meaningful I suppose. This result was the same regardless of library path settings.

If there is nothing new this information tells you then I believe you are correct that my next step would be to step back to v2.3.0 and follow the commit changes to make it compatible with the v15 compiler.

If that is the case, it may take me some time, but I will be sure to update the thread for anyone who finds it if I am successful. Thank you again for the help Bruno, you've been very informative.

-A
Attached Files
File Type: txt lddLog.txt (7.9 KB, 1 views)
File Type: zip straceLog.zip (8.5 KB, 3 views)
foamer123 is offline   Reply With Quote

Old   August 19, 2015, 15:49
Default
  #8
Retired Super Moderator
 
Bruno Santos
Join Date: Mar 2009
Location: Lisbon, Portugal
Posts: 10,975
Blog Entries: 45
Rep Power: 128
wyldckat is a name known to allwyldckat is a name known to allwyldckat is a name known to allwyldckat is a name known to allwyldckat is a name known to allwyldckat is a name known to all
You're welcome for the information, but it's too bad the Xeon Phi is so freaking expensive otherwise I would have probably already written build instructions for that thingamabob ... I gotta find out how to sign up for "free/borrowed stuff for open source development"... assuming I can then find the time to use it

Anyway, I can't find anything suspicious on the ldd output.
However, on the strace output there are way too many "No such file or directory" for my taste.
Including something that's worrying me: it's looking for the folder "/root/.OpenFOAM", which implies you're trying to run the solver as root, which might be why the system is blocking you out from using the Phi as root... since that's a risky step which could lead to a crash or critical damage to files needed for the main system to work... but this is just a guess.

The "cellModels" file is a a reference file for knowing how cells are structured, e.g. how points are ordered/related in an hexahedral cell. Weird thing is that this is the only file that doesn't give the message "No such file or directory"... oh, OK, now I get it: the files that weren't read were those that the library loader was looking for from the "LD_LIBRARY_PATH".

Either way, it was able to load the main "controlDict" and "cellModels" files, but it then ended up crashing before being able to open any more files.

I ran strace on my side (normal machine, not a Phi ) and the few lines come after the point where yours crashed:
Code:
open("/etc/localtime", O_RDONLY|O_CLOEXEC) = 4
fstat(4, {st_mode=S_IFREG|0644, st_size=3439, ...}) = 0
fstat(4, {st_mode=S_IFREG|0644, st_size=3439, ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fdd136bd000
read(4, "TZif2\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\v\0\0\0\v\0\0\0\0"..., 4096) = 3439
lseek(4, -2175, SEEK_CUR)               = 1264
read(4, "TZif2\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\v\0\0\0\v\0\0\0\0"..., 4096) = 2175
close(4)                                = 0
munmap(0x7fdd136bd000, 4096)            = 0
stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=3439, ...}) = 0
uname({sys="Linux", node="myMachineName", ...}) = 0
fstat(1, {st_mode=S_IFREG|0664, st_size=55181, ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fdd136bd000
This is for gathering data for the header and then it prints out said header, namely the one the solver usually outputs to let us know it's working.

Therefore, either the crash occurs somewhere in between the two points or it's when it tries to load the local time settings... although this is probably not done directly by OpenFOAM, but probably the time function it calls will need to open this file.

Looking for "Xeon Phi localtime" with Google does give some hits on some weird requirement for making the Phi work as if its an independent machine... not sure if this is only for some specific use scenarios.

It might be necessary for you to build the Debug build of OpenFOAM, for properly getting down to the bottom of the problem (in case everything works fine with 2.3.0), but my guess would be this: have you tried building and using some other software that is known to work with the Xeon Phi?
wyldckat is offline   Reply With Quote

Old   August 20, 2015, 10:55
Default
  #9
New Member
 
A
Join Date: Aug 2015
Posts: 6
Rep Power: 10
foamer123 is on a distinguished road
Bruno,

This is actually my first experience using a Xeon Phi, so it's a bit of a trial by fire.

I do not think the issue is with the /etc/localtime file. It looks like the commands all fail when calling the times function in the previous step (I was mistaken when I said the last call was to open ../OpenFOAM-dev/etc/cellModels. I had missed that times call right at the very end of my strace).

Just to double check, I configured the zoneinfo data referenced by localtime (which, admittedly, was not set up properly on the Phi). But it is now configured and produces the same time zone info as the host machine.

I believe the times command is used to monitor system time devoted to processes (e.g. used by OpenFOAM to determine ExecutionTime and ClockTime when running simulations). Unfortunately, this appears to be a bash built-in which means I cannot strace it to see what it could be accessing on the Phi that could be causing issues. Although, just running times by itself outside of OpenFOAM produces output and does not fail.

It looks like I will have to try compiling v2.3.0 with only those Intel 15 commit changes from the dev implemented and see if this is still an issue or not. The only reason I was using the dev version was because it had been modified for Intel Compiler 15 compatibility. So if a modified 2.3.0 will compile I wouldn't need to go back to the dev version.

Would the debug version of OpenFOAM still be useful if the problem looks to be with a bash built-in? I have not used it before but know the compile option I would need to set before building it.
foamer123 is offline   Reply With Quote

Old   August 20, 2015, 14:03
Default
  #10
Retired Super Moderator
 
Bruno Santos
Join Date: Mar 2009
Location: Lisbon, Portugal
Posts: 10,975
Blog Entries: 45
Rep Power: 128
wyldckat is a name known to allwyldckat is a name known to allwyldckat is a name known to allwyldckat is a name known to allwyldckat is a name known to allwyldckat is a name known to all
Quote:
Originally Posted by foamer123 View Post
I believe the times command is used to monitor system time devoted to processes (e.g. used by OpenFOAM to determine ExecutionTime and ClockTime when running simulations).
Uhm... no, the "times" in strace refers to something else, not the times command that is built into bash.
The "times" line probably refers to this: https://github.com/OpenFOAM/OpenFOAM...argList.C#L534
Code:
    if (initialise)
    {
        string dateString = clock::date();
        string timeString = clock::clockTime();

        // Print the banner once only for parallel runs
        if (Pstream::master() && bannerEnabled)
        {
            IOobject::writeBanner(Info, true)
                << "Build  : " << Foam::FOAMbuild << nl
                << "Exec   : " << argListStr_.c_str() << nl
                << "Date   : " << dateString.c_str() << nl
                << "Time   : " << timeString.c_str() << nl
                << "Host   : " << hostName() << nl
                << "PID    : " << pid() << endl;
        }
If you check the source code for the "clock" class that OpenFOAM uses here: https://github.com/OpenFOAM/OpenFOAM.../clock/clock.C
Code:
Foam::string Foam::clock::date()
{
    std::ostringstream osBuffer;

    time_t t = getTime();
    struct tm *timeStruct = localtime(&t);

    osBuffer
        << monthNames[timeStruct->tm_mon]
        << ' ' << std::setw(2) << std::setfill('0') << timeStruct->tm_mday
        << ' ' << std::setw(4) << timeStruct->tm_year + 1900;

    return osBuffer.str();
}
My guess is that the "getTime" line here is what gives the "times" in strace and that it's the "localtime" call that crashes in the Phi.

If you comment out most of the code within this method "date()" and run Allwmake again, the crash will likely occur later on after this call and you should be able to see the famous descriptive header for OpenFOAM applications when you run it.

A few strategic "cout" calls might also help isolate the point where it breaks, e.g.:
Code:
std::cout << "got here 00" << std::endl;
wyldckat is offline   Reply With Quote

Reply

Tags
icpc, intel compiler, openfoam-dev, segmentation fault, xeon phi


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
Trouble compiling utilities using source-built OpenFOAM Artur OpenFOAM Programming & Development 14 October 29, 2013 10:59
CFX11 + Fortran compiler ? Mohan CFX 20 March 30, 2011 18:56
OpenFOAM 1.5 dev LVDH OpenFOAM 98 May 5, 2010 17:01
OF 1.6 | Ubuntu 9.10 (64bit) | GLIBCXX_3.4.11 not found piprus OpenFOAM Installation 22 February 25, 2010 13:43
Intel compiler for linux x86 marcus Siemens 1 November 9, 2006 16:43


All times are GMT -4. The time now is 18:38.