CFD Online Logo CFD Online URL
www.cfd-online.com
[Sponsors]
Home > Forums > FLUENT

Problem running fluent with InfiniBand

Register Blogs Members List Search Today's Posts Mark Forums Read

Reply
 
LinkBack Thread Tools Display Modes
Old   March 20, 2009, 16:48
Default Problem running fluent with InfiniBand
  #1
New Member
 
Anonymous
Join Date: Mar 2009
Posts: 4
Rep Power: 9
blackpuma is on a distinguished road
Good evening,

I hope someone can help me. I got a new small Cluster. The First one with Infiniband. Now i try to use fluent with InfiniBand but i got always a failure.

fluent_mpi.6.3.26: Rank 0:10: MPI_Init: dlopen failed: libmtl_common.so: cannot open shared object file: No such file or directory
fluent_mpi.6.3.26: Rank 0:10: MPI_Init: vapi_resolve_entrypoints() failed
fluent_mpi.6.3.26: Rank 0:10: MPI_Init: Can't initialize RDMA device
fluent_mpi.6.3.26: Rank 0:10: MPI_Init: MPI BUG: Cannot initialize RDMA protocol

I can start fluent over Ethernet without any problem.

Where i can get this file which is missing? libmtl_common.so

OS is CentOS 5.2

Good Bye
Blackpuma
blackpuma is offline   Reply With Quote

Old   March 22, 2009, 03:26
Default
  #2
New Member
 
Gilad shainer
Join Date: Mar 2009
Posts: 2
Rep Power: 0
shainer is on a distinguished road
Have you run the subnet manager first for getting the IB network up?
shainer is offline   Reply With Quote

Old   March 22, 2009, 07:13
Default
  #3
New Member
 
Anonymous
Join Date: Mar 2009
Posts: 4
Rep Power: 9
blackpuma is on a distinguished road
OpenSM on the headnode is running. A ibping worked. Have I to install this programm on every node?

Can someon tell me where i can get the file libmtl_common.so? In which paket the file is included?
blackpuma is offline   Reply With Quote

Old   March 23, 2009, 13:13
Default
  #4
New Member
 
Gilad shainer
Join Date: Mar 2009
Posts: 2
Rep Power: 0
shainer is on a distinguished road
You can send email to hpc@mellanox.com, and they will be able to help you. This email is of the HCP Advisory Council help desk (free .. :-) )
shainer is offline   Reply With Quote

Old   August 3, 2009, 14:22
Default
  #5
New Member
 
Join Date: Aug 2009
Posts: 4
Rep Power: 8
Chinmay is on a distinguished road
even i have the same problem with following error

Host spawning Node 0 on machine "cl1n004" (unix).
/home/cfd/FLUENT/Fluent.Inc/fluent6.3.26/bin/fluent -r6.3.26 3ddp -node -alnx86 -t16 -pib -mpi=hp -cnf=parallel -mport 10.0.1.4:10.0.1.4:38940:0
Starting /home/cfd/FLUENT/Fluent.Inc/fluent6.3.26/multiport/mpi/lnx86/hp/bin/mpirun -prot -vapi -e MPI_HASIC_VAPI=1 -e MPI_USE_MALLOPT_SBRK_PROTECTION=1 -e MPI_USE_MALLOPT_AVOID_MMAP=1 -f /tmp/fluent-appfile.25401
fluent_mpi.6.3.26: Rank 0:0: MPI_Init: ERROR: The total amount of memory that may be pinned (3355443 bytes), is insufficient to support even minimal rdma network transfers. This value was derived by taking 20% of physical memory (134217728 bytes) and dividing by the number of local ranks (8). A minimum of 14688256 bytes must be able to be pinned. These values can be changed by setting the environment variables MPI_PIN_PERCENTAGE and MPI_PHYSICAL_MEMORY (Mbytes).
fluent_mpi.6.3.26: Rank 0:0: MPI_Init: Error intializing pin/unpin structures
fluent_mpi.6.3.26: Rank 0:0: MPI_Init: MPI BUG: Cannot initialize RDMA protocol
MPI Application rank 0 exited before MPI_Init() with status 1
fluent_mpi.6.3.26: Rank 0:8: MPI_Init: ERROR: The total amount of memory that may be pinned (3355443 bytes), is insufficient to support even minimal rdma network transfers. This value was derived by taking 20% of physical memory (134217728 bytes) and dividing by the number of local ranks (8). A minimum of 14688256 bytes must be able to be pinned. These values can be changed by setting the environment variables MPI_PIN_PERCENTAGE and MPI_PHYSICAL_MEMORY (Mbytes).
fluent_mpi.6.3.26: Rank 0:8: MPI_Init: Error intializing pin/unpin structures
fluent_mpi.6.3.26: Rank 0:8: MPI_Init: MPI BUG: Cannot initialize RDMA protocol
MPI Application rank 8 exited before MPI_Init() with status 1
fluent_mpi.6.3.26: Rank 0:2: MPI_Init: ERROR: The total amount of memory that may be pinned (3355443 bytes), is insufficient to support even minimal rdma network transfers. This value was derived by taking 20% of physical memory (134217728 bytes) and dividing by the number of local ranks (8). A minimum of 14688256 bytes must be able to be pinned. These values can be changed by setting the environment variables MPI_PIN_PERCENTAGE and MPI_PHYSICAL_MEMORY (Mbytes).
fluent_mpi.6.3.26: Rank 0:2: MPI_Init: Error intializing pin/unpin structures
fluent_mpi.6.3.26: Rank 0:2: MPI_Init: MPI BUG: Cannot initialize RDMA protocol
MPI Application rank 1 killed before MPI_Init() with signal 15
MPI Application rank 2 exited before MPI_Init() with status 1
MPI Application rank 4 killed before MPI_Init() with signal 15
MPI Application rank 6 killed before MPI_Init() with signal 15
MPI Application rank 3 killed before MPI_Init() with signal 15
MPI Application rank 5 killed before MPI_Init() with signal 15
MPI Application rank 7 killed before MPI_Init() with signal 15
fluent_mpi.6.3.26: Rank 0:14: MPI_Init: ERROR: The total amount of memory that may be pinned (3355443 bytes), is insufficient to support even minimal rdma network transfers. This value was derived by taking 20% of physical memory (134217728 bytes) and dividing by the number of local ranks (8). A minimum of 14688256 bytes must be able to be pinned. These values can be changed by setting the environment variables MPI_PIN_PERCENTAGE and MPI_PHYSICAL_MEMORY (Mbytes).
Chinmay is offline   Reply With Quote

Old   August 4, 2009, 01:27
Default
  #6
New Member
 
Anonymous
Join Date: Mar 2009
Posts: 4
Rep Power: 9
blackpuma is on a distinguished road
Good morning Chinmay!

Do you start fluent over Infiniband or Ethernet?

Try to set the hard an soft limit to unlimited. Therefor insert into the file /etc/security/limits.conf the 2 lines:

Code:
.
.
.
*               soft    memlock          unlimited
*               hard    memlock          unlimited
.
.
.
Insert this at all nodes.
blackpuma is offline   Reply With Quote

Old   August 4, 2009, 12:50
Default
  #7
New Member
 
Join Date: Aug 2009
Posts: 4
Rep Power: 8
Chinmay is on a distinguished road
hi
Thanks for your help
I am trying to start fluent on Infiniband.
The hard and soft limits are already set to unlimited
Chinmay is offline   Reply With Quote

Old   August 4, 2009, 13:51
Default
  #8
New Member
 
Anonymous
Join Date: Mar 2009
Posts: 4
Rep Power: 9
blackpuma is on a distinguished road
Are all Infiniband devices Active?

try ibstat

Code:
CA 'mlx4_0'
    CA type: MT25418
    Number of ports: 2
    Firmware version: 2.5.0
    Hardware version: a0
    Node GUID: 0x001e0bffff8446a4
    System image GUID: 0x001e0bffff8446a7
    Port 1:
        State: Active
        Physical state: LinkUp
        Rate: 20
        Base lid: 5
        LMC: 0
        SM lid: 1
        Capability mask: 0x02510868
        Port GUID: 0x001e0bffff8446a5
    Port 2:
        State: Down
        Physical state: Polling
        Rate: 10
        Base lid: 0
        LMC: 0
        SM lid: 0
        Capability mask: 0x02510868
        Port GUID: 0x001e0bffff8446a6
If not:

Have you opensm installed and is it running? It's the subnet manager.
blackpuma is offline   Reply With Quote

Old   August 8, 2009, 06:53
Default
  #9
New Member
 
Join Date: Aug 2009
Posts: 4
Rep Power: 8
Chinmay is on a distinguished road
reply from ibstat:

CA 'mthca0'
CA type: MT25204
Number of ports: 1
Firmware version: 1.2.0
Hardware version: a0
Node GUID: 0x0008f1040397e9f0
System image GUID: 0x0008f1040397e9f3
Port 1:
State: Active
Physical state: LinkUp
Rate: 10
Base lid: 2
LMC: 0
SM lid: 1
Capability mask: 0x02510a68
Port GUID: 0x0008f1040397e9f1

I can run fluent using ethernet but not with infiniband
Chinmay is offline   Reply With Quote

Old   August 8, 2009, 06:58
Default
  #10
New Member
 
Join Date: Aug 2009
Posts: 4
Rep Power: 8
Chinmay is on a distinguished road
Initially hpmpi was not installed, I have installed it now (ver. 2.3.1.), I installed it on the master node and two other nodes also, still couldn't float run using infiniband.
Chinmay is offline   Reply With Quote

Old   August 28, 2011, 01:16
Default
  #11
New Member
 
Join Date: Dec 2010
Posts: 2
Rep Power: 0
Stone is on a distinguished road
Quote:
Originally Posted by Chinmay View Post
even i have the same problem with following error

Host spawning Node 0 on machine "cl1n004" (unix).
/home/cfd/FLUENT/Fluent.Inc/fluent6.3.26/bin/fluent -r6.3.26 3ddp -node -alnx86 -t16 -pib -mpi=hp -cnf=parallel -mport 10.0.1.4:10.0.1.4:38940:0
Starting /home/cfd/FLUENT/Fluent.Inc/fluent6.3.26/multiport/mpi/lnx86/hp/bin/mpirun -prot -vapi -e MPI_HASIC_VAPI=1 -e MPI_USE_MALLOPT_SBRK_PROTECTION=1 -e MPI_USE_MALLOPT_AVOID_MMAP=1 -f /tmp/fluent-appfile.25401
fluent_mpi.6.3.26: Rank 0:0: MPI_Init: ERROR: The total amount of memory that may be pinned (3355443 bytes), is insufficient to support even minimal rdma network transfers. This value was derived by taking 20% of physical memory (134217728 bytes) and dividing by the number of local ranks (8). A minimum of 14688256 bytes must be able to be pinned. These values can be changed by setting the environment variables MPI_PIN_PERCENTAGE and MPI_PHYSICAL_MEMORY (Mbytes).
fluent_mpi.6.3.26: Rank 0:0: MPI_Init: Error intializing pin/unpin structures
fluent_mpi.6.3.26: Rank 0:0: MPI_Init: MPI BUG: Cannot initialize RDMA protocol
MPI Application rank 0 exited before MPI_Init() with status 1
fluent_mpi.6.3.26: Rank 0:8: MPI_Init: ERROR: The total amount of memory that may be pinned (3355443 bytes), is insufficient to support even minimal rdma network transfers. This value was derived by taking 20% of physical memory (134217728 bytes) and dividing by the number of local ranks (8). A minimum of 14688256 bytes must be able to be pinned. These values can be changed by setting the environment variables MPI_PIN_PERCENTAGE and MPI_PHYSICAL_MEMORY (Mbytes).
fluent_mpi.6.3.26: Rank 0:8: MPI_Init: Error intializing pin/unpin structures
fluent_mpi.6.3.26: Rank 0:8: MPI_Init: MPI BUG: Cannot initialize RDMA protocol
MPI Application rank 8 exited before MPI_Init() with status 1
fluent_mpi.6.3.26: Rank 0:2: MPI_Init: ERROR: The total amount of memory that may be pinned (3355443 bytes), is insufficient to support even minimal rdma network transfers. This value was derived by taking 20% of physical memory (134217728 bytes) and dividing by the number of local ranks (8). A minimum of 14688256 bytes must be able to be pinned. These values can be changed by setting the environment variables MPI_PIN_PERCENTAGE and MPI_PHYSICAL_MEMORY (Mbytes).
fluent_mpi.6.3.26: Rank 0:2: MPI_Init: Error intializing pin/unpin structures
fluent_mpi.6.3.26: Rank 0:2: MPI_Init: MPI BUG: Cannot initialize RDMA protocol
MPI Application rank 1 killed before MPI_Init() with signal 15
MPI Application rank 2 exited before MPI_Init() with status 1
MPI Application rank 4 killed before MPI_Init() with signal 15
MPI Application rank 6 killed before MPI_Init() with signal 15
MPI Application rank 3 killed before MPI_Init() with signal 15
MPI Application rank 5 killed before MPI_Init() with signal 15
MPI Application rank 7 killed before MPI_Init() with signal 15
fluent_mpi.6.3.26: Rank 0:14: MPI_Init: ERROR: The total amount of memory that may be pinned (3355443 bytes), is insufficient to support even minimal rdma network transfers. This value was derived by taking 20% of physical memory (134217728 bytes) and dividing by the number of local ranks (8). A minimum of 14688256 bytes must be able to be pinned. These values can be changed by setting the environment variables MPI_PIN_PERCENTAGE and MPI_PHYSICAL_MEMORY (Mbytes).

hi,
Have you solve your problem? I encountered the same problem recently, if you solved it, can you help me out of puzzle ,I will appreciate it .
Stone is offline   Reply With Quote

Old   September 20, 2012, 05:35
Default
  #12
Senior Member
 
Join Date: Jun 2011
Posts: 117
Rep Power: 7
mali28 is on a distinguished road
Quote:
Originally Posted by Stone View Post
hi,
Have you solve your problem? I encountered the same problem recently, if you solved it, can you help me out of puzzle ,I will appreciate it .
See the solution below:
http://www.eureka.im/1717.html
mali28 is offline   Reply With Quote

Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
For Nozzle fluent problem Jie FLUENT 17 January 11, 2012 14:44
Problem of import fluent data to ilight fieldview seasoul FLUENT 3 September 9, 2008 22:36
Problem about Fluent on Linux hbinma FLUENT 3 July 6, 2008 10:49
UDF problem caused by various version of Fluent Yurong FLUENT 3 January 15, 2006 11:57
Problem using parallel Fluent Gustavo FLUENT 0 June 27, 2004 23:12


All times are GMT -4. The time now is 18:17.