How to make a code parallel?

Bruno Machado · May 19, 2016, 06:35

Hi everyone,

I've read the manual and I could not gather enough information to understand how to turn a serial code into a code parallel. In my UDF (1 header + 3 source files), I have several functions, source terms, 1 define adjust, 1 execute at the end, 1 define init. But I do not know hot to target this problem. What has to be turned into parallel? Macros? Functions?

Can anyone give me a quick explanation so I can work on and come up with something?

Regards,
B

`e` · May 20, 2016, 01:30

It depends on the macro and what you're trying to do. For example, the DEFINE_EXECUTE_AT_END macro is called by all compute nodes whereas the DEFINE_SOURCE macro is only called by one compute node at a time and is returned to the Fluent solver. The UDF manual has detailed information on parallelising your UDF codes.

Is there anything you're having trouble with specifically?

Bruno Machado · May 24, 2016, 07:03

Quote:

Originally Posted by `e`

It depends on the macro and what you're trying to do. For example, the DEFINE_EXECUTE_AT_END macro is called by all compute nodes whereas the DEFINE_SOURCE macro is only called by one compute node at a time and is returned to the Fluent solver. The UDF manual has detailed information on parallelising your UDF codes.

Is there anything you're having trouble with specifically?

Sorry for the late reply, got extremely busy lately. And thank you for your comment.

In my code I have the following macros: DEFINE_INIT, DEFINE_ADJUST, DEFINE_SOURCE, DEFINE_DIFFUSIVITY, DEFINE_EXECUTE_AT_THE_END, DEFINE_PROPERTY, DEFINE_ON_DEMAND. I haven't started changing the code to parallel, I am still gathering information to start.

I'll start by my define adjust. This is the structure of my code.

Code:

DEFINE_ADJUST(set_reaction_sources, domain)
{
.
.
.
ct = Lookup_Thread(domain, anode_catalyst_layer_id); 
	begin_c_loop(c,ct)
	{
        // functions are used to compute values to UDM as showed bellow
                C_SPECIE_H2_SRC(c,ct) = - anode_h2_mass_source;
		C_SPECIE_O2_SRC(c,ct) = 0.0;
		C_SPECIE_H2O_SRC(c,ct) = - h2o_vap_to_liq_source;
		C_MASS_SRC(c,ct) = C_SPECIE_H2_SRC(c,ct) + C_SPECIE_O2_SRC(c,ct) + C_SPECIE_H2O_SRC(c,ct);

                /*Energy source*/
		ND_SET(Js[0], Js[1], Js[2], C_UDMI(c,ct,SOLID_PHASE_CURRENT_DENSITY_X), C_UDMI(c,ct,SOLID_PHASE_CURRENT_DENSITY_Y), C_UDMI(c,ct,SOLID_PHASE_CURRENT_DENSITY_Z));
		ND_SET(Jm[0], Jm[1], Jm[2], C_UDMI(c,ct,IONIC_PHASE_CURRENT_DENSITY_X), C_UDMI(c,ct,IONIC_PHASE_CURRENT_DENSITY_Y), C_UDMI(c,ct,IONIC_PHASE_CURRENT_DENSITY_Z));
        }
	end_c_loop(c,ct)
}

This thread loop repeat in many other threads. So, what is the starting point? I am not sure how to start..

I just printed the manual again and I will read it carefully to see if I can extract more information.

Thank you.

`e` · May 24, 2016, 08:01

The DEFINE_ADJUST macro is called on each compute node (and host) at the beginning of each iteration. Your code should work fine in parallel; begin_c_loop loops over all cells on the current compute node and therefore doesn't need adjusting. However, if for example your energy source was a function of the volume of the complete domain (across compute nodes/partitions) then you'd need to communicate between nodes to sum the total volume.

Bruno Machado · May 24, 2016, 09:10

Quote:

Originally Posted by `e`

The DEFINE_ADJUST macro is called on each compute node (and host) at the beginning of each iteration. Your code should work fine in parallel; begin_c_loop loops over all cells on the current compute node and therefore doesn't need adjusting. However, if for example your energy source was a function of the volume of the complete domain (across compute nodes/partitions) then you'd need to communicate between nodes to sum the total volume.

My energy source terms take into account ohmic heat generation, reversible and irreversible generation and phase change. So it does not fall into the description you said.

I defined my source terms as the UDM computed in the DEFINE_ADJUST, like this:

Code:

DEFINE_SOURCE(mass_src, c, ct, dS, eqn)
{
	dS[eqn] = 0.0;
	return C_MASS_SRC(c,ct);
}

So which are the macros I should focus on?

By the way, I have many RP vars... not sure this affect something.

`e` · May 24, 2016, 09:38

UDM is stored for each cell on their respective compute node (no special attention required). A good place to start with parallelising your code is to simply run it in parallel mode and spot the errors (compare results with the serial solver). Generally, the macros should work fine in parallel (unless you need to pass data between nodes which aren't done by default).

Bruno Machado · May 24, 2016, 09:54

Quote:

Originally Posted by `e`

UDM is stored for each cell on their respective compute node (no special attention required). A good place to start with parallelising your code is to simply run it in parallel mode and spot the errors (compare results with the serial solver). Generally, the macros should work fine in parallel (unless you need to pass data between nodes which aren't done by default).

I've done that, and obtained this

1: unable to find rpvar 'fuel-cell/membrane-layer-id'
2: unable to find rpvar 'fuel-cell/membrane-layer-id'
3: unable to find rpvar 'fuel-cell/output-voltage'
0: unable to find rpvar 'fuel-cell/anode-flow-channel-id'
0: unable to find rpvar 'fuel-cell/anode-catalyst-layer-id'
0: unable to find rpvar 'fuel-cell/anode-diffusion-layer-id'
0: unable to find rpvar 'fuel-cell/anode-bipolar-plate-id1'
0: unable to find rpvar 'fuel-cell/anode-bipolar-plate-id2'
1: unable to find rpvar 'fuel-cell/anode-flow-channel-id'
1: unable to find rpvar 'fuel-cell/anode-catalyst-layer-id'
1: unable to find rpvar 'fuel-cell/anode-diffusion-layer-id'
1: unable to find rpvar 'fuel-cell/anode-bipolar-plate-id1'
1: unable to find rpvar 'fuel-cell/anode-bipolar-plate-id2'
2: unable to find rpvar 'fuel-cell/anode-flow-channel-id'
2: unable to find rpvar 'fuel-cell/anode-catalyst-layer-id'
2: unable to find rpvar 'fuel-cell/anode-diffusion-layer-id'
2: unable to find rpvar 'fuel-cell/anode-bipolar-plate-id1'
2: unable to find rpvar 'fuel-cell/anode-bipolar-plate-id2'
3: unable to find rpvar 'fuel-cell/anode-flow-channel-id'
3: unable to find rpvar 'fuel-cell/anode-catalyst-layer-id'
3: unable to find rpvar 'fuel-cell/anode-diffusion-layer-id'
3: unable to find rpvar 'fuel-cell/anode-bipolar-plate-id1'
3: unable to find rpvar 'fuel-cell/anode-bipolar-plate-id2'
MPI Application rank 0 exited before MPI_Finalize() with status 2
0: unable to find rpvar 'fuel-cell/cathode-bipolar-plate-id2'
0: unable to find rpvar 'fuel-cell/membrane-layer-id

I defined my IDs, several constants, parameters as RP vars.

`e` · May 24, 2016, 19:21

Sounds like the rpvars are stored on the host and need to be passed across to the compute nodes. Have a read of the parallel UDF example: "Global Summation of Pressure on a Face Zone and its Area Average Computation" in the UDF manual. They've used host_to_node_int_1() to pass an integer (surface thread ID) across to the compute nodes.

Alternatively, you could specify these constants and parameters as preprocessor directives (at the top of your UDF):

Code:

#define membrane-layer-id 5

This approach is simpler (and slightly quicker computationally; but you won't tell the difference) but you'll need to compile again after each change of IDs.

Bruno Machado · May 25, 2016, 07:59

Quote:

Originally Posted by `e`

Sounds like the rpvars are stored on the host and need to be passed across to the compute nodes. Have a read of the parallel UDF example: "Global Summation of Pressure on a Face Zone and its Area Average Computation" in the UDF manual. They've used host_to_node_int_1() to pass an integer (surface thread ID) across to the compute nodes.

Alternatively, you could specify these constants and parameters as preprocessor directives (at the top of your UDF):

Code:

#define membrane-layer-id 5

This approach is simpler (and slightly quicker computationally; but you won't tell the difference) but you'll need to compile again after each change of IDs.

Hi `e`,

I prefer to use the rpvars to define the ID because I built an interface menu on fluent, so it is easy to change in case someone else builds a mesh.

I've tried using the host_to_node yesterday and couldn't make it work, but today I solved it adding the following part to the INIT, ADJUST, EXECUTE_AT_THE_END

Code:

#if !RP_NODE /* SERIAL or HOST */
	/* Read in zone ID numbers */
	anode_flow_channel_id = RP_Get_Integer("fuel-cell/anode-flow-channel-id");
	anode_catalyst_layer_id = RP_Get_Integer("fuel-cell/anode-catalyst-layer-id");
	anode_diffusion_layer_id = RP_Get_Integer("fuel-cell/anode-diffusion-layer-id");
	anode_bipolar_plate_id1 = RP_Get_Integer("fuel-cell/anode-bipolar-plate-id1");
	anode_bipolar_plate_id2 = RP_Get_Integer("fuel-cell/anode-bipolar-plate-id2");
	cathode_flow_channel_id = RP_Get_Integer("fuel-cell/cathode-flow-channel-id");
	cathode_catalyst_layer_id = RP_Get_Integer("fuel-cell/cathode-catalyst-layer-id");
	cathode_diffusion_layer_id = RP_Get_Integer("fuel-cell/cathode-diffusion-layer-id");
	cathode_bipolar_plate_id1 = RP_Get_Integer("fuel-cell/cathode-bipolar-plate-id1");
	cathode_bipolar_plate_id2 = RP_Get_Integer("fuel-cell/cathode-bipolar-plate-id2");
	membrane_layer_id = RP_Get_Integer("fuel-cell/membrane-layer-id");
#endif /* !RP_NODE */

host_to_node_int_1(anode_catalyst_layer_id);
host_to_node_int_1(anode_diffusion_layer_id);
host_to_node_int_1(anode_bipolar_plate_id1);
host_to_node_int_1(anode_bipolar_plate_id2);
host_to_node_int_1(anode_flow_channel_id);

host_to_node_int_1(cathode_catalyst_layer_id);
host_to_node_int_1(cathode_diffusion_layer_id);
host_to_node_int_1(cathode_bipolar_plate_id1);
host_to_node_int_1(cathode_bipolar_plate_id2);
host_to_node_int_1(cathode_flow_channel_id);

host_to_node_int_1(membrane_layer_id);

#if RP_NODE
#endif /* RP_NODE */

Now, I am facing a problem in the boundary condition. I have 4 files (3 source + 1 header). In the header, I have the definition of my rpvars, in another file I have the boundaries

Code:

/*HEADER FILE*/
#define T0 RP_Get_Real("fuel-cell/operating-temp")
#define V_out RP_Get_Real("fuel-cell/output-voltage")

Code:

/*FUNCTIONS FILE*/
double OVER_POTENTIAL() /* Total voltage loss, based on membrane water/liquid water product */
{
	double V_open;
	
	V_open = 1.229 - 0.846e-3*(T0 - 298.0) + UNIVERSAL_GAS_CONSTANT*T0/(2.0*FARADAY_CONSTANT)*(log(P_a - ANODE_RELATIVE_HUMIDITY()*0.4669) + 0.5*log((P_c - CATHODE_RELATIVE_HUMIDITY()*0.4669)*0.21));
	
	return (V_open - V_out);
}

Code:

/*BOUNDARIES FILE*/

DEFINE_PROFILE(anode_over_potential,t,i) /* boundary condition, total over-potential on the anode surface */
{
  face_t f;

  begin_f_loop(f,t)
    {
      F_PROFILE(f,t,i) = OVER_POTENTIAL();
    }
  end_f_loop(f,t)
}

and I am getting this error

0: unable to find rpvar 'fuel-cell/output-voltage'
3: unable to find rpvar 'fuel-cell/output-voltage'

Any suggestions?

Thanks for your support so far.

`e` · May 25, 2016, 18:00

You'll need to pass the output voltage rpvar from the host to the compute nodes (using a similar host_to_node function for real numbers):

Code:

host_to_node_real_1();

This line of code needs to be executed by both the host and compute nodes, therefore you could add the command in the DEFINE_PROFILE macro outside of the face loop. If, for example, you added this function within the face loop then the host process isn't called (only the relevant compute nodes which have cell faces on the face thread are included).

Bruno Machado · May 26, 2016, 09:05

Quote:

Originally Posted by `e`

You'll need to pass the output voltage rpvar from the host to the compute nodes (using a similar host_to_node function for real numbers):

Code:

host_to_node_real_1();

This line of code needs to be executed by both the host and compute nodes, therefore you could add the command in the DEFINE_PROFILE macro outside of the face loop. If, for example, you added this function within the face loop then the host process isn't called (only the relevant compute nodes which have cell faces on the face thread are included).

I added the line as you suggested and Fluent returned this to me:

..\..\src\fuel_cell_bcs.c(69) : error C2102: '&' requires l-value
..\..\src\fuel_cell_bcs.c(69) : warning C4133: 'function' : incompatible types - from 'char [26]' to 'double *'
..\..\src\fuel_cell_bcs.c(69) : warning C4047: 'function' : 'char *' differs in levels of indirection from 'int'
..\..\src\fuel_cell_bcs.c(69) : warning C4024: 'mphost_to_node_double_1' : different types for formal and actual parameter 2
..\..\src\fuel_cell_bcs.c(69) : error C2198: 'mphost_to_node_double_1' : too few arguments for call

I tried also with double instead of real and the same error was returned. Any suggestions?

`e` · May 26, 2016, 19:08

What is your UDF code? Are you trying to pass along a string or the actual voltage value? Only the host has access to the rpvars; you should be passing the real voltage value to the compute nodes.

Bruno Machado · May 27, 2016, 08:43

Quote:

Originally Posted by `e`

What is your UDF code? Are you trying to pass along a string or the actual voltage value? Only the host has access to the rpvars; you should be passing the real voltage value to the compute nodes.

Code:

/*HEADER FILE*/
#define T0 RP_Get_Real("fuel-cell/operating-temp")
#define V_out RP_Get_Real("fuel-cell/output-voltage")

Code:

/*FUNCTIONS FILE*/
double OVER_POTENTIAL() /* Total voltage loss, based on membrane water/liquid water product */
{
	double V_open;
	
	V_open = 1.229 - 0.846e-3*(T0 - 298.0) + UNIVERSAL_GAS_CONSTANT*T0/(2.0*FARADAY_CONSTANT)*(log(P_a - ANODE_RELATIVE_HUMIDITY()*0.4669) + 0.5*log((P_c - CATHODE_RELATIVE_HUMIDITY()*0.4669)*0.21));
	
	return (V_open - V_out);
}

Code:

/*BOUNDARIES FILE*/

DEFINE_PROFILE(anode_over_potential,t,i) /* boundary condition, total over-potential on the anode surface */
{
  face_t f;

  begin_f_loop(f,t)
    {
      F_PROFILE(f,t,i) = OVER_POTENTIAL();
    }
  end_f_loop(f,t)
}

This is pretty much the structure of my code regarding the V_out term. I aslo have a scheme for a menu on Fluent, where I define several rpvars (V_out included). Some of the other rpvars are used in the profiles without error, despite the fact it is defined in the same way as V_out.

In the mean time while I think in a solution for this, I deleted the V_out of the code and define it as a constant in order to run the code and see if the modifications I did to make it parallel will give me the same results as serial.

If you have any suggestions to properly implement the V_out term, I am looking forward to listen to it.

Thanks for the help so far.

`e` · May 27, 2016, 08:47

The #define preprocessor directive simply replaces T0 with the string of text RP_Get_Real("fuel-cell/operating-temp") throughout your source code.

Quote:

Originally Posted by `e`

Only the host has access to the rpvars; you should be passing the real voltage value to the compute nodes.

Bruno Machado · May 27, 2016, 09:00

Quote:

Originally Posted by `e`

The #define preprocessor directive simply replaces T0 with the string of text RP_Get_Real("fuel-cell/operating-temp") throughout your source code.

Yes, I got it, but it is odd that the value for T0 works fine whereas the value of V_out does not. They are defined exactly the same way.

Either way, Ill try something else and return later with the possible solution.

`e` · May 27, 2016, 09:18

It'd be strange if T0 was working while V_out wasn't as they appear to be constructed in the same way. As mentioned above, one solution could be passing the voltage value to the compute nodes from the host:

Code:

double OVER_POTENTIAL() /* Total voltage loss, based on membrane water/liquid water product */
{
	double V_open;
	real V_out;

	#if !RP_NODE
	V_out = RP_Get_Real("fuel-cell/output-voltage");
	#endif

	host_to_node_real_1(V_out);
	
	V_open = 1.229 - 0.846e-3*(T0 - 298.0) + UNIVERSAL_GAS_CONSTANT*T0/(2.0*FARADAY_CONSTANT)*(log(P_a - ANODE_RELATIVE_HUMIDITY()*0.4669) + 0.5*log((P_c - CATHODE_RELATIVE_HUMIDITY()*0.4669)*0.21));
	
	return (V_open - V_out);
}

May 19, 2016, 06:35	How to make a code parallel?	#1
Bruno Machado Senior Member Bruno Machado Join Date: May 2014 Posts: 271 Rep Power: 13	Hi everyone, I've read the manual and I could not gather enough information to understand how to turn a serial code into a code parallel. In my UDF (1 header + 3 source files), I have several functions, source terms, 1 define adjust, 1 execute at the end, 1 define init. But I do not know hot to target this problem. What has to be turned into parallel? Macros? Functions? Can anyone give me a quick explanation so I can work on and come up with something? Regards, B zlwdml3344 likes this.

May 24, 2016, 19:21		#8
`e` Senior Member Join Date: Mar 2015 Posts: 892 Rep Power: 18	Sounds like the rpvars are stored on the host and need to be passed across to the compute nodes. Have a read of the parallel UDF example: "Global Summation of Pressure on a Face Zone and its Area Average Computation" in the UDF manual. They've used host_to_node_int_1() to pass an integer (surface thread ID) across to the compute nodes. Alternatively, you could specify these constants and parameters as preprocessor directives (at the top of your UDF): Code: #define membrane-layer-id 5 This approach is simpler (and slightly quicker computationally; but you won't tell the difference) but you'll need to compile again after each change of IDs. Bruno Machado likes this.

May 25, 2016, 18:00		#10
`e` Senior Member Join Date: Mar 2015 Posts: 892 Rep Power: 18	You'll need to pass the output voltage rpvar from the host to the compute nodes (using a similar host_to_node function for real numbers): Code: host_to_node_real_1(); This line of code needs to be executed by both the host and compute nodes, therefore you could add the command in the DEFINE_PROFILE macro outside of the face loop. If, for example, you added this function within the face loop then the host process isn't called (only the relevant compute nodes which have cell faces on the face thread are included).

Similar Threads
Thread	Thread Starter	Forum	Replies	Last Post
Running UDF with Supercomputer	roi247	FLUENT	4	October 15, 2015 13:41
wmake - options for parallel code using omp.h	kuczmas	OpenFOAM Programming & Development	3	May 19, 2014 04:15
problem to make a UDF parallel	pilou	Fluent UDF and Scheme Programming	0	March 9, 2011 06:35
Installation OF1.5-dev	ttdtud	OpenFOAM Installation	46	May 5, 2009 02:32
Help: Serial code to parallel but even slower	Zonexo	Main CFD Forum	4	May 14, 2008 10:26

May 20, 2016, 01:30		#2
`e` Senior Member Join Date: Mar 2015 Posts: 892 Rep Power: 18	It depends on the macro and what you're trying to do. For example, the DEFINE_EXECUTE_AT_END macro is called by all compute nodes whereas the DEFINE_SOURCE macro is only called by one compute node at a time and is returned to the Fluent solver. The UDF manual has detailed information on parallelising your UDF codes. Is there anything you're having trouble with specifically?

May 24, 2016, 08:01		#4
`e` Senior Member Join Date: Mar 2015 Posts: 892 Rep Power: 18	The DEFINE_ADJUST macro is called on each compute node (and host) at the beginning of each iteration. Your code should work fine in parallel; begin_c_loop loops over all cells on the current compute node and therefore doesn't need adjusting. However, if for example your energy source was a function of the volume of the complete domain (across compute nodes/partitions) then you'd need to communicate between nodes to sum the total volume.

May 24, 2016, 09:38		#6
`e` Senior Member Join Date: Mar 2015 Posts: 892 Rep Power: 18	UDM is stored for each cell on their respective compute node (no special attention required). A good place to start with parallelising your code is to simply run it in parallel mode and spot the errors (compare results with the serial solver). Generally, the macros should work fine in parallel (unless you need to pass data between nodes which aren't done by default).

May 26, 2016, 19:08		#12
`e` Senior Member Join Date: Mar 2015 Posts: 892 Rep Power: 18	What is your UDF code? Are you trying to pass along a string or the actual voltage value? Only the host has access to the rpvars; you should be passing the real voltage value to the compute nodes.