ERROR: Safe Malloc has detected a null pointer
Hello everyone:
I'm meeting this error as follows: "ERROR: Safe Malloc has detected a null pointer, pointer_size is 56327176 Please ensure that the system has enough memory to run this program. " But the system has enough memory, so what's the problem? Thanks for your reading and It's my pleasure to get your response! |
Quote:
I would still expect this is a memory issue. Can you post more from the end of the log file? Have you checked your memory_usage.out file? This may or may not tell us what's going on with memory around the time of your crash. Can you tell me how much memory you have? Do you have anything occurring around the time of the crash (i.e.: embedding or amr turning on, valves or ports opening, combustion, etc.)? Do you have anything else running in the background? Thanks! - John |
Quote:
Thanks for your reply. 1. The message from the end of the log file will be shown at last. 2. I have checked the mem_usage.out file, but I didn't find some information useful. 3. I don't know the concrete memory, but it's enough to support the subsequent calculations. 4. The embedding mode is PERMANENT, something else didn't occur. 5. Calculations were carried by large clusters server, so there are many other nodes computing different tasks. Message from the end of the log file: JOB ABORT invoked by rank 0: ERROR: Safe Malloc has detected a null pointer, pointer_size is 56327176 Please ensure that the system has enough memory to run this program. ... ... ...(some messages like before) [cli_3]: [cli_10]: [cli_17]: [cli_23]: [cli_26]: [cli_0]: [cli_2]: [cli_4]: [cli_5]: [cli_6]: [cli_7]: [cli_8]: [cli_9]: [cli_11]: aborting job: application called MPI_Abort(MPI_COMM_WORLD, 1) - process 11 [cli_12]: aborting job: application called MPI_Abort(MPI_COMM_WORLD, 1) - process 12 ... ... ...(some messages like before) application called MPI_Abort(MPI_COMM_WORLD, 1) - process 13 yhrun: error: cn395: tasks 0-6,10-14,17-18,25,27: Exited with exit code 1 yhrun: error: cn395: tasks 8-9,15-16,19-24,26: Exited with exit code 1 yhrun: error: cn395: task 7: Exited with exit code 1 |
Quote:
Good morning - Can you tell me more about your simulation? What version of CONVERGE are you using? What are you simulating? Are you using AMR as well as embedding? What is the cell count? When does this error occur: upon startup, near an event, when a file is being written, etc.? In addition, I'm just referring to the nodes you are running on when asking about background processes. If there is something using significant additional memory, that could be a problem. Also, the memory usage file should give information on the total memory being used and the amount used for each processor just before the crash in the last line. Can you check for that and see if anything stands out? High total memory or a spike in memory usage on one of the ranks, possibly? How many cores are you using? Finally, it would be good to know how much memory is available on each node. Thanks, - John |
Quote:
Sorry to reply to you now. This problem has been solved, mainly due to the poor mesh quality caused by a certain part of the cylinder. And it then occures the "Segmentation fault" error, this problem was processed by changing the "Tolerance" to a suitable value. Thank you for your previous help. -LTL |
Quote:
- John |
All times are GMT -4. The time now is 06:13. |