![]() ![]() Now, to implement the above environment variables: Up till 128 threads one of the parallel loops was the hot spot. The result is that the elapsed time drops slowly from 6.5 seconds to about 4.5 at 64 threads, then increases to 5 seconds at 128 and then at 180 I hit the fork barrier. ![]() Both running 10 steps.Īnd all I am looking at is at the elapsed time, cpu time trend and which is the top hotspot. I just checked in VTune the program you sent with the correct first touch initialization, and systematically increased - in Vtune - the omp_num_threads from 4 through 128 and then jump to 180 for both that program and a previous version that did not have the correct first touch initialization. I am going back over the trail of all your recommendations and redoing what I did before to make sure I haven't messed anything up.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |