[Omp] First Touch initialization

Francisco Jesús Martínez Serrano franjesus at gmail.com
Wed Mar 8 03:06:13 PST 2006


How does first-touch initialization work?

According to
http://www.ncsa.uiuc.edu/UserInfo/Consulting/Tips/Memory_Placement.html

Initializing shared arrays in parallel at the very beginning of the program
will distribute the contents of each array according to the access pattern
hence, in NUMA machines access will be much faster since it's local-node.

We have tried it and it works indeed (Intel Fortran compiler v9 on 4-way
Opteron),
but I don't understand why.

According to my C experience, one can do pointer arithmetic and so on (like
in
elem1000=*(mat+1000);), because when an array is allocated, it is allocated
in a contiguous single block. However from the memory map of the kernel,
each node has a contiguous block of addresses, like in:

0·····node0·····1GB|·····node1·····2GB|·····node2·····3GB|·····node3·····4GB

I guess I'm missing something fundamental about NUMA here. Otherwise
an array can't be partitioned among nodes and still being accessible by
the simple pointer method (which I believe is what Fortran does).

Any clues?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.openmp.org/pipermail/omp/attachments/20060308/3fb8cf59/attachment.html


More information about the Omp mailing list