FACETS/GYRO Topology mappings on Intrepid

As described in the talk given at the August 2009 meeting (see slides paratools-facets-Aug2009.ppt), we analyzed the communication patterns of the constProfile/shortScaling runs of FACETS (using GYRO). We found that the default mapping did not scale well.

First, we tried some alternate template mappings and then a random one. After seeing the random mapping do better than the default, we surmised that by analyzing the communication patterns we could construct a better mapping.

Indeed, the tracking work paid off, the results are seen below. The new mapping file shaves of 21% of the runtime for the 32,768-processes case.


Graph showing the performance of different topologies on Intrepid for FACETS/Gyro


Below is a table showing the timings:

Default mapping (TXYZ) Run 1 Run 2
512 proc 1460 seconds 1460 seconds
1024 proc 1378 seconds 1379 seconds
2048 proc 1379 seconds 1379 seconds
4096 proc 1381 seconds 1381 seconds
8192 proc 1378 seconds 1379 seconds
16384 proc 1472 seconds 1475 seconds
32768 proc 1682 seconds 1679 seconds
XYZTRun1Run2
512 proc 1486 seconds 1485 seconds
1024 proc 1381 seconds 1383 seconds
2048 proc 1321 seconds 1316 seconds
4096 proc 1357 seconds 1362 seconds
8192 proc 1362 seconds 1368 seconds
16384 proc 1394 seconds 1404 seconds
32768 proc 1524 seconds
Random MappingRun1Run2
512 proc 1436 seconds 1438 seconds
1024 proc 1444 seconds 1401 seconds
2048 proc 1310 seconds 1314 seconds
4096 proc 1354 seconds 1359 seconds
8192 proc 1484 seconds 1486 seconds
16384 proc 1479 seconds 1445 seconds
32768 proc 1418 seconds 1388 seconds
"opt" mappingRun1
512 proc 1322 seconds
1024 proc 1328 seconds
2048 proc 1328 seconds
4096 proc 1330 seconds
8192 proc 1331 seconds
16384 proc 1334 seconds
32768 proc 1334 seconds

To use the new mapping, see the attached tarball.

Attachments