NONMEM on a SGI R8000

NONMEM Topic 33

Keywords: SGI, Power Challenge


Topic started by: Lutz Harnisch (harnisch@pollux.zedat.fu-berlin.de) - 7 Dec 94

We would like to run NONMEM on a SGI Power Challenge machine with MIPS R8000 Processors and need some hints for compiling. What are the machine constants in BLKDAT? Is it advisable to run NONMEM in double or single precision mode, since this machine has a true 64 bit architecture? Are there any compiler flags to generate optimized code for this machine?

Response from: Ferdie Rombout (romboutf@oios13.oss.akzonobel.nl) - 8 Dec 94

We wanted to buy the R8000 challenge from SGI. Before buying we wanted to have a benchmark to see how the performance compared to our R4400 150 Mhz. It turned out that the R8000 did not deliver the performance we (and SGI) expected. So SGI had someone look at the sources and discovered that it was no floating point dependent source. It was more coupled to the integer performance. Then we had a look at the R4400 200 Mhz, and found our run speeds improved two-fold. This was explained by the bigger secondary cache (R4400 150 Mhz: 1 Mb, R4400 200 Mhz: 4 Mb) and by the higher specsint of this machine compared to the R8000 and R4400 150 Mhz Beside this, on SGI machines it can be important to adjust the size of the buffe When looking with a profiler NONMEM spends most time doing freads. This is done for a big part to NONMEM FILE10. This has increased our run times almost with a factor of two, just by adjusting NONMEM buffers. After adjusting the buffers freads decreased from more than 35 % to less than 1%. This is probably dependent on secondary cache since the performance got worse on a R3000 (no 1 Mb sec. cache). We have ordered the R4400 200 Mhz with two processors, our next steps will be optimizing the NONMEM source and parallel processing.

Response from: Ian Dillon - (dillon@mable.indianapolis.sgi.com) - 8 Dec 94

I've CC'd everyone in your email, to expedite this request. > We would like to run NONMEM on a SGI Power Challenge machine with MIPS ... > this machine? Here are the most aggressive optimization options I've used to benchmark NONMEM on a single R8000 Power Challenge (from my makefile): BIN = nonmem FFLAG1 = -c -O3 -GCM:aggressive_speculation=ON:array_speculation=ON FFLAG2 = -Wk,-o=0,-so=0,-r=3 -OPT:roundoff=3:IEEE_arithmetic=3:fast_sqrt=on:fast_exp=on LFLAGS = -lfastm F77 = f77, You can play around with the KAP (-Wk,-o=0,-so=0,-r=3) options if need be. Also, if you want to turn on parallelization, tack on the -PFA switches.

End of Topic - 30 May 95