From: "Rombout, Dr. Ferdinand" 
Subject: [NMusers] NONMEM runtimes
Date: Mon, 29 Apr 2002 14:36:14 +0200
 We investigated the runtimes of a standardized run on several computer 
 systems and operating systems.
 We found that the runtime was linearly related to the declared 
 SPECint2000 (not to others like Mhz or SpecFP) of the processor and 
 almost independent of operating system and compiler if the most 
 agressive optimization was used. We tested Silicon Graphics under Irix 
 as operating system, and Intel/AMD with W2K, XP and Linux. Intel/AMD 
 processors tested ranged from a PII at 400 Mhz to a PIV at 2 Ghz and 
 Sgi was a 250Mhz R10K. We however had an important but unsolved 
 problem under Linux above a Processor speed of around 1Ghz we saw no 
 further improvement in runtime although we changed the processor type 
 AMD vs Intel, Linux distribution, compiler brand, DDR versus Rambus, 
 and NONMEM buffersize.
 So we saw a 42% decrease in runtime when switching a 1Ghz PIII to a 2 
 Ghz PIV which is close to the increase in SPECINT of 46%. Under Linux 
 our runtime stayed about the same.
 Any suggestions as for an explanation?
 Ferdie Rombout

From: "Joern Loetsch" 
Subject: Re: [NMusers] NONMEM runtimes
Date: Mon, 29 Apr 2002 
Any chance to get the results ? :-))
Is Intel or AMD better? Is Windows or Unix better ?

Jorn Lotsch, MD
pharmazentrum frankfurt, Dept. of Clinical Pharmacology
Johann Wolfgang Goethe-University Hospital
Theodor-Stern-Kai 7
D-60590 Frankfurt am Main


From: Nick Holford [] 
Subject: Re: [NMusers] NONMEM runtimes
Date: Monday, 29. April 2002 


It seems you have good evidence that the bottleneck is not CPU or RAM so it
would seem likely that it is the disk I/O. There are a variety of parameters
in NSIZES that control buffers for the data that might be changed to enhance
performance. There were some suggestions made for SGI Iris systems (see
NONMEM repository) that may help.

Nick Holford, Divn Pharmacology & Clinical Pharmacology University of
Auckland, 85 Park Rd, Private Bag 92019, Auckland, New Zealand tel:+64(9)373-7599x6730 fax:373-7556


From: "Rombout, Dr. Ferdinand" 
Subject: Re: [NMusers] NONMEM runtimes
Date: Mon, 29 Apr 2002 14:36:14 +0200

I know about the i/o, since I was the person who did all this with SGI. We
have tried the buffers so we would have mainly disk i/o and less in memory
since we first thought it was the Rambus. Since the problem also occurred at
DDRam, we changed the buffers to the optimal so it would not involve any
(almost) disk i/o. This did not solve the problem on Linux. On Windows the
runtimes decreases 2 minutes from 14 to 12 minutes when the buffers were set


Subject: Re: [NMusers] NONMEM runtimes
Date: Mon, 29 Apr 2002 15:17:46 +0200

Dear all,

I don't think that disk I/O is the main problem here. As far as I understand
the hardware used for the (final) tests with Linux and Windows was (nearly?)
the same - so it seems like a compiler problem - especially by mentioning
"most agressive optimization".

All newer Windows compilers are able to optimize the executable for newer
CPUs (as PIV or Athlon XP) which use the "Streaming SIMD Extension 2
(SSE2)". These optimization results in a significant decrease in the runtime
(especially compared to the SSE version of a PIII).

Correct me if I'm wrong: I assume that with Linux the range of "standard"
f77 / g77 compilers were used for the tests - with no support for SSE2
(sometimes even no SSE support).

Therefore I would suggest to do some tests with the Intel Fortran Compiler
6.0 wich is available also for Linux. The options available in this versions
will (surprisingly) work also on AMD CPUs.

By the way: it would be really nice to see some data of the test


Dirk Zeumer


From: José Javier Zarate 
Subject: Re: AW: [NMusers] NONMEM runtimes
Date: Mon, 29 Apr 2002 23:17:54 +0200

Dear all

I think that it is possible that you haven't activated your 32 bit
access in your Linux installation (if I recall correctly some Linux
distributions don't activate it by default) also try to check if you
have DMA activated (it also improves disk performance)

SSE and SSE2 are SIMD instruction sets that are only related to FPU work
and the improvements described in the initial message of this thread
appear to be related mainly to INT performance. Anyway Intel compilers
increase performance of most applications (even for AMD processors if
you don't apply specific P-4 optimizations)

I tried the SIMD options on DEC Alpha microprocessors with Compa (DEC)
compilers with not too much success. When changing some parameters I
managed to improve performance by reducing disk usage.

Take into account that Linux appears not to be too P-4 friendly (don't
know the reasons)

I think that this is a problem related to Linux behaviour and not
compiler choice.

I suspect that NONMEM would need to be substantially rewritten in order
to get the benefits of using modern compilers and CPUs.

Intel compilers haven't been listedas reccomended compilers for NONMEM.
Other programs tested with V5.xx (e.g. POVRAY) caused erroneus results
when some optimizations were applied: You should be careful with your
results if you apply aggressive optimizations.

Good luck

JJ Zarate
Departamento de Compras
Clínica Universitaria de Navarra