Subject: Nonmem seems to hang itself/Powers - floating-point exception error
Date: 15 Oct 1997 09:28:26 -0400
Recently we got a problem in which NONMEM did not seem to move anymore. It was frozen after a certain iteration. After having run NONMEM through a 'debugger' (prof), the reason was a floating-point exception error. Fixing the problem as has been defined before in this discussion group by using:
IF (BASE.EQ.0) Q=1
did not solve it.
We than used a little trick on our Silicon Graphics Origin/O2 which was quite simple, add: -lfpe to your compilation options and add the environmental variable TRAP_FPE set it for example to set env TRAP_FPE "ALL=COUNT; UNDERFLOW=0; OVERFLOW=ABORT;DEVZERO=ABORT;INVALID=0" or anything else one can support/defend.
I do not know if this problem also occurs at other systems, but it is worth a try to find a similar solution. The main reason in our problem was underflow and invalid FPE, since the formula above should take care of this, it might be something worth to think about when something strange is happening in an if statement. Could it be internal rounding?
From: firstname.lastname@example.org.EDU (ABoeckmann)
Date: 15 Oct 1997 15:31:13 -0400
Thanks for your very interesting tip on controlling system actions after FPE. I did not quite understand your email. You say you used
set env TRAP_FPE "ALL=COUNT; UNDERFLOW=0;
Did the computer then stop immediately with the FPE, avoiding the infinite loop (which can occur when the objective function is set to NaN), or did it continue to an apparently successful termination?
Are you saing that 0**POWER was the source of the FPE, and that the little fix in your email didn't help? If so, maybe your code was trying to compute X**POWER with X < -1, and this could be caused by some modelling error. I don't think internal rounding had anything to do with the situation.
The above "set env" is unique to the SGI platform, but I can guess what it is saying. Everything looks ok, except that I am worried about INVALID=0.
It seems as if this might set 0**POWER to 0, but it might also set (neg)**POWER to 0, or LOG(0) to 0, or SQRT(neg) to 0, which is not a good idea. I think INVALID=ABORT is a better choice, if this means what I think it means ("stop immediately if an invalid operation is attempted.")
I'd really like to see your control stream and try running it here to find out exactly where the FPE occurs, and whether it is appropriate to set the result to 0.
Even though some FPE's are benign, it would be best if the computers always stopped when they occur, as do the INTEL systems. In general, it is better to fix the data and/or abbreviated code to avoid FPE's (other than underflow, which is normal in NONMEM and should always result in 0).
From: Erik Olofsen +31 71 5263344 / 5152719
Subject: Nonmem seems to hang itself
Date: 17 Oct 1997 09:04:01 -0400
Dear NONMEM users,
I've been running NONMEM under Linux, and I have had the same problem as Dr. Rombout, which was that NONMEM did not abort after a floating point exception. On the WWW I found information provided by Dr. Giannetti on how to solve it. (see http://www.adl.dmt.csiro.au/mail_lists/html/linux-gcc/msg00090.html)
By adding the following verbatim code to $PK or to a user-written PRED, the masks for floating point errors can be set as desired:
"IF (ICALL.EQ.1) THEN
" WRITE(*,*) 'CALLING SETFPUCW'
" CALL SETFPUCW
NONMEM then needs to be linked with the following C routine. The advantage of implementing this in C is that one can use the FPU constants as defined in the header file.
__setfpucw(_FPU_DEFAULT & ~(_FPU_MASK_IM | _FPU_MASK_ZM | _FPU_MASK_OM));
This causes a core dump in case of an invalid operation, a divide by zero or overflow.
Department of Anesthesiology
Leiden University Medical Center