From: Alan Xiao <Alan.Xiao@cognigencorp.com>

Subject: Help: Non-positive semi-definite message

Date: Wed, 15 Aug 2001 09:54:13 -0400

Can any one systematically address the possible causes of and

possible solutions to the message of "Nonpositive semidefinite

I have a 5-compartment model to simultaneously fit one parent drug

(2 compartments) and two SERIAL metabolites ( 1 and 2 compartments)

over a database of 600 patients and >7000 conc records. The

conversion between the two metabolites are reversible.

With a final model including 29 covariate terms on 5 different

clearances (intact elimination and conversion) (40 THETAs + 5 ETAs + 3 EPSs),

I always got the above message. I tried changing initial guesses, merging

different covariates (if applicable), using omega block, etc.

the message was always there just like your shadow under the sun.

Note that, with 23 covariates in the model (38 THETAs) during forward

selection, there is no problem to go through the $COV step. After that, I

dropped the $COV step from the model for further forward selection

and backward elimination (of covariates) just because it cost too much

time (>30 hours for each run with $COV on a high speed UNIX system in average).

Also note that, no matter how diverse the initial guesses are, the final

esitmates are the same once it is successfully minimized

(still with the above message). (Of course I just tried a few sets of

initial guesses, out of numberous possible combinations. - This is based on

one of explanations that this message might mean minimization is not

In addition, there are some patients in the dataset with missing covariate

values and they were imputated with population MEANs. Also there are some obvious

outliers which can not be appropriately expressed by any covariate funcitons.

I appreciate if any one can share his/her experience of how to handle this

(I've already searched discussions on this topic previously posted

***** Alan Xiao, Ph.D ***************

***** PK/PD Scientist ***************

***** Cognigen Corporation **********

***** Tel: 716-633-3463 ext 265 ******

From: "bvatul" <bvatul@ufl.edu>

Subject: Re: Non-positive semi-definite message

Date: Wed, 15 Aug 2001 12:36:27 -0400

I had a similar error message when I was analysing a data set with three =

compartment model for drug and a similar model for metabolite with $COV =

step. The problem could be solved (Thanks to Dr Peter Bonate) by viewing =

the scatter plots of various parameters and minimising them in the model =

if they are correlated and changing the NSIG from default of 3 to 5. =

One more suggestion was to use MATRIX=3DS in COV step to get around this =

message although I am not sure how it works.=20

I used these methods and I could work around the problem. It is not =

clear to me still how change of NSIG effects the search of the global =

minima? Perhaps somebody could comment on this.

Date: Wed, 15 Aug 2001 11:55:53 -0700 (PDT)

This is in reply to the following question from Alan Xiao.

>Can any one systematically address the possible causes of and

>possible solutions to the message of "Nonpositive semidefinite

I will not endeavor to systematically address the possible causes for and

solutions to the message. These can be many, and they are very

specific to the user's problem. We can make this one general remark regarding

Your point estimate is neither a local nor a global mimimum. It is a saddle

point, and therefore, it may not be a suitable point estimate.

You say you have tried different initial estimates and other perturbations of

your model and still, the message arises. This makes me think that there is

something fundamentally wrong with your setup. But it is not easy, and it may

be impossible, to diagnose this by means of interaction over the

Date: Wed, 15 Aug 2001 17:23:45 -0400

From: Alan Xiao <Alan.Xiao@cognigencorp.com>

Subject: [Fwd: Re: Non-positive semi-definite message]

-------- Original Message --------

Subject: Re: Non-positive semi-definite message

Date: Wed, 15 Aug 2001 16:52:12 -0400

From: Alan Xiao <Alan.Xiao@cognigencorp.com>

I just tried MATRIX=S option in $COV, it just took 20 minutes (I used

$MSFI), compared to about 15-20 hours without this option (also use

$MSFI). And it past the $cov step and the results are consistent with

those of models during the forward selection process (all of the base

models except the last few used $cov but without this option).

Now, any suggestions about the exact explanations to this phenomenon?

(Hi, Atul, I'll buy you a beer next time when we meet - might be on the

> Your third choice (MATRIX=S) is really what I first time heard about

> using it. Does this mean we go around Matrix R (or more

> appropriately, just inverse R) ? The NONMEM manual just mentioned

> that "MATRIX=S requests that the inverse S matrix be used. " At this

> case, does the statistical inference change? (since R and S are two

> different matrices - Hessian and Cross-Product Gradient). Has

> anyone explored the differences - in parameter estimates, statistical

> inferences, etc. - between options MATRIX=R and MATRIX=S over a good

> model (which does not produce the message "non-positive ...")? Does

> this message only happen to MATRIX R (or strictly, inverse R)? or

> also possible to matrix S (inverse S)?

> I did try SIGDIG = 4 and the first round results showed "Rounding

> Error". The second round of results (after changing initial guesses)

> are not out yet (it takes more than 30 hours). I did not go so far as

> to SIGDIG = 5. Here, I'm wondering if anyone tried even further

> before, such as SIGDIG = 6 or 7 (if applicable with any reasons)?

> For the first choice - minimizing (the number of) covariates, it's

> really helpful during the forward selection step (if using $cov) since

> it could pick up redundant (or correlated) covariates depending on the

> magnitude of p values. However, if it's after the backward

> elimination, ... Couldn't the backward elimination process remove

> the redundant (correlated) covariates from the model? Or simply the

> magnitude of p values is not lower enough (to remove redundant or

> correlated covariates)? If so, how lower is low enough (generally in

> industry) - a silly question, huh? - I use p = 0.001 right now. If

> not so, does this mean we can not (completely) rely on the backward

> elimination step based on the magnitude of the change of the objective

> function values (chi square distribution assumption)?

> Sorry for a lot of questions.

***** Alan Xiao, Ph.D ***************

***** PK/PD Scientist ***************

***** Cognigen Corporation **********

***** Tel: 716-633-3463 ext 265 ******

From: "Piotrovskij, Vladimir [JanBe]" <VPIOTROV@janbe.jnj.com>

Subject: RE: Help: Non-positive semi-definite message

Date: Thu, 16 Aug 2001 09:59:45 +0200

Don't you think your model is a bit over-parameterized? Often $COV fails

when you have too many random effects not supported by the data. Did you try

dropping one of your ETAs? Note: using MATRIX = S may be misleading.

--------------------------------------------------------------------

Global Clinical Pharmacokinetics and Clinical Pharmacology

From: "Olofsen, E." <E.Olofsen@lumc.nl>

Subject: Re: Non-positive semi-definite message

Date: Thu, 16 Aug 2001 15:33:49 +0200 (CEST)

> One more suggestion was to use MATRIX=S in COV step to get around this

> message although I am not sure how it works.

See the User's Guide part II, page 21, on the assumption of the

distribution of random effects.

> I used these methods and I could work around the problem. It is not

> clear to me still how change of NSIG effects the search of the global

> minima? Perhaps somebody could comment on this.

I wanted to understand this, and the origin of the `rounding errors'

message as well, so I had a look at the code. This is what I think

First, NSIG determines the small but finite difference h in (scaled)

parameters at function evaluations to calculate the gradient and to

update the Hessian (see Help Guide). These terms will be more accurate

when NSIG is higher (due to the nonlinearity of the objective

function), but a too small h may cause numerical instability.

Second, it is used to check if the required precision is achieved, to

terminate the minimization (successfully). Third, it is checked if

precision is still high enough to calculate a new solution. When this

is not possible, the Hessian is reset (see Help Guide). If it is still

not possible, the minimization terminates with the `rounding errors'

message. This therefore may be caused by a low NSIG rather than

numerical instability such as above. Note that NSIG is also used in

It may be that there is a discrepancy between the accuracy of

numerical derivatives and the precision of parameter estimates when

they are highly correlated; the former needing a higher NSIG than

From: Alan Xiao <Alan.Xiao@cognigencorp.com>

Subject: Re: Help: Non-positive semi-definite message

Date: Thu, 16 Aug 2001 09:35:50 -0400

Yep. I found that the absolute magnitude of SEM (= STD ERR/MEAN) are

systematically smaller with the option than not, although the relative

magnitude is consistent (meaning if a parameter has a higher SEM without

the option, it also has higher SEM with the option.)

About the over-parameterization, it's the first problem I tried to fight

with. But first I focused on covariates.

About number of etas, I tried if I could increase it, rather than

decrease it, because the number of etas and their positions were tested

during the construction of the structural model and it always did well

during the addition of the first 23 covariates to the model. But I did

find that in the structural model, eta5 and eta6 did not show

correlation but after addition of covariates to their parameters, they

are correlated. Then I tried $OMEGA BLOCK.

Now, one question, if we drop an eta from a parameter with covariates,

(the addition of covariates significantly decreases the eta value, from

0.4** down to 0.04**), does this mean we push the interindividual

variability be basically from the covariates? because dropping an eta

is equivalent to push it as zero, although we can claim that data might

be not enough to support it since eta is not estimated, which is

different from being estimated as zero. Does any one have experiences

that after addition of covariates to a parameter, the estimate of eta on

that parameter goes to zero? - I know, theoretically it's a little

bizzarr but bizzarr things sometimes happen.

***** Alan Xiao, Ph.D ***************

***** PK/PD Scientist ***************

***** Cognigen Corporation **********

***** Tel: 716-633-3463 ext 265 ******

From: peter.bonate@quintiles.com

Date: Thu, 16 Aug 2001 08:52:12 -0500

Vladmir responded that using Matrix=S in a $COV statement may be

misleading. How so? The default in NONMEM is something like R^(-1)*S*R.

This is the heteroscedastic consistent estimator. Many books on nonlinear

regression report that inversion of S alone is also valid and should be

approximately the same as the other estimator. Why not so with nonlinear

On to another question, I know this is minutia, but I am giving a talk on

random number generators in simulation at an AAPS meeting in September and

I am pretty certain that someone will ask this: what is the random number

generator used by NONMEM for performing simulations? Does someone have a

From: "Piotrovskij, Vladimir [JanBe]" <VPIOTROV@janbe.jnj.com>

Subject: RE: Help: Non-positive semi-definite message

Date: Thu, 16 Aug 2001 15:58:41 +0200

>Now, one question, if we drop an eta from a parameter with covariates, (the

addition of >covariates significantly decreases the eta value, from 0.4**

down to 0.04**), does this >mean we push the interindividual variability be

basically from the covariates?

I think you are talking about OMEGAs, not ETAs (the latter are random

variables, while OMEGAs are the corresponding variances). Your thinking

about the role of covariates in the reduction of OMEGAs is correct. If no

covariates are included in the model the interindividual variability is a

pure random effect. Adding covariates reduces the random component of the

interindividual variability. One can imagine the situation when all (or

almost all) interindividual variability is explained by fixed effects of

covariates, and there is no need in the random effect anymore.

From: Alan Xiao <Alan.Xiao@cognigencorp.com>

Subject: Re: Help: Non-positive semi-definite message

Date: Thu, 16 Aug 2001 10:24:29 -0400

That means the random effects distribution of the parameter is perfectly

matching to the radom effect distribution of the covariates? - when eta

goes to zero after addition of covariates. Isn't there an assumption

that the radom effect distriution is normal or lognormal? If so, does

this mean the above possibility can never happen to parameters only

with dichotomous covariates? - since dichotomous covariatres are in a

quite different distribution type. Or it's also possible? - since the

dichotomous covariates could be twisted to be a normal / lognormal

distribution (I guess at least in the structural model, they are

partly, if not completely, done this way since their effects are

components of the radom effect distribution of the parameter - Am I

From: "Piotrovskij, Vladimir [JanBe]" <VPIOTROV@janbe.jnj.com>

Subject: RE: Help: Non-positive semi-definite message

Date: Thu, 16 Aug 2001 16:43:06 +0200

Covariates do not have any random effects, and their distributions do not

play any role. They are just independent variables like time and are assumed

From: "Piotrovskij, Vladimir [JanBe]" <VPIOTROV@janbe.jnj.com>

Subject: RE: S-matrix and RNGs

Date: Thu, 16 Aug 2001 16:48:28 +0200

It may be so, and may be not. The question is how good is "approximately".

At least there should be reasons for selecting R^(-1)*S*R as the default in

From: Alan Xiao <Alan.Xiao@cognigencorp.com>

Subject: Re: Help: Non-positive semi-definite message

Date: Thu, 16 Aug 2001 11:07:42 -0400

I think I mixed the covariate's random effect distribution with the

Aren't their values measured and there are some kind of radom effects,

just like concentration records? - where residual variability is

(partly) defined. OK, we just ignore that by assumption - after all,

we can not cover every thing. However, covariates have their own

distributions. if a parameter can be expressed as:

P = theta0 + theta1*cov1 + theta2*cov2 + eta

var(P) = theta1**2*var(cov1) + theta2**2*var(cov2) + var(eta)

-assuming there is no random effect distribution on thetas.

now if cov1, cov2 and eta have different distribution types - assuming,

cov1=AGE, cov2=GENDER and we already assumed eta is in a normal

distribution - are they "normalized" to the normal distribution?

From: LSheiner <lewis@c255.ucsf.edu>

Subject: Re: Help: Non-positive semi-definite message

Date: Thu, 16 Aug 2001 09:57:44 -0700

Beware an inter-individual random effect variance estimate near zero:

It is almost never what it appears!

When a random effect variance is estimated to be zero and the model

used has a diagonal OMEGA, it usually means that the model is

over-parameterized in random effects and the data are insufficient to

estimate the number of distinct components of variance taht are

I personally haven't seen this suddenly happen only

upon addition of a covariate, but it could.

A helpful diagnostic is to run the model with a full OMEGA (i.e., if the

of etas is 4, then use $OMEGA BLOCK(4)). On output, convert OMEGA to

form (divide the off-diagonal values by the square root of the product

of the two corresponding diagonal elements ... thus the corr value

with OMEGA(2,3) is OMEGA(2,3)/(OMEGA(2,2)*OMEGA(3,3))**.5). If one of

the off-diagonal terms in the correlation matrix in the same row/col as

the variance that went to zero on the diagonal OMEGA run is nearly

then over-parameterization in random effects is probably the reason for

not that the covariate has removed all variability ....

_/ _/ _/_/ _/_/_/ _/_/_/ Lewis B Sheiner, MD (lewis@c255.ucsf.edu)

_/ _/ _/ _/_ _/_/ Professor: Lab. Med., Bioph. Sci., Med.

_/ _/ _/ _/ _/ Box 0626, UCSF, SF, CA, 94143-0626

_/_/ _/_/ _/_/_/ _/ 415-476-1965 (v), 415-476-2796 (fax)

Date: Thu, 16 Aug 2001 11:09:05 -0700 (PDT)

Regarding some discussion ensuing from Alan Xiao's question about

the message he is getting that the R matrix is non-positive definite:

Discussion about how to get around this message, by using other options on the

$COV record to obviate the need to compute the R matrix, is

off the mark. After obtaining a final parameter estimate that appears

to be reasonable, it is a good idea to compute the R matrix. (This is

easily done if a MSF was used when the parameter estimate was computed.)

With the R matrix, and only with the R matrix, can one check that the estimate

is not a saddle point. If, as in Alan's case, the estimate is a saddle point,

then it probably is not a suitable estimate, and standard error

estimates (no matter how they may be computed) and the like are irrelevant.

By default, if the R matrix turns out to be positive definite, and the S

matrix to be nonsingular, then NONMEM computes the estimate of the

variance-covariance matrix of the estimator by using both the R and S

matrices (viz. as (Rinv)S(Rinv)).

From: Michael J Fossler <Michael.J.Fossler@dupontpharma.com>

Date: 16 Aug 2001 14:49:11 -0400

Could someone explain in detail how to go about computing the R matrix and how to interpret the results?

*******************************************************************

Drug Metabolism and Pharmacokinetics, DuPont Pharmaceuticals

michael.j.fossler@dupontpharma.com

************************************************************************

From: Alan Xiao <Alan.Xiao@cognigencorp.com>

Date: Thu, 16 Aug 2001 15:07:18 -0400

Thanks for your comments. About the saddle point, how could it be formed and

Can (linear) addition of some extra covariates into the model change the surface

around a point from a cone shape or whatever into a saddle shape surface? Or

is this just from the structure of the model and not related to covariates at all

(as you mentioned "fundamental" problems in the last email)? If yes, then how to

explain that it's no problem for the structural model and all models with up to

23 covariates to go through $COV step by default? - You won't suggest that we can

still get the right R on a saddle shape surface, right?

On the other hand, if a saddle shape surface is so big that it covers the whole

range of you data (wild?), what could you do?

***** Alan Xiao, Ph.D ***************

***** PK/PD Scientist ***************

***** Cognigen Corporation **********

***** Tel: 716-633-3463 ext 265 ******

From: harrold@sage.che.pitt.edu

Date: Thu, 16 Aug 2001 13:26:11 -0400 (EDT)

Sometime in August Alan Xiao assaulted keyboard and produced...

|Thanks for your comments. About the saddle point, how could it be formed and

in one dimenstion a saddle point occurs at a point when your first

derivative is zero and your second derivative is also zero. think back to

calculus and points of inflection. in multiple dimensions this occurs

when the gradient is zero and the determinant of the hessian is

singular. these are of course referring to the derivatives of the

function you are trying to minimize/maximize.

in minimization terms the first order necessary condition requires that

the gradient be zero and the second order necessary condition requires

that the hessian be positive definate at this point.

|Can (linear) addition of some extra covariates into the model change the surface

|around a point from a cone shape or whatever into a saddle shape surface? Or

|is this just from the structure of the model and not related to covariates at all

|(as you mentioned "fundamental" problems in the last email)? If yes, then how to

|explain that it's no problem for the structural model and all models with up to

|23 covariates to go through $COV step by default? - You won't suggest that we can

|still get the right R on a saddle shape surface, right?

|On the other hand, if a saddle shape surface is so big that it covers the whole

|range of you data (wild?), what could you do?

then i believe that your minimum would lie somewhere along the boundary of

From: Alan Xiao <Alan.Xiao@cognigencorp.com>

Date: Thu, 16 Aug 2001 15:52:44 -0400

(Following your second comments). Then this means, even you get a perfect R and

everything else is also perfect, your final model is still questionable since they

might be simply because " that your minimum would lie somewhere along the boundary of

And also, does your this sentence imply that the iterative values of (some)

parameter estimates search along just one direction ? Or do we have other methods to

Sorry, I just try to clarify the questions and find a solution to the porblem.

***** Alan Xiao, Ph.D ***************

***** PK/PD Scientist ***************

***** Cognigen Corporation **********

***** Tel: 716-633-3463 ext 265 ******

From: "KOWALSKI, KENNETH G. [R&D/1825]" <kenneth.g.kowalski@pharmacia.com>

Subject: RE: Help: Non-positive semi-definite message

Date: Mon, 20 Aug 2001 13:06:18 -0500

I agree with Vladimir. In looking at your ETAs you might want to calculate

the correlations (NONMEM only reports the covariance matrix for OMEGA). The

non-positive semi-definite message from the $COV step often arises when

OMEGA is over-parameterized. When this happens I often find that the ETAs

for two or more of the parameters (e.g., CL and V) have their correlation

estimated to be 1.0. In this setting I usually reduce the dimensionality

for OMEGA by using a shared ETA for the two parameters with a different

scale parameter to account for different variances. For example,

V=THETA(2)*EXP(THETA(3)*ETA(1))

Here Var(logV)=(THETA(3)^2)*Var(logCL) and Corr(logV, logCL)=1.0.

From: Alan Xiao <Alan.Xiao@cognigencorp.com>

Subject: Re: Help: Non-positive semi-definite message

Date: Mon, 20 Aug 2001 17:37:55 -0400

I noticed that you and Diane and others talked about this online months

Yee, this is a kind of playing trick, you reduce a parameter eta but

increase a parameter theta - the total parameter numbers are the same

(Of course they have different meanings) and you get the same (?)

(other) parameter estimates and the porblem is gone.

How about some kind of correlation between ETAs but not equal (close) to

1 (-1)? Does THETA(3) in your example have to be positive?

And what's the difference between using this method and $OMEGA BLOCK?

If any of you has already tested these kinds of quesitons and their

inferences are general, and you are willing to share them with me (and

other NMusers) here, that would be great - and I really really

My time frame does allow me to test (even think) every aspect for every

From: Nick Holford <n.holford@auckland.ac.nz>

Subject: Re: Help: Non-positive semi-definite message

Date: Tue, 21 Aug 2001 10:14:15 +1200

Like Alan I have some questions about your suggestion. I have difficulty accepting the interpretation of the model when you fix the correlation at 1. This means that if CL goes up 50% in patient then V will go up exactly 50% in the same patient. I have no prior expectation that this would ever happen.

In the case we are discussing were the data does not allow the covariance between CL and V to be estimated adequately I feel more comfortable with the the assumption that the covariance is 0.

A third middle ground possibility would be to include a third ETA to define the covariance as a random but FIXED parameter e.g. correlation of say 0.3. I am not sure exactly how to do this but perhaps this might do it?

Nick Holford, Divn Pharmacology & Clinical Pharmacology

University of Auckland, 85 Park Rd, Private Bag 92019, Auckland, New Zealand

email:n.holford@auckland.ac.nz tel:+64(9)373-7599x6730 fax:373-7556

http://www.phm.auckland.ac.nz/Staff/NHolford/nholford.htm

From: "KOWALSKI, KENNETH G. [R&D/1825]" <kenneth.g.kowalski@pharmacia.com>

Subject: RE: Help: Non-positive semi-definite message

Date: Mon, 20 Aug 2001 18:01:05 -0500

The trick I suggested when a correlation goes to unity does reduce the

dimensionality by a parameter. In the unrestricted case you are estimating

a variance component for both etas (e.g., V and CL) as well as the

covariance between the etas (the off-diagonal element in OMEGA). Thus, in

the case of two etas you are estimating 3 parameters. In the example I gave

where an eta is shared, then you only have 2 parameters being estimated as

the correlation is restricted to unity.

If you are not using a full OMEGA to begin with you should look at Lew

Sheiner's recent message on this topic regarding diagonal OMEGAs with

variance components being estimated as zero. He comments on the problem of

having too many etas than is supported by the data and that you should look

at a full OMEGA as a diagnostic and see if any of the off-diagonal elements

have correlations approaching unity. I do this as a matter of good practice

regardless of whether a diagonal OMEGA gives me problems or not. I'm pretty

sure I went through an example illustrating this when I visited Cognigen a

Blocking of OMEGA is also a useful way to reduce the dimensionality by

restricting certain covariances to zero. Basically, all etas within a block

are assumed to be correlated and etas between blocks are uncorrelated. If

for example you have a one compartment model with tlag, ka, V, and CL, a

full OMEGA would be specified as BLOCK(4) on the $OMEGA statement. If this

leads to an over-parameterized OMEGA that suggests that tlag and ka are

uncorrelated with V and CL it might be reasonable to reduce the

dimensionality by assuming tlag and ka are in one block and V and CL are in

a second block with code something like

Note in the full BLOCK(4) OMEGA there are 10 parameters to be estimated (4

diagonal elements and 6 off-diagnonal elements). In the above example where

the covariances between the random effects for tlag and ka are uncorrelated

with those for V and CL there are only 6 parameters to be estimated.

Furthermore, if the BLOCK(4) OMEGA yields off-diagonal correlations near

unity then you can use the trick I suggest with the shared etas to further

reduce the dimensionality. Another way to reduce the dimensionality in

OMEGA is the use of band diagonal matrices. Diane Mould discussed this on

the NONMEM network a couple of months ago. I think blocking, shared etas

and banding give us quite a bit of flexibility in reducing the

dimensionality of OMEGA to combat the problems with over-parameterized

OMEGAs which can lead to those nasty non-positive semi-definite messages.

From: "KOWALSKI, KENNETH G. [R&D/1825]" <kenneth.g.kowalski@pharmacia.com>

Subject: RE: Help: Non-positive semi-definite message

Date: Mon, 20 Aug 2001 18:40:18 -0500

I share your concern however, you might be surprized how often this occurs.

We need to run the full OMEGA model more often as a diagnostic and calculate

these correlations...I wish NONMEM would automatically report these

correlations rather than just the variances and covariances. I think this

issue is the same as when we observe a particular variance component being

estimated as zero. We should not infer that the variability in that

particular variable is negligible just that the data/design do not provide

sufficient information to estimate it. When a full OMEGA has an

off-diagonal element going to unity, we shouldn't automatically assume that

they are perfectly correlated, just that they are strongly correlated but

the data/design don't provide sufficient information to estimate it

different from unity. I gave a talk on this at the CDDS workshop on

modeling and simulation best practices a couple years ago. In that talk I

show how if we ignore these strong correlations and resort to a diagonal

OMEGA simply because we can get the COV step to run without the

"non-positive semi-definite message" problems, we can end up simulating

unrealistic data. If OMEGA is over-parameterized we need to understand why

and try to reduce the dimensionality in the most parsimonious way...diagonal

OMEGAs usually are not the most parsimonious.

From: "KOWALSKI, KENNETH G. [R&D/1825]" <kenneth.g.kowalski@pharmacia.com>

Subject: RE: Help: Non-positive semi-definite message

Date: Tue, 21 Aug 2001 09:55:07 -0500

>I recall your talk which was very helpful at pointing to solutions to the

>problem. The importance of knowing something about covariance when doing

>simulations needs to be continuously emphasized. While using a covariance

of >1 may not be as harmful as a covariance of 0 it still seem that we would

>better off with a middle position which uses some covariance between 0 and

>Do you have any specific comments on the method I suggested?

> V=THETA(2)*EXP(ETA(3)*ETA(1))

In the setting when NONMEM wants to estimate the correlation for an

off-diagonal element of omega to 1 what value do you propose fixing the

covariance value to? If you fix the covariance at various values (and

estimate everything else) I believe you will find that the minimum OFV will

correspond to the estimate of the covariance which leads to the correlation

being set to 1.0. Thus, any fixed value that results in the correlation

being constrained to less than 1.0 should have a higher OFV and hence will

be less parsimonious. I don't know how one arbitrarily sets the correlation

to be something between 0 and 1 when the data/design do not provide

sufficient information to estimate it different from 1. If you set it

arbitrarily close to 1 like 0.95 or 0.99 so as not to increase the OFV too

much then you have a "near perfect correlation" and your concern would still

From: Nick Holford <n.holford@auckland.ac.nz>

Subject: Re: Help: Non-positive semi-definite message

Date: Wed, 22 Aug 2001 07:16:41 +1200

"KOWALSKI, KENNETH G. [R&D/1825]" wrote:

> In the setting when NONMEM wants to estimate the correlation for an

> off-diagonal element of omega to 1 what value do you propose fixing the

That is *exactly* what I am proposing. The analogous situation would be fixing KA to a reasonable value when you dont have enough data in the absorption phase to estimate it (Janet Wade did some simulations on this in JPB). If you pick a reasonable value then the other estimates will be reasonable.

If you fix the covariance at various values (and

> estimate everything else) I believe you will find that the minimum OFV will

> correspond to the estimate of the covariance which leads to the correlation

> being set to 1.0. Thus, any fixed value that results in the correlation

> being constrained to less than 1.0 should have a higher OFV and hence will

I would accept that fixing the covariance would constrain the OFV but sometimes you should not believe the OFV as the only criterion of a reasonable fit.

>I don't know how one arbitrarily sets the correlation

> to be something between 0 and 1 when the data/design do not provide

> sufficient information to estimate it different from 1. If you set it

> arbitrarily close to 1 like 0.95 or 0.99 so as not to increase the OFV too

> much then you have a "near perfect correlation" and your concern would still

> exist at least approximately.

I was thinking of a value of around 0.3 which I have seen for CL and V. This seems more reasonable than 0 or 1.

Nick Holford, Divn Pharmacology & Clinical Pharmacology

University of Auckland, 85 Park Rd, Private Bag 92019, Auckland, New Zealand

email:n.holford@auckland.ac.nz tel:+64(9)373-7599x6730 fax:373-7556

http://www.phm.auckland.ac.nz/Staff/NHolford/nholford.htm

From: "KOWALSKI, KENNETH G. [R&D/1825]" <kenneth.g.kowalski@pharmacia.com>

Subject: RE: Help: Non-positive semi-definite message

Date: Tue, 21 Aug 2001 17:27:50 -0500

>I was thinking of a value of around 0.3 which I have seen for CL and V.

This >seems more reasonable than 0 or 1.

I disagree. If you arbitrarily fix the correlation to 0.3 when NONMEM wants

to estimate it to be 1.0 I think you will still run into simulations of

individual data that can be very unrealistic more so than restricting the

correlation to 1.0. To me the real proof of what is more reasonable is to

do a simulation (posterior predictive check). I'll bet you a six-pack that

1.0 will be more reasonable than 0.3.

From: Nick Holford <n.holford@auckland.ac.nz>

Subject: Re: Help: Non-positive semi-definite message

Date: Wed, 22 Aug 2001 11:10:43 +1200

If you use PPC and rely only the data set at hand then I would not take on the bet. But if your PPC included an informative prior about the value of the covariance then I would be happy to bet you a six pack :-)

Let me know when you have the results and I'll provide a six pack of your favourite if you can demonstrate that the PPC with an uninformative prior is better than one with a prior covariance of 0.3 and lets say a CV of 50% for the precision of this prior.

Nick Holford, Divn Pharmacology & Clinical Pharmacology

University of Auckland, 85 Park Rd, Private Bag 92019, Auckland, New Zealand

email:n.holford@auckland.ac.nz tel:+64(9)373-7599x6730 fax:373-7556

http://www.phm.auckland.ac.nz/Staff/NHolford/nholford.htm

From: Mats Karlsson <Mats.Karlsson@farmbio.uu.se>

Subject: Re: NNMUSERS : Non-positive semi-definite message

Date: Wed, 22 Aug 2001 10:25:18 +0200

Neither code will work. Multiplying one ETA with another will give zero

gradients (unless you do some changes in the source code). At the evaluation

of the impact of one ETA on the function, other etas are set to zero. You can

reparametrize the whole model so that you estimate CORR and CV's. Then it is

easy to fix CORR to a prior. The code is ugly and not general though. One

COVA =CORR*SQRT(CVCL*CVCL*CVV*CVV)

IF(COVA.LT.TCVCL) FCVCL =SQRT(CVCL*CVCL-COVA)

IF(COVA.LT.TCVV) FCVV =SQRT(CVV *CVV -COVA)

IF(COVA.GE.TCVCL) FCVCL =SQRT(-CVCL*CVCL+COVA)

IF(COVA.GE.TCVV) FCVV =SQRT(-CVV *CVV +COVA)

ETCL =EXP(ETA(1)*FCVCL)*EXP(ETA(6)*SQRT(COVA))

ETV =EXP(ETA(2)*FCVV)*EXP(ETA(6)*SQRT(COVA))

;----------IOV--------------------

KPCL =EXP(VIS3*ETA(7)*THETA(4)+VIS8*ETA(8)*THETA(4))

KPKA =EXP(VIS3*ETA(9)*THETA(13)+VIS8*ETA(10)*THETA(13))

KPV =EXP(VIS3*ETA(11)*THETA(14)+VIS8*ETA(12)*THETA(14))

KPLAG=(VIS3*ETA(4)*THETA(11)+VIS8*ETA(5)*THETA(11))

TVCL =THETA(1)*(1+THETA(9)*(CLCR-65))

ALAG1=EXP(PHI+KPLAG+ETLAG)/(1+EXP(PHI+KPLAG+ETLAG))

$TAB ID TAD IPRED IWRES ONEHEADER NOPRINT FILE=sdtab64

$TAB VID ETCL ETV ONEHEADER NOPRINT FILE=patab64

$TAB VID AGE WT CLCR ONEHEADER NOPRINT FILE=cotab64

$TAB VID SEX ACE DIG DIU COMP VISI ONEHEADER NOPRINT FILE=catab64

$THETA (0,27.5) (0,1.565) (0,2.1) (.254) (0,.23)

$THETA (0,.22) (0,.24) (0,1.45) (0,.008,.033) (0,.14) (0,5) (0,.86,1) (0,1)

$OMEGA 1 FIX 1 FIX 1 FIX 1 FIX 1 FIX 1 FIX 1 FIX 1 FIX 1 FIX 1 FIX

$EST MAXEVALS=9990 PRINT=2 POSTHOC MSFO=msfb64

Div. of Pharmacokinetics and Drug Therapy

Dept. of Pharmaceutical Biosciences

From: "KOWALSKI, KENNETH G. [R&D/1825]" <kenneth.g.kowalski@pharmacia.com>

Subject: RE: Help: Non-positive semi-definite message

Date: Wed, 22 Aug 2001 11:02:36 -0500

In the discussion that Nick and I were having I was assuming that the only

information available was based on the data at hand which ML wants to

estimate a singular cov matrix. When I used the term "arbitrary" for

setting the correlation to 0.3 I was assuming there was no other information

available. That is, there is no prior knowledge that the true correlation

is 0.3. Certainly if other data is available perhaps from a design that

allows better estimation of the correlation and an informative prior can be

developed that says the correlation is 0.3 I wouldn't take the bet either.

My point is that in the absence of an informative prior if ML wants to

estimate a singular cov matrix it seems doubtful that the true correlation

would be 0.3. A correlation of 0.3 is fairly weak and I would expect that

for most data/designs if the true correlation was 0.3 you wouldn't run into

a singular cov matrix due to ML trying to estimate the correlation as 1.0.

It seems to me there must be sufficient information in the data to make ML

want to drive the correlation to 1. So, I guess I'm saying that if you put

a mild prior on the correlation centered about 0.3 its still likely to move

the correlation considerably away from 0.3. Perhaps I'm wrong and a poor

design could still drive ML to estimate the correlation as 1 when the true

value is 0.3...I don't know. It just seems more likely that the true

correlation would be considerably higher...perhaps 0.8 or 0.9.

In any event, the six-pack is on me.

From: LSheiner [mailto:lewis@c255.ucsf.edu]

Sent: Tuesday, August 21, 2001 7:44 PM

To: KOWALSKI, KENNETH G. [R&D/1825]; Nick Holford

Subject: Re: Help: Non-positive semi-definite message

If it will make you more likely to take Ken up on his wager,

I'll cover 2 of the cans in the six-pack against him ...

The reason I suspect that he might

be wrong is that ML really really wants a singular

cov matrix; sometimes beyond "reason". So if your prior knowledge

says param A and param B really are no more correlated in

real life than .3, I'm guessing the simulated data will look more

that way than if we let ML have it's way.

You made the suggestion (but then it wasn't discussed further) of

putting a prior on OMEGA that puts some prior weight on corr=.3.

If you did this, and then you saw that without the prior the corr goes

but with a mild prior it stays near .3, I'd cover the

whole six-pack ... Bayes is ALWAYS sensible; ML is only sensible when

_/ _/ _/_/ _/_/_/ _/_/_/ Lewis B Sheiner, MD (lewis@c255.ucsf.edu)

_/ _/ _/ _/_ _/_/ Professor: Lab. Med., Bioph. Sci., Med.

_/ _/ _/ _/ _/ Box 0626, UCSF, SF, CA, 94143-0626

_/_/ _/_/ _/_/_/ _/ 415-476-1965 (v), 415-476-2796 (fax)

Subject: Re: Help: Non-positive semi-definite message

Date: Wed, 22 Aug 2001 11:48:45 -0500

I think Ken is handling this discussion quite well, but I wanted to say in

general I agree with his points.

It seems to me that we have to consider the following. Do we have any

previous experience with this compound that would suggest that we could

fix the correlation to some value? If you do then you may be able to fix

it. Let me warn, however, that this covariance is partly determined by

what fixed and random effects that you include in your model. One can

imagine the situation where there is an covariate that is highly

correlated with both parameters, and this is what is pushing the

correlation to 1. Suppose, for instance, that your estimate of the

correlation from previous information comes from a model that contains

this covariate, but your new model for your new data set does not. You

can see then that by fixing the correlation to the previous value, you may

be endangering the adequacy of your resulting model.

What this discussion really amounts to is how much of a frequentist or

Bayesian are you? A frequentist would believe that your model should be

based only on your data. A Bayesian typically believes that you should

incorporate prior information into your model. Fixing a value in your

model to some previous determined value is a Bayesian idea. But, it is

not optimal in the Bayesian sense. A true Bayesian also believes that

there is uncertainty in prior estimates and thus you need to account for

this uncertainty in your prior beliefs. To do this, takes the model

outside of the scope of NONMEM (or at least I think that it does) and is

why you do not see it very much. It also, often makes scientists uneasy,

especially when your prior information swamps the data, and leaves you

with a model that more reflects prior beliefs than the data.

Now, I do not claim to be either a Bayesian or frequentist, but a

pragmatist. The first question is will I have a poor model if I assume

that the correlation is equal to 1, when the maximum likelihood estimate

is that it should be 1? I really cannot imagine that the important

characteristic of the model, the estimation of fixed effects, will be

greatly harmed with this assumption. If I arbitrarily set the correlation

to 0.3, will the model be harmed? Quite possibly, especially if you want

to base your inference on the data. Although this is based on the fact

that I believe that maximum likelihood is telling all of the pertinent

information about the particular data set. The fact that the correlation

is going to 1 is the consequence of inadequate data to accurately estimate

this particular parameter. But, it does not necessarily mean that the

estimates that you get for your other parameters are being inadequately

Thus, where are we? No where, really. This just points out that

sometimes statistics, estimation, modeling, or whatever you want to call

it, requires both skill and knowledge. Many approach NONMEM as a black

box. Most of the time this is OK. But, annoyingly, there are some

instances where this is not OK. At this point statistical knowledge and

statistical philosophical belief are integral to the final product. This

also points out why good design is so important. Good design reduces the

chance that we are stuck with these annoying problems. Good design

anticipates analytical problems that could develop, and alleviates these

problems by getting data at the right places. Alas, even good design

cannot alleviate the possibility of poor data.

So, what is the solution. Know as much statistics as possible. Know as

much science as possible. Know where your weak spots are and work with

others that can fill in those weak spots. In this way, we will all be

From: John Lukas <johnl@compulink.gr>

Subject: NNMUSERS : Non-positive semi-definite message

Date: Wed, 22 Aug 2001 10:52:58 -0700

At 12:13 PM 8/21/01 +1200, Nick Holford wrote:

.......... (responding to Dr. Kowalski)....

>The importance of knowing something about covariance when doing

>simulations needs to be continuously emphasized. While using a covariance

>of 1 may not be as harmful as a covariance of 0 it still seem that we

>would be better off with a middle position which uses some covariance

>Do you have any specific comments on the method I suggested?

> V=THETA(2)*EXP(ETA(3)*ETA(1))

V=THETA(2)*EXP(SQRT(ETA(3)*ETA(2)/ETA(1)))

may be more true to the units -eg if you use an additive structure. Of