From: James <J.G.Wright@ncl.ac.uk>
Subject: Negative objective functions
Date: Thu, 29 Oct 1998 22:45:35 +0000

Dear NONMEM users,

How come it is possible to get negative objective function values in NONMEM?

James Wright

****

From: "Rik Schoemaker" <RS@chdr.nl>
Subject: Re: Negative objective functions
Date: Fri, 30 Oct 1998 08:45:30 +0100

Dear James,

The minimum value of the objective function is equal to twice the logarithm of the likelihood. If the likelihood is smaller than 1, the log will be negative. The likelihood itself is influenced by the scaling of the measurements; changing for instance from concentrations in ng/mL to ng/L will change the value of the likelihood.

Regards,

Rik Schoemaker
CHDR, Leiden, Netherlands

****

From: "Stephen Duffull" <sduffull@fs1.pa.man.ac.uk>
Subject: Re: Negative objective functions
Date: Fri, 30 Oct 1998 09:00:05 -0000

To comment on negative objective functions with NONMEM.

A negative objective function from NONMEM would indicate the likelihood is greater than 1.0. Hence twice the negative Ln of it will be a negative number.

I have experienced this before when using maximum likelihood, but not with NONMEM when using a normal likelihood. Typically if the distribution of the data given the parameter values conforms to a N(0,1) distribution then the likelihood of the mode will be approx 0.4 (NONMEM objective function of 1.8). If the distribution is very steeped around the mode then I guess it must be possible to get likelihoods of the mode as > 1.0.

The only time I have seen this when using NONMEM, is when I inadvertantly specified a binary objective function (using NONMEM IV) when analysing categorical data using a proportional odds model. When I used the correct likelihood the negative objective functions did not recur, and needless to say the model fitted the data somewhat better!

Regards
Steve
=============================
Stephen Duffull
School of Pharmacy
University of Manchester
M13 9PL, Manchester, UK
Ph +44 161 275 2355
Fax +44 161 275 2396
Email sduffull@fs1.pa.man.ac.uk

****

From: "Rik Schoemaker" <RS@chdr.nl>
Subject: Re: Negative objective functions
Date: Fri, 30 Oct 1998 10:28:29 +0100

Oops,

Steve is right of course, it is MINUS twice the log likelihood and therefore the likelihood must be greater than 1. I've run into negative values very often and it seems to me they have no meaning on their own.

Rik Schoemaker

****

From: James <J.G.Wright@ncl.ac.uk>
Subject: Re: Negative objective functions
Date: Fri, 30 Oct 1998 10:55:20 +0000

>Date: Fri, 30 Oct 1998 10:54:19 +0000
>To: LSheiner <lewis@c255.ucsf.edu>
>From: James <J.G.Wright@ncl.ac.uk>
>Subject: Re: Negative objective functions
>
>Dear Professor Sheiner,
>
>This agrees with my understanding of the situation. However, my
understanding of likelihood is that is it a probability, and therefore must
be between 0 and 1.
>
>James
>
>
>At 08:43 PM 10/29/98 -0800, you wrote:
>>The objective function is not a sum of squares, it is
>>-2 times the log of the likelihood. The likelihood, in
>>simple npormal problesm is a sum of squares. If that sum
>>is >1 then -2 log likelihood will be negative.
>>
>>The likelihood in NONMEM is usually more complicated
>>than a simple sum of squares, but it may still be
>>>1 and hence the obj fn be negative. The absolute
>>value of the obj fn is meaningless, as a likelihood
>>is only defined up to an arbitrary proportiojnality
>>constant. Only differences
>>between obj functions of nested models are meaningful.
>>
>>LBS.

****

Subject: RE: Negative objective functions
Date: Fri, 30 Oct 1998 12:23:36 +0100

The default objective function in NONMEM is that of ELS, and it can be positive or negative depending on DV units, number of measurements, etc. In NONMEM V, it is possible to select an objective function equal to likelihood or -2*log(likelihood) by including LIKELIHOOD or -2LOGLIKELIHOOD option, respectively, in \$ESTIMATION.

----------------------------------------------------------
Clinical Pharmacokinetics
Janssen Research Foundation
2340 Beerse, Belgium
e-mail: vpiotrov@janbe.jnj.com

****

From: LSheiner <lewis@c255.ucsf.edu>
Subject: Re: Negative objective functions
Date: Fri, 30 Oct 1998 09:05:07 -0800

As I wrote in response to the original message (but didn't copy nmusers), the likelihood (L) is proportional to the probability of the data, but only proportional. It's scale is not fixed. Thus it can easily be greater than 1 and hence -2 log(L) < 0. A typical likelihood, in simple Normal problems is the sum of squares. This is typically N*(variance of residuals), where N is the number of observatikons. If N is large enough, or if the residual variance in the squared observation units of the dependent variable is >1, L will be >1 and -2log(L) <0.

LBS.

****

From: KENNETH.G.KOWALSKI@monsanto.com
Subject: Re[2]: Negative objective functions
Date: 30 Oct 1998 13:50:38 -0600

Lewis,

Just to be clear, the ELS objective function differs from -2log(L) by a constant (C) that depends only on the number of observations, that is, ELS = -2log(L) + C where C = -Nlog(2pi). Thus, even if L<1, as long as -2Log(L)<Nlog(2pi) the ELS will be <0. Right?

Ken

****

From: LSheiner <lewis@c255.ucsf.edu>
Subject: Re: Negative objective functions
Date: Fri, 30 Oct 1998 14:28:56 -0800

Right. As the ELS obj fn has been defined, the "scale" I mentioned in my note is indeed as Ken gives it. I was simply making the point that a likelihood (of which the ELS obj fn is an example) is equal to a probability only up to a proportionality constant which may be a function of the data (as for ELS: N is a data-determined quantity). Likelihoods are defined for ML estimation, and an extremum of a function does not change when the scale of the function changes.

****

From: James <J.G.Wright@ncl.ac.uk>
Subject: Re: Negative objective functions
Date: Sat, 31 Oct 1998 16:23:47 +0000

Dear Professor Sheiner,

My definition of likelihood as a probability is taken from Kendall's Advanced Theory of Statistics. However I appreciate that this distinction may be blurred or even contradicted in other texts. I agree that it makes no difference to your maximum likelihood estimation to include an additive constant in your objective function. However, if this constant is data-dependent (ie not constant as the number of observations changes) how does this affect the asymptotic properties of NONMEM estimates? Indeed is it even sensible to talk about asymptotics when your estimator may not be consistent? Is this constant necessary for any reason?

My interest stems from my attempts to produce leverage diagnostics for individuals in populations analysed using NONMEM.

James

At 10:17 AM 10/30/98 -0800, you wrote:
>PROPORTIONAL to prob, not = prob.
>
>James wrote:
>>
>> Dear Professor Sheiner,
>>
>> This agrees with my understanding of the situation. However, my
>> understanding of likelihood is that is it a probability, and therefore must
>> be between 0 and 1.
>>
>> James
>>
>> At 08:43 PM 10/29/98 -0800, you wrote:
>> >The objective function is not a sum of squares, it is
>> >-2 times the log of the likelihood. The likelihood, in
>> >simple npormal problesm is a sum of squares. If that sum
>> >is >1 then -2 log likelihood will be negative.
>> >
>> >The likelihood in NONMEM is usually more complicated
>> >than a simple sum of squares, but it may still be
>> >>1 and hence the obj fn be negative. The absolute
>> >value of the obj fn is meaningless, as a likelihood
>> >is only defined up to an arbitrary proportiojnality
>> >constant. Only differences
>> >between obj functions of nested models are meaningful.
>> >
>> >LBS.
>> >
>> >
>Attachment Converted: "d:\prog\attach\vcard.vcf"

****

From: James <J.G.Wright@ncl.ac.uk>
Subject: Re: Negative objective functions
Date: Sun, 01 Nov 1998 14:18:33 +0000

I am inclined to agree that this isn't important, although as your sample size tends to infinity your objective function will tend to minus infinity, this is only a technical nuisance. My query is essentially about, why is this constant there? If you let your likelihood have the property of being the probability of the data given the model (ie drop the constant) it then provides some indication of how likely the data is to have arisen for your model (and makes it possible to calculate "individual likelihoods" more easily and compare subsets of the population for compatibility with the model). It had occurred to me that this number may be some kind of compensatory factor for the oddity that adding an observation always makes the data less likely to have arisen from your model, however none of the replies to the list have indicated that this is the case.

James

At 11:54 AM 10/31/98 -0800, you wrote:
>There is nothing about a likelihood having an
>arbitrary porportionality constant that affects consistency or
>any of the other properties of ML estimates that can be
>proved.
>
>LBS.