```From: David Foster david.foster@adelaide.edu.au
Subject: [NMusers] LLR test, AIC, BIC
Date: 10/21/2003 8:06 PM

Hello all,

Im not so new to PK and POPPK (but I am a recent NONMEM user)...hopefully
this question isnt too silly.  I have read a thread posted some time ago
on this topic, but it raised a few questions for me.

AIC= obj. fun + 2*NPAR
BIC= obj. fun + NPAR* log(NOBS)

As I understand, log(NOBS) is the natural log, and NOBS is the number of
data points -is this the number of concentration-time points or the number of subjects?

To examine the impact of including a covariable (no eta, just a new theta),
I understand the use of the the LLR test just fine.  But what about, for
example, going from a 1 compartment to a 2 compartment model -there are new
theta's and matching eta's.  So my question is: does the "number of parameters"
include theta's, eta's and sigma's?  This also has an impact on AIC and BIC
calculations as different results are obtained if one only counts theta's...

In addition, is it very possible that a drop in OBJ that gives a significant
"yes" to a 2 comp oral over a 1 comp oral model in the LLR test (counting theta's
and eta's = 9.5; theta's only = 6 units drop) is often likely to be supported
by the AIC, but not by the BIC -a much larger drop is needed, especially if the
number of observations is large.  This is to be expected, but how do people
rationalize this in their model selection? -The use of diagnostic plots and
other stats MPE, RMSPE etc is important here no doubt.

I have also seen a correction to the AIC:

AICc = obj. fun + 2*NPAR + (2*NPAR*(NPAR+1))/(NOBS-NPAR-1)

Does anyone use this, and does it also apply to the BIC?

I do understand that one must look at a range of other factors/plots/statistics
(that I am well aware of), but LLR and AIC/BIC may be useful, and I just would
appreciate the groups input.

Regards,

David

that's the "Other David Foster" Nick

--
David Foster, PhD
NHMRC Research Officer
Department of Clinical and Experimental Pharmacology
Faculty of Health Sciences
Tel: +61 08 8303 5985
Fax: +61 08 8224 0685

_______________________________________________________

From: Nick Holford n.holford@auckland.ac.nz
Subject: Re: [NMusers] LLR test, AIC, BIC
Date: 10/21/2003 8:31 PM

David (the other),

My advice is not to waste your time with AIC, LLR etc if you are
using NONMEM. If you want to know the true null distribution for
an objective function change then you should be prepared to estimate
it using the randomization test.

In your example this means fitting the original data with a one
compartment model and a two compartment model and recording the
delta OBJorg. Then use the one compartment parameter estimates to
simulate say 1000 data sets (the randomization part). Fit each of
these data sets to a one compartment model and a two compartment
model. Look at the distribution of the 1000 delta OBJ values to
find the probability that you would have observed delta OBJorg
under the null hypothesis. This is an estimate of the true P value
for falsely rejecting the null (the test part).

Whether the time spent doing the randomization test is a better
waste of time instead of worrying about AIC, LLR etc. is up to you

Nick

Nick Holford, Dept Pharmacology & Clinical Pharmacology
University of Auckland, 85 Park Rd, Private Bag 92019, Auckland, New Zealand
email:n.holford@auckland.ac.nz tel:+64(9)373-7599x86730 fax:373-7556
http://www.health.auckland.ac.nz/pharmacology/staff/nholford/
_______________________________________________________

From: Paul Hutson prhutson@pharmacy.wisc.edu
Subject: Re: [NMusers] LLR test, AIC, BIC
Date: 10/21/2003 9:36 PM

Nick:
You always make it sound easy.  Do you do these 1 and 2 compt fits, then 1000
simulations followed by fits to 1 & 2 compt models, followed by statistical
tests in one batch program?  Even if not, are there sites in the NONMEM archive
that have sample control streams for those such as me that struggle with these
Paul
_______________________________________________________

From: Leonid Gibiansky lgibiansky@emmes.com
Subject: Re: [NMusers] LLR test, AIC, BIC
Date: 10/22/2003 10:06 AM

Nick,

This randomization method might be the most appropriate for the problem, but
even with the most advanced hardware/software combination you will not be able
to apply this procedure to each model comparison step of the modeling process
(the number of this steps can easily be in dozens for the base model, in hundreds
for the covariate model, each step running anywhere from few minutes to few hours).
Therefore, there should be a way (you may call it quick and dirty way) to decide whether
to accept the model or make it more complicated. Then the question is not whether to
use some quick criteria based on the objective function or use randomization test,
but rather which of the criteria (LLC, AIC, BIC) to use and how to compute them
correctly. Correctly here means "with the highest probability that crude criteria
will correctly approximate true distribution". It would interesting to compare
crude approach with randomization approach to extract some recommendations on
when and how to use crude approach.

Leonid
_______________________________________________________

From: "Hutmacher, Matt" mhutmach@amgen.com
Subject: Re: [NMusers] LLR test, AIC, BIC
Date: 10/22/2003 3:12 PM

Hello all,

I certainly agree that what Nick has proposed below is one way to proceed.
However, I would like to comment on his suggestion.

The algorithm that Nick proposes below, in my mind, is not technically a
randomization test.  In my experience, the randomization test randomly
permutes the "actual data" to establish the distribution of the null
hypothesis test.  What Nick is doing is "simulating data" to assess what the
LRT would look like if the data were generated by the 1 compartment model
that he describes.

Since the data are simulated from a model that was based on a fit to the
observed data, I would argue that what is being simulated is not the true
null distribution but an approximation to it.  Thus the cut-off value that
you select is a prediction.  While I believe that this is one way to
proceed, the method is not as absolute (in my mind) as Nick suggests.  The
assessment of type 1 error of 5% is a prediction not a truth.

Perhaps a simpler way to proceed here is to look at a few diagnostics.
Check the model's condition number to see that the model is not
over-parameterized.  Check to make sure the theta estimates don't indicate
the 2-cmt model is wanting to collapse into a 1-cmt.  Look at the WRES and
IWRES plots versus time to make sure the 1-cmt model is adequately
characterizing the peak concentrations and the tail (no time trends).  If
you are still unsure, simulate from the 2-cmt and 1-cmt models and see which
can better reproduce the data (small posterior predictive check).

Matt
_______________________________________________________

From: Nick Holford n.holford@auckland.ac.nz
Subject: Re: [NMusers] LLR test, AIC, BIC
Date: 10/22/2003 5:21 PM

Matt,

Thanks for your comments. I am aware of the 'true' randomization test method
that you refer to (e.g. see http://wfn.sourceforge.net/wfnrt.htm). I agree
that a method based on randomization of the actual data is what Fisher
originally proposed e.g. the 'Fisher exact test'. However, I do not know
of a way to perform this kind of randomization to investigate the performance
the LLR to distinguish a one vs two cpt model.

The parametric bootstrap method I described to create an empirical null
distribution has been called by one group of authors the 'simulation
hypothesis test' (Gisleskog PO, Karlsson MO, Beal SL. Use of Prior Information
to Stabilize a Population Data Analysis. Journal of Pharmacokinetics & Biopharmaceutics
2003;29(5/6):473-505).

A randomization test, based strictly on the original data, is used to estimate the
probability of rejecting the null for *that specific set of data*. It is not a
generalized result. The 'simulation hypothesis test' more closely resembles the
asymptotic flavour of conventional statistical testing because it is obtained by
considering a large sample from the proposed distribution of typical data generated
from the null model.

I prefer not to use the term 'simulation hypothesis test' because it focusses
attention on hypothesis testing. The procedure can be seen in a broader context.
It is an algorithm for generating the null distribution of a (test) statistic.
This null distribution has several uses other than strictly doing an hypothesis
test with some arbitrary alpha criterion e.g. it can be used to estimate the true
probability of the data arising under the null, it can be used to create a table
of lookup values for doing hypothesis testing, it can be used to teach and learn
about the shape of distributions that are are widely assumed to have certain shapes
(but these assumptions may be wrong). The NONMEM community has been exposed over
the last couple of years to the problems of assuming the chi-square distribution
for the null distribtion of LLR (especially with FO but also  with FOCE). If you
need to get involved in making important modelling decisions using hypothesis
testing with the LLR then I would encourage you to verify by experiment what
null distribution is required for your decision.

The other diagnostics you mention are of course valuable and I would typically rely
more on a visual examination of the time course of observed and predicted concs to
make a decision on an individual data set. However, some tasks e.g. using clinical
trial simulation to examine the power of designs, require an automatable, objective
decision criterion. I have been using the randomization test to get better critical
values for rejecting the null when doing clinical trial simulation. This has had a
major impact on the estimates of power -- critical values for LLR changes are often
much larger than expected even using FOCE.

Nick
--
Nick Holford, Dept Pharmacology & Clinical Pharmacology
University of Auckland, 85 Park Rd, Private Bag 92019, Auckland, New Zealand
email:n.holford@auckland.ac.nz tel:+64(9)373-7599x86730 fax:373-7556
http://www.health.auckland.ac.nz/pharmacology/staff/nholford/
_______________________________________________________

From: Nick Holford n.holford@auckland.ac.nz
Subject: Re: [NMusers] LLR test, AIC, BIC
Date: 10/22/2003 7:39 PM

Leonid,

The simulation form of the randomization test can be made as
crude as you wish to meet your own correctness criterion
"with the highest probability that crude criteria will correctly
approximate true distribution". The crudity (or precision) is
determined simply by the number of replications.

I am not advocating that the RT be used at every stage of model
building. A 10 point change in LLR for one parameter using FOCE
along with other diagnostic info is a practical approach while
learning about a drug. But when the focus of the drug development
modelling is to support a confirming rather than a learning focus
then attention should be paid to the assumptions made when doing
hypothesis testing.

BTW I cannot agree with your generalization "even with the most
advanced hardware/software combination you will not
be able to apply this procedure ...". As other recent posts to
nmusers have indicated it is quite *possible* to apply the procedure
to many problems but I would not consider it an effective use of time
and resources for model building.

Nick

--
Nick Holford, Dept Pharmacology & Clinical Pharmacology
University of Auckland, 85 Park Rd, Private Bag 92019, Auckland, New Zealand
email:n.holford@auckland.ac.nz tel:+64(9)373-7599x86730 fax:373-7556
http://www.health.auckland.ac.nz/pharmacology/staff/nholford/
_______________________________________________________

Subject: Re: [NMusers] LLR test, AIC, BIC
Date: Thursday, October 23, 2003 1:44 AM

Dear all,

I seem to have stimulated a bit of discussion!  However, a few of my
original points have been left out of the ensuing discussion .  I
understand the limitations of LLR, AIC etc, but I still would like to be
able to calculate them, given that it doesn't really take much effort
(well, sort of...).

Here are my questions:

I have read a thread posted some time ago on this topic, but it raised a
ew questions for me.
If:
AIC= obj. fun + 2*NPAR
BIC= obj. fun + NPAR* log(NOBS)

As I understand, log(NOBS) is the natural log, and NOBS is the number of
data points -is this the number of concentration-time points or the number
of subjects?

To examine the impact of including a covariable (no eta, just a new
theta),  I understand the use of the the LLR test etc just fine.  But what
about, for example, going from a 1 compartment to a 2 compartment model
-there are new theta's and matching eta's.

***So my question is: does the "number of parameters (NPAR)" include
theta's, eta's and sigma's?  This also has an impact on AIC and BIC
calculations as different results are obtained if one only counts
theta's...***

I have also seen a correction to the AIC:

AICc = obj. fun + 2*NPAR + (2*NPAR*(NPAR+1))/(NOBS-NPAR-1)

Does anyone use this, and does it also apply to the BIC?

Regards,

David

--

David Foster, PhD
NHMRC Research Officer
Department of Clinical and Experimental Pharmacology
Faculty of Health Sciences
Tel: +61 08 8303 5985
Fax: +61 08 8224 0685

CRICOS Provider Number 00123M
_______________________________________________________

From: "Kowalski, Ken" Ken.Kowalski@pfizer.com
Subject: Re: [NMusers] LLR test, AIC, BIC
Date: 10/23/2003 8:34 AM

David,

NPAR is the total number of parameters estimated in the likelihood including
all the parameters in Theta, Omega and Sigma.

NOBS is the total number of observations NOT the total number of subjects.

See Vonesh & Chinchilli, Linear and Nonlinear Models for the Analysis of
Repeated Measures, Marcel Dekker, 1997, p. 262.

Ken
_______________________________________________________

From: "Bonate, Peter" pbonate@ilexonc.com
Subject: Re: [NMusers] LLR test, AIC, BIC
Date: 10/23/2003 8:43 AM

David,
I don't know if you ever got anybody to answer you.  But

The AIC was originally predicated on independent observations, like in
linear regression.  I am not sure if it was ever really validated for
repeated measures data, but people do it all the time.  NOBS is total
number of observations.

NPAR is the number of estimable parameters, all thetas, etas, and sigmas,
including covariances.

BIC tends to pick the simpler model more often than AIC or AICc.

The AICc was for small sample sizes and I don't believe really applies
to pop pk models.  NOBS is usually much larger than NPAR, so the
second-order correction term is practically zero.

An excellent book on this is Model Selection and Inference by Burnham
and Anderson.  This is a must read.

Hope this helps,

pete bonate
_______________________________________________________

From: Robert L. James rjames@rhoworld.com
Subject: Re: [NMusers] LLR test, AIC, BIC
Date: 10/23/2003 9:00 AM

David,

I regularly use BIC (along with graphic examination, convergence of
standard errors, and physiological intuition) as a criterium in model
development.  I prefer BIC because it is more parsimonious than AIC.  I
always try to error on the side of a simpler model.

For NPAR I use all parameters estimated by the model (thetas,etas,sigmas).
For NOBS I use the number of concentration-time points.

Robert

_______________________________________________________

```