From: "Piotrovskij, Vladimir [JanBe]" <>
Subject: Validation
Date: Wed, 3 Feb 1999 13:11:10 +0100

Dear all,
It was nice to see so intensive discussion on the issue that hardly deserve it. I would suggest another topic, to me much more essential, and I believe the exchange of opinions will be even more intensive:


FDA's Guidance for Industry "Population Pharmacokinetics" which is still draft (as far as I know) says: "The issue of validation of population models remains unresolved".

The list of validation methods includes:
- Prediction errors on concentrations: "This method can be used when only one sample per subject is obtained. ... The method does not take into account the correlation of observations within subjects"
- Standardized prediction errors: "The use of the approach is discouraged" (when and by whom?)
- Validation through parameters: What are 'true' individual parameters for a subject sampled 2 or 3 times and the model has, say, 5 structural parameters?
- Plot of residuals against covariates: Just a visualizing technique?
- Plot of residuals against covariates and Validation through parameters: Bootstrap is recommended with at least 200 replicates. It will take months for a model of medium complexity with the data set typical for industry (N hundreds if subjects) unless you have access to a supercomputer.
- Posterior predictive check: Has any body experience with this new technique in the field of population PK-PD?

Any comments?

Vladimir Piotrovsky
Clinical Pharmacokinetics, ext 5463
Janssen Research Foundation
2340 Beerse, Belgium




From: Pascal Girard <>
Subject: Re: Validation
Date: Wed, 03 Feb 1999 17:24:57 +0100

Dear Vladimir,

Just a note on posterior predictive check (PPC).

I used this technique for validating the compliance model. It is described in the recently published paper:

Girard, P., Blaschke, T.F., Kastrissios, H., and Sheiner, L.B. A Markov mixed effect regression model for drug compliance. Stat.Med. 17(20):2313-2334, 1998,

and I recently posted the NONMEM code for compliance model on the NONMEM repository in Palo Alto (, but not the code for the PPC, since it is a complex mixture of Splus, UNIX C-shell and NONMEM.

We found the idea of the PPC in an outstanding (from my point of view) paper:

Belin, T.R. and Rubin, D.B. The analysis of repeated-measures data on schizophrenic reaction times using mixture models. Stat.Med. 14:747-768, 1995.

The appealing idea with PPC, is that you validate your model using a statistic, non sufficient to describe your data, but that may interest directly the clinician. For example, with the compliance model, I used either the longest drug holiday, or the non therapeutic coverage posterior distribution, which speaks much more to a clinician than telling him that you have 30% interindividual variability in the logit of the marginal probability of not taking the treatment. For a population PK model, you can imagine loking at the distribution of the % of time during which the concentrations are within a therapeutic window, which once again can be more interesting than knowing you have 70%, 30% and 40% interindividual variability on Ka, CL and V and 50% residual variability.

So the idea of PPC is to simulate the posterior distribution of a (non sufficient) statistic (NSS) and to compare this distribution with the observed statistic on your actual data. If there is no contradiction between the 2 you accept your model. You can imagine a model for which NSS1 is in agreement with the posterior distribution simulated using the model, and NSS2 another statistic in contradiction with it. This may not be a problem if what is really important for the clinician is NSS1.

>From a technical point of view, in order to do PPC, you need posterior distribution of <<all>>, fixed and random, parameters of your population model. And here is the difficulty.

NONMEM does not give you the posterior distribution of <<all>>, fixed and random, parameters, because you do not define any prior distribution. For THETAs you can approximate the posterior distribution by supposing it is (multivariate) normal (MVN) with mean equals to THETA, the final estimate, and covariance given by the asymptotic covariance matrix, obtained using $COV. But even in this case, when yo sample from this distribution you may find negative parameters (e.g. CL or V), because normal distribution is not constrained to be positive. So either you will have to truncate your resampled parameters, or to suppose that your parameter is log normally distributed. The problem is even worse when you want to resample the OMEGA and SIGMA matrices using a MVN with mean OMEGA and SIGMA, the final estimates, and variance given by the asymptotic covariance matrix.

Another solution is to use a fully bayesian method, with MCMC algorithms, as the ones implemented in POPKAN or the less expensive PHARM-BUGS. This softwares allow you to define the prior distribution of all parameters, estimate the posterior distributions, and then to compute posterior distribution of any statistic you want.

When you don't have access to this bayesian technique, or you don't like it, you can simulate the posterior distribution using a parametric bootstrap, as we did for the compliance model. Briefly, let Y be your observed data set; M the final model fitted to Y; TOS=(THETA, OMEGA, SIGMA), the final parameter estimate; and S(Y) a statistic computed on Y. You can approximate the posterior distribution of S by doing:

1. Simulate new set of obsevations Y* using THETA, etas sampled from MVN(0,OMEGA) and errors sampled from MVN(0,SIGMA)

2. fit M to Y* and get new estimates TOS*

3. simulate new set of obsevations Y** using THETA*, etas sampled from MVN(0,OMEGA*) and errors sampled from MVN(0,SIGMA*)

4. compute S(Y**)

Repeat steps 1-4 a great number of times (at least 100). The fact that S is computed on Y** and not Y* (which would be less CPU intensive ...) is to approximate the posterior distribution of TOS given the data. Notice also that step 1 and 3 are easily implemented within NONMEM using $SIMUL.

Concerning CPU, the PPC results, presented on the Stat in Med paper for the compliance model, took 1 week of computation using 3 SUN Sparcstation and one UltraSparc (this one produces more than 60% of the iterations). No comments!

So there is space either for improved, less CPU intensive, methodology on PPC, or CPU improvement (UltraPENTIUM III, 450 GigaHz), and probably for both ...

I'm not sure all this helps industry today. But why not tomorrow ...



Pascal Girard
Service Pharmacologie Clinique
BP 3041,162, avenue Lacassagne
69394 LYON Cedex 03 FRANCE
Tel : +33 (0)4 78 78 57 26
Fax : +33 (0)4 78 78 57 19




From: Mats Karlsson <>
Subject: Re: Validation
Date: Thu, 04 Feb 1999 11:27:07 +0100

I agree with Pascal in that proper validation should be performed with the purpose(/future use) of the population model in mind. If not, formal validation doesnt make much sense. My experience of industry is that the intended purpose of the model seldom is clear enough when the validation procedure is decided and therefore the validation is more to be viewed as a model diagnostic (often excessively time-consuming compared to the increased/decreased trust in the model obtained). Maybe, as Pascal writes, things will change .

Best regards,




From: "BRUNO, Rene" <Rene.BRUNO@RP-RORER.FR>
Subject: RE: Validation
Date: Thu, 4 Feb 1999 15:03:03 +0100

Hi everybody, I am not sure I agree with Mats that the purpose of the model is seldom clear ... I believe that one of the objectives of pop PK in premarketing trials is drug labeling and the purpose is the identification of subpopulations at risk because of altered PK. The risk of course depends on both the magnitude of PK (CL) change and the consequences of such a change on effect (PK/PD). Lets focus on PK. In this context what matter is to validate a predicted decrease of CL in a subpopulation with some (often extreme) covariate value (say in older patients, patients with liver disease ...) and this is what we tried to do in the reference quoted in the draft guidance (ref 44, JPB, 24(2), 153, 1996) under the heading validation through parameters. The approach is based on the assessment of the performance of the (index set) model in predicting CL in the particular subpopulation (taken in the validation set).

I agree with Vladimir that the issue here is the estimation of 'true' individual parameters of the validation set patients based on sparse data. It is my experience that, provided that the design is informative enough in individual patients, the performance of empirical Bayesian estimation is quite robust to the priors. At the extreme, if the 'true' value estimates would strongly be biased toward the mean of priors (which is the risk), then the model would appear as non valid even if it is, which is conservative. I don't think the model could appear as valid if it is not.

Note that this approach requires as many patients in the validation set than in the index set (just to have enough patients in subpopulations with extreme covariate values). An alternative would be to independently re-built the model from the validation set and I would bet that you would come to a very similar final model (if it was valid) with similar parameter estimates and similar inferences regarding the subpopulations that matter (i.e. those with a change in CL with sufficient magnitude to be clinically relevant) ... this approach would not rely on the estimation of individual parameters ... unless if you are using them for model buiding.

Drug Metabolism and Pharmacokinetics
Pharmacometry Unit - Box 58
Rhône-Poulenc Rorer Recherche-Developement
20, Av. Raymond Aron
921645 Antony cedex - France
Tel :
Fax :
email :