From: "Muralidharan, Bharath" Bharath.Muralidharan@stjude.org
Subject: [NMusers] Coding for missing data values   
Date: Mon, June 28, 2004 11:32 am

Dear NONMEM Users,

 

Let me introduce myself as Kumar. I am a graduate student
in the department of Biomedical Engineering at UT Health
science center – Memphis. Is there a way in which I can code
for missing data relating to a possible covariate? An example
would be that I have Technetium clearance for only few patients
and do not have most of the other data set. How do I code for
the fact that the data item is missing in a few individuals?
I assume that NONMEM reads this as a zero value rather than
as a missing value.

 

Kumar

Graduate Student

Pharmaceutical Sciences Department

St. Jude Children's Research Hospital

332 North Lauderdale Street

Memphis, TN 38105

Danny Thomas Research Center
_______________________________________________________

From: "Bachman, William (MYD)" bachmanw@iconus.com
Subject: RE: [NMusers] Coding for missing data values   
Date: Mon, June 28, 2004 12:20 pm   

There are a number of ways you can do this:
 
1. simply code separate parameters for those with and
without the covariate.
 
IF(TECL.EQ.0) THEN  
CL=THETA(1)  ;where TECL is assigned to zero in the data file for those with missing value
ELSE
CL=THETA(2)+(TECL-5.4)*THETA(3) ; where 5.4 might be the mean TECL
ENDIF
 
2. impute the missing covariate.  again a number of ways this can be done. 
eg. simplest way is to use the population mean for the missing subjects or
devise a more complex imputation scheme possibly based on the relationship
between the covariate and other available covariates.
 
second way probably more prone to inducing bias in the model, first way
possibly less explanatory of variance.
 
William J. Bachman, Ph.D. 
Manager, Pharmacometrics Research and Development 
GloboMax® 
The Strategic Pharmaceutical Development Division of ICON plc 
7250 Parkway Drive, Suite 430 
Hanover, MD 21076 
410-782-2212 
bachmanw@iconus.com 
_______________________________________________________

From: Nick Holford n.holford@auckland.ac.nz   
Subject: RE: [NMusers] Coding for missing data values   
Date: Mon, June 28, 2004 7:40 pm

Bill,

In your first method you propose estimating THETA(1) for CL when TECL is missing and
THETA(2) for CL when TECL is equal to the mean TECL.  If TECL is missing then
wouldn't the simplest thing be to assume that TECL is equal to the mean TECL (e.g.
5.4) in which case THETA(2) is the prediction for CL if TECL is missing? This only
requires estimation of one THETA instead of two.

If I understand the second method you are proposing correctly then it shouldn't be
any worse than method 1 and in general will be better. If observed TECL is used as a
DV with DVID.EQ.2 and observed CONC has DVID.EQ.1 then I would suggest the
following:

$THETA 10 ; POPCL
$THETA 5.4 ; POPTCL
$THETA 0.1 ; SLOPE
$OMEGA 0.25 ; PPV for CL
$OMEGA 0.01 ; PPV for POPTCL
$SIGMA 1 ; eps(1)
$SIGMA 0.01 FIX ; eps(2). Use a plausible value for the measurement error of TECL
e.g. SD=0.1

$PK
ITCL=THETA(2)*EXP(ETA(2)) ; individual prediction for TECL
GRPCL=THETA(1)*EXP((ITCL-5.4)*THETA(3)) ; group prediction for CL
CL=GRPCL*EXP(ETA(1)) ; individual CL prediction
...

$ERROR
IF (DVID.EQ.1) THEN
   Y=F+EPS(1) ; observed conc
ENDIF
IF (DVID.EQ.2) THEN
   Y=POPTCL+EPS(2) ; observed TECL
ENDIF

If population parameter variability for TECL [OMEGA(2,2)] is fixed to 0 then this
becomes essentially the same as your method 1 i.e. it uses the mean observed TECL to
centre the TECL covariate. If OMEGA(2,2) is estimated then the value of ITCL will
vary from subject to subject. Depending on how small EPS(2) is made the value will
be close to the observed value when TECL is not missing. If it is missing then a
plausible value will be imputed that reflects the uncertainty in CL for that
individual given the particular covariate model using TECL. 

If I remember correctly this method for imputing missing covariates with NONMEM was
first proposed by Karlsson M, Jonsson E, Wiltse C, Wade J. Assumption testing in
population pharmacokinetic models: illustrated with an analysis of moxonidine data
from congestive heart failure patients. J Pharmacokinet Biopharm 1998;26(2):207-46.

Note the empirical covariate model for TECL uses EXP() to avoid predicting negative
values of GRPCL. If THETA(3) is 'small' then this model is approximately the same as
a linear function of TECL.

Nick
--
Nick Holford, Dept Pharmacology & Clinical Pharmacology
University of Auckland, 85 Park Rd, Private Bag 92019, Auckland, New Zealand
email:n.holford@auckland.ac.nz tel:+64(9)373-7599x86730 fax:373-7556
http://www.health.auckland.ac.nz/pharmacology/staff/nholford/
_______________________________________________________

From: "Anthe Zandvliet" Apaza@SLZ.NL
Subject: RE: [NMusers] Coding for missing data values
Date: Tue, June 29, 2004


Nick,

Thank you for your suggestion how to account for missing covariates.
I hope that I haven't misunderstood the code  , but I suppose that the
ERROR block should contain Y=ITCL+EPS(2) rather than Y=POPTCL+EPS(2).
Could you please let me know if I'm wrong? I will definitely try the
code provided by you. Thanks again!

Anthe
_______________________________________________________
From: Nick Holford n.holford@auckland.ac.nz
Subject: RE: [NMusers] Coding for missing data values
Date: Tue, June 29, 2004 2:57 pm

Anthe,

Sorry about the mistake. You are correct. Of course, the
prediction for DVID.EQ.2 should be ITCL not POPTCL.

Note that you can make the code for prediction of ITCL fancier
if you want e.g. if you have WT then you could make an
allometric prediction. POPTCL would then be the pop TECL for a 70 kg subject.

POPTCL=THETA(2)
GRPTCL=POPTCL*(WT/70)**0.75
ITCL=GRPTCL*EXP(ETA(2)) ; individual prediction for TECL

Nick
--
Nick Holford, Dept Pharmacology & Clinical Pharmacology
University of Auckland, 85 Park Rd, Private Bag 92019, Auckland, New Zealand
email:n.holford@auckland.ac.nz tel:+64(9)373-7599x86730 fax:373-7556
http://www.health.auckland.ac.nz/pharmacology/staff/nholford/
_______________________________________________________


From: "Bachman, William (MYD)" bachmanw@iconus.com   
Subject: RE: [NMusers] Coding for missing data values   
Date: Mon, June 28, 2004 9:33 pm

Nick,

You've misunderstood the code.  In the second instance, the code is CENTERED
on the mean but INDIVIDUALIZED by TECL from the data file.

There is also a difference between assuming the mean and just allowing the
CL for missing TECL to be estimated at whatever value the data will dictate.

Bill
_______________________________________________________

From: Nick Holford n.holford@auckland.ac.nz
Subject: RE: [NMusers] Coding for missing data values   
Date: Mon, June 28, 2004 10:05 pm

Bill,

I don't think I have misunderstood your code. What I don't understand is why you
chose this code. Is this because you do not want to assume that the group clearance
in a subject with TECL missing is the same as the group clearance in a subject with
TECL equal to the mean TECL?

Would you please try to clarify your remarks about assuming the mean etc. by
referring explicitly to the comments I made earlier? Are your remarks related to
Method 1 or Method 2?

Thanks,

Nick
--
Nick Holford, Dept Pharmacology & Clinical Pharmacology
University of Auckland, 85 Park Rd, Private Bag 92019, Auckland, New Zealand
email:n.holford@auckland.ac.nz tel:+64(9)373-7599x86730 fax:373-7556
http://www.health.auckland.ac.nz/pharmacology/staff/nholford/
_______________________________________________________

From: "Bachman, William (MYD)"    
Subject: RE: [NMusers] Coding for missing data values   
Date: Tue, June 29, 2004 9:02 am

"Why did I choose this code?"  Actually, the bottom line is that I gave two
methods that are commonly employed: assuming the mean for the missing value
(or other imputation algorithm) or letting the data decide if this is a
valid assumption.  (remember that the original question asked was: how do
you code missing covariates?)  Choose whichever you want.  In actual
practice, I've most often assumed the mean for missing covariates and
frankly it usually has no significant effect on the model which method is
used.  

However, I've decided to be the devils advocate in the finest Holfordesque
tradition.  There is no reason to assume that the missing values are random
and representative of the population (unless you have additional prior
information and in the absence of which it is not rigorous to assume that
they are from a statistical viewpoint).  e.g. they may have come from a
pediatric population (less blood drawn, fewer tests, high likelihood of
different parameters), from a site with less rigorous procedures (just
skipped the test, could also have less attention to sampling times resulting
in more variability), from a sicker subset, or any number of other
scenarios.  Assuming the mean for them introduces systematic bias under
these scenarios.  Allowing the parameter to be estimated could
prove/disprove the validity of the assumption.  

The other reason for coding they way I did was the interpretation of the
thetas.  In retrospect, this is how I would code it today: 

IF(TECL.EQ.0) THEN
TVCL=THETA(1) + THETA(2) + (COVn-x.x)*THETA(n) + ...
 ELSE
TVCL=THETA(1)+(TECL-5.4)*THETA(3) + (COVn-x.x)*THETA(n) + ...
ENDIF
CL=TVCL*EXP(ETA(1))

Then, theta(1) is "basically" the population typical value, theta(2) relates
the difference in CL between those with and without measured TECL, and
theta(3), theta(n) represent the influence of TECL, COVn ...  At the
conclusion of the modeling excercise, test for significance of all thetas.
If any are not, remove them from the model.  (if theta(2) is zero you have
proved that that population without measured TECL can be adequately
represented by mean TECL, get rid of it).

Let the data drive the model to the simplest model rather than assuming it
apriori.  If a simpler model is warranted, the data will tell you that and
the prudent modeler will listen to the data.  Also, give 10 analysts a set
of data and you will get 10 differently coded models.

_______________________________________________________

From: Nick Holford 
Subject: RE: [NMusers] Coding for missing data values  
Date: Tue, June 29, 2004 9:42 pm 

Bill,

Thanks for explaining your approach. I agree with your overall strategy (not very
Holfordesque!) if you do not want to use the joint modelling method to describe the
missing covariate. 

Returning to being Holfordesque, I would quibble with the choice of an additive
model for all covariate effects. Unless one is careful these kinds of models can
lead to predictions of negative values which are usually unphysiological. I prefer
to use multiplicative covariate models for empirical covariate effects e.g.

POPCL=THETA(1)
KMISS=THETA(2)
KTECL=THETA(2)
KCOVN=THETA(4)
IF(TECL.EQ.0) THEN
   GRPCL=POPCL*EXP(KMISS)*EXP((COVn-x.x)*KCOVN)*EXP(...) ...
ELSE
   GRPCL=POPCL*EXP((TECL-5.4)*KTECL)*EXP(COVn-x.x)*KCOVN)*EXP(...) ...
ENDIF
CL=GRPCL*EXP(ETA(1))

In this particular case the TECL is probably being used to predict renal function in
which case an additive model would be mechanistically more appropriate. I would then
prefer to write:

PPCLNR=THETA(1) ; constrain this to be non-negative in $THETA
KMISS=THETA(2)
POPCLR=THETA(2)
KCOVN=THETA(4)
TCLSTD=5.4 ; or whatever value is appropriate for a standard renal function

IF(TECL.EQ.0) THEN
   RF=EXP(KMISS) ; KMISS.NE.0 means Renal Function is non-standard when TECL is missing
ELSE
   RF=TECL/TCLSTD ; RF.EQ.1 means this is standard Renal Function
ENDIF
GRPCL=(PPCLNR + RF*POPCLR)*EXP(COVn-x.x)*KCOVN)*EXP(...) ...
CL=GRPCL*EXP(ETA(1))

Nick
--
Nick Holford, Dept Pharmacology & Clinical Pharmacology
University of Auckland, 85 Park Rd, Private Bag 92019, Auckland, New Zealand
email:n.holford@auckland.ac.nz tel:+64(9)373-7599x86730 fax:373-7556
http://www.health.auckland.ac.nz/pharmacology/staff/nholford/
_______________________________________________________