"See also: 99aug222002 "Imputation of missing sex covariate"

From:"Venkatesh Atul Bhattaram" 
Subject:[NMusers] Missing Gender (Categorical values)
Date:Tue, 30 Jul 2002 11:19:59 -0400

Hello All
 
Could somebody share their views on how to analyse data where there are missing "gender" data.
I am analysing a data where in one study in nearly 70% of the data the information on "Gender" is missing. 
In earlier discussion on missing covariates in 2001 Dr Leonid discussed a way to analyse this data. Are there
any new views on these type of data?
 
Could somebody suggest me some references in this direction?
 
Thanks in advance for your time.
 
Venkatesh Atul Bhattaram
Post-doctoral Fellow
University of Florida
Gainesville-32610

------------------

From:Nick Holford 
Subject:Re: [NMusers] Missing Gender (Categorical values)
Date:Wed, 31 Jul 2002 07:34:37 +1200

Atul,

You probably are missing data on sex (not gender -- Kim JS, Nafziger AN. Is it sex or is it
gender? Clin Pharmacol Ther 2000;68(1):1-3).

If you are missing sex then I suggest you simulate it. You know from the existing data the
probability of being female (PRFEM) so simply simulate the missing sex values e.g. if you use
NONMEM:

$SIM (20000625 NEW) (12345678 UNIFORM) SUBPROBLEMS=1 ONLYSIMULATION
IF (ICALL.EQ.4.AND.SEX.LT.0) THEN ; assume missing SEX is coded < 0
       CALL RANDOM(2,R)
       IF (R.GT.PRFEM) THEN
          SEX=1 ;male
       ELSE
          SEX=0 ;female
       ENDIF
ENDIF

An alternative, more elegant approach, is to treat SEX as another DV. This is a bit trickier as it
requires a LIKELIHOOD model that allows you to estimate continuous and categorical data at the
same time. The missing SEX values are then predicted from the parameter describing the probability
of being female just like you can predict DV values at times when you have no observations.

Nick
Nick Holford, Divn Pharmacology & Clinical Pharmacology
University of Auckland, 85 Park Rd, Private Bag 92019, Auckland, New Zealand
email:n.holford@auckland.ac.nz tel:+64(9)373-7599x6730 fax:373-7556
http://www.health.auckland.ac.nz/pharmacology/staff/nholford/

------------------

From:Nick Holford 
Subject: [NMusers] Not enough sex!
Date:Wed, 31 Jul 2002 07:57:24 +1200

{assuming the subject line got past your spam filter...]

I forgot to say that just simulating sex once is not enough. Ideally you should simulate sex about
6 times i.e. simulate the missing SEX covariate in 6 different data sets and run your model with
each of these data sets then average the parameter estimates you get across all the 6 runs. This
is called multiple imputation. Why 6 times? This is Rubin's Rule (Don Rubin is Prof Statistics at
Harvard and invented the multiple imputation method). In this particular instance the rule of 6 is
even older -- the Romans would say SEX was enough too.

The joint function model (which I mention below) gets around the need to have 6 data sets so is
probably more time efficient in the long run.


Nick Holford, Divn Pharmacology & Clinical Pharmacology
University of Auckland, 85 Park Rd, Private Bag 92019, Auckland, New Zealand
email:n.holford@auckland.ac.nz tel:+64(9)373-7599x6730 fax:373-7556
http://www.health.auckland.ac.nz/pharmacology/staff/nholford/

------------------

From:Alan Xiao 
Subject:Re: [NMusers] Missing Gender (Categorical values)
Date: Tue, 30 Jul 2002 16:28:04 -0400

Nick and Atul, 

We might want to take into account  the effect of SEX on PK/PD parameters if the effect is for sure before
we talk about the random imputation. 
Does anyone have any literature reference on this topic? 

Thanks, 
  

Alan. 

------------------

From:"diane r mould" 
Subject:RE: [NMusers] Missing Gender (Categorical values)
Date: Tue, 30 Jul 2002 16:49:05 -0400

Dear All
 
I would think that you should be able to get some good information for imputation from other
covariate data such as weight, creatinine clearance, age etc.  So even if a sex effect on the PK
or PD of a drug is not well established, one should be able to create a covariate based model that
is not unreasonable for use with multiple imputation.
 
Then you could use the covariate information (including the sex data) as your DV and use the
Likelihood option as Nick suggested.
 
Diane

------------------

From:Leonid Gibiansky 
Subject:Re: [NMusers] Not enough sex!
Date:Tue, 30 Jul 2002 16:53:10 -0400

I would try the following approaches:
1. Create a three-level covariate:
gender= M, F, missing. 
Then patients with gender="missing"  should have intermediate values of the
parameters comparing with M and F. At least, this should give a feeling on
whether this covariate is important and what is the difference between the
parameters for M and F.

2. It is likely that one can predict gender based on weight and height
(something else ?). Simple tree model in S+ or any other similar software
can do it (find the model based on 30% of the available data and predict
for the other 70%).  One can then fit the model with these "predicted"
gender and compare OF and fit with the model obtained in (1).

3. Alternative may be to try mixture model. If for 30% of patients with
known gender, the probability of being in one of two groups will correlate
with the gender then one may conclude that groups are defined by the
gender. If on the other hand, the mixture model will not reveal importance
of gender (again, comparing model (3) with (1) and (2) ), then one can
safely ignore the issue and omit the gender. In fact, weight and height
may compensate for absence of gender. 

On the other hand, gender is one of the most easily measured covariates. It
should be possible to recover it if any information about the study is
available.
Leonid

------------------

From:Nick Holford 
Subject:Re: [NMusers] Missing Gender (Categorical values)
Date:Wed, 31 Jul 2002 08:58:57 +1200

Alan,

I thought the idea was to simulate SEX in order to help discover the effect of SEX on other model
parameters. Can you be be clearer about what you mean about taking this into account BEFORE doing
the imputation?

The original Rubin paper on imputation is Rubin DB, Schenker N. Multiple imputation in health-care
databases: an overview and some applications. Stat Med 1991;10(4):585-98. 

Joe Shafer (http://www.stat.psu.edu/~jls/) gave an excellent
talk at PAGE in 2001 on the topic and has lots of stuff on his web pages. I am not aware of any
applications of multiple imputation using NONMEM but Lewis Sheiner has remarked that there is little
point because you can always do joint modelling which is a better overall approach.

You can look at Mould DR, Holford NHG, Schellens JHM, Beijnen JH, Hutson PR, Rosing H, et al.
Population Pharmacokinetic and Adverse Event Analysis of Topotecan in Patients with Solid Tumors.
Clinical Pharmacology & Therapeutics 2002;71(5):334-348 for a recent application of the joint
modelling approach for missing covariates. The initial suggestion for using this with NONMEM was
Karlsson M, Jonsson E, Wiltse C, Wade J. Assumption testing in population pharmacokinetic models:
illustrated with an analysis of moxonidine data from congestive heart failure patients. J
Pharmacokinet Biopharm 1998;26(2):207-46.


-- 
Nick Holford, Divn Pharmacology & Clinical Pharmacology
University of Auckland, 85 Park Rd, Private Bag 92019, Auckland, New Zealand
email:n.holford@auckland.ac.nz tel:+64(9)373-7599x6730 fax:373-7556
http://www.health.auckland.ac.nz/pharmacology/staff/nholford/

------------------

From:"Stephen Duffull" 
Subject: RE: [NMusers] Missing Gender (Categorical values)
Date:Wed, 31 Jul 2002 08:55:00 +1000

Hi
 
Just my 2c worth.  I have tried joint function modelling on one data set - where about 50% of the
patients had missing covariates.  The covariates were continuous - rather than categorical.  There weren't any other
covariates to try and get an idea about the missing one in question (which is again different from your example)... so
it was a matter of estimating the missing covariate and parameter values simultaneously from the PK data
(there was a lot of that).  Anyway it seemed to work ok - but we had difficulties due to the large proportion
of missing covariates.  With 70% of your sex data missing this could be problematic also.
 
In addition, we did not find a satisfactory way of assessing the statistical significance of covariate relationships.
When using joint function modelling the objective function is greatly inflated with estimating the covariates, which
means that a simple LRT is not straightforward to perform.
 
Regards

Steve
*****************************************
Stephen Duffull
School of Pharmacy
University of Queensland
Brisbane 4072
Australia
Tel +61 7 3365 8808
Fax +61 7 3365 1688
http://www.uq.edu.au/pharmacy/duffull.htm

------------------

From:"Bachman, William" 
Subject:RE: [NMusers] Missing Gender (Categorical values) - my $0.02
Date: Wed, 31 Jul 2002 08:25:03 -0400

imho, if I had other covariate data such as weight, creatinine clearance,
age etc. (and 70% of the gender data was missing), I would forget about
gender entirely and not go about making up data!

:)
Bill
------------------

From:"Lewis B. Sheiner" 
Subject: Re: [NMusers] Missing Gender (Categorical values) - my $0.02
Date:Wed, 31 Jul 2002 08:58:31 -0700

And just to chime in ...

If you *must* know the effect of sex for some reason, then mult 
imputation is a way of evaluating what Diane (& Nick) call the 
likelihood option; actually a marginal likelihood --
p(Y|data) = Integral[p(Y,S|data),p(S|data)dS],
where Y is your usual response, and S is sex.  It is important to 
realize that both methods are attempting to find the MLE of the SAME the 
same underlying likelihood (i.e. model).  They are simply using 
different methods to do so.  The std error of the sex covaraite (when 
correctly computed using either method) will of course be larger than if 
there had been no missing data (you can't get something for nothing).

On the other hand, if there is no reason to need to knbow the sex 
coefficient per se (e.g. you're just on a hunt for explanatory variables 
& don'tcare which ones you find), then you can just  leave sex out if 
you ahve little informationon it, UNLESS the missingness of sex is 
non-ignorable (that is, the sex covariate is missing selectively in 
individuals whose responses are systematically different than the rest). 
 In that case (which should be revealed by Leonid's analysis using a 
separate 'missing' class for those with missing sex data), if the other 
covariates do correlate with sex, then sex should be taken into account 
in the likelihood to avoid bias.  This can be done using either of the 
computational approaches above.

LBS.


    _/  _/  _/_/ _/_/_/ _/_/_/ Lewis B Sheiner, MD (lewis@c255.ucsf.edu)
  _/  _/ _/    _/_    _/_/    Professor: Lab. Med., Biopharmaceut. Sci.
 _/  _/ _/        _/ _/       Box 0626, UCSF, SF, CA, 94143-0626
 _/_/   _/_/ _/_/_/ _/        415-476-1965 (v), 415-476-2796 (fax)
------------------

From: Alan Xiao 
Subject: Re: [NMusers] Missing Gender (Categorical values)
Date:Wed, 31 Jul 2002 12:20:21 -0400

Nick, 

You got me.   I replaced  "think about" with "take into account"   but forgot to change  "before" to " when".  
Anyway, my English undoubtedly need be improved and a careful check should be done
before the email was sent out. 

About the imputation, I am just curious about the effect  of the imputation on the identifiability of the
covariate effect if the effect of the imputed covariate is significant (for example, known from other data).
About this topic, you can ask hundreds of similar questions.

Here  is  an example, if the effect of SEX  on CL is surely significant, you might get different
imputation results and/or parameter estimates when you include SEX into your model, as compared
to when you exclude SEX from your model.  Or say, the results might be different between imputations
using a structural model and using a full model.  (I'm not really sure but this should be testable by
simulation).   If this is true, then it's reasonable to think (in an opposite  way) that imputation will
influence the identification of the covariates or the significance of the covariates to the model, depending
on what  kind of imputation method is used. 

Thank you for the information you listed. 
  

Alan. 
------------------

From: "Serge Guzy" 
Subject: RE: [NMusers] Missing Gender (Categorical values)
Date:Wed, 31 Jul 2002 09:41:25 -0700


I think that there is no difference conceptually between filling missing data and estimating PK
parameters from sparse data. The same strategy could be used and one of them is the Monte Carlo
Implementation of the EM Algorithm. 

Serge Guzy,PH.D
Head of Pharmacometrics
Xoma
------------------

From:Alan Xiao 
Subject:Re: [NMusers] Missing Gender (Categorical values)
Date:Wed, 31 Jul 2002 14:08:22 -0400

In methodology, I think you are right.  However, in implementation, I'm not sure.  As you know, 
to estimate PK parameters  from sparse data, an appropriate model structure is a prerequisite although
this model structure  can be tested/selected using statistical tools.  Similarly, in filling missing data, an
appropriate model structure (or algorithm or whatever you name) is also a prerequisite.  Likelihood method
is just a statistic tool to force the model to fit the data (or converge) in a certain way and it is not the model itself.  
When the above two processes or tasks (filling the missing data and estimating PK parameters)  are not
correlated, that would be relatively simple - both model structures (or algorithms) can be adjusted independently
at the same time based on the selected statistic tools.  If they are  correlated, how to make sure both model
structures are correct or correctly adjusted based on a selected statistic tool is somehow a question I would like to ask. 
I'm just wondering if there are any reports on this. 

I noticed that in Mould et al's paper on topotecan (J Pharmacokinet Biopharm 1998;26(2):207-46), WEIGHT
was used as a "built-in" covariate in the model.  SEX was not identified as a covariate.  I'm not sure whether SEX
is really not significant or just  because SEX is  highly correlated with WEIGHT (equation 4) and the  WEIGHT
imputation covers the effect of SEX.   Of course, I do not mean SEX must be in the model, either.  I'm just
curious if any testing was performed on this. 
  

Alan. 
------------------

From:Nick Holford 
Subject:Re: [NMusers] Missing Gender (Categorical values)
Date: Thu, 01 Aug 2002 07:52:19 +1200

Alan,

Alan Xiao wrote:

The idea of multiple imputation is to test the effect of SEX under the range of plausible
possibilities for the missing SEX covariate. You cannot get something for nothing so I expect the
power to detect the effect of SEX is lower than if you had the full data set. On the other hand if
you conclude via imputation that SEX has an effect then you are more likely to be correct
(assuming the effect of SEX is the truth) than by assuming an intermediate SEX which is guaranteed
to be the wrong value for the missing covariate.

Your response to Serge Guzy said"I noticed that in Mould et al's paper on topotecan (J
Pharmacokinet Biopharm 1998;26(2):207-46), ".  Note there are 2 papers cited here. Diane Mould
wrote about topotecan in 2002, Mats Karlsson wrote about assumption testing in 1998. I leave Diane
and/or Mats to respond to your question...

-- 
Nick Holford, Divn Pharmacology & Clinical Pharmacology
University of Auckland, 85 Park Rd, Private Bag 92019, Auckland, New Zealand
email:n.holford@auckland.ac.nz tel:+64(9)373-7599x6730 fax:373-7556
http://www.health.auckland.ac.nz/pharmacology/staff/nholford/

------------------

From: "diane r mould" 
Subject:RE: [NMusers] Missing Gender (Categorical values)
Date: Wed, 31 Jul 2002 19:24:54 -0400

Dear Alan
 
I would have to agree with Lewis' summary - that if you have some reason to believe that sex is an
important covariate or if there is reason to believe that the missingness of sex is informative
(non ignorable) then you have reason to undertake some form of multiple imputation to attempt to
account for that covariate.  However, if you are just in the process of identification of
covariates, then you would be better off taking less intensive measures than imputation. 
Therefore, I would ask for more more input from you on that issue before launching into some
discussion on how to do that and whether the results are reasonable.
 
is this covariate part of your hunt for covariates?  is there some reason to think that the
missingness is not ignorable?
 
The work that was published in CPT did not have to estimate sex based on imputation, but we did
have to estimate performance status, which is also a discrete covariate and therefore has some of
the same issues associated with imputation.  Unlike sex, which is correlated with other covariates
such as weight or creatinine clearance, we did not see  correlations for performance status that
would help predict it (other than the response) although we were also not missing as much data as
you seem to be.  Therefore, I expect that your imputation model for sex would probably be more
reliable than ours was for performance status.  So if you do need to use some form of imputation
then I think it would be do-able.
 
Please let me know your thoughts
 
Best Regards
Diane
------------------

From: Alan Xiao 
Subject:Re: [NMusers] Missing Gender (Categorical values)
Date: Sun, 04 Aug 2002 20:58:07 -0400

Dear Diane, 

Sorry for the late reply to your email because I was out for vacation 20 minutes after I sent the last email. 

About the imputation of  missing data, I'm not against Lewis' summary at all.   By contrast, I agree with his summary. 
However, as expressed in the last email, what I'm concerned is about the potential effect  of the imputation model (or 
algorithm) on the evaluation of the significance of the imputed covariate and corresponding correlated covariates (used 
in  the imputation model, or joint model in your paper) to the parameters in a PK model and/or the justification of  the 
imputation model and the PK model - this is not about the Likelihood method itself.  Here, by justification of the PK 
model, I mean the type of function in the PK model for the covariate  effect (when covariate  is other covariates than 
SEX) rather than the whole PK model.  To make this easier to understand, let's take your paper as an example: 

1). How about if you replace Equation 4 in your paper with other simpler functions, such as WEIGHT as a function of BSA 
and HEIGHT or as a function of AGE, CLCR and SEX.  As you know, BSA is usually calculated from HEIGHT and 
WEIGHT while CLCR is calculated  from AGE, WEIGHT and SEX.  The functions for them are very certain and no 
modeling/simulation is needed at all if your BSA data  was indeed calculated from HEIGHT and WEIGHT  or CLCR was 
indeed calculated from AGE, WEIGHT and SEX, and HEIGHT  and/or SEX data was not missing.  From your Table I, 
AGE was not missing at all. BSA, CLCR and SEX were also available (1 missing in CLCR and 3 in SEX, as compared to 
AGE).   Or, BSA and SEX were  also partly imputed in TABLE 1?  In another word, patients with missing WEIGHT 
actually had all other covariate values missing, including AGE, BSA, SEX and CLCR? - I don't think the information 
about this is clear in the paper.  Or your BSA and CLCR were directly measured so that you did not have a certain 
function for them to simply connect WEIGHT with BSA and HEIGHT or CLCR, AGE  and SEX ?  If so, can you tell us 
how they were measured? (The note under table I says that CLCR was calculated from Cockcroft and Gault formula, 
which is a function of AGE, WEIGHT and SEX - why couldn't you just simply revert  the calculation to get WEIGHT 
from CLCR, AGE and SEX?) 

Actually, whether they were measured or calculated does not influence our discussion.  The question is, how are you sure 
your joint model is the best imputation model? Did you try other imputation models? If you include a function, for 
example, WEIGHT**THETA(), to the volume of distribution, or another one such as THETA()*SEX to clearance in the 
PK model (assuming they are significant, thus your PK model and imputation model are correlated) and simultaneously fit 
the PK model and imputation model to the data using likelihood method to control the minimization, would you get the 
same results? or close enough? 

2).  We talked about SEX  previously just because SEX was the missing covariate in the email sent by Atul.  If the missing 
covariate is a continuous covariate, e.g. WEIGHT in your paper, it becomes a little bit more complicated, because the 
potential function could be a power function in additive or multiplication on some parameters of the PK model.  I'm 
afraid this function will also influence the imputation results.   Or, just for testing, how about replacing (WT/70)**0.75 in 
Equation 3  with (WT/70)?  Would the results be the same?  (I am trying to figure  out the conditions  for  a model and 
generalize it). 

3). Back to covariate  effects.  If the missing covariate is not significant to any parameters of a PK/PD model, then 
whatever value you impute does not matter - the imputation is not really important.  However, if the missing covariate 
(e.g. WEIGHT in your paper) is  significant to the PK/PD model, then the type of function in the PK/PD model to express 
the covariate effect will be correlated with the imputation model, as discussed in (1) and (2) above.  Furthermore, if a 
imputation model-predicting covariate  (or "joint model-predicting covariates" such as SEX in your paper, right below 
equation 4)  is significant or marginally significant, its significance could be neutralized (I'm not sure it's the right word) 
or largely weakened by the inclusion of the imputed covariate (WEIGHT here) into the model (i.e., both the imputation 
model-predicting covariate, e.g. SEX, and the imputed covariate, e.g. WEIGHT, are significant and correlated).  If you 
have tested that the imputation model-predicting covariate (SEX here) is not significant in the PK/PD model which does 
not include the imputed covariate (WEIGHT here), then we  might be able to ignore the influence of the imputed 
covariate on the identification of the imputation model-predicting covariate (SEX here).   When you say that you "did 
not have to estimate sex based on imputation", can you explain a little bit more in detail? Did you mean that you have 
tested or you knew from other data that  SEX is not significant (whether WEIGHT is significant or not)? or that SEX is 
not significant based on the PK model after imputation? 

4). How strong is this potential influence of the type of the imputation model on the type of the function of the imputed 
covariate  on parameters in a PK model? and  how strong is the potential influence  of the imputed covariate on the 
identification of other  significant covariates on parameters in a PK model? I have no idea. This is why I asked for the 
information if anyone has done this before.  But I think  this should be case dependent. 

5). Do I have a more reliable imputation model for SEX?  No.   I don't think I can develop one without any detailed 
information about the dataset.  Actually, the specific model itself is not the most important.  It is the methodology used to 
develop the model and the interpretation of the model that is the most important.  After all, science is just science, it can 
be questioned and can be defended. 

6). Another minor thing.  From my experience, in  a combined dataset (from many studies), when the missing ratio is 
high, the missing pattern is usually not random (refer to the combined dataset) - if the covariate  is missing for all subjects 
in one or more studies, or even if the missing is random in one or more studies.  In this case, the simple random imputation 
may not be appropriate at all - This could be easily overlooked if it is not yourself who have merged all sub datasets 
together.  I assume this is not the case in your paper (20% missing) and in Atul's data (70% missing). 
  

I have to admit that  I don't have had this paper yet:

Nick Holford wrote: 

Karlsson M, Jonsson E, Wiltse C, Wade J. Assumption testing in 
population pharmacokinetic models: illustrated with an analysis of moxonidine data from congestive heart failure 
patients. J Pharmacokinet Biopharm 1998;26(2):207-46. 

If all or some of above  questions/concerns have already been addressed in this paper,  please just simply skip and flag them. 

Thanks. 
  

Best regards, 
  

Alan. 
  
------------------

From:"diane r mould" 
Subject:RE: [NMusers] Missing Gender (Categorical values)
Date:Mon, 5 Aug 2002 21:10:12 -0400

Dear Alan  

Height was not available in the data, if it had been then we could have just back-calculated
weight from the Dubois and Dubois formula without imputing it.  We had BSA for all of the
patients, although we did not have the original values of weight that had been used to calculate
BSA.  We would rather have done that than pay the price in the long run times and other model
qualification aspects involved with using joint functions.  We did test age, sex, creatinine
clearance as covariates for the joint model for the same reasons that you cite.  This joint model
was built in the same fashion that any model would be built.  You should have seen that BSA, sex
and creatinine clearance were included in the model - age and the other covariates did not improve
the fit.  

 
We used the observed covariates in the pk part of the model if they were available.  BSA was
available for all patients.  creatinine clearance and sex were estimated for the few individuals
that were missing it.  There were 2 patients who were missing weight and were also missing sex and
ECOG performance status.  All of the patients who were missing weight did have creatinine
clearance however.  The patient who was missing creatinine clearance had the other covariates
available.  Patients missing ECOG performance status had all other covariates (with the exception
of the two who were missing weight and sex). 

We did not back calculate from creatinine clearance for several reasons - the first was that we
did not have sex for 2 of the patients and we did not have the serum creatinine data for any of
them.  Without serum creatinine, its hard to back estimate weight even if you happen to know the
sex, creatinine clearance, and age of a patient.  The second reason was that we were also
missing ECOG performance status from a fairly large percentage of patients.  This latter
covariate historically had been shown to influence the safety of the drug.  Therefore we had to
handle at least 2 missing covariates that were potentially meaningful.  

Many of the studies used in that work had been conducted a long time ago, and data bases change. 
Its hard to answer your question given the age and number of data bases that these data were
extracted from.  Some of the data was faxed to me on paper because the data bases were not easily
available. 

We felt that this was the 'best imputation model' for the same reasons that a modeler would decide
that his final pop pk model was 'best'.  It was used because the model described the data, the
covariates were physiologically reasonable and because changing that model (ie adding other
factors) did not improve it further.  The manuscript was fairly clear about the fact that we
tested a lot of covariates both for the joint model and for the pk model.  

The second aspect of your question seems to deal with the selection of the covariates for the PK
part of the model. Weight was not added just because we had imputed it.  This covariate was added
because it explained a lot of the inter-individual variability and its inclusion reduced the
objective function. We tested a rather long list of covariates, including sex, age, weight,
creatinine clearance, BSA, etc.  Other covariates and other functions did not do the job as well. 
If it helps, I did fit a much reduced data set (where all the imputed data had been removed) and
came to the same conclusions that were drawn from the larger data set.  In addition, I have
completed a second analysis using a new data set, with no missing data and the functions are
nearly the same - with the same results on IIV. 

You may be confusing regenerating missing data with standard model building practices.  The
covariate model that was ultimately used for weight was an allometric model - which is why the
exponential terms for clearance were fixed.  I did try other models for weight (and I also tried
BSA), but they were not as good as this allometric model.  Changing the PK model does have some
impact on the imputation model but its not that profound.  We noticed that the estimated weights
did change slightly from the base to the final model but the observed weights do help keep the
individual estimates of weight in line.  Unless the model is grossly misspecified, I dont think
that you are going to do a lot to change the individual estimates if the imputation model is not
perfect.  

Actually - I think I misspoke - we did impute sex using the joint function for the two subjects
who were missing it.   Sex was not statistically significant as a covariate in the pk model. 

You seem to be saying that one could dismiss a covariate because of imputation.  I dont think so. 
Perhaps you are missing an important point - the imputed value of a covariate that is used in the
pk is the INDIVIDUAL predicted value, not a typical value.  Even a base PK model (with no
covariates) should provide good individual predicted values of a concentration.  Furthermore, in
imputation, one would use the observed values of a covariate when they are available.  Sex, in our
case, was missing only for 2 patients - a very small percentage.  if sex was not significant (and
it was not) then its not dismissed because we used imputed weight as a covariate in the final
model.  Covariate effects are checked individually too.  Good model building practices should help
prevent the sort of thing that you describe from happening. 

I am not sure that I can answer that.  I would imagine that it would be case dependent but a
simulation study would need to be done to test that, or perhaps some other person could answer
this. 

True enough.  

You seem to be referring to informative missingness, such as missing creatinine clearance
information because all of the patients with low CLCR values dropped out due to high drug levels
leading to adverse events or something of that sort.  Is that right?  this is not the case with
topotecan - it was missing completely at random as far as we could tell.   However, it may be a
problem or an issue with Atul's data - but that would make it even more reasonable to impute, in
order to avoid bias as Lewis suggested earlier. 

Best Regards 

Diane

------------------
"See also: 99aug222002 "Imputation of missing sex covariate"