```
Subject: [NMusers] Missing mixed continuous and categorical data
Date: Mon, May 2, 2005 6:56 pm

Dear all,

I am a new NONMEM user.A few questions about developing the data set
for NONMEM.I have 46 eight patients totally.2 patients miss ASA score
(categorical variable) and 4 patients miss height( continuous variable).
How to use NONMEM handle the missing mixed continuous and categorical data.
Can I do Multiple imputation to estimate missing data by Splus/R using MIX
Library first, and then utilize NONMEM to build population PK/PD modeling?

Many thanks!

Chen chunlin
Faculty of pharmacy
University of Montreal
E-mail:chun.lin.chen@umontreal.ca
_______________________________________________________

From: "Nick Holford" n.holford@auckland.ac.nz
Subject: Re: [NMusers] Missing mixed continuous and categorical data
Date: Tue, May 3, 2005 7:30 am

Hi,

There are several approaches to your problem.

1. The simplest is to impute the missing value with the median of the non-missing
values. I suspect this is the most widely used method.

2. You can use multiple imputation to generate say 6 separate data sets each with
imputed values drawn from either a theoretical or empirical distribution of the
covariates. Then fit each of the 6 data sets and use the mean of the 6 estimates for
each parameter as the final model estimate. The choice of 6 is "Rubin's Rule" --
suggested by Don Rubin (one of the originators of the multiple imputation concept).
You can try bigger numbers of imputed until you find the mean converges but 6 is
often enough.

3. You can construct a joint model for the covariate distribution and the PKPD
model. This means including all the covariates as DV values and constructing a model
(usually quite simple) to predict each covariate.

The first method is simple but ignores correlation between covariates. The second
and third methods allow you to account for the covariance of covariates. It can be
tricky to know what to do when you have both categorical and continuous variables.
If you have lots of patients for each category e.g. the category is sex and about
half of the sample is male then you can construct two multivariate normal
distributions for the continuous covariates (one for males and one for females). If
you have many categories then you can try treating the categorical covariate as if
it was a continuous value. Stacey Tannenbaum (stacey.tannenbaum@pharma.novartis.com)
and Ivan Matthews (Ivan.Matthews@postgrad.manchester.ac.uk) have both worked on this

If you use method 3 then you will need to be aware of the isolated eta bug
(see http://www.metrumrg.com/publications/LG_ETAbug_full.pdf for details) and use the 'zeta transform' of ETAs
in order to capture the correlation between covariates when predicting missing
values.

Nick
_______________________________________________________

From:  "Mats Karlsson" mats.karlsson@farmbio.uu.se
Subject: RE: [NMusers] Missing mixed continuous and categorical data
Date: Wed, May 4, 2005 4:10 am

Hi,

We have recently used a mixture model approach for missing categorical
covariates (see code below). If you have category observations in some but
not all individuals, the parameter values for the different categories and
corresponding mixture components will be the same (i.e. A known EM and a
mixture model assigned EM will have the same typical CL value). The observed
(or known from literature) subpopulation frequencies can be used as fixed
parameters. Thus, no additional parameters need to be estimated. This
approach can be extended (I think but have not tried) to the situation where
both a continuous and categorical are missing. If they are not correlated
the extension is trivial. For the continuous covariate you can use the
"data" method (for a discussion on implementation of that method see J
Pharmacokinet Biopharm. 1998 Apr;26(2):207-46) and the categorical the
method illustrated in code below. If they are correlated (and either or both
can be missing), the situation is trickier. However, also that can be done:
let's assume that the covariates instead are SEX and WT and you want to
model CL with different WT relations for the two sexes:

TVCL=THETA(1)*WT
IF(SEX.EQ.1) TVCL=THETA(2)*WT

;Lets assume the following:
WT(males)   = 75 (mean) , 10 (SD BSV) , 0 (SD measurement error)
WT(females) = 65 (mean) , 10 (SD BSV) , 0 (SD measurement error)
Males=50% of target population
Categories for missing data are NOMISS, SEXMISS, WTMISS and BOTHMISS.
Everyone has a "1" in one of these categories and "0" in the three other.
For the solution below it is assumed that there are no BOTHMISS
SEX=0/1 for males/females
Assume \$SIGMA and \$OMEGA fixed to 1

Then an extended model would be:
IF(NOMISS.EQ.1) THEN
TVCL=THETA(1)*WT
IF(SEX.EQ.1) TVCL=THETA(2)*WT
ENDIF

IF(WTMISS.EQ.1) THEN
MWT=75+ETA(1)*10
FWT=65+ETA(1)*10
TVCL=THETA(1)*MWT
IF(SEX.EQ.1) TVCL=THETA(2)*FWT
ENDIF

IF(SEXMISS.EQ.1) THEN
IF(MIXNUM.EQ.1) TVCL=THETA(1)*WT
IF(MIXNUM.EQ.2) TVCL=THETA(2)*WT
ENDIF
\$MIX
NSPOP=2
P(1)=EXP(0.0111+(WT-70)*0.0922)/(1+EXP(0.0111+(WT-70)*0.0922))
P(2)=1-P(1)
ENDIF
;the equation of P(1) comes from a logistic regression based on the
;different WT distributions for males and females

;----
Here is the code for the different genotype (VHAB) influences on CL:
IF(VHAB.EQ.1.OR.VHAB.EQ.-99) CLVHAB = 0
IF(VHAB.EQ.2) CLVHAB=THETA(8)
IF(VHAB.EQ.3) CLVHAB=THETA(9)

IF(MIXNUM.EQ.2) THEN
IF(VHAB.EQ.-99) CLVHAB =THETA(8)
ENDIF

IF(MIXNUM.EQ.3) THEN
IF(VHAB.EQ.-99) CLVHAB =THETA(9)
ENDIF

CL=TVCL*(1+CLVHAB)*EXP(ETACL)

\$MIX
NSPOP=3
P(1)=0.253
P(2)=0.552
P(3)=0.195

P(1-3) Are observed frequencies of wild type, heterozygous and homozygous
variants in the studied population.
;-------
Best regards,
Mats
--
Mats Karlsson, PhD
Professor of Pharmacometrics
Div. of Pharmacokinetics and Drug Therapy
Dept. of Pharmaceutical Biosciences
Faculty of Pharmacy
Uppsala University
Box 591
SE-751 24 Uppsala
Sweden
phone +46 18 471 4105
fax   +46 18 471 4003
mats.karlsson@farmbio.uu.se
_______________________________________________________

```