From: "chunlin chen" quadchen@yahoo.com Subject: [NMusers] Missing mixed continuous and categorical data Date: Mon, May 2, 2005 6:56 pm Dear all, I am a new NONMEM user.A few questions about developing the data set for NONMEM.I have 46 eight patients totally.2 patients miss ASA score (categorical variable) and 4 patients miss height( continuous variable). How to use NONMEM handle the missing mixed continuous and categorical data. Can I do Multiple imputation to estimate missing data by Splus/R using MIX Library first, and then utilize NONMEM to build population PK/PD modeling? Many thanks! Chen chunlin Faculty of pharmacy University of Montreal E-mail:chun.lin.chen@umontreal.ca _______________________________________________________ From: "Nick Holford" n.holford@auckland.ac.nz Subject: Re: [NMusers] Missing mixed continuous and categorical data Date: Tue, May 3, 2005 7:30 am Hi, There are several approaches to your problem. 1. The simplest is to impute the missing value with the median of the non-missing values. I suspect this is the most widely used method. 2. You can use multiple imputation to generate say 6 separate data sets each with imputed values drawn from either a theoretical or empirical distribution of the covariates. Then fit each of the 6 data sets and use the mean of the 6 estimates for each parameter as the final model estimate. The choice of 6 is "Rubin's Rule" -- suggested by Don Rubin (one of the originators of the multiple imputation concept). You can try bigger numbers of imputed until you find the mean converges but 6 is often enough. 3. You can construct a joint model for the covariate distribution and the PKPD model. This means including all the covariates as DV values and constructing a model (usually quite simple) to predict each covariate. The first method is simple but ignores correlation between covariates. The second and third methods allow you to account for the covariance of covariates. It can be tricky to know what to do when you have both categorical and continuous variables. If you have lots of patients for each category e.g. the category is sex and about half of the sample is male then you can construct two multivariate normal distributions for the continuous covariates (one for males and one for females). If you have many categories then you can try treating the categorical covariate as if it was a continuous value. Stacey Tannenbaum (stacey.tannenbaum@pharma.novartis.com) and Ivan Matthews (Ivan.Matthews@postgrad.manchester.ac.uk) have both worked on this method and may be able to help you. If you use method 3 then you will need to be aware of the isolated eta bug (see http://www.metrumrg.com/publications/LG_ETAbug_full.pdf for details) and use the 'zeta transform' of ETAs in order to capture the correlation between covariates when predicting missing values. Nick _______________________________________________________ From: "Mats Karlsson" mats.karlsson@farmbio.uu.se Subject: RE: [NMusers] Missing mixed continuous and categorical data Date: Wed, May 4, 2005 4:10 am Hi, We have recently used a mixture model approach for missing categorical covariates (see code below). If you have category observations in some but not all individuals, the parameter values for the different categories and corresponding mixture components will be the same (i.e. A known EM and a mixture model assigned EM will have the same typical CL value). The observed (or known from literature) subpopulation frequencies can be used as fixed parameters. Thus, no additional parameters need to be estimated. This approach can be extended (I think but have not tried) to the situation where both a continuous and categorical are missing. If they are not correlated the extension is trivial. For the continuous covariate you can use the "data" method (for a discussion on implementation of that method see J Pharmacokinet Biopharm. 1998 Apr;26(2):207-46) and the categorical the method illustrated in code below. If they are correlated (and either or both can be missing), the situation is trickier. However, also that can be done: let's assume that the covariates instead are SEX and WT and you want to model CL with different WT relations for the two sexes: TVCL=THETA(1)*WT IF(SEX.EQ.1) TVCL=THETA(2)*WT ;Lets assume the following: WT(males) = 75 (mean) , 10 (SD BSV) , 0 (SD measurement error) WT(females) = 65 (mean) , 10 (SD BSV) , 0 (SD measurement error) Males=50% of target population Categories for missing data are NOMISS, SEXMISS, WTMISS and BOTHMISS. Everyone has a "1" in one of these categories and "0" in the three other. For the solution below it is assumed that there are no BOTHMISS SEX=0/1 for males/females Assume $SIGMA and $OMEGA fixed to 1 Then an extended model would be: IF(NOMISS.EQ.1) THEN TVCL=THETA(1)*WT IF(SEX.EQ.1) TVCL=THETA(2)*WT ENDIF IF(WTMISS.EQ.1) THEN MWT=75+ETA(1)*10 FWT=65+ETA(1)*10 TVCL=THETA(1)*MWT IF(SEX.EQ.1) TVCL=THETA(2)*FWT ENDIF IF(SEXMISS.EQ.1) THEN IF(MIXNUM.EQ.1) TVCL=THETA(1)*WT IF(MIXNUM.EQ.2) TVCL=THETA(2)*WT ENDIF $MIX NSPOP=2 P(1)=EXP(0.0111+(WT-70)*0.0922)/(1+EXP(0.0111+(WT-70)*0.0922)) P(2)=1-P(1) ENDIF ;the equation of P(1) comes from a logistic regression based on the ;different WT distributions for males and females ;---- Here is the code for the different genotype (VHAB) influences on CL: IF(VHAB.EQ.1.OR.VHAB.EQ.-99) CLVHAB = 0 IF(VHAB.EQ.2) CLVHAB=THETA(8) IF(VHAB.EQ.3) CLVHAB=THETA(9) IF(MIXNUM.EQ.2) THEN IF(VHAB.EQ.-99) CLVHAB =THETA(8) ENDIF IF(MIXNUM.EQ.3) THEN IF(VHAB.EQ.-99) CLVHAB =THETA(9) ENDIF CL=TVCL*(1+CLVHAB)*EXP(ETACL) $MIX NSPOP=3 P(1)=0.253 P(2)=0.552 P(3)=0.195 P(1-3) Are observed frequencies of wild type, heterozygous and homozygous variants in the studied population. ;------- Best regards, Mats -- Mats Karlsson, PhD Professor of Pharmacometrics Div. of Pharmacokinetics and Drug Therapy Dept. of Pharmaceutical Biosciences Faculty of Pharmacy Uppsala University Box 591 SE-751 24 Uppsala Sweden phone +46 18 471 4105 fax +46 18 471 4003 mats.karlsson@farmbio.uu.se _______________________________________________________