From: "Charlotte van Kesteren" <>
Subject: logistic regression
Date: Fri, 14 Sep 2001 16:14:58 +0200

Dear NONMEM-users,

We have AUC-values and toxicity data of a total of 143 individuals,
originating from 4 studies with different treatment schedules. The toxicity
data are dichotomous, i.e. the adverse effect either occurs (1) or it does not
(0). We have one data point for each patient.

With logistic regression in NONMEM, we have tried to model the relation
between exposure and the chance of toxicity. Furthermore, we want to
investigate a possible schedule dependency in this relation.

However, we are not sure whether it is appropriate to estimate interindividual
variability with logistic regression with only one observation per individual.
Furthermore, how can we judge goodness of fit with such a data set? Does
anyone have experience with these kind analyses?

Thank you in advance for your help.
Best regards,
Charlotte van Kesteren





From: Lewis B Sheiner <>
Subject: Re: logistic regression
Date: Fri, 14 Sep 2001 08:41:29 -0700

Can't be done.

_/ _/ _/_/ _/_/_/ _/_/_/ Lewis B Sheiner, MD (
_/ _/ _/ _/_ _/_/ Professor: Lab. Med., Biophmct. Sci., Med.
_/ _/ _/ _/ _/ Box 0626, UCSF, SF, CA, 94143-0626
_/_/ _/_/ _/_/_/ _/ 415-476-1965 (v), 415-476-2796 (fax)





Subject: Re: logistic regression
Date: Fri, 14 Sep 2001 13:08:52 -0400

I have performed logistic regression with similar types of data. You could
simply perform the analysis in either S-plus or SAS.

BTW, an excellent text on LR is Hosmer and Lemeshow, Applied Logistic
Regression. This text covers many of the questions that you have asked
(none of which are really simple) and is very readable


Michael J. Fossler
Associate Director
Drug Metabolism and Pharmacokinetics, DuPont Pharmaceuticals
(302) 366-6445
Cell: (302) 584-5495






From: "Piotrovskij, Vladimir [JanBe]" <>
Subject: RE: logistic regression
Date: Mon, 17 Sep 2001 14:05:46 +0200


It is possible to solve some of your problems in NONMEM. However, the best
way is to apply generalized linear regression using one of the statistical

With NONMEM, try the following control stream:

$PROB dichotomous response: fixed effect of schedule
$DATA nmd.ssc
; schedule coded as 1,2,3, etc.
SCHD1 = 0
SCHD2 = 0
SCHD3 = 0
SCHD4 = 0
IF (SCHD.EQ.1) SCHD1 = 1
IF (SCHD.EQ.2) SCHD2 = 1
IF (SCHD.EQ.3) SCHD3 = 1
IF (SCHD.EQ.4) SCHD4 = 1
IF (DV.EQ.1) Y=P
IF (DV.EQ.0) Y=1-P
(2 5 7); 1 SIGM
(0 30 50); 2 E50 SCHD=1
(20 50 70); 3 E50 SCHD=2
(40 70 90); 4 E50 SCHD=3 (60 100 200); 5 E50 SCHD=4
$OMEGA .0001


I tested it using simulation-fitting. Note that you need sufficient number
of individuals per schedule to identify all the parameters with sufficient
precision. In my simulation I included 20 individuals per schedule and it
was OK.

Best regards,

Vladimir Piotrovsky, Ph.D.
Research Fellow
Global Clinical Pharmacokinetics and Clinical Pharmacology (ext. 5463)
Janssen Research Foundation
B-2340 Beerse





From: "James Bailey" <>
Subject: logistic regression
Date: Tue, 18 Sep 2001 16:24:59 -0500

I believe the difficulty with logistic regression for sparse dichotomous
data can be well appreciated by considering the case of binary data (for
example, loss of responsiveness with an intravenous anesthetic) with one
data point per patient. The probability of a positive drug effect is
given by

P = C**gamma/(C**gamma + C50**gamma) (1)

This is equivalent to a model which postulates an underlying continuous
drug effect E given by

E = gamma*ln(C/C50) + epsilon (2)

where epsilon is a random variable with a logistic distribution. It is
further postulated that a positive binary drug effect is observed if

E > 0

The probability of positive binary drug effect is equal to the
probability that epsilon is greater than -gamma*ln(C/C50). and using the
definition of the logistic distribution one can easily derive equation

Now consider interpatient variability and assume that

ln(C50) =ln(<C50>) + eta

where <C50> is the "typical value" and eta is normally distributed.

E = gamma*ln(C/<C50>) + gamma*eta + epsilon

In this case the probability of a positive binary drug effect is equal
to the probability that the random variable gamma*eta + epsilon is
greater than -gamma*ln(C/<C50>).

However, consider the situation where epsilon conforms to a normal
distribution instead of a logistic distribution. Then gamma*eta +
epsilon also has a normal distribution and it is impossible to determine
the relative contributions of eta and epsilon to the overall variance.
In this situation it is impossible to do a complete analysis of binary
data with one data point per patient. This, of course, corresponds to
probit analysis but it makes the difficulty apparent. The normal and
logistic distributions are not that different. Doing a population
analysis of sparse binary data depends on the ability to distinguish
between the two distributions and will be almost impossible.
Furthermore, it rests on the assumption of an underlying logistic
distribution for the intrapatient variability (in epsilon), and there is
little basis for this assumption.

I and my colleague Wei Lu have done some simulations and our results
indicate that from 5-10 data points per patient are necessary to
estimate <C50> or gamma with any degree of reliability.

Jim Bailey