From: Justin Wilkins jwilkins@uctgsh1.uct.ac.za Subject: [NMusers] $OMEGA blocks and log-likelihood profiling Date: Mon, May 31, 2004 9:26 am Hi all, When doing a log-likelihood profile of a model containing an $OMEGA BLOCK(n) structure, what is best practice for mapping ETAs that are defined by this kind of parameterization? It's my understanding that only the entire block can be fixed... Justin -- ------------------------------------------- Justin Wilkins ------------------------------------------- Division of Clinical Pharmacology Department of Medicine University of Cape Town ------------------------------------------- K45 Old Main Building Groote Schuur Hospital Observatory 7925 South Africa ------------------------------------------- Email: jwilkins@uctgsh1.uct.ac.za Phone: +27 (0)21 406 6448 Fax: +27 (0)21 448 1989 Web: http://www.uct.ac.za/depts/pha/ _______________________________________________________ From: Nick Holford n.holford@auckland.ac.nz Subject: RE:[NMusers] $OMEGA blocks and log-likelihood profiling Date: Tue, June 1, 2004 6:23 am Justin, My preference is to bootstrap rather than do lots of LLPs. The bootstrap gives you confidence intervals without the chi-square null hypothesis assumption (known to be commonly wrong) required to interpret LLPs. The bootstrap will of course let you compute confidence intervals for all your parameters (and secondary parameters derived from the primary parameters). Nick -- Nick Holford, Dept Pharmacology & Clinical Pharmacology University of Auckland, 85 Park Rd, Private Bag 92019, Auckland, New Zealand email:n.holford@auckland.ac.nz tel:+64(9)373-7599x86730 fax:373-7556 http://www.health.auckland.ac.nz/pharmacology/staff/nholford/ _______________________________________________________ From: mark.e.sale@gsk.com Subject: RE: [NMusers] $OMEGA blocks and log-likelihood profiling Date: Tue, June 1, 2004 6:55 am Nick, I like your suggestion that bootstrap is a better way to determine CI than LLP. The other advantage is speed, one set of bootstrap samples will give you CI for all parameters, rather than need to do a set for each parameter. But a question (or two). I know you've been an advocate that the covariance step success isn't always necessary to believe the results (and I'm beginning to agree with you). So, I assume you'd be willing to include NONMEM runs that failed covariance step in your bootstrap interval. But, what about other failures? Do you include termination due to rounding error? How do you protect against local minima? And then there are the ones that simply crash. We do this a lot as well (so far we have discarded the runs that fail covariance step), but are concerned that the runs that fail are the very ones that we're most interested in (the ones that define the tails where one or more parameters is poorly estimated), so eve if they are only 2%, they may be critical in defining the CI. Mark _______________________________________________________ From: Leonid Gibiansky lgibiansky@emmes.com Subject: RE:[NMusers] $OMEGA blocks and log-likelihood profiling Date: Tue, June 1, 2004 10:43 am Justin I'll leave the bootstrap/LLP comparison to others, but if you really want to do LLP for OMEGA, you need to express those in terms of THETA parameters. For diagonal OMEGA matrix it is trivial: $PK CL=THETA(1)*EXP(SQRT(THETA(11))*ETA(1)) V=THETA(2)*EXP(SQRT(THETA(22))*ETA(2)) ..... $OMEGA 1 FIXED 1 FIXED Here you can LLP THETA(11) and THETA(22) that are equivalent to OMEGA11, OMEGA22 For off-diagonal elements it is more tricky. One possibility is to express the model in terms of META1=THETA()*ETA(1) META2=THETA()*ETA(1)+THETA()*ETA(2) META3=THETA()*ETA(1)+THETA()*ETA(2)+THETA()*ETA(3) ...... CL=THETA()*EXP(META1) etc. and profile those THETA-coefficients (this is also the way to check for significance of each of the off-diagonal elements/correlations via LL). OMEGA elements are easily expressed in terms of THETAs. Leonid _______________________________________________________ From: Nick Holford n.holford@auckland.ac.nz Subject: RE: [NMusers] $OMEGA blocks and log-likelihood profiling Date: Tue, June 1, 2004 5:15 pm Mark, You will find some anecdotal evidence about the influence of including all runs or only successful runs for bootstrap CI here: http://www.cognigencorp.com/nonmem/nm/99jul152003.html Some remarks I made then were: "I bootstrapped the original data set using the preferred model and found 28% of bootstrap runs minimized successfully and 7.1% ran the $COV step. " "The mean of the parameters obtained from all bootstrap runs and the mean from those which ran the $COV step were all within 2%." "95% confidence intervals obtained from all the bootstrap runs were very similar to those obtained from minimization successful and $COV successful runs." "I computed the ratio of the mean standard error from the $COV successful runs to the bootstrap standard error obtained from all runs. For THETA:se estimates the $COV SE was on average 3% smaller but for OMEGA:se the $COV SE was 58% larger than the overall bootstrap SE." I would add that because $COV ran on some of the bootstrap samples the failure of NONMEM to run $COV does not mean the model is somehow badly formed (e.g. overparameterized). It must mean that NONMEM fails $COV because of the data because the only difference between runs that ran $COV and those that did not was a random sample of the data. BTW key parts of the results of the bootstrap exercise I described last year are on-line and will appear soon in print in BJCP: http://www.blackwell-synergy.com/links/doi/10.1111/j.1365-2125.2004.02114.x/full Nick -- Nick Holford, Dept Pharmacology & Clinical Pharmacology University of Auckland, 85 Park Rd, Private Bag 92019, Auckland, New Zealand email:n.holford@auckland.ac.nz tel:+64(9)373-7599x86730 fax:373-7556 http://www.health.auckland.ac.nz/pharmacology/staff/nholford/ _______________________________________________________ From: "Kowalski, Ken" Ken.Kowalski@pfizer.com Subject: RE: [NMusers] $OMEGA blocks and log-likelihood profiling Date: Wed, June 2, 2004 9:16 am Nick, Nmusers, Regardless of whether the $COV step fails, I have a hard time understanding why only 28% of the bootstrap runs minimized successfully if it wasn't due to over-parameterization. Note that over-parameterization does not mean that the model is badly formed. It could indeed be a correctly specified model (e.g., no lack of fit) just that the existing data do not support estimating all of the parameters. Ken _______________________________________________________ From: "Gastonguay, Marc" marc.gastonguay@snet.net Subject: RE: [NMusers] $OMEGA blocks and log-likelihood profiling Date: Wed, June 2, 2004 10:27 am Ken, Nick, Mark, NMusers: In addition to problems with over-parameterization, it's possible that a large fraction of bootstrap runs can fail (e.g. minimization unsuccessful) even when the model is not over-parameterized: particularly when one of the conditional estimation methods is used (e.g. methods other than FO). I've learned from Tom Ludden that NONMEM V can sometimes run into local minima when searching for individual ETA estimates - resulting in convergence difficulties for the population model. I suspect that this may be happening with a fraction of the bootstrap failures under conditional estimation. I've found that placing constraints on the individual ETA estimates through PRED-exit codes can be helpful in dramatically reducing the fraction of failed bootstrap runs. For example (assuming an exponential inter-individual variance model and 3 variance terms in OMEGA): IF(ABS(ETA(1)).GT.10) EXIT 1 102 IF(ABS(ETA(2)).GT.10) EXIT 1 103 IF(ABS(ETA(3)).GT.10) EXIT 1 104 ... (also use NOABORT option in $EST) I hope this is helpful. Marc Gastonguay _______________________________________________________ From: "Kowalski, Ken" Ken.Kowalski@pfizer.com Subject: RE:[NMusers] $OMEGA blocks and log-likelihood profiling Date: Wed, June 2, 2004 11:19 am Marc, Nmusers, I just have a hard time believing that this is the primary reason when so large a fraction (78%) fail to converge, and for those that do converge the $COV step fails. Certainly there are a number of reasons why we may encounter minimization failures but I would first rule-out over-parameterization before I pursued other explanations when such a large fraction fail. It would be of interest to review the $COV step output for some of the bootstrap runs of the 7% that had successful $COV estimation to see if the diagnostics suggest that the model is over-parameterized for these bootstrap runs. For example, are there extremely large SEs for one or more of the parameters, are there any pairwise correlations near +/-1, etc? If so, then over-parameterization would remain high on my list of potential reasons for the large fraction of minimization failures. Ken _______________________________________________________ From: jeffrey.a.wald@gsk.com Subject: RE:[NMusers] $OMEGA blocks and log-likelihood profiling Date: Wed, June 2, 2004 3:04 pm I am with Ken on this one. I am not too troubled by the lack of the $COV, but the low fraction of convergence is bothersome. Whatever the cause, I have a hard time reasoning what the bootstrap actually means in this case. The sample of parameter estimates is effectively censored by a presumably non-random filter (vis. NONMEM). Marc's suggestion sounds pragmatic, but I still have to think that something more systematic is occurring with the structural model. If some parameter/s is/are not identifiable in a large fraction of the runs then my feeling is that the 'correct' bootstrap distribution for one or more of the parameters would have a large fraction of its values (up to 72%?) sitting on 1 or 0 or whatever its null value would be. Jeff Jeff Wald, PhD jeffrey.a.wald@gsk.com Clinical Pharmacokinetics/Modeling and Simulation Neurology and GI RTP, NC _______________________________________________________ From: "Gastonguay, Marc" marc.gastonguay@snet.net Subject: RE:[NMusers] $OMEGA blocks and log-likelihood profiling Date: Wed, June 2, 2004 4:56 pm No argument here, either. When the primary problem is over-parameterization, constraining ETA estimates probably won't help much. In other cases, however, constraining individual ETA estimates can be very helpful with convergence in NONMEM. I have one example where constraining ETAs (within broad, realistic bounds) reduced the bootstrap run convergence failure rate from near 50% to less than 1%. Marc _______________________________________________________ From: Nick Holford n.holford@auckland.ac.nz Subject: RE:[NMusers] $OMEGA blocks and log-likelihood profiling Date: Thu, June 3, 2004 6:58 am Hi, Thanks to all of you for your opinions. Ken does not consider over-parameterization is a form of malformed model (but I think this is hairsplitting). However, it is my impression is that it is not the model per se that is causing the failure to converge. Random (non-parametric bootstrap) samples of the original data are able to use the same model to converge and in some cases complete the $COV step. So this means to me that the failure is not a systematic feature of the model. It might however be a systematic problem with the NONMEM code or the compiler/processor combination which may take different execution paths depending on the numbers it has to work with. I looked at one of the bootstrap runs that ran the $COV step. The highest CV% (SEE/estimate*100) was 85% for between occasion variability in V1 but all the rest were less than 28%). The highest estimation error correlation was 0.93 (between renal and non-renal clearance) but not enough to declare overparameterization (See Ken's contribution http://www.cognigencorp.com/nonmem/nm/99may012003.html). It also has eigenvalues ranging from 7.35E-03 to 5.26 suggesting ill-conditioning is not severe (once again see Ken's comments). Do these values invalidate the model's ability to describe and predict the influence of covariates in predicting individual clearance values (the main point of the model building study)? I also tried Marc's suggestion to use EXIT to avoid high etas. Exiting if ETA was >=5 did not let the model complete its initialization phase. Q and V2 caused exits if ETA was >= 5.5 and <10. The runs terminated in the same way as the unconstrained run on the original data "DUE TO PROXIMITY OF LAST ITERATION EST. TO A VALUE AT WHICH THE OBJ. FUNC. IS INFINITE (ERROR=136) AT THE LAST COMPUTED INFINITE VALUE OF THE OBJ. FUNCT.". I only tried this on the original dataset. It took about a month to do the unconstrained bootstrap runs and I have other things to do right now. With regard to Jeff's comment about overparameterized bootstrap runs tending to have many "sitting on its null value" -- this certainly wasn't the case for the parameters defining the covariate effects. But it's not clear to me what he means by the 'correct' bootstrap distribution. It seems to me that the performance of a model should be judged by it's purpose and not by criteria determined by numerical idiosyncrasies. NONMEM V is a bit of a dog when it comes to local minima as Marc reminds us and the conditional methods are clearly less robust (but nevertheless IMHO more believable). So I prefer to base decisions about model performance in how good the fit is by assessment of its merits in the intended application rather than the ability to jump over certain arbitrary hurdles. Nick -- Nick Holford, Dept Pharmacology & Clinical Pharmacology University of Auckland, 85 Park Rd, Private Bag 92019, Auckland, New Zealand email:n.holford@auckland.ac.nz tel:+64(9)373-7599x86730 fax:373-7556 http://www.health.auckland.ac.nz/pharmacology/staff/nholford/ _______________________________________________________ From: jeffrey.a.wald@gsk.com Subject: RE:[NMusers] $OMEGA blocks and log-likelihood profiling Date: Thu, June 3, 2004 7:59 am Nick My point referenced what might have been seen in the distributions of parameter estimates if the runs that failed to converge had actually provided some information...assuming that the full model was not identifiable in this set of runs. What you have seen in the parameters defining the covariate effects is not surprising as the runs that potentially show the over-parameterization are the ones that are censored out of the distribution. The 'correct' bootstrap distribution statement refers to the distribution of parameter estimates that makes an appropriate statement about the uncertainty that we have in the joint distribution of parameter values. By potentially ignoring a large fraction of the plausible parameter values - that is assuming a simpler model would have given a plausible answer - we risk assuming that we know more than we really do. Most of this string has been based on the premise that the model is over parameterized. I think that proposing technical solutions detracts from the philosophical point at hand. Even if the model is not over parameterized, the fact that so many runs fail to converge tells me something about the uncertainty in the model, its parameter estimates, or the tools we have to evaluate the model. I do not necessarily know how to quantify that uncertainty, but I have a hard time dismissing it. Jeff _______________________________________________________ From: "Kowalski, Ken"Subject: RE:[NMusers] $OMEGA blocks and log-likelihood profiling Date: Thu, June 3, 2004 12:40 pm Nick, Jeff, Marc, Nmusers, I don't think it is hairsplitting depending on your definition of a "badly formed" or "malformed" model. I tend to equate such statements with a poor fitting model. My point is that good fitting models can suffer from the effects of over-parameterization just as much as poor fitting models. I don't see how you can conclude because 28% of the bootstrap models converged that the failure of the other 72% is not related to some systematic feature of the model (although I wouldn't characterize it as some deficiency in a systematic feature of the model as this might imply that the model is poorly fitting but rather I would characterize it as a possible limitation of the data to support the model). Your latest COV step information regarding the 7% that converged with a successful COV step is encouraging from the standpoint that over-parameterization is not an issue for these bootstrap runs but Jeff makes a good point that the ones that did not converge are censored out from this evaluation. Perhaps the 93% where the estimation and/or COV step failed is still related to problems with over-parameterization. For example, if your dataset is based on a pooled analysis with several studies with varying designs where only small portions of the data from 1 or 2 studies provide information on some key parameter then a basic bootstrap resampling scheme that does not stratify by study and/or these key treatment/design features when performing the resampling from the original dataset may result in bootstrap datasets that under-represent key data necessary to support the model. I'm not saying that this is the issue you are encountering but it certainly is something I would investigate if I was encountering such a large failure rate in my bootstrap runs. I agree with Jeff on his philosophical point that we want to use bootstrapping to characterize the uncertainty in our parameters and having a large fraction fail is somewhat disconcerting about our ability to characterize this uncertainty. I always feel more enlightened when I can identify the root cause for convergence/COV step failures. From my own experience these failures are often related to some aspect of over-parameterization in elements of theta, Omega or Sigma. I'm not always successful in resolving these failures, but thinking through possible limitations of the data to support the model and diagnostic NONMEM runs specifically trying to resolve the convergence/COV step failures are worth the effort IMHO. With regards to NONMEM V's estimation methods being a dog I think you are being too harsh. I'm not fully informed on Marc's or Tom's work with FOCE and problems with convergence to local minima but I'm willing to bet these problems are especially pronounced when fitting models that are somewhat over-parameterized. I draw analogy to Pete Bonate's exercise where he showed that sensitivity to compiler/NONMEM installations was most pronounced when fitting ill-conditioned (over-parameterized) models. I think a lot of things can go wrong with NONMEM when we push the data too hard in supporting the models we fit. Statements that many of you make trivializing the importance of trying to get a successful COV step and just as important, to actually review the COV step output just galvanizes my thinking that the effects of over-parameterization are too often ignored. I agree with you that at the end of the day we need to develop good fitting models that meet the purposes/intended use of the model. However, I disagree that achieving convergence and successful COV steps in the majority of the bootstrap runs is an arbitrary hurdle...it's good science. Ken _______________________________________________________ From: "Mats Karlsson" mats.karlsson@farmbio.uu.se> Subject: RE: [NMusers] $OMEGA blocks and log-likelihood profiling Date: Sat, June 5, 2004 3:27 am Dear Ken, Jeff, Nick and all, It seems that in the latest discussion, Nick's original observation that parameter estimates were the same, regardless of successful COV step or not has been forgotten. Thus, Ken's suggestion below, that somehow the 7% data sets with successful COV step may contain some information that the other 93% data sets would not, does not seems plausible. If that had been the case, at least one parameter would have displayed a different mean and/or considerably more variability in the runs with failed COV-step than in those with successful COV step. Jeff is worried by so many bootstrap runs fail to converge and that therefore the bootstrap distribution of parameter estimates would be censored. However, no censoring appears to have taken place as Nick does report on the distribution of *all* parameter estimates (even non successful terminations such as "Rounding Errors dominating" will provide a set of parameter estimates). As the type of termination (whether successful COV step or not, successful minimization or not) was not of importance for the resulting distribution of parameter estimates, and local minima related to the use of the same initial estimates was not a problem (which I think Nick said it wasn't), I think it much supports Nicks notion that there is no correlation (in this case) between termination status and parameter estimates. On a general note, we know that with poor models (overparametrisation, misspecification) we can see failed convergence or failed covariance steps, whereas for appropriate models, supported by adequate data the opposite is true. We use the convergence and covariance steps of NONMEM to diagnose when we're moving from one type of problem (the appropriate) to another. However, Nick's results strongly suggest that these diagnostics are not appropriate in this case. This should not come as a surprise. We have here a situation where the COV step diagnostic is not perfect. Most of us have had similar experiences before. Nick seems to draw the conclusion that it therefore is useless. I don't agree with this, but still see COV step as a diagnostic with value even if we should be cautious in accepting both positive and negative results from it. Thus, with access to results from a properly carried out bootstrap, I would not care about COV-step failure or not, just as I generally would not pay attention to NONMEM's approximate SEs if I had bootstrap confidence intervals. COV-step is a cheap (in a positive sense) and often informative diagnostic, but yet not as informative as the more expensive bootstrap. Best regards, Mats -- Mats Karlsson, PhD Professor of Pharmacometrics Div. of Pharmacokinetics and Drug Therapy Dept. of Pharmaceutical Biosciences Faculty of Pharmacy Uppsala University Box 591 SE-751 24 Uppsala Sweden phone +46 18 471 4105 fax +46 18 471 4003 mats.karlsson@farmbio.uu.se _______________________________________________________ From: Nick Holford Subject: RE:[NMusers] $OMEGA blocks and log-likelihood profiling Date: Sat, June 5, 2004 4:52 am Mats, thank you for re-emphasising my earlier assertion. So everyone can make their own judgement of the similarity of the bootstrap average and stdev obtained with all runs, minimization successful runs and covariance step successful runs here are the parameters of a 2 compartment model (BSV* are sqrt(OMEGA), CVAG and SDAG are sqrt(SIGMA), R12 and R34 are correlations of random effects of CL and V and V2 and Q) plus some covariate parameters. Param avg-all avg-sxs avg-cov sd-all sd-sxs sd-cov CL 4.70 4.68 4.64 0.12 0.13 0.11 Vss 30.8 31.1 30.6 1.75 1.68 1.68 CLNR 1.12 1.12 1.11 0.24 0.25 0.30 CLR 3.56 3.56 3.53 0.23 0.24 0.27 V1 19.52 19.53 19.53 0.26 0.25 0.22 Q 1.02 1.01 1.00 0.11 0.10 0.09 V2 11.28 11.29 11.03 1.52 1.52 1.66 BSVCL 0.30 0.29 0.29 0.017 0.016 0.019 BSVV1 0.24 0.24 0.24 0.014 0.014 0.013 BSVQ 0.82 0.82 0.84 0.081 0.076 0.076 BSVV2 1.17 1.20 1.17 0.22 0.21 0.21 CVAG 0.16 0.16 0.16 0.006 0.006 0.005 SDAG 0.14 0.14 0.15 0.025 0.024 0.020 R12 0.55 0.55 0.55 0.059 0.064 0.058 R34 0.90 0.91 0.90 0.085 0.082 0.052 KAGECR 111.9 111.8 112.4 5.88 6.39 7.63 FSEXCR 0.82 0.82 0.82 0.033 0.032 0.027 KAGERF 119.0 119.1 118.4 8.05 7.95 7.25 FCPRLO 0.70 0.70 0.71 0.030 0.031 0.031 I don't contend that NONMEM's asymptotic standard errors are useless. But failure of the $COV step is not a reliable sign that the parameters are not trustworthy (see above). Equally I believe that success of the $COV step implies little more than good (numerical) luck. The SE estimates of the $COV step are a cheap and quick way of understanding something about estimation uncertainty but when they disagree with the bootstrap values I prefer the bootstrap. Nick -- Nick Holford, Dept Pharmacology & Clinical Pharmacology University of Auckland, 85 Park Rd, Private Bag 92019, Auckland, New Zealand email:n.holford@auckland.ac.nz tel:+64(9)373-7599x86730 fax:373-7556 http://www.health.auckland.ac.nz/pharmacology/staff/nholford/ _______________________________________________________ From: "Kowalski, Ken" Ken.Kowalski@pfizer.com Subject: RE:[NMusers] $OMEGA blocks and log-likelihood profiling Date: Tue, June 8, 2004 5:23 pm Mats, Nick, and all, I have not forgotten Nick's original premise that the parameter estimates were the same regardless of successful convergence or successful COV step. However, my preference is to understand why he has such a high convergence failure rate. In my opinion Nick has two choices, 1) he can try to understand why the failure rate is so high perhaps leading to a more stable model selection (if his model is over-parameterized) that resolves the high failure rate, or 2) he can verify that the empirical marginal distributions of the parameter estimates are the same between the successful and failed convergence runs. I don't believe Nick has provided sufficient information to assess the latter and I will respond directly to his message with what I think would provide compelling evidence that his bootstrap sample distribution is indeed independent of convergence status. I could be convinced with sufficient data that pooling bootstrap estimates from both the successful and failed convergence runs is OK for his particular example but Nick wants to generalize based on this one example to conclude that as long as we are doing bootstrapping we never have to worry about convergence and/or COV step failures and this is where I take exception. Suppose with another example with a high convergence failure rate the bootstrap distributions between the successful and failed runs are different. In this setting the analyst has no choice but to go back and try to figure out why he/she has such a high failure rate. This is where the COV step provides diagnostic information that may be helpful. In Nick's example, I wanted to try and understand why he has such a high convergence failure rate and the 7% that had a successful COV step do have additional information to assess the possible instability of the model at least with respect to these particular bootstrap datasets. It is in this regard that they contain more information than the 93% where the COV step failed. In a previous message Nick provided information from the COV step output that suggests the bootstraps runs for these 7% are indeed stable. That didn't have to be the case as the COV step can be successful and the model can still be unstable. It is for this reason that I agree with Nick that simple success or failure of the COV step alone is a poor indicator of the reliability/stability of the model. If the COV step output for these 7% had diagnostic information to suggest that these particular fits were unstable, then it would have been of interest to postulate alternative, more stable models that resolve this instability and see if that also resolves the high convergence failure rate as well. But in Nick's example the 7% percent appear to be stable which means diagnosing the reason for the high convergence failure rate is going to be more difficult. It could still be related to instability/over-parameterization of the model for the remaining 93% of bootstrap datasets but we would need more information from Nick regarding the design of his dataset and bootstrap sampling scheme to assess this. Ideally, one would be looking at the COV step output throughout the model building process before getting to the bootstrap phase to give one a better shot at not encountering such a high convergence failure rate when performing bootstrapping. In fact I often don't perform the COV step during bootstrapping as it can be impractical especially for models/data with long run-times. On the other hand, I don't often encounter such a high convergence failure rate when I perform bootstrapping either. I believe this is in part due to the concious effort I take to avoid instability in my model building. I agree with Mats that the COV step provides imperfect diagnostic information but that is the case with other diagnostics such as empirical Bayes estimation as well. Imperfect as the COV step information may be, it still provides valuable diagnostic information. That is not to say I would use the COV step output to make formal inference via confidence intervals because I generally don't, but I do think we should be reviewing the COV step output routinely to help guide model development. Ken _______________________________________________________ From: "Kowalski, Ken" Ken.Kowalski@pfizer.com Subject: RE:[NMusers] $OMEGA blocks and log-likelihood profiling Date: Tue, June 8, 2004 5:27 pm Nick and all, I can be convinced of your assertion that it is OK to pool bootstrap results for all the runs including those with failed convergence but I would need additional information. The table below summarizing the means and SDs don't provide sufficient information. If you are planning to use these bootstrap results to report out confidence intervals then it is important to compare the tails of the distributions with and without the failed convergence runs. At a minimum you should calculate bootstrap CIs with and without the failed runs to see how they compare. However, since only 28% of your runs had successful convergence, you might be faced with very poor precision to estimate the tail percentiles for the bootstrap CIs based on the successful convergence runs alone. In which case you might consider performing quantile-quantile (Q-Q) plots comparing the order statistics between the empirical distributions of the bootstrap estimates for the failed versus successful runs. If these Q-Q plots are fairly concordant between say the 10th and 90th percentiles (80% of the distribution) then I would be inclined to believe that the empirical distribution of the bootstrap estimates is independent of convergence status and that any breakdown in the tails is probably due to poor precision. In this setting I would then conclude it is OK to pool both the successful and failed convergence results in reporting bootstrap CIs. Assuming such information supports your assertion, I would be careful not to over-generalize these results to suggest that one can always pool failed convergence results. I believe every bootstrap simulation where you encounter a high convergence failure rate has to be dealt with on a case-by-case basis and a similar exercise as above would have to be performed if you wanted to pool these results. Moreover, if for a particular example, the empirical distributions are different between the failed and successful runs then the analyst has to go back to the drawing board to try and diagnose the reason for the high convergence failure rate. Here is where we seem to be in disagreement regarding the value of the COV step. Equating success of the COV step with good luck suggests that success or failure of the COV step is a purely random event outside of our control...I strongly disagree. I agree with you that success or failure of the COV step alone provides insufficient information regarding the reliability of the estimates, and in general it is not a good idea to perform formal inference with CIs based on the COV step std errors, but we appear to disagree on the value of the COV step output as a diagnostic tool to help assess instability which may be the reason for the high convergence failure rate. I agree with Mats that the COV step is an imperfect diagnostic and certainly we can get COV step and convergence failures that are unrelated to instability (e.g., over-parameterization) but that doesn't mean it has no value as you seem to suggest. Ken _______________________________________________________ From: Leonid Gibiansky lgibiansky@emmes.com Subject: RE:[NMusers] $OMEGA blocks and log-likelihood profiling Date: Tue, June 8, 2004 5:45 pm I think we need to discuss what is "failed" run. The run may fail for variety of reasons with different error messages. The most "safe" reason is insufficient accuracy: failure to achieve required precision. I am sure everyone observed this phenomenon when the run stops at the precision just below the requested one. If you request 3 digits you get 2.9, if you request 5 you get 4.5. I believe, these runs should be treated as "converged", not failed. Indeed, what is the difference if the run attained numerical precision of 3 significant digits with SIGDIGITS=3 (what is called good convergence) or with SIGDIGITS=5 (which would be treated as failure)? I would guess that failed runs in Nick example are of this variety. On the other hand, if the run fails with undefined OF or undefined precision, I would be hesitant to accept it even in the bootstrap setting. Leonid _______________________________________________________ From: "Kowalski, Ken" Ken.Kowalski@pfizer.com Subject: RE:[NMusers] $OMEGA blocks and log-likelihood profiling Date: Wed, June 9, 2004 8:55 am Leonid, I don't think Nick has provided sufficient detail to know how insufficient the accuracy might be. Moreover, even for the 28% which did achieve sufficient sigdigit accuracy, only one quarter (7%) of these had a successful COV step so I still think the problem could be due to a very flat likelihood response surface where difficulty in achieving sufficient accuracy can be problematic (i.e., the problem of over-parameterization where there is an infinite number of solutions for the parameters that result in essentially the same OFV). In any event, if indeed the majority of runs are failing with rounding error messages where the minimum sigdigits for the run is 2.9 then I would expect the bootstrap distribution for these failed runs would be no different from the successful runs. The whole point to doing the bootstrapping is to construct these distributions. So regardless of the reason for the failure, the proof is in the estimates obtained to generate these empirical distributions. I'm just concerned that there is the potential for many of them to fail near a local minima or perhaps not move very much from the starting values. In this setting, the bootstrap distribution for the failed runs could be very ugly (e.g., bimodal). That might not be the case for Nick's example but in general the potential is there...which is why we have to be cautious when we have such a high convergence failure rate. Ken _______________________________________________________ From: Nick Holford n.holford@auckland.ac.nz Subject: RE:[NMusers] $OMEGA blocks and log-likelihood profiling Date: Thu, June 10, 2004 3:55 pm Ken and Leonid, Thanks for all your thoughtful remarks and suggestions. I am afraid I am rather busy this week (leaving tomorrow for PAGE and 2 weeks in Europe) so I don't have time to take up the suggested investigations right now. But a quick thought -- Can either of you provide a data based example where you can show that bootstrap runs that minimized successfully with $COV produced more reliable parameter estimates than those that did not complete $COV or terminated? It seems that much of the discussion is based on opinion rather than data. I have offered some data to test the hypothesis that successful runs (+/- $COV) are better than all bootstrap runs. I realize it is only one data set (and I am working on another as I type) but it is data not just opinion. Nick -- Nick Holford, Dept Pharmacology & Clinical Pharmacology University of Auckland, 85 Park Rd, Private Bag 92019, Auckland, New Zealand email:n.holford@auckland.ac.nz tel:+64(9)373-7599x86730 fax:373-7556 http://www.health.auckland.ac.nz/pharmacology/staff/nholford/ _______________________________________________________ From: Leonid Gibiansky lgibiansky@emmes.com Subject: RE:[NMusers] $OMEGA blocks and log-likelihood profiling Date: Thu, June 10, 2004 4:29 pm Nick, I do not have an opinion on the subject that you raised: "The hypothesis that successful runs (+/- $COV) are better than all bootstrap runs". However, the question of whether to accept "strongly failed" bootstrap runs (that have parameters defined with zero significant digits or with undefined OF) is interesting to discuss. Can someone offer an example when "strongly failed" NONMEM run was used in any paper or regulatory submission? My guess would be no. I would simplify the model, fix some parameters or do any other possible tricks to get reasonable convergence. I think the criteria for accepting the bootstrap runs should be similar. We cannot accept "strongly failed" runs simply because they have parameters similar to those that converged. They need to be treated as failed, with undefined parameters. The next question is what to do with those. One can try to push them to convergence with several starting points (initial parameters). This would be my choice, but it need to be automated. Another option is to place them at the tails of the distribution: say if 5% of runs "strongly failed" then the most you can count on is to look for the bounds in the 5 to 95% interval (5% set aside for the failed runs). But his might be too strict. The other option (actually, my favorite) is to look on the bootstrap as a useful diagnostics of the problem, not the tool to get confidence intervals with great precision. Then 5% of "strongly failed" runs can be ignored and the rest used for approximate diagnostics, investigation of the parameter distributions, publishing nice papers with bootstrap figures, etc. Have a nice trip to PAGE and Europe ! Leonid _______________________________________________________ From: Nick Holford n.holford@auckland.ac.nz Subject: RE:[NMusers] $OMEGA blocks and log-likelihood profiling Date: Thu, June 10, 2004 4:57 pm Leonid, Thank you for your further comments. I am afraid your guess is wrong. The model (a "strongly failed NONMEM run") which is the subject of this discussion has indeed been published in Br J Clin Pharmacol. http://www.blackwell-synergy.com/links/doi/10.1111/j.1365-2125.2004.02114.x/ I can assure you many tricks and initial estimate games were played to try and get a cleaner solution for this model. If you have read the previous discussion you will realize that the *model* can describe a random sample of the original data with successful minimization and $COV. So my contention is that this is not a model problem but a modelling program (NONMEM) problem. I invite you to write to the editor with your opinions if you wish but I would prefer to read some arguments based on data. Nick -- Nick Holford, Dept Pharmacology & Clinical Pharmacology University of Auckland, 85 Park Rd, Private Bag 92019, Auckland, New Zealand email:n.holford@auckland.ac.nz tel:+64(9)373-7599x86730 fax:373-7556 http://www.health.auckland.ac.nz/pharmacology/staff/nholford/ _______________________________________________________ From: "Kowalski, Ken" Ken.Kowalski@pfizer.com Subject: RE: [NMusers] $OMEGA blocks and log-likelihood profiling Date: Thu, June 10, 2004 5:14 pm Nick, As I indicated in a previous message I don't typically run bootstraps with $COV. I really don't see the value in summarizing the bootstrap results with/without a successful COV step. The issue is the high convergence failure rate and whether or not you can use the estimates from the failed runs to provide valid inference via the empirical distribution generated from your bootstrap samples. The COV step only comes into play as a diagnostic to provide insight into why you are getting such a high convergence failure rate. Ideally, closer attention to the COV step output during model building to guide model selection may help to avoid the high convergence failure rate when performing bootstrapping. In the past you have indicated that you would much rather just go right into bootstrapping rather than take the time to get a successful COV step and review this output for ill-conditioning during model building. That's fine, but then you are more likely to encounter the problem of a high convergence failure rate during bootstrapping. In which case you have more work on the back end to verify that these failed runs can still be used to provide valid inference regarding the uncertainty in the parameters. You may be fine with your one example, but in general its a slippery slope to be on and each case where you have a high convergence failure rate the burden will be on you to verify that the empirical distributions for the failed and successful runs are unchanged before pooling them. If you had a successful COV step from your model fit on the original dataset, which suggested that your model was stable, and you ended up with >90% successful convergence in your bootstrap runs, then I couldn't care less whether you even ran the COV step for the bootstrap runs. Ken _______________________________________________________ From: Leonid Gibiansky lgibiansky@emmes.com Subject: RE:[NMusers] $OMEGA blocks and log-likelihood profiling Date: Thu, June 10, 2004 5:47 pm Nick Every rule has exceptions. I will not write to the editor, of course (why you included this suggestion to the message?) but this is an open forum, and everybody can express their opinions even the opinions that are not based on the data. My opinion is that the results based on the strongly failed run should be taken with great suspicion, even if they are confirmed by 1000s more strongly failed bootstrap runs. I would agree with you that you can accept the model (treated as black box) and model predictions based on the model diagnostics (who cares how the model was obtained if it describes the data well enough?) but it is more difficult to argue that the parameters and CI obtained by this procedure reflect true parameters and CI. It cannot be only NONMEM problem. NONMEM was able to fit thousands of models to thousands of data sets successfully. The failure of the program on one particular data set means that this particular model and this particular data set have some intrinsic problems so that general and stable nonlinear model fitting algorithm fails on this particular problem. In any case, this means that experience gained on this problem is unique and cannot be (or should not be) generalized to other problems. Leonid _______________________________________________________ From: "Hutmacher, Matt" Subject: RE:[NMusers] $OMEGA blocks and log-likelihood profiling Date: Fri, June 11, 2004 9:53 am Hello all, I have some general overall comments on the discussion based on my understanding of the discussion and on the bootstrap. It seems to me, that for the bootstrap to perform correctly, one must have a "proper" estimate. In the simple case of a univariate random identically distributed random variable (not necessarily known), the sample mean is an estimator of the population mean. This population may be bootstrapped, the sample means for each bootstrap calculated, and the sample means can be ordered to provide an estimate of the confidence interval (which I believe is asymptotically consistent with the true confidence interval for very general situations). In calculating the sample mean, no iteration is necessary, so we are always assured of a "proper" estimate of the mean. When things get more complicated in the non-linear mixed effects world, it would seem to me - to ensure good estimates of the confidence intervals that one would want the "proper" (i.e. global minimum) parameter estimates for each bootstrapped data set. The problem in the practical setting is determining when you have "proper" estimates. This can be difficult to determine even for the original data set. Even when the COV step completes, we may not be sure that we are at a global minimum. To help ensure that we get there, we build stable models and use good starting values. Both of these practices can be applied to the model run on the original data set, but it becomes difficult when bootstrapping because of the large number of runs. Thus, I agree with Ken, that to ensure the best overall performance, not only of the original model run, but the bootstrap procedure, that a stable model should be built and used for inference. As Mats points out, computationally, the COV step is cheap (compared to the bootstrap) in that it gives us an indication of the stability of the model. If the COV step runs, and the model condition number is low, then I would have good faith in proceeding with the bootstrap. However, if the COV step does not run, as Ken indicated, I would be concerned about whether the model has found a saddle point (indicating the estimates are not proper), is overparameterized, etc. I do agree with you Nick, that sometimes the COV may not run for a good model. One often used example is when TLAG is estimated near a data point. Here, the derivative is undefined, so no covariance estimate is available. This model may still yield good estimates and be suitable for inference by bootstrapping. However, in my opinion, for general situations some detective work needs to be done to determine why the COV failed. Otherwise, I would think, some suspicion as to whether the estimates are "proper" would exist in a reviewer's mind. When the COV step fails on the original model, I think the bootstrap could be valuable tool to help show that the estimates are proper. If one looks at the distributions and finds they are reasonable (smooth, unimodal, [and "ideally" symmetric due to the central limit theorem]) then one has some credible evidence that the original model/data is robust. However, one must be careful. In a case like Nick's, where, hypothetically, a large number of runs do not converge, the parameter estimates from the runs that didn't converge could be unnecessarily widening the confidence intervals. Thus, in my opinion, it is good to always include the distribution of the bootstrap estimates in the appendix of one's report. For rounding errors, I agree with Leonid that to some extent, selection of significant digits is arbitrary. However, in my mind, if one can't get three significant digits, one must ask why can't I achieve them. Perhaps if the lowest is 2.9 you could proceed. However, in this case if you change sigdigits to 2 and you get 1.9, wouldn't that be concerning (in the sense of instability)? If the parameter estimates were identical then maybe they are ok, but in my experience this is not usually the case. [I know usually when this happens we increase the sigdigits and trick NONMEM to run the COV using the MSFI option - I state this as somewhat of an old argument or thought process]. If a sigdigit is less than 1, would you trust the estimates. Nick, I would liked to have gone to PAGE and discussed this with you over a pint, as I think you have an interesting problem - that being, "how do you know you have good estimates". There is always room for debate there. But, I have to say, I disagree with the general tone of your 2nd paragraph below. I believe this statement implies (or at least I infer) that this one dataset/model provides proof that the COV step is meaningless. You are using the data to argue your point here, and you claim it is generalizable, which I think is not a scientific argument. You say the proof is in the data and unarguable. However, when data drives a correlation estimate between CL and V to 95%, I believe that you would state that the model is wrong and that this result was in error, even if the model fit the data well (correct me if this is inconsistent with previous emails). You would discount the data in that situation based on an "opinion", wouldn't you? Or is your "opinion" subject matter knowledge... Model fitting is not 100% composed of subject matter knowledge (pharmacology/pharamcokinetics). When one wants to "realize" the data by fitting a model to the data and perform inference, estimation is necessary. Estimation is statistical in nature. Statistical theory plays a valuable role in assessing estimates and ensuring proper inference. Statistics allows us to make probabilistic statements about the data, future predictions, etc. Some statistical rigor is necessary to ensure that the probabilistic statements made will be valid. These "hurdles" have been shown to be valuable over time. If this were not the case, statistics would be such a value part of interpreting clinical trials. Therefore, I would not call good statistical practice "arbitrary" hurdles. Matt _______________________________________________________ From: Nick Holford n.holford@auckland.ac.nz Subject: RE:[NMusers] $OMEGA blocks and log-likelihood profiling Date: Fri, June 11, 2004 5:25 pm Matt, Thanks for your remarks. Most of which I fully agree with. But I do take issue with your faith in statistical theory. For many years it was commonly accepted that the difference in log likelihood under the null was distributed chi-square. But we know from data based experimentation that this is not true for NONMEM. It is also known that NONMEM's standard error's are of little use for confidence intervals because we can find from data based experimentation (using bootstraps) that confidence intervals are often asymmetrical and SE based predictions of parameters such as OMEGA can easily include impossible negative values. Once again statistical theory applied to NONMEM is misleading. These data based experimental tests have forced me to think harder about the assumptions we make when applying statistical theory in this area. So why should I accept the 'good statistical practice' notion that getting the $COV step to run in NONMEM is a marker for a 'stable model' which is somehow more reliable? It is here that I am asking for data. Can anyone support this hypothesis with data based experiments using NONMEM? Of course I would be happy if all my runs converged and the $COV step ran. But in the real world of non-trivial PKPD analysis this cannot be guaranteed and indeed I find the closer one gets to a mechanistically plausible model the harder it is to get these things to happen. So in the real world I live in I feel I cannot rely on statistical theory for NONMEM results but want some data based backup. The bootstrap and randomization test are tools for doing this. You and other have indicated that you think I wish to generalize the results from one study to all cases. If I have given this impression it was not intentional. I am not offering a general theory but I am offering an experiment to test a hypothesis. The hypothesis is that NONMEM runs that converge with $COV are somehow more reliable/stable than those that do not. I really don't have a good idea how to test for model 'stability' but in this context I would consider reliability to mean that the parameter estimates are unbiased. As I understand it this hypothesis is not built on any mathematical theory but arises from 'good statistical practice'. In the single case I have tested I can find no evidence to support this hypothesis. I have read and understand the various viewpoints that have offered reasons why my one experiment may be misleading. I accept these possibilities (e.g. failed runs might widen bootstap confidence intervals) but the data at hand gives no obvious support that this is happening. I am aware of the need to do some 'detective work' to try to understand why a model may be failing to converge. There are numerous ad hoc tricks one can apply e.g. using SLOW or HYBRID or increasing SIGDIG to get results for MSFI with a lower SIGDIG. But at the end of the day, after months of work, I want to move forward with the model that best describes the data and seems to explain how the world works. It is here I am reluctant to throw the baby out with bathwater because the model fails some test of 'good statistical practice' which I cannot find any data to support. Too bad you won't be at PAGE. I look foward to catching up on another occasion. Nick -- Nick Holford, Dept Pharmacology & Clinical Pharmacology University of Auckland, 85 Park Rd, Private Bag 92019, Auckland, New Zealand email:n.holford@auckland.ac.nz tel:+64(9)373-7599x86730 fax:373-7556 http://www.health.auckland.ac.nz/pharmacology/staff/nholford/ _______________________________________________________ From: "Kowalski, Ken" Ken.Kowalski@pfizer.com Subject: RE:[NMusers] $OMEGA blocks and log-likelihood profiling Date: Tue, June 29, 2004 3:08 pm Nick, I'm sure we're all getting tired of this thread but I just can't leave it where it last ended. Below you ask the rhetorical question why we should consider it good statistical practice to get the COV step to run as a marker for a stable model that is somehow more reliable. I don't consider it good statistical practice to simply use success/failure of the COV step as such a marker. What I and Matt have been saying is that it is good statistical practice to develop models that have a successful COV step AND to inspect the COV step output to assess the stability of the models. It is the stability of the model that allows us to guage our confidence (i.e., reliability) in the parameter estimates that we obtain from NONMEM as being optimal from which to make inference via point and interval (i.e., bootstrapping) estimates. You seem to want to rely solely on your confidence that you have a correctly specified mechanistic model and that is sufficient to have confidence in the estimates regardless of whether NONMEM achieves successful convergence. Surely you would agree that if we knew with 100% certainty that your mechanistic model was correctly specified, and you had a dataset where every subject was sampled only once at the same fixed time point, it would be ludicrous to fit this data in NONMEM (we would expect it not to converge) and put any level of trust in the estimates we obtain. The reliability in the paramater estimates depends not only in our confidence of a correctly specified model (which we can never really know with real data) but also in our confidence that the data in hand contains rich enough information to estimate these parameters. Reliable estimates should not be equated with unbiased estimates. Unbiased estimates speak to the accuracy of the estimates and the correctness of the model which cannot be assessed with real data sets. Even if you wanted to assess the value of simple success/failure of the COV step in your bootstrap runs as a marker of the validity of the results with real data, at best, you can conclude that the distribution of the parameter estimates from the failed COV step runs (and in your example the majority also fail in convergence) are similar to the distribution of the estimates from the successful COV step runs. The problem with real data sets is we don't know what the true distribution of the parameter estimates should be. If you want to assess whether a successful COV step provides any value as a marker in assessing the accuracy and precision of the estimates you need to do this via simulations where you know the true value of the parameters. As you vary the design to change the information content in the data resulting in increased instablity (with fewer data points), I think you will find that the accuracy of the parameter estimates and coverage probabilities of the bootstrap CIs will get worse even though the model is correctly specified (i.e., fitting the same model as you used to simulate the data) regardless of COV step status. Note that the nonparametric bootstrap relies on statistical theory and is not a data-based result. In order for the bootstrap to give us valid confidence intervals we need to rely on the randomness and optimality of the estimates. Rounding errors, lack of convergence, and COV step failures may be indicative that the estimates are sub-optimal (not at the global minimum) and could result in systematic biases that invalidate any inference from the resulting empirical distribution of these sub-optimal estimates. Finally, I'd encourage you to read up more on statistical theory for nonlinear models. There is a wealth of information in the statistical literature dating back to the 60's and 70's that Wald-based symmetric confidence intervals for nonlinear models often do not have proper covergage and that the distributions of many parameter estimates from nonlinear models are asymmetric. A lot of this is in standard texts for nonlinear regression models. Bates & Watts, Nonlinear Regression Analysis and its Applications, Wiley, NY, 1988 is a good text. It is true that maximum likelihood theory states that asymptotically, maximum likelihood estimates have a multivariate normal distribution, however, for population models, Vonesh (Biometrika 1996;83:447-452) has shown that the asymptotics require not only a large number of subjects (N) but also a large number of observations per subject (n). For certain intrinsically nonlinear parameters the magnitude of n necessary to achieve these asymptotics may never be realistically achieved. Furthermore, with regards to NONMEM we are not doing exact maximum likelihood estimation but approximate maximum likelihood estimation due to the first-order approximations that are employed and this also plays into the problem. Mats Karlsson and his colleagues have shown situations when the first-order approximations (especially the FO method) do not perform well in maintaining the nominal type I error rate for likelihood ratio tests. However, they have also shown situations where the type I error rate is maintained based on the Chi-Square distribution. So, situations where the likelihood ratio test do not follow a Chi-Square distribution are not evidence of a failure of statistical theory but rather an indication that our approximations and asymptotics may not be working to our advantage. Application of statistical theory requires us to have an understanding of the properties and limitations of the estimation methods that we employ. We (the PK/PD modeling community) are continuing to contribute to this statistical theory as it applies to NONMEM and its estimation methods. Regards, Ken _______________________________________________________ From: Nick Holford n.holford@auckland.ac.nz Subject: RE:[NMusers] $OMEGA blocks and log-likelihood profiling Date: Wed, 30 Jun 2004 15:51:23 +1200 Ken, Thanks for keeping the ball rolling... I think we need to keep clear a distinction between statistical theory and the assumptions required by the theory when applied to NONMEM. Please note this sentence in my response to Matt (below) "These data based experimental tests have forced me to think harder about the assumptions we make when applying statistical theory in this area." Thanks for the suggestions for improving my education. My own preference for a textbook is Seber GAF, Wild CJ. Non-linear regression. New York: John Wiley & Sons; 1989 (the authors are academic colleagues at the University of Auckland <grin>). I have no quibble with the theories (at least to the extent I understand them) but as you note below they often involve assumptions of normality (e.g. for predicting CIs from SEs) or that the likelihood is correct when applying the chi-square distribution to the likelihood ratio test. Both of these assumptions are dubious when applied to results from NONMEM. Bob Leary provided some additional evidence of the problems of NONMEM's Maximum Approximate Likelihood at the PAGE meeting 2 weeks ago. http://www.page-meeting.org/page/page2004/Leary.pdf On matters that are not based on statistical theory like 'reliability' and 'model stability' which you justify by 'good statistical practice' I can only say that I prefer data based SOPs. If NONMEM's $COV results are only approximately correct then just how valuable is the inspection of the entrails for evidence of model stability? I note that Bates & Watts offer no guidance on either reliability or stability (in the index) but Seber & Wild do offer a discussion of parameter stability but only give a practical example in the case of a model with 2 parameters. There is no mention of model or parameter reliability in the index. My confidence in non-parametric bootstrap results is not simply on the basis of a plausible model but also from consideration of a reasonable design confirmed by simulation based tests. I consider your example of a reductio ad absurdam design to be a straw man in this context. With regard to your point about simulation -- You seem to have forgotten that I reported to you already results of using simulation with this problem (http://www.cognigencorp.com/nonmem/nm/99jul152003.html). There was no evidence of bias or important differences in CIs in runs that failed compared with those that were successful, or those that also completed the covariance step. Your recent suggestion to examine Q-Q plots of the distributions of estimates obtained with varying termination/minimization conditions has been very helpful - thank you. I ended up (with some assistance from Christoffer Turnoe) creating CDFs of the empirical distribution of the non-parametric bootstrap parameters. There are some minor systematic differences between these plots when I compare the worst case (terminated due to proximity to infinite objective function value) with the best (covariance step completed). But the inferences about coverage are not importantly different in the context of the application (Matthews et al.2004). I am currently doing some more runs to increase the total number of $COV successful estimates to refine the evaluation of the distributions. I would be happy to send you the data for you to perform your own examinations. NONMEM parameter estimation involves calculations that are often at the limits of computational precision. I know that Stuart Beal has put a lot of effort into trying to make this as platform independent as possible but it is widely known that NONMEM results depend on compiler (and options) and CPU type. I believe that Stuart fine-tuned NONMEM on a specific platform (Sun workstation and compiler, I think). Heuristic decisions were no doubt made on the basis of performance on this platform. Even if other compilers and processors provide superior numerics NONMEM may not perform so well because of the Sun specific pragmatic implementation. Thus NONMEM running with an AMD Athlon CPU and the Compaq Visual Fortran compiler (my own platform) may fail despite being very close to a solution that would be successful on a Sun. It is my current hypothesis that it is this numerical dice throwing that gives rise to the high minimization failure rate rather than any major deficiency in the data or model that I have been using. My data based experiments are currently aimed at testing this hypothesis. If minimization and/or covariance step success are dependent on pseudo-random numerical issues then I would predict that the distribution of parameter estimates would be very similar irrespective of success of failure. Results to date do not provide evidence to reject this hypothesis. Nick Matthews I, Kirkpatrick C, Holford NHG. Quantitative justification for target concentration intervention - Parameter variability and predictive performance using population pharmacokinetic models for aminoglycosides. British Journal of Clinical Pharmacology 2004;58(1):8-19. -- Nick Holford, Dept Pharmacology & Clinical Pharmacology University of Auckland, 85 Park Rd, Private Bag 92019, Auckland, New Zealand email:n.holford@auckland.ac.nz tel:+64(9)373-7599x86730 fax:373-7556 http://www.health.auckland.ac.nz/pharmacology/staff/nholford/ _______________________________________________________ From: Kowalski, Ken Ken.Kowalski@pfizer.com Subject: RE: [NMusers] $OMEGA blocks and log-likelihood profiling Date: Fri, 2 Jul 2004 09:37:48 -0400 Nick, It sounds like we are not in disagreement about the value of statistical theory just that we have to understand the assumptions we make when applying it...no argument there. Just as there is a wealth of PK/PD terminology and different words are sometimes used to mean the same thing, the same is true in the statistical literature. I equate stability or rather 'instability' with 'ill-conditioning' which can be a result of 'over-parameterization'. Bates and Watts (pp. 86-91) discusses inspecting the correlation matrix of the parameter estimates to help diagnose collinearity among the parameter estimates (general ill-conditioning) and/or over-parameterization (fitting too many parameters). With regards to 'reliability' I'm using the term as per Webster to mean the extent to which results are reproducible. In this context a model fit might be unreliable because it is difficult to reproduce estimates across platforms/compilers, with different starting values that may only differ by 10%, etc. Certainly numerical instability of NONMEM itself may indeed be the explanation for your particular problem, however, if you don't strive to get the COV step to run and inspect the output during model building before performing bootstrapping a reviewer could always question whether your model is stable. Moreover, Pete Bonate's exercise showed that NONMEM differences across platforms/compilers were most likely to occur when one is dealing with an unstable model. Thus, numerical instability of NONMEM and model instability in many cases may just be two sides of the same coin. Regards, Ken _______________________________________________________