From: "Peterson, Mark" <markp@amgen.com>
Subject: Distributive Computing

Date: Fri, 7 Sep 2001 08:08:01 -0700
 

Hello,
 

I am interested in learning if there is an advantage of setting up a

distributive computing network to process NONMEM runs that are

computationally lengthy (several days) on a single PC. Can anyone elaborate

on the possibility, benefits, difficulties, and technical aspects of

implementing NONMEM on such a network?
 
 

Thank you,

Mark C. Peterson
 

Amgen, Inc.

One Amgen Center Drive

Thousand Oaks, CA 91320-1799

Tel: 805.447.7222

Fax: 805.499.4868

Email: markp@amgen.com
 

*****
 

From: Darin Perusich <Darin.Perusich@cognigencorp.com>

Subject: Re: Distributive Computing

Date: Thu, 11 Oct 2001 12:16:13 +0000

 

Setting up a distributed environment depends on your needs. Are you

going to be running lots of NONMEM jobs all the time, or just a few now

and then? How much money can you spend; it's always as

little as possible right? In many cases, depending on the application,

using some big SMP machine will be the best route. This is not the case

with NONMEM as it is single threaded and cannot take advantage of

multiple cpu's; the NONMEM processes just bounce from cpu to cpu and nothing is

gained. Would you rather buy a 4 processor machine that will cost

$20,000+ or twenty $1000.00 machines?
 

Let's look at a few things that need to be considered.
 

1. NONMEM run time - In many cases, runs can go for hours, days, even

weeks. The underlying platform, hardware, and OS must be reliable. If uptime is

a concern, the system should be unix or linux.
 

2. Computing hardware - The output of NONMEM runs differ between

processor architectures. Running the same job on the x86(Intel) and

SPARC(Sun) platforms will give hugely different results; therefore, all

the machines must be of the same architecture.
 

3. Compilers - As with the hardware, compiling NONMEM with different

compilers gives different results, even between different versions of

the same compiler. There is also licensing and pricing issues; GNU is free

and runs on every OS whereas many other compilers cost upwards of

$1000/seat.
 

4. Licensing issues for NONMEM, compilers, operating systems, etc. -

At a minimium, you'll have to purchase enough licenses for NONMEM;

everything else can be done with free tools.
 

5. Management - The less you have to do the better, especially when you

have many machines. This goes back to my first point; as an

administrator, I'd need this to be as reliable as possible - having

machines "blue screen" for no reason would not be acceptable to me

or users. Once a machine has been deployed, it shouldn't need to be

touched. What about deploying new systems; do you want to spend half a

day setting up the systems?
 

6. Job deployment - In such an environment, this is one of the most

important things. How do i get job 'abc-123' out to machines x, y, or z?
 

These are just a few things to think about. The technical aspects and

difficulties run hand-in-hand if you ask me. Someone looking to

implement a distributed network of workstations should have a solid

grasp of networking, OS deployment, and at least know how to use NONMEM.
 

--

Darin Perusich

Unix Systems Administrator

Cognigen Corp.

darinper@cognigencorp.com
 

*****
 

From: Leonid Gibiansky <lgibiansky@emmes.com>

Subject: Re: Distributive Computing

Date: Thu, 11 Oct 2001 09:54:38 -0400

 

At 12:16 PM 10/11/01 +0000, Darin Perusich wrote:
 

>2. Computing hardware - The output of NONMEM runs differ between

>processor architectures. Running the same job on the x86(Intel) and

>SPARC(Sun) platforms will give hugely different results; therefore, all

>the machines must be of the same architecture.
 

Can it be really "hugely different results" ? In this case, I would be

hesitant to trust any of those. We have seen cases when PC version did not

converge whereas the UNIX was fine, but in those case switching to g77

compiler on PC usually delivered results very similar to those obtained on

UNIX. Could you (I mean, the NONMEM community), please, describe some

particular examples of huge differences if you observe them ?
 

Thanks,

Leonid
 

*****
 

From: "Sale, Mark" <ms93267@GlaxoWellcome.com>

Subject: RE: Distributive Computing

Date: Thu, 11 Oct 2001 11:22:02 -0400
 

Getting back to the original question. I think that Mark orginally ask

about distributed computing in NONMEM, and I got the impression he meant

parallel processing or something like virtual parallel processing. About 6

years ago I worked with SGI to run NONMEM on a 4 processor machine using the

compiler they claimed was the best at parallelizing (aka unrolling) fortran

code. NONMEM did absolutely no parallel processing, ran entirely on one

processor. So, old-fashioned multi processing doesn't work with the

existing code. Darin correclty points out that NONMEM also will not

multithread. Multithreading requires dedicated coding, and NONMEM is

written as a single thread application. I heard a rumor at one time though

that Steve Shafer had rewritten some parts of PREDPP as multithread.

Multithreading is in principle entirely possible in mixed effect modeling.

A number of groups at working on new applications of mixed effect modeling

(not yet available) that support multi threading, but don't hold your

breath.
 

Mark
 

*****
 

From: =?iso-8859-1?Q?Jos=E9?= Javier Zarate <jzarate@unav.es>

Subject: Re: Distributive Computing

Date: Sat, 13 Oct 2001 19:54:19 +0200

 

Dear Mr. Perusich
 

On point 3 of your message.
 

My experience on Alpha and x86 machines tells me that g77 is slower than

any other compiler in the market.
 

On points 1 and 5.

My experience with well set up Windows NT 4 and Linux (RedHat 6.1)

machines is that you don't find real differences on uptime among

systems. It is a question of doing a good job on the system you have and

knowing the systems deeply enough for the task.
 

Best wishes
 

--

JJ Zarate

Departamento de Compras

Clínica Universitaria de Navarra

http://www.unav.es/cun/
 
 

*****
 
 

From: Alice Nichols <nichols@bellatlantic.net>

Subject: Re: Distributive Computing

Date: Sun, 14 Oct 2001 15:36:32 -0400

 

Mark,
 

A fairly low cost option which I use is to have several pcs running the nonmem

software tied in to the same keyboaard & monitor using a swithing box for

keyboards/monitors. Not as elegant as having a costing multiprocessor but much

less expensive.
 

Alice
 

Alice Nichols, PhD

Hawthorne Research and Consulting, Inc

132 Hawthorne Rd

King of Prussia, PA 19406
 

*****
 

From: Nick Holford <n.holford@auckland.ac.nz>

Subject: Re: Distributive Computing

Date: Mon, 15 Oct 2001 09:07:47 +1300

Even cheaper options include:
 

1. Use VNC to control your number crunching slave computers.
 

Virtual Network Computing works with almost any operating system to allow remote control. It requires you have a TCP/IP connection but you dont need a keyboard or monitor for the slave computers once it is running. The software is free and you can control as many computers as you want from anywhere with an IP connection. The client part of the software (the 'viewer') fits on a single floppy disk so you can travel light!
 

http://www.uk.research.att.com/vnc/
 
 

2. Get PCs with dual CPU motherboard and 2 CPUs.

Generally cheaper to buy a dual CPU machine with all other components in common than to buy 2 machines. You will need to use Windows 2000 but Win2K is very stable. The multiprocessing works nicely including being able to set NONMEM jobs at a low priority so that you can respond to nmusers without any interference yet have negligble impact on the NONMEM run times.

 

--

Nick Holford, Divn Pharmacology & Clinical Pharmacology

University of Auckland, 85 Park Rd, Private Bag 92019, Auckland, New Zealand

email:n.holford@auckland.ac.nz tel:+64(9)373-7599x6730 fax:373-7556

http://www.phm.auckland.ac.nz/Staff/NHolford/nholford.htm
 

*****
 

From: harrold@sage.che.pitt.edu

Subject: Re: Distributive Computing

Date: Sun, 14 Oct 2001 17:57:59 -0400 (EDT)
 

Sometime in October Nick Holford assaulted keyboard and produced...
 

|Even cheaper options include:

|1. Use VNC to control your number crunching slave computers.

|Virtual Network Computing works with almost any operating system to allow remote control. It requires you have a TCP/IP connection but you dont need a keyboard or monitor for the slave computers once it is running. The software is free and you can control as many computers as you want from anywhere with an IP connection. The client part of the software (the 'viewer') fits on a single floppy disk so you can travel light!

|http://www.uk.research.att.com/vnc/

|

|2. Get PCs with dual CPU motherboard and 2 CPUs.

|Generally cheaper to buy a dual CPU machine with all other components in common than to buy 2 machines. You will need to use Windows 2000 but Win2K is very stable. The multiprocessing works nicely including being able to set NONMEM jobs at a low priority so that you can respond to nmusers without any interference yet have negligble impact on the NONMEM run times.
 

even cheaper: use linux and save $80 per computer for the operating system,

and loose the overhead of a gui. plus the fortran compiler is also free.

use ssh to connect to the computers with x forwarding turned on to display

grapical stuff to your local computer when you want that kind of thing.

this woud requrire some network cards, a switch, and some network cable,

but that would be balanced out because you wouldnt need monitors or

videocards.

--

john
 

*****

 
 
 

From: "Sale, Mark" <ms93267@GlaxoWellcome.com>

Subject: RE: Distributive Computing

Date: Mon, 15 Oct 2001 12:18:58 -0400
 

 

My (additional) cents worth.
 

 

We're currently running a home grown application to run NONMEM on other

computers. It is really pretty easy, just use Winsock (in VB or VC++) to

send the control file and or data file. If you have the data file on a

network server (which we usually do) you don't need to send the data file.

Then issue a command using TCP-IP/winsock again to compile and run NONMEM.

when done, send the result and any table files back. This is currently in

prototype for the machine-learning/Genetic algorithm application for NONMEM.

Currently, it is very site specific, meaning a winsock control is added for

each remote server (currently 1), and the computer name is added manually to

the source code. Eventually, it will be more general (e.g., drop down menus

that tell you what servers are available, and the load on each so you can

select them) There is a dedicated directory on the remote server that

NONMEM runs in (at low priority). The advantage of this is that you don't

need "full" access to the computer (e.g, you can use anyones computer

without a security issue), because there are only a limited number of

commands that can be issued (writing a control file, running NONMEM and

sending the results back). In theory, could be run on any computer

available (such as admins), without impacting the "real" user. Since TCP-IP

only sends text, there is no risk of an executable virus (unless you send

the source code and compile it on the server)

I'm happy to share the VB code to do this, but again, at this point will

require some work to set up for a specific site.
 

 

Mark
 

*****

 

From: "Banken, Ludger {PDBS~Basel}" <LUDGER.BANKEN@roche.com>

Subject: [NMusers] RE: Distributive Computing

Date: Tue, 16 Oct 2001 11:17:12 +0200
 

It is also possible to use the equivalent of a batch queue instead of starting each NONMEM run separately:

A program can be started on a server (in the DOS box) to execute automatically all control files in a given directory. The program search in a given (network) directory for control files (files with a given extension, e.g. *.CTL). Such a control file is moved to another directory and processed from there. To reduce network traffic NONMEM should be executed locally, either by moving the control file to a local directory or by copying intermediate results to the local drive before starting the compilation and execution. After the execution the results are copied to an output directory on the network.
 

This program can be started on several PCs using the same input directory. (Intermediate files from NONMEM have to be stored in a local directory.) Thereby several NONMEM runs are executed in parallel and each server search for a new run when it has finished the previous run.
 

Ludger
 

Ludger Banken

F.Hoffmann-La Roche Ltd, CH-4070 Basel

Phone +41-61- 688 73 63

Fax +41-61- 688 14 52

E-mail ludger.banken@roche.com
 

*****

 

From: Lars.Lindbom@farmbio.uu.se (Lars Lindbom)

Subject: [NMusers] RE: Distributive Computing

Date: Tue, 16 Oct 2001 12:06:39 +0100 (IST)
 

We are currently using a Linux-cluster to run multiple nonmem jobs in

parallel. The cluster is based on MOSIX, a patch or add-on to the Linux

kernel that enables automatic process migration. Start a job on one

cluster-node and it will move to the best node available. Best means

highest CPU-speed, most RAM and of course lowest load. There are many

pro's and con's with this system and you can read all about them on

www.mosix.org but the most important are:
 

- Long nonmem jobs will benfit the most; they will migrate to the fastest

node and stay there.

- You can write parallel applications that utilise the process migration

of MOSIX but multithreaded shared-memory applications will not. They will

stay on the node on which they were started.

- The setting up and maintainance of a MOSIX cluster demands some

knowledge about Linux administration.
 

/Lars
 

Below is the ouput from top, run on our main node just now:
 

 

2:03pm up 17 days, 21:09, 16 users, load average: 3.89, 3.74, 3.38

114 processes: 108 sleeping, 5 running, 0 zombie, 1 stopped

CPU states: 1297.3% user, 0.7% system, 506.1% nice, 0.0% idle

Mem: 512144K av, 503800K used, 8344K free, 0K shrd, 21680K buff

Swap: 1052216K av, 88K used, 1052128K free 309904K

cached
 

 

PID USER PRI NI SIZE RSS SHARE STAT N# %CPU %MEM TIME COMMAND

2294 ----- 18 0 3412 3412 328 S 7 99.9 0.6 5463m nonmem

32222 ----- 19 19 3624 3624 520 S N 6 99.9 0.7 4494m nonmem

21438 ----- 11 0 1852 1852 284 S 3 99.9 0.3 1574m nonmem

17654 ----- 17 0 2488 2488 324 S 5 99.8 0.4 10527m nonmem

21769 ----- 18 0 3600 3600 528 S 5 99.8 0.7 7769m nonmem

727 ----- 10 0 1340 1340 4 S 6 99.8 0.2 5772m nonmem

20898 ----- 10 0 3216 3216 276 S 2 99.8 0.6 2218m nonmem6

22952 ----- 17 0 2356 2356 352 R 0 96.9 0.4 1046m nonmem

25028 ----- 14 0 2996 2996 860 R 0 93.7 0.5 11:44 nonmem

21736 ----- 19 19 2368 2368 336 S N 4 83.6 0.4 1187m nonmem

2314 ----- 19 19 3436 3436 464 S N 4 65.7 0.6 5400m nonmem

22460 ----- 19 19 2720 2720 376 S N 4 50.3 0.5 959:00 nonmem

22292 ----- 19 19 3408 3408 504 S N 3 49.9 0.6 1048m nonmem

24352 ----- 19 19 1868 1868 400 S N 3 49.9 0.3 66:39 nonmem

2275 ----- 19 19 2756 2756 428 S N 2 49.5 0.5 5032m nonmem

2262 ----- 19 19 3236 3236 348 S N 2 49.3 0.6 4506m nonmem

23836 ----- 19 19 2728 2728 488 R N 0 8.0 0.5 109:07 nonmem

24961 root 10 0 1108 1108 872 R 0 0.7 0.2 0:09 mtop

15 root 9 0 0 0 0 R 0 0.1 0.0 7:12 memsorter

1 root 9 0 544 544 472 S 0 0.0 0.1 0:26 init

2 root 9 0 0 0 0 S 0 0.0 0.0 0:00 keventd
 

--

Lars Lindbom

PhD-student

Division of Pharmacokinetics and Drug Therapy

Department of Pharmaceutical Biosciences

Box 591

SE-751 24 Uppsala

Sweden

Phone +46 18 471 4291

Fax +46 18 471 4003

email Lars.Lindbom@farmbio.uu.se