Debugging with PVM

If you wish to venture into debugging your distributed application, then you simply need to set the parameter *_debug, where * is the name of the module you wish to debug, to ``1'' in the parameter file. This will tell PVM to spawn the particular process or processes in question under a debugger. What PVM actually does in this case is to launch the script $PVM_ROOT/lib/debugger. You will undoubtedly want to modify this script to launch your preferred debugger in the manner you deem fit. If you have trouble with this, please send e-mail to the list serve (see Section 1.6).

It's a little tricky to debug interacting parallel processes. The main difficulty is in that the order of operations is difficult to control. Random interactions can occur when processes run in parallel due to varying system loads, process priorities, etc. Therefore, it may not always be possible to duplicate errors. To force runs that you should be able to reproduce, make sure the parameter no_cut_timeout appears in the parameter file or start SYMPHONY with the -a option. This will keep the cut generator from timing out, a major source of randomness. Furthermore, run with only one active node allowed at a time (set max_active_nodes to ``1''). This will keep the tree search from becoming random. These two steps should allow runs to be reproduced. You still have to be careful, but this should make things easier.

Ted Ralphs
2007-12-21