« In the beginning.... | Main | Career Advice »

November 22, 2005

Watchdog Reset

Today's tech tip comes from Geoff Huntley:

This fatal error usually indicates some kind of hardware problem.

Data corruption on the system is possible.

Look for some other message that might help diagnose the problem.

By itself, a watchdog reset doesn’t provide enough information; because traps are disabled, all information has been lost. If all that appears on the console is an ok prompt, issue the PROM command below to view the final messages that occurred just before system failure:
ok f8002010 wector p
The result is a display of messages similar to those produced by the dmesg command. These messages can be useful in finding the cause of system failure.

This message doesn’t come from the kernel, but from the OpenBoot PROM monitor, a piece of Forth software that gives you the ok prompt before you boot UNIX. If the CPU detects a trap when traps are disabled (an unrecoverable error), it signals a watchdog. The OpenBoot PROM monitor detects the watchdog, issues this message, and brings down the system.

Posted by Ozguru at November 22, 2005 06:00 AM