Please note that not all articles will appear in this archive page. You may prefer to search for specific articles via the search function on the main page...


Daylight Saving Changes (AU/Solaris)

I have had to chase up the various ways of resolving the changes to daylight saving for this year (due to the Commonwealth Games).

The solution for Solaris comes from The Good, the Blog & the Ugly:

The beginning and end of Daylight Savings in Australia is controlled by state goverment regulation, which means it can be changed at relatively short notice (except in Queensland, where it can not be changed, as the extra daylight will fade curtains and confuse dairy cows).

Question: Has or will Sun release Solaris patches to take account of changes to daylight savings due to happen for the Commonwealth Games in 2006?
Answer: There is an RFE (Request For Enhancement) for this, but we have not yet developed & released patches. If you are a customer and want to be notified when the patches are released, please log a call to this effect. You can reference BugID 6282969 to identify the issue.

Posted by Ozguru at 06:00 AM | Comments (0)

Perfect Spider Score

Q: I have been playing Spider (Solitaire) on my Sun box but I can never get the maximum score (1000 points). The best I have got is around 990.

A: The trick is not to collapse any piles to the stacks at the top. See the picture for a 1000 point game:

2006-01-17--spider.jpg

Posted by Ozguru at 06:00 AM | Comments (0)

Something is wrong with SMC

Q: You mentioned SMC the other day and I tried it but it doesn't seem to work. I get a message that reads "Starting the server for the first time may take a few minutes" and then nothing happens.

A. Yes. Perfectly right.

:-)

SMC runs walks crawls along in Java. I have never understood why Sun try to write apps in Java and then run walk crawl them under Solaris - Java can be quite snappy on platforms like Mac or Windows but it really drags on Solaris.

2005-12-02--SMC_Startup.jpg

Basically the message you are getting is false advertising unless your definition of "few" extends to "quarter of an hour". Be patient. It might work. Eventually.

Please also note that I stated that Sun did NOT have an equivalent to SMIT but SMC was as close as you could get :-)

Posted by Ozguru at 06:00 AM | Comments (0)

JNI Fibre Drivers

Q1: I have heard that JNI cards are no longer supported....
Q2: Why can't I find a driver for my JNI card....

A: There was a formal announcement about this:

AMCC (formerly JNI) plans to exit the Solaris Fibre Channel HBA market and has terminated their reseller relationship with Hitachi Data Systems and other OEM partners. The last driver qualified, tested & supported by Hitachi Data Systems is v5.3.0.11 for FCE-6460 & FCE-1473. The final driver available from AMCC (v5.3.1.1) has not been tested or qualified by Hitachi Data Systems.
We would advise customers to migrate to a more readily supportable product from either Emulex or QLogic (eg, LP10000 or QLA2340). Both vendors now offer Solaris drivers with the "no reboot" feature.

The announcement is really referring to the JNI badged (as opposed to Sun badged) cards. I believe the drivers for Sun-badged cards are still in Solaris 10 (see this thread). To download the legacy drivers, try this site.

Posted by Ozguru at 06:00 AM | Comments (0)

IDPROM Invalid Contents (280R)

This message can also appear with a Sun Blade 1000 and for both machines it should only happen under Solaris 8. Other related errors include:

  • 'IDPROM Invalid Contents' message at banner during power-on

  • 'Trap32' or 'Trap63' failures at system power-on

  • 'Red State Exception' at power-on

  • Incorrect CPU Module temperature readings

The problem is resolved with a series of patches (see SunAlertID 28290) which specifically recommends:

  • 111228-01 or later

  • 111293-03 or later

  • 110383-01 or later

  • 108528-09 or later

  • 109888-05 or later

  • 110460-03 or later (SunBlade only)

  • 111292-03 or later

  • 110723-02 or later

  • 109882-04 or later

  • 110800-01 or later (Sun 280R only)

Posted by Ozguru at 06:00 AM | Comments (0)

Does Solaris have SMIT or SMITTY

AIX has some useful administration tools but Solaris does not.

Some versions of Solaris had admintool (users, printers) but the closest would probably be SMC (Sun Management Centre). This is usually installed as an extra software package (not part of the standard install).

Posted by Ozguru at 06:00 AM | Comments (0)

StorEdge D1000 (Again)

A D1000 is a SCSI array. Actually it is two arrays.

OK, I actually had a couple of readers who were curious about this post. What is a D1000 they wanted to know and who cares...

Well a Sun StorEdge D1000 is really just an differential SCSI array (actually two differential SCSI arrays in a single box).

It was intended for use in a rack mounted environment and could be configured as one array by crosslinking the two internal controllers. When crosslinked, you want to make sure that the individual target address do NOT overlap. Hence the comments the other day :-) The two controller cards appear as targets e and f and the disks (in a second generation D1000) would be 0, 1, 2, 3, 4, 5 and 8, 9, a, b, c, d. Note that 7 is not used because that is reserved for the host end of the SCSI connection.

The way I normally find these boxes deployed is in a boot-disk mirror. In this case the host server will have two SCSI controllers - one connected to the left hand array and one connected to the right hand array. Both arrays have their targets configured to start at zero so that the target numbers in the metadisk configuration look "similar" (i.e. less confusing for the poor sods who maintain it). For example the array I set up the other day ended up supplying c1t0, c1t1, c1t2, c1t3 and c5t1, c5t2, c5t3, c5t4. In this case c1t0 was mirrored to c5t0. Same for c1t1 and c5t1. The remaining four disks were assembled into a RAID 5 stripe.

Warning for the uninitiated - don't link the two halves together when you are connected to two controllers. SCSI is a bus architecture, not a ring technology! Use terminators instead.

Posted by Ozguru at 06:00 AM | Comments (4)

StorEdge D1000

This time ... read the fine manual ... just for a change.

Just in case any of you find yourselves installing a Sun StorEdge D1000 at some point in time, I would like to recommend that you get hold of the essential pdf file (805-2624-12) attractively called "SunStorEdge (TM) A1000 and D1000 Installation, Operations, and Service Manual" (look on the SunDocs site).

Why?

Because it has the correct settings for the incy-wincy-teeny-weeny dip-switch settings.

The very clear and exact instructions that come attached to the first generation D1000 (the one with only 8 disks instead of 12) suggest that the correct settings are switch 1 down and switch 2 up to give the addresses as targets 0, 1, 2, 3, 8, 9, a and b. If you split the D1000 you change switch 2.

This is wrong.

Very wrong.

The rightmost (as viewed from the back) dipswitch (1) controls the right hand array (as viewed from the front). Up gives 8, 9, a and b. Down gives 0, 1, 2 and 3. The next switch (2) controls the left hand side (up for 8, 9, a and b; down for 0, 1, 2 and 3).

Clear now? Probably not. Just follow the manual and completely ignore the instructions on the D1000 itself :-)

Posted by Ozguru at 06:00 AM | Comments (0)

TechTip: AP and boot disks

Q: I have set up AP as per the normal configuration but I keep getting errors when I try to reboot. I even put a rootdev line in /etc/system but then the next reboot crashed completely.

A: The most likely solution is that you forgot to run apboot on your boot disk. Running apboot would have updated the kernel with a whole bunch of other key lines:
* Begin AP root info (do not edit)
forceload: drv/ssd
forceload: drv/sf
forceload: drv/socal
forceload: drv/sbus
forceload: drv/ap_dmd
forceload: drv/ap
forceload: drv/pseudo
rootdev: /pseudo/ap_dmd@0:8,blk
* End AP root info (do not edit)

Without those lines, the kernel cannot mount the root filesystem....

Thanks to AD, DG, JT, TH for solving this one.

Posted by Ozguru at 06:00 AM | Comments (0) | TrackBack

Watchdog Reset

Today's tech tip comes from Geoff Huntley:

This fatal error usually indicates some kind of hardware problem.

Data corruption on the system is possible.

Look for some other message that might help diagnose the problem.

By itself, a watchdog reset doesn’t provide enough information; because traps are disabled, all information has been lost. If all that appears on the console is an ok prompt, issue the PROM command below to view the final messages that occurred just before system failure:
ok f8002010 wector p
The result is a display of messages similar to those produced by the dmesg command. These messages can be useful in finding the cause of system failure.

This message doesn’t come from the kernel, but from the OpenBoot PROM monitor, a piece of Forth software that gives you the ok prompt before you boot UNIX. If the CPU detects a trap when traps are disabled (an unrecoverable error), it signals a watchdog. The OpenBoot PROM monitor detects the watchdog, issues this message, and brings down the system.

Posted by Ozguru at 06:00 AM | Comments (0)

The Day SunOS Died

I found this over at Stokely Consulting and figured it was entirely relevant given that I am currently teaching a Sun course that includes the transition from SunOS (BSD) to Solaris (SysV):

The Day SunOS Died
by N.R. "Norm" Lunde, with apologies to Don McLean

Remember when those guys out West
With their longish hair and paisley vests
Were starting up, straight out of UCB?

They used those Motorola chips
Which at the time were really hip
And looked upon the world through VME.

Their first attempt ran like a pig
But is was the start of something big;
They called the next one the Sun-2
And though they only sold a few
It soon gave birth unto the new
Sun-3 which was their pride
And now they're singing

"Bye, bye, SunOS 4.1.3!
ATT System V has replaced BSD.
You can cling to the standards of the industry
But only if you pay the right fee --
Only if you pay the right fee..."

The hardware wasn't all they sold.
Their Berkeley port was solid gold
And interfaced with system V, no less!

They implemented all the stuff
That Berkeley thought would be enough
Then added RPC and NFS.

It was a lot of code to cram
Into just four megs of RAM.
The later revs were really cool
With added values like SunTools
But then they took us all for fools
By peddling Solaris...
And they were singing,

"Bye, bye, SunOS 4.1.3!
ATT System V has replaced BSD.
You can cling to the standards of the industry
But only if you pay the right fee --
Only if you pay the right fee..."

They took a RISC and kindled SPARC.
The difference was like light and dark.
The Sun-4s were the fastest and the best.

The user base was having fun
Installing SunOS 4.1
But what was coming no one could have guessed.

The installed base was sound.
The software did abound.
While all the hackers laughed and played
Already plans were being made
To make the dubious "upgrade"
To Sun's new Solaris...
And Sun was singing,

"Bye, bye, SunOS 4.1.3!
ATT System V has replaced BSD.
You can cling to the standards of the industry
But only if you pay the right fee --
Only if you pay the right fee..."

The cartridge tapes were first to go
And CDROM's a must, you know
And floppy drives will soon go out the door.

I tried to call and ask them why
But they took away my tty
and left my modem lying on the floor.

While they were on a roll
They moved the damned control.
The Ethernet's now twisted pair.
Which no one uses anywhere.
ISDN is still more rare--
The bandwidth's even less!
But still they're singing

"Bye, bye, SunOS 4.1.3!
ATT System V has replaced BSD.
You can cling to the standards of the industry
But only if you pay the right fee --
Only if you pay the right fee..."

The worst of all is what they've done
To software that we used to run
Like dbx and even /bin/cc.

Compilers now have license locks
Wrapped up in OpenWindows crocks
We even have to pay for gcc!

The applications broke;
/usr/local went up in smoke.
The features we've depended on
Before too long will all be gone
But Sun, I'm sure, will carry on
Be peddling Solaris,
Forever singing,

"Bye, bye, SunOS 4.1.3!
ATT System V has replaced BSD.
You can cling to the standards of the industry
But only if you pay the right fee --
Only if you pay the right fee..."

Posted by Ozguru at 06:00 AM | Comments (0)

SunOS vs Solaris

The underlying operating is, was and still will be SunOS. The package (operating system plus windowing system plus extra software) is called Solaris.

The deal was that Sun announced the coming of Solaris (SVR4) and predicted that it would come soon to replace SunOS (BSD). The problem was that the timing was unrealistic and there was no way it was going to happen in time. Enter the marketing geniuses who relabelled the existing operating systems Solaris 1 (which means "Solaris" was available on time but it was not SVR4). The real Solaris was called Solaris 2.

So what was the relationship between Solaris and SunOS?

  • SunOS (Solaris) - Date (Platforms)

  • 4.0.2 (none) - Sep. 89 (386i)

  • 4.0.3 (none) - May 89 (sun2, sun3/3x, sun4)

  • 4.0.3c (none) - June 89 (Sparc 1)

  • 4.0.3 PSR_A (none) - July 89 (Sun 4/470, 4/490)

  • 4.1 (none) - Mar. 90 (sun3, sun4)

  • 4.1e (none) - Apr. 91 (sun4e)

  • 4.1.1 (none) - Mar. 90 (sun3/3x, sun4)

  • 4.1.1B (1.0) - Feb. 91 (sun4)

  • 4.1.1.1 (1.0) - Jul. 91 (sun3/3x)

  • 4.1.1_U1 (1.0) - Nov. 91 (sun3/3x)

  • 4.1.2 (1.0.1) - Dec. 91 (sun4, sun4m)

  • 4.1.3 (1.1A) - Aug. 92 (sun4, sun4c, sun4m)

  • 4.1.3C (1.1c) - Nov. 93 (Sparc LX/Classic)

  • 4.1.3_U1 (1.1.1) - Dec. 93 (sun4, sun4c, sun4m)

  • 4.1.3_U1B (1.1.1B) - Feb. 94 (sun4, sun4c, sun4m)

  • 4.1.4 (1.1.2) - Nov. 94 (sun4, sun4c, sun4m)

  • 5.0 (2.0) - Jul. 92 (sun4c)

  • 5.1 (2.1) - Dec. 92 (sun4, sun4c, sun4m, x86)

  • 5.2 (2.2) - May 93 (sun4, sun4c, sun4m, sun4d)

  • 5.3 (2.3) - Nov. 93 (sun4, sun4c, sun4m, sun4d)

  • 5.4 (2.4) - Aug. 94 (sun4, sun4c, sun4m, sun4d, x86)

  • 5.5 (2.5) - Nov. 95 (sun4c, sun4m, sun4d, sun4u, x86)

  • 5.5.1 (2.5.1) - May 96 (sun4c, sun4m, sun4d, sun4u, x86, ppc)

  • 5.6 (2.6) - Aug. 97 (sun4c, sun4m, sun4d, sun4u, x86)

  • 5.7 (7) - Oct. 98 (sun4c, sun4m, sun4d, sun4u, x86)

  • 5.8 (8) - Feb. 2000 (sun4m, sun4d, sun4u, x86)

  • 5.9 (9) - May. 2002 (sun4m, sun4u, x86)

  • 5.10 (10) - Jan. 2005 (sun4u, x86)

Posted by Ozguru at 06:00 AM | Comments (1)

T3 Disk Replacement

Warning: Replace only one disk drive in an array at a time to ensure that no data is lost. Ensure that the disk drive is fully re-enabled before replacing another disk drive in the same array.

The default configuration for the array is to automatically spin up and re-enable a replaced disk drive, then automatically reconstruct the data from the parity or hot-spare disk drives. Disk drive spin up takes about 30 seconds, and reconstruction of the data on the disk drive can take one or more hours depending on system activity.

  1. Remove the front panel
  2. Locate the disk to replace. Disks are numbered 1 to 9 starting on the left
  3. Use a small screwdriver to press in and release the black latch
  4. Use the latch to slowly remove the drive (I usually pause a moment after getting the drive about 25% out to wait for it to spin down)
  5. Release the latch on the replacement drive and insert the drive
  6. Press the latch in using your screwdriver
  7. Replace the faceplate
  8. Using the CLI, verify that the insertion worked and data is being restored

Posted by Ozguru at 06:00 AM | Comments (0)

SCSI IDs for Disk Cards

When you install a disk card (see TechTip: Building an E4500/5500/6500 for more information about disk cards), the SCSI IDs seem to come out strange. There is actually a naming scheme which lets you predict the numbers - useful if you are connecting to the internal SCSI bus.

Note that these default drive address settings are assigned by the centerplane slot position when a jumper is *NOT* installed on J0702 and J0703 Pins 1-2.

- If the Board is in Slot 0, the disks will be 4 and 5 (clash with internal tape units)*
- Slot 1, disks will be 6 and 7 (clash with CD-ROM and motherboard)*
- Slot 2, disks 0, 1
- Slot 3, disks 10, 11 (a, b)
- Slot 4, disks 2, 3
- Slot 5, disks 12, 13 (c, d)
- Slot 6, disks 8, 9
- Slot 7, disks 14, 15

If you have a 6500, there are extra slots:
- Slot 8, disks 10, 11 (a, b) - clash with slot 3, 15
- Slot 9, disks 0, 1 - clash with slots 2, 14
- Slot 10, disks 12, 13 - clash with slot 5
- Slot 11, disks 2, 3 - clash with slot 4
- Slot 12, disks 14, 15 - clash with slot 7
- Slot 13, disks 8, 9 - clash with slot 6
- Slot 14, disks 0, 1 - clash with slots 2, 9
- Slot 15, disks 10, 11 - clash with slots 3, 8

On the other hand, if you jumper J0702 (first disk) or J0703 (second disk), then you can set the SCSI ID manually for the disk. Just avoid 4, 5, and 6 if you want to use the internal SCSI bus.

[* So it is just as well you cannot install a disk board in this slot :-)]

Posted by Ozguru at 06:00 AM | Comments (0)

Adding a Hot Spare in SVM

Sometimes, when creating SVM meta devices, I forget to attach the hotspare disks to the correct meta volumes. This is important because a generic hotspare pool can be used by any meta volume and you may want to reserve it for a particular use. For example, if you had some mirrors and a RAID5 set, you may want to restrict the hotspare to the RAID set because the mirrors already have more redundancy built into them.

It turns out that this is easy to do (although the documentation is hard to find):
metaparam -h hot-spare-pool component

A more complete example (from the manual):

# metaparam -h hsp001 d10
# metastat d10
d10: RAID
State: Okay
Hot spare pool: hsp001

Posted by Ozguru at 06:00 AM | Comments (0)

TurboGX Video Card

Want to use an older Sun box with a newer monitor?

The older Sun systems often used cgsix (GX) graphics cards which sort of assumed a 1152x900 display unless the monitor tells it otherwise (via a query from the graphics card). This may have worked with Sun monitors but more often than not, you find yourself with any old monitor and the sense code idea simply does not work. Sometimes the monitor will start up anyway, other times it will refuse to play.

To override the default behaviour you need to be at the OK> prompt. You need to use a supported resolution and frequency such as:
ok setenv output-device screen:r1280x1024x76

If this does not work and you need to set it back, use:
ok setenv output-device screen
ok reset

Posted by Ozguru at 12:00 PM | Comments (0)

PROM Commands

Typing help at the ok prompt will give you a list of main categories of commands available. Typing help category will show help for all commands in the category (use only the first word of the category description). Typing help command will show help and details of the individual command.

A lesser known boot PROM tool is the sifting command. Typing sifting characters will display names of all commands containing your sequence of characters. For example:

ok sifting probe
probe-all probe-scsi-all probe-sbus probe-slots probe probe-fpu probe-virtual lprobe wprobe cprobe
ok

You can then use help command to get the correct syntax.

Posted by Ozguru at 06:00 AM | Comments (0)

Recreate /dev/null

If for some reason you have lost your /dev/null you can easily recreate it by moving a file to /dev/null as root. Another way is to just recreate the soft link (as root again):

devlinks

But if you have messed up the original device in /devices proceed as follows:

mknod /devices/pseudo/mm@0:null c 13 2
chown root:sys /devices/pseudo/mm@0:null
chmod 666 /devices/pseudo/mm@0:null
cd /dev
ln -s ../devices/pseudo/mm@0:null null

Posted by Ozguru at 06:00 AM | Comments (2)

Bad blocks (mirrored disk)

Bad Mirrored Blocks

If you run fsck (or newfs) on a mirrored disk and you get a ‘bad blocks’ message, then check the block numbers against the partition information (from format). A true bad block should never be visible because of the data mirror.

It is possible to create partitions, build filesystems and then change the partitions. Fsck will then complain about any blocks which *could* exist in the filesystem but are actually in another partition.

This error does not actually mean that data has been lost. The blocks are flagged regardless of the presence of data.

Posted by Ozguru at 06:00 AM | Comments (0)

Creating Disk Layouts

We already had some tech tips about disk layouts, but they assume that you already have a layout to copy or work with. What if you have a new disk and you want some suggestions?

Well, once upon a time, a long time ago, you used to build separate partitions for / (root), /var, /usr, /opt, /export/home, etc. These days the considerations are somewhat simplified by the sheer size of the average boot disk. The base considerations are: how much swap do you need (older Solaris versions have specific requirements that relate to memory size), how much stuff are you going to load in terms of Solaris optional features and do you feel like playing Russian Roulette? This last point refers to the /var filesystem - the place where all the logs are kept as well as the critical directory /var/sadm.

If you feel lucky, put /var in the root filesystem. If you fail to do corrective surgery on your logs, they will grow and grow. One day, you will get a full filesystem. If this is the root filesystem, your installation is hosed and I hope you have a good backup and recovery procedure.

If you don't feel lucky today, put /var in a filesystem by itself :-)

So we need three filesystems: swap, / (root) and /var. The rest of the space is yours to arrange.

What I usually do (and this is only a rule of thumb) is to think in 6ths or 7ths of a disk. Imagine that you disk has 14087 cylinders. This is roughly 2000 x 7. So we divide the disk into chunks of around 2000 cylinders as follows (most left over cylinders go into swap):

  • Slice 0 = /, starts at 2080, 2000 cylinders long

  • Slice 1 = swap, starts at 0, 2080 cylinders long

  • Slice 2 = reserved - don't change this

  • Slice 3 = /var, starts at 4080, 2000 cylinders long

  • Slice 4 = unused

  • Slice 5 = unused

  • Slice 6 = /export/home, starts at 6080, 8000 cylinders long

  • Slice 7 = unused for now

Slice 7 is actually intended for use later by SVM and there just happens to be 7 cylinders left over :-) Note that / and /var will be almost 10Gb and swap will be slightly larger. /export/home will be almost 40Gb.

Posted by Ozguru at 06:00 AM | Comments (0)

Ethernet Settings

V240 ALOM Warning lights

Automount /home (Solaris 10)

T3 Serial Cables

Kernel Changes Missing

T3 Disk Lights

Solaris vs Serial Terminals

Combining Solaris Patch Directories

AP and Solaris

Solaris 9, STMS and A5000

Ethernet Devices (Solaris)

SVM / SDS numbering

Who or what was the KOTF?

Rounding with strfmon on Solaris

Replacing an A1000 battery

Building an E4500/5500/6500

Copying Disk Layouts (Solaris)

Ultra 5 vs Ultra 10

Useful Documentation

TechTip: Insufficient metadevice database replicas ...

X11 and Solaris

Solaris & CE

PCI Card Identification

Disk Device Names (Solaris)

V880 and AP

SunOS Bootblocks

Netra SAN Cards

Unmounting Filesystems

Getting to the PROM Prompt

Patching FCAL Loops (V880)

TechTip: Sun Ethernet Ports

Happy Meal Ethernet

Tech Tip: Check the backside

TechTip: Bad blocks (mirrored disk)

TechTip: Adding a NIS Slave Server

TechTip: Solaris Packages

TechTip: Wierd Solaris Bugs