Saturday, July 11, 2009

Installing OpenSolaris 2009.06 on SPARC without wanboot

With the release of OpenSolaris 2009.06, SPARC is now an officially supported platform.

Unfortunately, older SPARCs with an OBP revision less than 4.17 are not currently supported due to lack of wanboot in the firmware. It is still possible to use wanboot on these machines, however it requires SXCE install media.

To add insult to injury, the AI client miniroot (understandably) makes a number of assumptions about the environment, particularly with respect to network interface configuration (specifically netbootinfo and dhcpagent). For the curious, these issues have been documented in Bugzilla under Bug 9549.

What follows is a workaround to install OpenSolaris on older SPARC hardware using wanboot from the SXCE install media. This process assumes that you have correctly configured the AI server and client as documented in the OpenSolaris Automated Installer Guide.

To get started, you will need to download and burn the SXCE install media (snv_111 or higher is required due to a number of recent fixes to wanboot). Place the install media into your DVD drive, drop into the PROM, and issue:

{0} ok boot cdrom -F wanboot -o dhcp
At this point, the system will boot wanboot from the install media and pick up its configuration from DHCP.

Eventually, the boot process will fail with an error:
Rebooting with command: boot cdrom -F wanboot -o dhcp
Boot device: /pci@8,700000/scsi@6/disk@6,0:f File and args: -F wanboot -o dhcp
<time unavailable> wanboot info: WAN boot messages->console
<time unavailable> wanboot info: configuring /pci@8,600000/pci@1/network@0

1000 Mbps FDX Link up
<time unavailable> wanboot info: Starting DHCP configuration
<time unavailable> wanboot info: DHCP configuration succeeded

<time unavailable> wanboot info: Default net-config-strategy: dhcp
<time unavailable> wanboot progress: wanbootfs: Read 366 of 366 kB (100%)
<time unavailable> wanboot info: wanbootfs: Download complete
Sat Jul 11 01:40:14 wanboot progress: miniroot: Read 175868 of 175868 kB (100%)
Sat Jul 11 01:40:14 wanboot info: miniroot: Download complete
SunOS Release 5.11 Version snv_111b 64-bit
Copyright 1983-2009 Sun Microsystems, Inc. All rights reserved.
Use is subject to license terms.
strplumb: open /devices/pseudo/clone@0:sd failed: 19
Hostname: opensolaris
Remounting root read/write
Probing for device nodes ...
Preparing automated install image for use
The AI image will be retrieved from /export/aiserver/osol-0906-ai-sparc/ directory
Downloading solaris.zlib archive
--18:41:16-- http://10.8.0.8:5555/export/aiserver/osol-0906-ai-sparc//solaris.zlib
=> `/tmp/solaris.zlib'
Connecting to 10.8.0.8:5555... failed: Network is unreachable.
FAILED
Requesting System Maintenance Mode
(See /lib/svc/share/README for more information.)
Console login service(s) cannot run

Enter user name for system maintenance (control-d to bypass):
At this point you will need to login as root (the password will be empty) and configure your network interfaces manually:
# ifconfig -a plumb
# ifconfig <interface> dhcp
You will also need to enable DNS; this is required for pkg(1) to locate and install packages. It should be noted that the wanboot miniroot is very minimal; you do not have access to some of the more common commands such as cp and rmdir. Also, the miniroot is using dcfs(7FS) which means some additional steps need to be taken to modify the filesystem contents:
# cat > /etc/resolv.conf
nameserver <address>
^D
# rm /etc/nsswitch.conf
# cat /etc/nsswitch.dns > /etc/nsswitch.conf
At this point, the network has been configured correctly, however we will need to clean up state to allow the svc:/system/filesystem/root:live-media service to continue booting:
# umount /etc/netboot
# rm -rf /etc/netboot
# umount /tmp
Before clearing the svc:/system/filesystem/root:live-media service, I had to disable a couple of services for a clean boot. The svc:/platform/sun4u/dscp:default and svc:/platform/sun4u/sckmd:default services were not necessary for my hardware and if left enabled, caused the install process to fail. These services are safe to disable unless you have the hardware referenced in the services:
# svcadm disable dscp
# svcadm disable sckmd
You are now free to clear the svc:/system/filesystem/root:live-media service:
# svcadm clear root:live-media
Remounting root read/write
Probing for device nodes ...
Preparing automated install image for use
Before logging out, you will also want to clear the svc:/network/dns/multicast:default service to ensure service discovery works correctly:
# svcadm clear dns/multicast
Once you logout from the maintenance shell the AI process will continue:
# logout
The AI image will be retrieved from /export/aiserver/osol-0906-ai-sparc/ directory
Downloading solaris.zlib archive
--18:43:25-- http://10.8.0.8:5555/export/aiserver/osol-0906-ai-sparc//solaris.zlib
=> `/tmp/solaris.zlib'
Connecting to 10.8.0.8:5555... connected.
HTTP request sent, awaiting response... 200 OK
Length: 83,334,656 (79M) [text/plain]

100%[====================================>] 83,334,656 50.36M/s

18:43:26 (50.28 MB/s) - `/tmp/solaris.zlib' saved [83334656/83334656]

Downloading solarismisc.zlib archive
--18:43:26-- http://10.8.0.8:5555/export/aiserver/osol-0906-ai-sparc//solarismisc.zlib
=> `/tmp/solarismisc.zlib'
Connecting to 10.8.0.8:5555... connected.
HTTP request sent, awaiting response... 200 OK
Length: 3,857,408 (3.7M) [text/plain]

100%[====================================>] 3,857,408 --.--K/s

18:43:26 (47.53 MB/s) - `/tmp/solarismisc.zlib' saved [3857408/3857408]

--18:43:26-- http://10.8.0.8:5555/export/aiserver/osol-0906-ai-sparc//install.conf
=> `/tmp/install.conf'
Connecting to 10.8.0.8:5555... connected.
HTTP request sent, awaiting response... 200 OK
Length: 61 [text/plain]

100%[====================================>] 61 --.--K/s

18:43:26 (2.44 MB/s) - `/tmp/install.conf' saved [61/61]

Done mounting automated install image
Configuring devices.
Reading ZFS config: done.

opensolaris console login: Service discovery phase initiated
Service name to look up: 0906sparc
Service discovery finished successfully
Process of obtaining configuration manifest initiated
Configuration manifest obtained
Automated Installation started
The progress of the Automated Installation can be followed by viewing the logfile at /tmp/install_log
At this point you should be good to go - Enjoy!

13 comment(s):

  1. Hello,

    Do you know if configuring the interface with a static IP (by changing the "ifconfig <interface> dhcp" will work? I would like to use your procedure to avoid the usage of DHCP, forbidden in my network.

    Thanks!

    ReplyDelete
  2. It takes a some additional steps, but it should work just fine:

    # ifconfig <interface> <address> netmask <netmask> broadcast + up

    # route add default <router_address>

    Mind you, I have not tried this myself, but this should be all it takes to get your interface up manually.

    HTH

    ReplyDelete
  3. Yes, it sounds good and we should try it. Thanks! I will keep you informed ;)

    ReplyDelete
  4. Yes, for static IP, single client, I tried:

    ifconfig -a plumb
    ifconfig network-address
    cat >/etc/resolv.conf
    (install server ip)

    # umount /etc/netboot
    # rm -rf /etc/netboot
    # umount /tmp
    # svcadm disable dscp
    # svcadm disable sckmd
    # svcadm clear root:live-media

    After this point, it burst into life without needing to logout.

    In fact, I am using an Auto-Installer (via Installadm pkg) on a Laptop with the WANboot options + cgi definitions and originally it still failed with the errors displayed above.

    ReplyDelete
  5. ..having just said that, it just drops into single user mode as if I had typed:
    boot net -s..

    ReplyDelete
  6. Out of curiosity, does your install server also provide DNS? Your /etc/resolv.conf file should contain one or more lines like the following:

    nameserver <dns server ip>

    If this is not the case, then the install will indeed fail.

    Also, you must logout for the rest of the process to continue; I believe the actual automated install does not kick off until the console-login service starts.

    ReplyDelete
  7. I used your procedure on an old X1, kinda. It doesn't have a CD but it already had another version of Solaris installed, so I copied wanboot into the /platform/SUNW,UltraAX-i2 directory and did a 'boot disk:a -F wanboot -o dhcp'.

    All went as you indicated until the system tried to DHCP. I can see the discover arrive and the offer go back out but the install client never sees it. It worked on the original boot but not with miniroot booted so I suspect a driver issue. I also had an unexpected message during the boot process, it may be the cause or a symptom.
    'WARNING: pcipsy0: ino 0x6 blocked'.

    I am kind of stuck at this point, any hints would be appreciated.
    P.S. I have more than one of these machines and I tried it on another with identical results.

    ReplyDelete
  8. Out of curiosity, are you using an eri interface? IIRC, the wanboot miniroot does not include support for the eri driver.

    You might want to check the contents of /kernel/drv/sparcv9 to see which chipsets are supported in the wanboot miniroot.

    If you can get your hands on a 'newer' chipset (i.e. ce) you will likely have better luck.

    Another option is to pack your own wanboot image and include the missing driver yourself. I've had limited luck with this as the documentation is pretty sparse. Essentially you will need to mount the wanboot miniroot using dcfs and install the driver.

    HTH

    ReplyDelete
  9. The X1 doesn't have a lot of options. No expansion slots of any kind. It uses a Davicom 9102 chip with the dmfe driver. The driver is there (the eri as well, by the way), it just doesn't work 100%. The machine is running ' 5.10 Generic_141414-02' now with no issues, I just wanted to run OpenSolaris.

    ReplyDelete
  10. Great post - I managed to get a OS 2009.06 to install on an Ultra 60 which is allegedly 'not possible' ;).

    I did notice something quite interesting however - if you have a old-style jumpstart server, you can copy the wanboot binary onto it and setup your client SPARC machine to netboot from it in the original fashon. This actually worked correctly for me - I didn't hit the 'Network is unreachable' - and it went straight into the AI installation.

    ReplyDelete
  11. I'm having the exact same issue as JohnR, and I'm wondering if anyone has an update on this.

    I have a Sun Fire V100, and loading wanboot via the net (I use rarp/tftp to load it) with 'boot net -o prompt - install'.

    Just like in the example, it bails and I go into maintenance mode. From here, however, I cannot get anything.

    root:@~# ifconfig -a plumb
    root:@~# ifconfig -a
    dmfe0: flags=1000842[BROADCAST,RUNNING,MULTICAST,IPv4] mtu 1500 index 6
    inet 0.0.0.0 netmask 0
    ether 0:3:ba:16:85:f3
    dmfe1: flags=1000802[BROADCAST,MULTICAST,IPv4] mtu 1500 index 7
    inet 0.0.0.0 netmask 0
    ether 0:3:ba:16:85:f4
    root:@~# ifconfig dmfe0 dhcp

    And nothing happens. Looking at my DHCP server, I see the requests and the responses. But nothing comes back. From my DHCP log:

    Dec 9 16:31:16 wilma dnsmasq[9317]: DHCPDISCOVER(eth0) 00:03:ba:16:85:f3
    Dec 9 16:31:16 wilma dnsmasq[9317]: DHCPOFFER(eth0) 192.168.0.60 00:03:ba:16:85:f3

    But on the Sun, nothing. In fact, the status gives:

    root@:~# ifconfig dmfe0 dhcp status
    Interface State Sent Recv Declined Flags
    dmfe0 SELECTING 5 0 0

    So it seems it isn't receiving any ethernet packets. Now, perhaps this is related to the 'pcipsy0: ino 0x6 blocked' messages?

    So, JohnR, did you get it to boot? Or does anyone else have any ideas?

    ReplyDelete
  12. I think all I had to do is move a cable. Two different distros of Linux (ubuntu and gentoo) mis-identify the ethernet ports. The one labeled net0 is eth0 and vice-versa. When Sun software (boot code and OSOL) it is correctly identified but while in the miniroot it is backwards. The easiest thing to do is connect both interfaces to your network and it should work either way. I will double check next week but I wanted to get this out.

    ReplyDelete

Note: Only a member of this blog may post a comment.