Monday, July 10, 2006

fun with MSI, ACPI, and freebsd.

Last two weeks I had been working on and off on a problem related to ACPI and some 1u servers using MSI-9618 motherboards. After booting freebsd, I couldn't get the second NIC to work at all, it turns out after looking at 'vmstat -i' that em0 and em1 where sharing an interrupt:


# vmstat -i
interrupt total rate
irq3: sio1 52807402 55
irq4: sio0 1604 0
irq14: ata0 36 0
irq16: em0 em1+ 2527378 2
irq19: uhci1+ 86991 0
cpu0: timer 1886077214 2000
cpu1: timer 1886076861 2000
Total 3827577486 4058


From past experience I knew this was from poor resource assignment from the BIOS when ACPI isn't enabled. So I built and installed the acpi.ko modules and installed it. These are webware 1185s from pogolinux. The chassis manual that comes with it labels them as P1-103 series and a part number as MS-9218 1U rackmounts.

Then when I booted, right after it should be switching to multi-user, it appeared to hang. I enabled ALT_BREAK_TO_DEBUGER in the kernel and tried again, except the keycode for break to debugger did nothing. I posted about the hang on the freebsd-acpi list with no response. Then out of hand I tried sshing to the box... it worked! So it wasn't really hung, just the serial console was unresponsive.

Now looking closer at the dmesg I realized the resources for sio0 and sio1 were wrong! sio0 was getting 0x2F8 and sio1 was getting 0x3F8 when it should be the other way around. I rechecked the BIOS settings, everything was ok. I downloaded a new BIOS from the MSI taiwan site that stated the fix was for 'redhat 4' installs. That didn't work.

I posted to freebsd-acpi again with what I knew. This time I got a reply and after some back and forth, I generated a new .asl file that would probe sio0 and sio1 in the correct order! Now all my interrupts are assigned correctly:


[root@pogo-1 ~]# vmstat -i
interrupt total rate
irq1: atkbd0 6 0
irq3: sio1 13587145 55
irq4: sio0 1243 0
irq14: ata0 36 0
irq16: em0 uhci3 3237 0
irq17: em1 21 0
irq19: uhci1+ 66550 0
cpu0: timer 485520843 2000
cpu1: timer 485520575 2000
Total 984699656 4056


And everything seems to be working great, although my post's problem according to the resulting discussion on freebsd-acpi seems to be more common than imagined!