Zotac IONITX Gigabit Ethernet Hangs on Boot
Over the last two months I've purchased 18 Zotac IONITX-A-U and
IONITX-G-E motherboards and tested them extensively with Linux. After
many boot cycles I have conclusive data that there is either a
systematic hardware defect or a problem with the forcedeth driver,
which I understand that nVidia openly supports for their chipset since
they dropped their proprietary driver a few years back.
The problem with the gig-e when booting is not isolated to one or two
of the systems, it occurs fairly evenly across the entire set. It has
nothing to do with WOL specifically as I originally speculated since
it happens at about the same frequency when the motherboard power
switch is manually activated or even when the driver is removed and
reinserted into the running kernel.
- There is a roughly 8% chance that one of my Zotac IONITX gig-e's
will boot into a state where the gig-e reports nearly continuous
(sometimes only 50% of frames) crc errors on the link. It happens
with the forcedeth driver in 2.6.31 and 2.6.32 kernels in Ubuntu
9.10 and 10.04 LTS. I am beginning to test 2.6.33.
Whenever the problem occurs, if I can get in on console, I attempt to
resolve the problem with:
and once again there is about an 8% chance that this might not work
and it needs to be repeated.
- Usually no rx frame errors are reported on the interface, even after
tens or hundreds of millions of packets are received under both light
and extremely heavy load, with both small and jumbo frames.
- About 10% of the time a stable interface reports frame errors occurred
early on. 90% of the time when this happens it's fewer than ten
frames, 99% fewer than a hundred, and one time it literally reported
26,465 crc errors then was stable.
eth0 Link encap:Ethernet HWaddr 00:01:2e:27:ad:7f
inet addr:192.168.0.39 Bcast:192.168.0.255 Mask:255.255.255.0
inet6 addr: fe80::201:2eff:fe27:ad7f/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:7200 Metric:1
RX packets:132 errors:26465 dropped:0 overruns:0 frame:26465
TX packets:94 errors:0 dropped:0 overruns:0 carrier:0
RX bytes:17839 (17.8 KB) TX bytes:15818 (15.8 KB)
Interrupt:21 Base address:0x8000
At this point, the easiest solution is generally a manual reboot if
you can't consistently ping it and you have physical access to the
machine. For a more reliable solution, a script could be written to
check for frame errors after the network interface is brought up, then
repeatedly rmmod, modprobe the interface and check again every few
seconds until it comes up cleanly.
Clearly the best solution would be a driver or hardware fix.