Ethernet Card and Driver Test
The following combinations of ethernet cards and drivers were tested under
Linux (RedHat 6.2; kernel 2.2.16):
All tests were done using cross-over cables.
- Realtek RTL8139 / rtl8139.o (version 1.07)
- LinkSys LNE100Tx / tulip.o (version 0.91g-ppc)
- 3Com 905B / 3c59x.o (high-performance variant poll-2.2.17-pre18.c;
- 3Com 905B / 3c90x.o (version 1.0.0i)
- Intel EtherExpressPro 100 / eepro100.o (version 1.09j-t)
- D-Link DFE570TX (4-port NIC) / tulip.o
- 3 RTL8139 / rtl8139.o channel bonded
- 3 3Com 905B / 3c59x.o channel bonded
- D-Link DFE570TX / tulip.o 3 ports channel bonded
- D-Link DFE570TX / tulip.o 4 ports channel bonded
- Intel Pro/100+ dual port / eepro100.o 2 ports channel bonded
First test: two identically configured PIII/600MHz
tcp test: ./netperf -H p600 -i 10,2 -I 99,5 -l 60
udp test: ./netperf -H p600 -i 10,2 -I 99,5 -l 60 -t UDP_STREAM -- -m 8192 -s 32768 -S 32768
(for the udp test the throughput at the receiving end is reported)
All numbers in Mbit/s.
RTL8139 LNE100Tx 3c59x 3c90x EEPro100 DFE570Tx 3xRTL8139 3x3c59x DFE570Tx(3) DFE570Tx(4) 2xEEPro100+
tcp | 85.67 93.72 94.09 94.10 93.38 94.07 228.37 279.55 279.27 238.18 183.26
udp | 87.93 95.75 95.82 83.85 95.75 95.79 127.47 266.15 265.92 61.75 177.49
Second test: Pentium 166MHz and dual PII/400MHz
tcp test: ./netperf -H p166 -i 10,2 -I 99,5 -l 60
udp test: ./netperf -H p166 -i 10,2 -I 99,5 -l 60 -t UDP_STREAM -- -m 8192 -s 32768 -S 32768
tcp test: ./netperf -H p400 -i 10,2 -I 99,5 -l 60
RTL8139 LNE100Tx 3c59x 3c90x EEPro100
2xPII/400 -> P166, tcp | 62.08 65.24 84.62 54.73 61.23
P166 -> 2xPII/400, tcp | 66.36 68.19 92.91 63.76 85.35
P166 -> 2xPII/400, udp | 88.76 85.33 95.40 58.95 95.76
Some of the configurations also were benchmarked with
NPtcp results for the Realtek, EtherExpressPro, 3C905B, and DFE570Tx cards:
The full results can be viewed here.
| ||latency [µs]||bandwidth [Mb/s]||RLT8139||39||81.8
||DFE570Tx (1 port)||42||89.7
Netpipe results for the channel bonded cases:
| ||latency [µs]
||DFE570Tx (3 ports)||43||250||94||199
||DFE570Tx (4 ports)||43||215||93||173
||EEPro100+ (2 ports)||51||179||102||154
- For channel bonding use the modules bonding.c from the 2.2.17 kernel even
if your are using a 2.2.16 kernel. Otherwise bad things (kernel oops)
happen. I found out about this the hard way:
"/etc/rc.d/init.d/network restart" will crash your machine so
badly that it won't even reboot (it hangs on unmounting /proc with
"device busy"). Only power cycling the box will get it back to life. The
problem is with bonding module:
"ifconfig bond0 down; rmmod bonding" will yield the same result
(oops). The 2.2.17 module seems to fix this.
- The 3c59x.c from the 2.2.17 kernel produced basically the same results
as the high-performance variant used in the tests above.
The 3c59x.c from kernels < 2.2.17 have bugs and should not be used.
- The 3Com 3c90x.c driver has given me nothing but problems, particularly
in connection with NFS. Under heavy NFS load the interface would simply
lock up and only "/etc/rc.d/init.d/network restart" would solve the
problem. As the tests above show the performance of the 3c90x driver
is much worse than that of the 3c59x driver.
- Channel bonding with the 3C905B cards/3c59x driver and the DFE570Tx
card/tulip driver works without any problems: just
follow the instructions in
create /etc/sysconfig/network-scripts/ifcfg-bond0 and setup the
/etc/sysconfig/network-scripts/ifcfg-eth1, etc. files. That's all
what is required. With the RealTek cards/driver this fails, because
the MAC addresses are not copied correctly. The only way out is to
actually change the real MAC addresses of the ethernet cards by making
them all equal. These
instructions that were published on the
Beowulf mailing list tell you how this
can be achieved.
Also with the RealTek cards the connection would occasionally die
and I had to run "/etc/rc.d/init.d/network restart". This did not
happen with the 3C905B/3c59x cards/driver.
The 3c90x driver cannot be used for channel bonding at all.
I only had two tulip and Intel cards, so I could not test channel
bonding with those cards.
- The RealTek cards were far the worst in this test.
Heavy udp load freezes the machine. The channel bonded 3 Realtek cards
sometimes locked up the machine so severely that only a hard reboot
(on/off switch) would bring it back. They are cheap, but it seems that
you get what you paid for.
- The results for the tulip, 3Com, Intel cards for PIII/600 -> PIII/600
do not differ significantly. However they differ in the tests from
the P166 to the dual PII/400. The significance of this test is the
following: the P166 does not have the cpu power to handle 100Mbit/s.
Hence, the transfer rates in this case are not limited by the highest
throughput a particular card/driver combination can handle, but by
the cpu. However, some of the cards are "smarter" than others as they
can off load some of the cpu tasks. Therefore, a higher throughput in
this test indicates a "smarter" card. This should be particularly
important when you channel bond the cards. If the cpu is 100% busy
with maintaining a high throughput, it is impossible to do computation
and communication in parallel. In this area the 3C905B/3c59x outperforms
all the other card/driver combinations.
- Channel bonding three 3C905B using the 3c59x driver works very well: the
bandwidth is basically three times the throughput of a single card.
Channel bonding the RealTek cards is out of the question with the
RealTek cards: they show a horrendous packet loss under udp. This
would bring NFS to a grinding halt. Also the reliability is poor:
You do not want to be forced to reboot the Beowulf cluster because the
NIC hangs up.
Channel bonding three or four ports on the DFE570Tx actually does not
work as well as with 3 3C905B cards. Performance for 4 ports channel bonded
is worse than the performance for 3 ports. This is almost certainly due
to the load on the CPU: the interrupt processing, etc.,
required for 4 channel bonded ports becomes so high that the CPU cannot
keep up. This becomes particularly clear with UDP traffic (the sender
pushes out packets as fast as it can, the receiver has to do interrupt
processing): the sending CPU pushes out 354.96Mb/s (i.e., basically
4 times the bandwidth of a single card), however, at the receiving end
only 61.75Mb/s are received. The remaining packets are dropped (you
don't want to use NFS in such a configuration).
Unfortunately, even in a configuration with just 3 ports channel
bonded the card does not reach the performance of 3 3c905B cards:
NPmpi (from Netpipe)
compiled with mpich-1.2.1
as well as NPmpi compiled with
show that the 3C905B's outperform the DFE570Tx by about 10%.