Forums

Crapy Broadcom 5708 1G ethernet

Started by Unknown April 1, 2015
We are using a Dell 2950 with two Broadcom 5708 1G ethernet, but the second adapter (in NAT mode) keep crapping out.   A similar test setup with 0.1G Realtek works fine.  The binary Linux driver for the Boardcom is pain in the a* to setup.  I would junk it if not build-in on the motherboard.  The up-link port seems OK at least.  So, we are looking for a second port.  One RT8118 is coming next week.  Just wondering if we should also try the Intel Pro 1G or any other chips?  If all else fails, downgrade to 0.1G.  I thought Dell usually use Intel chips, why go with a crapy Broadcom in this Enterprise rack server?
On 4/1/2015 5:23 PM, edward.ming.lee@gmail.com wrote:
> We are using a Dell 2950 with two Broadcom 5708 1G ethernet, but the second > adapter (in NAT mode) keep crapping out. A similar test setup with 0.1G > Realtek works fine. The binary Linux driver for the Boardcom is pain in the > a* to setup. I would junk it if not build-in on the motherboard. The > up-link port seems OK at least. So, we are looking for a second port. One > RT8118 is coming next week. Just wondering if we should also try the Intel > Pro 1G or any other chips? If all else fails, downgrade to 0.1G. I thought > Dell usually use Intel chips, why go with a crapy Broadcom in this > Enterprise rack server?
The Broadcom devices tend to have lots of bugs. You need to know *exactly* which "bug set" you need to support. Then, find a driver that knows how to handle those bugs. YMwV
On Wednesday, April 1, 2015 at 5:36:58 PM UTC-7, Don Y wrote:
> On 4/1/2015 5:23 PM, edward.ming.lee@gmail.com wrote: > > We are using a Dell 2950 with two Broadcom 5708 1G ethernet, but the second > > adapter (in NAT mode) keep crapping out. A similar test setup with 0.1G > > Realtek works fine. The binary Linux driver for the Boardcom is pain in the > > a* to setup. I would junk it if not build-in on the motherboard. The > > up-link port seems OK at least. So, we are looking for a second port. One > > RT8118 is coming next week. Just wondering if we should also try the Intel > > Pro 1G or any other chips? If all else fails, downgrade to 0.1G. I thought > > Dell usually use Intel chips, why go with a crapy Broadcom in this > > Enterprise rack server? > > The Broadcom devices tend to have lots of bugs. You need to know > *exactly* which "bug set" you need to support. Then, find a driver > that knows how to handle those bugs. > > YMwV
Yep, now I know. Just junk it.
On 4/1/2015 5:54 PM, edward.ming.lee@gmail.com wrote:
> On Wednesday, April 1, 2015 at 5:36:58 PM UTC-7, Don Y wrote: >> On 4/1/2015 5:23 PM, edward.ming.lee@gmail.com wrote: >>> We are using a Dell 2950 with two Broadcom 5708 1G ethernet, but the second >>> adapter (in NAT mode) keep crapping out. A similar test setup with 0.1G >>> Realtek works fine. The binary Linux driver for the Boardcom is pain in the >>> a* to setup. I would junk it if not build-in on the motherboard. The >>> up-link port seems OK at least. So, we are looking for a second port. One >>> RT8118 is coming next week. Just wondering if we should also try the Intel >>> Pro 1G or any other chips? If all else fails, downgrade to 0.1G. I thought >>> Dell usually use Intel chips, why go with a crapy Broadcom in this >>> Enterprise rack server? >> >> The Broadcom devices tend to have lots of bugs. You need to know >> *exactly* which "bug set" you need to support. Then, find a driver >> that knows how to handle those bugs. >> >> YMwV > > Yep, now I know. Just junk it.
If you're married to a particular OS (driver, etc.), then you're probably screwed -- or reliant upon someone else to "fix" the driver for your particular silicon. In my case, I just switched to another (BSD) OS that had better support for the hardware that I was using. Problem solved. I have no issues with any of the interfaces on these machines. Of course, the actual devices in your box will dictate whose solution works best for you.
On Wednesday, April 1, 2015 at 8:51:30 PM UTC-7, Don Y wrote:
> On 4/1/2015 5:54 PM, edward.ming.lee@gmail.com wrote: > > On Wednesday, April 1, 2015 at 5:36:58 PM UTC-7, Don Y wrote: > >> On 4/1/2015 5:23 PM, edward.ming.lee@gmail.com wrote: > >>> We are using a Dell 2950 with two Broadcom 5708 1G ethernet, but the second > >>> adapter (in NAT mode) keep crapping out. A similar test setup with 0.1G > >>> Realtek works fine. The binary Linux driver for the Boardcom is pain in the > >>> a* to setup. I would junk it if not build-in on the motherboard. The > >>> up-link port seems OK at least. So, we are looking for a second port. One > >>> RT8118 is coming next week. Just wondering if we should also try the Intel > >>> Pro 1G or any other chips? If all else fails, downgrade to 0.1G. I thought > >>> Dell usually use Intel chips, why go with a crapy Broadcom in this > >>> Enterprise rack server? > >> > >> The Broadcom devices tend to have lots of bugs. You need to know > >> *exactly* which "bug set" you need to support. Then, find a driver > >> that knows how to handle those bugs. > >> > >> YMwV > > > > Yep, now I know. Just junk it. > > If you're married to a particular OS (driver, etc.), then you're probably > screwed -- or reliant upon someone else to "fix" the driver for your > particular silicon. In my case, I just switched to another (BSD) OS > that had better support for the hardware that I was using. Problem > solved. I have no issues with any of the interfaces on these machines. > > Of course, the actual devices in your box will dictate whose solution > works best for you.
I usually avoid Broadcom whenever possible. Only binary drivers are available, so nobody else can see or fix problems. The problem seems to appear when both interfaces are active; namely, only for firewall/gateway servers. These boxes were loaded with Intel and QLogic Fibre Channel adapters. I guess they didn't care about the Broadcom problem before. Just wondering if we should go with more Intel or QLogic RJ45 adapters.
On 4/2/2015 10:10 AM, edward.ming.lee@gmail.com wrote:

>> Of course, the actual devices in your box will dictate whose solution >> works best for you. > > I usually avoid Broadcom whenever possible. Only binary drivers are > available, so nobody else can see or fix problems. The problem seems to > appear when both interfaces are active; namely, only for firewall/gateway > servers. These boxes were loaded with Intel and QLogic Fibre Channel > adapters. I guess they didn't care about the Broadcom problem before. Just > wondering if we should go with more Intel or QLogic RJ45 adapters.
With the broadcom devices, there are *two* issues to deal with: - the actual driver - the device's firmware Drivers are usually fixable -- as the sources are available. OTOH, the firmware is another matter. FWIW, I've had good success with FreeBSD and Broadcom. The same wasn't true about NetBSD (and the exact same devices). E.g., I recall NetBSD getting the interfaces "backwards" (swapped) in some places -- but not in others. This made it a real crap-shoot to try to configure each interface individually (as well as understanding to which *physical* interface a given set of messages pertained!)
On Thu, 02 Apr 2015 11:10:22 -0700, Don Y <this@is.not.me.com> wrote:

>On 4/2/2015 10:10 AM, edward.ming.lee@gmail.com wrote: > >>> Of course, the actual devices in your box will dictate whose solution >>> works best for you. >> >> I usually avoid Broadcom whenever possible. Only binary drivers are >> available, so nobody else can see or fix problems. The problem seems to >> appear when both interfaces are active; namely, only for firewall/gateway >> servers. These boxes were loaded with Intel and QLogic Fibre Channel >> adapters. I guess they didn't care about the Broadcom problem before. Just >> wondering if we should go with more Intel or QLogic RJ45 adapters. > >With the broadcom devices, there are *two* issues to deal with: >- the actual driver >- the device's firmware >Drivers are usually fixable -- as the sources are available. >OTOH, the firmware is another matter. > >FWIW, I've had good success with FreeBSD and Broadcom. The same wasn't >true about NetBSD (and the exact same devices). E.g., I recall NetBSD >getting the interfaces "backwards" (swapped) in some places -- but not >in others. This made it a real crap-shoot to try to configure each >interface individually (as well as understanding to which *physical* >interface a given set of messages pertained!)
Aparently the only 'good' Nic's are the Intel Pro series. Their the only ones that can saturate the connection reliably. I think the latest consumer Intel Pro is the CT model. Cheers
On Thursday, April 2, 2015 at 3:41:35 PM UTC-7, Martin Riddle wrote:
> On Thu, 02 Apr 2015 11:10:22 -0700, Don Y <this@is.not.me.com> wrote: > > >On 4/2/2015 10:10 AM, edward.ming.lee@gmail.com wrote: > > > >>> Of course, the actual devices in your box will dictate whose solution > >>> works best for you. > >> > >> I usually avoid Broadcom whenever possible. Only binary drivers are > >> available, so nobody else can see or fix problems. The problem seems to > >> appear when both interfaces are active; namely, only for firewall/gateway > >> servers. These boxes were loaded with Intel and QLogic Fibre Channel > >> adapters. I guess they didn't care about the Broadcom problem before. Just > >> wondering if we should go with more Intel or QLogic RJ45 adapters. > > > >With the broadcom devices, there are *two* issues to deal with: > >- the actual driver > >- the device's firmware > >Drivers are usually fixable -- as the sources are available. > >OTOH, the firmware is another matter. > > > >FWIW, I've had good success with FreeBSD and Broadcom. The same wasn't > >true about NetBSD (and the exact same devices). E.g., I recall NetBSD > >getting the interfaces "backwards" (swapped) in some places -- but not > >in others. This made it a real crap-shoot to try to configure each > >interface individually (as well as understanding to which *physical* > >interface a given set of messages pertained!) > > Aparently the only 'good' Nic's are the Intel Pro series. > Their the only ones that can saturate the connection reliably. > > I think the latest consumer Intel Pro is the CT model. > > > Cheers
Intel bought quite a bit of their NIC IP from Digital Equipment Corp.
On Thursday, April 2, 2015 at 11:10:45 AM UTC-7, Don Y wrote:
> On 4/2/2015 10:10 AM, edward.ming.lee@gmail.com wrote: > > >> Of course, the actual devices in your box will dictate whose solution > >> works best for you. > > > > I usually avoid Broadcom whenever possible. Only binary drivers are > > available, so nobody else can see or fix problems. The problem seems to > > appear when both interfaces are active; namely, only for firewall/gateway > > servers. These boxes were loaded with Intel and QLogic Fibre Channel > > adapters. I guess they didn't care about the Broadcom problem before. Just > > wondering if we should go with more Intel or QLogic RJ45 adapters. > > With the broadcom devices, there are *two* issues to deal with: > - the actual driver
Let's call this "firmware wrapper".
> - the device's firmware > Drivers are usually fixable -- as the sources are available. > OTOH, the firmware is another matter. > > FWIW, I've had good success with FreeBSD and Broadcom. The same wasn't > true about NetBSD (and the exact same devices). E.g., I recall NetBSD > getting the interfaces "backwards" (swapped) in some places -- but not > in others. This made it a real crap-shoot to try to configure each > interface individually (as well as understanding to which *physical* > interface a given set of messages pertained!)
There was a timeout prior to losing the interface. Web search shows some issues regarding off-load checksum to the driver, rather than done by the TCP stack. The device is taking too long or not dealing with errors properly. However, there is no easy way to disable this features from the "wrapper". Perhaps some versions of the firmware had this disabled. Or it could be other problems. That's why i hate binary (firmware) drivers. In normal drivers, we can selectively enable/disable features, or at least see the options.
On 4/3/2015 8:10 AM, edward.ming.lee@gmail.com wrote:
> On Thursday, April 2, 2015 at 11:10:45 AM UTC-7, Don Y wrote: >> On 4/2/2015 10:10 AM, edward.ming.lee@gmail.com wrote: >> >>>> Of course, the actual devices in your box will dictate whose solution >>>> works best for you. >>> >>> I usually avoid Broadcom whenever possible. Only binary drivers are >>> available, so nobody else can see or fix problems. The problem seems >>> to appear when both interfaces are active; namely, only for >>> firewall/gateway servers. These boxes were loaded with Intel and QLogic >>> Fibre Channel adapters. I guess they didn't care about the Broadcom >>> problem before. Just wondering if we should go with more Intel or >>> QLogic RJ45 adapters. >> >> With the broadcom devices, there are *two* issues to deal with: - the >> actual driver > > Let's call this "firmware wrapper".
Yes, but it *does* have some functionality. "Wrapper" sort of dismisses that.
>> - the device's firmware Drivers are usually fixable -- as the sources are >> available. OTOH, the firmware is another matter. >> >> FWIW, I've had good success with FreeBSD and Broadcom. The same wasn't >> true about NetBSD (and the exact same devices). E.g., I recall NetBSD >> getting the interfaces "backwards" (swapped) in some places -- but not in >> others. This made it a real crap-shoot to try to configure each interface >> individually (as well as understanding to which *physical* interface a >> given set of messages pertained!) > > There was a timeout prior to losing the interface.
Exhausting the resources made available to it (e.g., mbufs)?
> Web search shows some > issues regarding off-load checksum to the driver, rather than done by the > TCP stack.
These devices are a lot smarter than "legacy" devices. Essentially, microcoded to embed functionality in the NIC instead of forcing it to be implemented in the driver/OS/stack. Unfortunately, I htink some of the implementation choices (hardware *or* firmware) may have been poorly thought out. "We know what the customer needs *better* than [s]he does!"
> The device is taking too long or not dealing with errors > properly.. However, there is no easy way to disable this features from the > "wrapper". Perhaps some versions of the firmware had this disabled.
Try a FreeBSD LiveCD (-CURRENT is probably 10.something but the bce(4) driver goes back to 6.mumble. There are probably half a dozen controls that can be tweeked to influence its behavior. I'd take the "crawl before walk" approach: configure the interfaces for lower data rates (e.g., 100BaseTX FDX) and verify it works reliably. Then, start enabling features (TSO, MSI, etc.) and boosting the interface speed(s) *with* a rich set of resources (i.e., the *most* you can configure). Eventually, throttle back on resources and see what tickles the bug.
> Or it could be other problems. That's why i hate binary (firmware) > drivers.. In normal drivers, we can selectively enable/disable features, or > at least see the options.
Yup. At the very least, you can instrument or stub the code to see *why* (where) the problem manifests. And, from there, decide if its something that you can fix, the vendor must fix, etc. Here ("office"), it's hard for me to come up with *any* use that even momentarily saturates the Gb links (file transfers, interactive traffic, X protocol, etc). I have a couple of servers "waiting in the wings" because I can't use the capabilities that they have available (in this case, dual 10Gb interfaces). Doubtful I'll ever use that fat of a pipe *in* the house :-/ (hard to even make full use of 100M links) Good luck!