Parallel DHCP Clients

There (as of 4.1.1-P1) is a bug in DHCP (at least the public release from the ISC) that can cause multiple DHCP client instances to accidentally generate the same “random” XIDs (transaction ID) and therefore mangle communications when operating in parallel. The problem is that they seed their randomness using “current time in seconds” + “some seed”. And as you quickly note, launching processes in parallel almost guarantees more than one process per second…

srandom(seed + cur_time);

The fix, as I filed under [ISC-Bugs #21497] (thanks to Jeff Haran for the idea) was simple… Just add getpid().

srandom(seed + (unsigned)cur_time + (unsigned)getpid());

Of course, fixing this 1 issue brought 2 more to the surface. Read on to see what I mean.

  1. At least in Linux, there is a notion of “Primary” and “Secondary” addresses. (Aliases being one kind of secondary address) The problem is that with removal of any Primary address (which happens in several different DHCP client state changes) all Secondary addresses are lost. This plays out very badly if any of the secondary addresses are controlled by a separate instance of a DHCP client state machine. Not to mention that if you lose state on a running DHCP client (like removing its address from under it), you may have to wait for a lease renewal to reactivate the state machine.

  2. Which brings up the next issue, which is the lack of shared state between parallel instances of DHCP clients. Seems unnecessary until you realize that because there isn’t any shared state, the Primary address is not guaranteed to arrive first. This also seems of little consequence until you realize that any Secondary addresses, assigned before the Primary, are obliterated (another loss of state) in the setting of the Primary address. Oops?

Now, after about 3 weeks of working on this issue, I think I discovered a solution to the Primary/Secondary problem.  Hey, 50% isn’t too bad.
The fix is to set a kernel networking flag called “promote_secondaries”, although you will note that this is only for IPv4.

echo 1 >/proc/sys/net/ipv4/conf/all/promote_secondaries

It is just that simple. Now as soon as you remove a Primary address, the next Secondary address will take its place.