I need some help with configuring NameServer Load Balancing for fail-over setup.
OE is 11.3.3 on AIX 64-bit.
I created two AppServer brokers with the same appserviceNameList - AS_pb00b1 and AS_pb00b2. For AS_pb00b1 I set priorityWeight=100 and for AS_pb00b2 priorityWeight=0. When I start them all requests go to AS_pb00b1 - this is expected.
Now, when I kill AS_pb00b1 (I mean unix kill, not asbman -k) I expect that all requests will go to AS_pb00b2 automatically, but it doesn't happen. It needs minute or two for NameServer to notice that AS_pb00b1 is dead. If I decrease registrationRetry to 1 and brokerKeepAliveTimeout to 2 (documentation says that it shoud be higher than registrationRetry), it still takes several seconds.
Is there a way to make NameServer notice dead broker immediately ? Is there any risk with setting registrationRetry and brokerKeepAliveTimeout to so low values ?
I found this KB entry with Enhancement Request regarding this:
but it's quite old, from the times of OE 10.2B. Is this still valid ?
BTW, it works almost the same way even without NameServer LoadBalancer product installed.
Thanks for any suggestions.
The communication between the nameserver and broker uses UDP protocol. As such, there is no connection between the nameserver and broker. When you kill the AS_pb00b1, the nameserver is unaware that this has happened. The nameserver must decide based on how when it hears the registration messages from the brokers as to which brokers are active. The timing of this cannot be too precise, even with small values for the brokerKeepAliveTimeout and registrationRetry values.
Making these values too small can increase network traffic unnecessarily. You risk degrading general performance in order to handle an edge case.
Another downside to very low retry values is you may get false disconnects. Your AppServer is up, but a ping was not received in the retry time so it is removed from the NameServer's list of available AppServers.
Thanks for your input. I forgot to mention - NameServer and both AppServers reside on the same physical machine, so there should be no delays in network communication.