Why does OpenEdge broker fire too many appserver ??

Posted by dvoyat on 03-Nov-2016 10:39

Running OE11.5 StateLess appserver (using NameServer) (HPUx 11)
Just recently spotted that our broker did fire 48 new agent in 3s.

We've been running from 10 to 58 agent and almost all of the new agent get trimmed within an hour (rougly).
Since the broker is running asbman is reporting a max Client queue depth of 22 which means that at worse we got 22 client request on the queue. The max agent is set to 75 which has not been reached.
Server startup timeout as well request timeout (not applied as max agent not reached) are still the default ones 3s and 15s.

This case happened some week ago and our appserver log setting makes the agent log no longer available but I can clearly see from broker log the 48 "started server" requests.

I recently checked with some fresh agent start and I spotted a delay (6 mins) between broker log file requesting a new agent and agent log file mentionning the start of new agent.

Bellow is an extract of broker log file reporting last new agent started today

[16/11/03@10:23:08.789+0100] P-003568 T-S-0966 2 UB Basic      Started server thread: S-0966. (8101)
[16/11/03@10:23:08.856+0100] P-003568 T-S-0966 2 UB Basic      Started server: /progress/dlc115/bin/_proapsv -logginglevel 3 -logfile /home/eucomm/EUrafgo/log/dbproc/EUrafgo.server.002994.log -ubpid 3568 -Ms 1 -logname EUrafgo -logentrytypes ASPlumbing,DB.Connects -logthreshold 500000000 -numlogfiles 3 -ASID 966 -ubpropfile /home/eucomm/properties/ubroker.properties -ipver IPv4 -pf /home/eucomm/EUrafgo/db/params/eugoAS_utf.pf (8108)
[16/11/03@10:23:09.558+0100] P-003568 T-S-0967 2 UB Basic      Started server thread: S-0967. (8101)
[16/11/03@10:23:09.574+0100] P-003568 T-S-0967 2 UB Basic      Started server: /progress/dlc115/bin/_proapsv -logginglevel 3 -logfile /home/eucomm/EUrafgo/log/dbproc/EUrafgo.server.002994.log -ubpid 3568 -Ms 1 -logname EUrafgo -logentrytypes ASPlumbing,DB.Connects -logthreshold 500000000 -numlogfiles 3 -ASID 967 -ubpropfile /home/eucomm/properties/ubroker.properties -ipver IPv4 -pf /home/eucomm/EUrafgo/db/params/eugoAS_utf.pf (8108)
[16/11/03@10:23:09.575+0100] P-003568 T-S-0968 2 UB Basic      Started server thread: S-0968. (8101)
[16/11/03@10:23:09.587+0100] P-003568 T-S-0968 2 UB Basic      Started server: /progress/dlc115/bin/_proapsv -logginglevel 3 -logfile /home/eucomm/EUrafgo/log/dbproc/EUrafgo.server.002994.log -ubpid 3568 -Ms 1 -logname EUrafgo -logentrytypes ASPlumbing,DB.Connects -logthreshold 500000000 -numlogfiles 3 -ASID 968 -ubpropfile /home/eucomm/properties/ubroker.properties -ipver IPv4 -pf /home/eucomm/EUrafgo/db/params/eugoAS_utf.pf (8108)
[16/11/03@10:23:09.589+0100] P-003568 T-S-0969 2 UB Basic      Started server thread: S-0969. (8101)
[16/11/03@10:23:09.616+0100] P-003568 T-S-0969 2 UB Basic      Started server: /progress/dlc115/bin/_proapsv -logginglevel 3 -logfile /home/eucomm/EUrafgo/log/dbproc/EUrafgo.server.002994.log -ubpid 3568 -Ms 1 -logname EUrafgo -logentrytypes ASPlumbing,DB.Connects -logthreshold 500000000 -numlogfiles 3 -ASID 969 -ubpropfile /home/eucomm/properties/ubroker.properties -ipver IPv4 -pf /home/eucomm/EUrafgo/db/params/eugoAS_utf.pf (8108)
[16/11/03@10:23:09.888+0100] P-003568 T-S-0970 2 UB Basic      Started server thread: S-0970. (8101)
[16/11/03@10:23:09.896+0100] P-003568 T-S-0970 2 UB Basic      Started server: /progress/dlc115/bin/_proapsv -logginglevel 3 -logfile /home/eucomm/EUrafgo/log/dbproc/EUrafgo.server.002994.log -ubpid 3568 -Ms 1 -logname EUrafgo -logentrytypes ASPlumbing,DB.Connects -logthreshold 500000000 -numlogfiles 3 -ASID 970 -ubpropfile /home/eucomm/properties/ubroker.properties -ipver IPv4 -pf /home/eucomm/EUrafgo/db/params/eugoAS_utf.pf (8108)
[16/11/03@10:23:10.009+0100] P-003568 T-S-0971 2 UB Basic      Started server thread: S-0971. (8101)
[16/11/03@10:23:10.025+0100] P-003568 T-S-0971 2 UB Basic      Started server: /progress/dlc115/bin/_proapsv -logginglevel 3 -logfile /home/eucomm/EUrafgo/log/dbproc/EUrafgo.server.002994.log -ubpid 3568 -Ms 1 -logname EUrafgo -logentrytypes ASPlumbing,DB.Connects -logthreshold 500000000 -numlogfiles 3 -ASID 971 -ubpropfile /home/eucomm/properties/ubroker.properties -ipver IPv4 -pf /home/eucomm/EUrafgo/db/params/eugoAS_utf.pf (8108)

and bellow agent log file reporting these start

[16/11/03@10:29:05.843+0000] P-016897 T-000001 2 AS AS Application Server Startup. (5473)
[16/11/03@10:29:06.527+0000] P-016898 T-000001 2 AS AS Application Server Startup. (5473)
[16/11/03@10:29:06.647+0000] P-016899 T-000001 2 AS AS Application Server Startup. (5473)
[16/11/03@10:29:06.810+0000] P-016900 T-000001 2 AS AS Application Server Startup. (5473)
[16/11/03@10:29:06.910+0000] P-016907 T-000001 2 AS AS Application Server Startup. (5473)
[16/11/03@10:29:06.981+0000] P-016906 T-000001 2 AS AS Application Server Startup. (5473)

With my current understanding when all our agent are active the broker should wait 3s (server startup timeout) before started a new agent. There's nothing clear in the doc saying how many new agent started but as a quick guess it shouldn't be more than the current client queue at the time it happens.

Also how to interpret the 6mins delay between broker log showing a new agent request and agent log showing the new agent started.

We don't have any startup, activate, connect,..... procedure defined and the appserver and connected database are on same server. Database log file is not showing anything and we haven't been facing anything critical with the application on the day the appserver peak occured. 

There's something weird at some point so if anyone has clue or experience this is more than welcome ;-)

Rgds

Denis

 

All Replies

Posted by rayherring on 03-Nov-2016 17:59

6min delay could be how long it takes for the agent to fire up everything it needs?

I know in our system we start up by default 15 agents for the broker that our branches use and it takes about 10mins in total to start up all 15 of them, but we have a lot of business entities that each webspeed agent has to fire up and the fact that it is firing 15 of them in total.

In the beginning we used to start up only a few and then let the system bring on the rest as it needed them but the delay from realising it needed a new one to starting it up to the agent then being ready was too long and everyone was complaining.

Posted by dvoyat on 04-Nov-2016 03:31

Thanks Ray.

Delay is one thing but agent# is even more crucial to understand there. Delay may vary depending on the server loading. 6mn rings some bell as we don't fire anything at startup but db connexion.  What is really strange is how broker can fire 48 appservers in approx 3s where we never had more than 22 client (max) in the queue (unless asbman return wrong details). I'll probably contact support to get more clue on their firing rules if possible. I can still move the max agent to lower value to limit any side effect but I'm worrying it can raise timeout error to any new client request (what we didn't get as well as end user complaint). I've some feeling that whenever a bottleneck happen broker get a bit crazy in firing new agent (we had cases with deadlock involving one agent but this time it was different there was no deadlock and it was friday mid afternoon where the business activity usually start to slow down).

This thread is closed