Hello ,
We are trying to connect to SOAP webservice. There is no problem in getting the connection if the numbers of clients are less. However if there is situation that hundreds of clients (around 1200) simultaneously trying to connect to SOAP wsdl, the execution hangs.
After debugging we found that the process hanged at connect statement (hWebService:CONNECT).
This means, neither error is raised, nor STOP condition is raised, nor finally block in the code is executed.
Below is the code snippet:
loTimeOut = true.
DOBLOCK:
DO STOP-AFTER 60
ON STOP UNDO DOBLOCK, LEAVE DOBLOCK:
CREATE SERVER hWebService.
hWebService:CONNECT("-WSDL " + "#########/*****.wsdl" + " -nohostverify") NO-ERROR.
IF ERROR-STATUS:ERROR OR
NOT(VALID-HANDLE(hWebService)) OR
NOT(hWebService:CONNECTED()) OR
ERROR-STATUS:NUM-MESSAGES > 0 THEN
...
end. /*DOBLOCK*/
loTimeOut = false.
end.
IF loTimeOut THEN
DO:
…
END.
CATCH anyErrorObject AS Progress.Lang.Error:
…
END CATCH.
FINALLY:
/*delete handles.*/
END.
I.e. neither the control goes to next statement, nor the control goes to “Timeout” block , nor the control goes to catch block , not finally block is executed.
Please suggest how this can be avoided. We are consuming corticon decision service deployed on tomcat. We have customized tomcat with below properties
connectionTimeout="60000"
maxThreads="500"
maxConnections="1000"
acceptCount="300”
Thanks,
Archana
Hi Brian,
We are following first approach, that is create the server object once and connect just once.
However to replicate the scenario running in Production (where thousands of progress clients are accessing the SOAP webservice same time), we are are launching simultaneous background progress sessions (using -b -p parameters).
When we increase the count of these sessions to 1200 (situation that could in real environment), we found that there are quite a number of sessions in which the execution of the code is in "hang" state at "connect" statement.
Just to add here, no such problem occurs if we locally keep the copy of WSDL file and use it in connect statement (however this is not feasible for production).
Thanks,
Archana
They are running in three different machines (400 session on each machine)
Thanks,
Archana
1) On the machine running the Web Service run the netstat -a command and redirect the output to a file.
2) After that is done, kick off the 400 processes on each of the 3 stress test machines.
3) Once you start seeing the problem...
a. Run the netstat -a command again on the machine running the Web Service and redirect the output to a second file.
b. Run the netstat -a command on each of the 3 stress test machines, redirecting the output to different files on each machine.
Hi Brian,
I run the above test case, please find below the details. The results were not something you suspected.
Before Running the test:
---------------------------------
On the 3 stress machines, total occurrences of TIME_WAIT was nearly 6 in the "netstat" file.
On the web service machine, total occurrences of TIME_WAIT was nearly 18 in the "netstat" file.
After Running the test:
--------------------------------
The issue occurred on just one machine, and the number of sessions hanged were 130 out of 400. i.e. for 1st and 2nd machine, all 400 + 400 requests were executed properly, and on 3rd machine, 130 sessions, went in "hang" status.
It was very strange to see that in all the three stress machines, the occurrence of TIME_WAIT was around 3.
On the web service machine, total occurrences of TIME_WAIT was nearly 28. But this count rapidly changed. Then we check that there were around 600 occurrences of FIN_WAIT_2 which rapidly changed to CLOSE_WAIT.
Thanks,
Archana