Connect SOAP WSDL

Posted by archana.gupta on 28-Dec-2016 06:32

Hello ,

We are trying to connect to SOAP webservice. There is no problem in getting the connection if the numbers of clients are less. However if there is situation that hundreds of clients (around 1200) simultaneously trying to connect to SOAP wsdl, the execution hangs.

 After debugging we found that the process hanged at connect statement (hWebService:CONNECT).

 This means, neither error is raised, nor STOP condition is raised, nor finally block in the code is executed.

Below is the code snippet:

 loTimeOut = true.

DOBLOCK:
DO STOP-AFTER 60
ON STOP UNDO DOBLOCK, LEAVE DOBLOCK:

 CREATE SERVER hWebService.
 hWebService:CONNECT("-WSDL " + "#########/*****.wsdl" + " -nohostverify") NO-ERROR.

 IF ERROR-STATUS:ERROR OR
 NOT(VALID-HANDLE(hWebService)) OR
 NOT(hWebService:CONNECTED()) OR
 ERROR-STATUS:NUM-MESSAGES > 0 THEN
 ...

end. /*DOBLOCK*/

loTimeOut = false.

end.

 IF loTimeOut THEN
DO:

END.
 
CATCH anyErrorObject AS Progress.Lang.Error:

END CATCH.

FINALLY:
/*delete handles.*/

END.

I.e. neither the control goes to next statement, nor the control goes to “Timeout” block , nor the control goes to catch block , not finally block is executed.

Please suggest how this can be avoided. We are consuming corticon decision service deployed on tomcat. We have customized tomcat with below properties

connectionTimeout="60000"

maxThreads="500"

maxConnections="1000"

acceptCount="300”

Thanks,

Archana 

 

All Replies

Posted by Brian K. Maher on 29-Dec-2016 06:40

Hi Archana,
 
Does your application create the server object once and connect just once or do you follow the create -> connect -> run -> disconnect -> delete model?
 
If you follow the second model you should switch to the first.  Because Web Services use HTTP/HTTPS which is inherently stateless you really only need to create the server object and connect to the Web Service one time during the execution of the application then you can make calls against that handle as required.
 
Brian

Posted by archana.gupta on 29-Dec-2016 06:58

Hi Brian,

We are following first approach, that is create the server object once and connect just once.

However to replicate the scenario running in Production (where thousands of progress clients are accessing the SOAP webservice same time), we are are launching simultaneous background progress sessions (using -b -p parameters).  

When we increase the count of these sessions to 1200 (situation that could in real environment), we found that there are quite a number of sessions in which the execution of the code is in "hang" state at "connect" statement.

Just to add here, no such problem occurs if we locally keep the copy of WSDL file and use it in connect statement (however this is not feasible for production).

Thanks,

Archana

Posted by Brian K. Maher on 29-Dec-2016 07:02

Archana,
 
Are these background processes running on the same machine where the Web Service is running?
 
Brian

Posted by archana.gupta on 29-Dec-2016 07:05

They are running in three different machines (400 session on each machine)

Thanks,

Archana

Posted by Brian K. Maher on 29-Dec-2016 07:28

Hi Archana,
 
Okay.  I assume the total environment is 4 machines (3 used to stress test and 1 running the Web Service).
 
Can you do the following?
 

1)      On the machine running the Web Service run the netstat -a command and redirect the output to a file.

2)      After that is done, kick off the 400 processes on each of the 3 stress test machines.

3)      Once you start seeing the problem...

a.       Run the netstat -a command again on the machine running the Web Service and redirect the output to a second file.

b.       Run the netstat -a command on each of the 3 stress test machines, redirecting the output to different files on each machine.

 
Once all this is done, start by looking at the output file from step 3.a and count how many sockets are sitting in a TIME_WAIT state (note that the text might vary slightly ... TIMED_WAIT or something close to that).  Do the same on each of the output files from step 3.b.  If possible, generate a count of the number of times this value is shown in each file and respond back with that information.
 
The reason I am asking you to do this is because I have a suspicion that you are running out of available sockets in the dynamically allocatable pool of sockets (normally sockets in the 2000 to 4000 port range).  If you see thousands of sockets (particularly in the 2000 to 4000 range) that are sitting in the TIME_WAIT state then you are stalled because there are no dynamically allocatable ports on the server side which are available for reuse.  The norm is that once a socket is disconnected it sits in the TIME_WAIT state for something like 3 to 4 minutes (not absolutely sure on that and it can vary depending on OS config) before it becomes available for reuse.  This is part of the design of TCP/IP so that stray packets from a previous connection do not wrongly get sent to the socket when it is connected to a new client.
 
Brian

Posted by archana.gupta on 30-Dec-2016 07:27

Hi Brian,

I run the above test case, please find below the details. The results were not something you suspected.

Before Running the test:

---------------------------------

On the 3 stress machines, total occurrences of TIME_WAIT was nearly 6 in the "netstat" file.

On the web service machine, total occurrences of TIME_WAIT was nearly 18 in the "netstat" file.

After Running the test:

--------------------------------

The issue occurred on just one machine, and the number of sessions hanged were 130 out of 400. i.e. for 1st and 2nd machine, all 400 + 400 requests were executed properly, and on 3rd machine, 130 sessions, went in "hang" status.

It was very strange to see that in all the three stress machines, the occurrence of TIME_WAIT was around 3.

On the web service machine, total occurrences of TIME_WAIT was nearly 28. But this count rapidly changed. Then we check that there were around 600 occurrences of FIN_WAIT_2 which rapidly changed to CLOSE_WAIT.

Thanks,

Archana

 

 

Posted by Brian K. Maher on 30-Dec-2016 08:46

Archana,
 
The counts on the stress test machines are fine so those don’t seem to be the problem.
 
What OS and version is the Web Service running on?
 
Brian

Posted by Brian K. Maher on 30-Dec-2016 09:02

Archana,
 
I think that at this point you need to open an official support request with Technical Support.  If you have an active maintenance agreement with us you should be able to send this entire thread to support by clicking a button.
 
Brian

This thread is closed