TCP/IP write errors - high level of WS agents losing DB conn

Posted by conclarryf on 04-Feb-2014 14:13

Hi All...

running OE11.1 on RH linux 6.4 64-bit -ESX VM hosts

We are having widespread issues with our webspeed agents losing connections to their databases.  The databases are showing a large number of "TCP/IP write error occurred with errno 32" being logged in all of the conencted databases.  These are purely random in nature - no pattern at all.  WE have to bounce webspeed to re-establish the connections to get our app working again.

I've opened support cases with PSC and we did make some OS tcp keepalive changes to no avail.  Was hoping someone else in the trenches might have seen this and has a remedy.  Anyone?  Anyone?

All Replies

Posted by Rob Fitzpatrick on 04-Feb-2014 15:12

Have you traced the network traffic between your host and client machines?  Is there evidence of network reliability issues?  E.g. TCP retransmissions, resets, zero window updates, etc.  Are there devices (switches, routers, firewalls) between client and server that could potentially cause such issues?

Posted by conclarryf on 04-Apr-2014 11:57

Sorry for the delay responding Rob.  We resolved by cutting the TCP_KEEPALIVE_TIMEOUT to 30 seconds. 


[collapse]
On Tue, Feb 4, 2014 at 4:12 PM, Rob Fitzpatrick <bounce-robfsit@community.progress.com> wrote:
Reply by Rob Fitzpatrick

Have you traced the network traffic between your host and client machines?  Is there evidence of network reliability issues?  E.g. TCP retransmissions, resets, zero window updates, etc.  Are there devices (switches, routers, firewalls) between client and server that could potentially cause such issues?

Stop receiving emails on this subject.

Flag this post as spam/abuse.


[/collapse]

This thread is closed