ASG
IBM
Zystems
Cressida
Icon
Netflexity
 
  MQSeries.net
Search  Search       Tech Exchange      Education      Certifications      Library      Info Center      SupportPacs      LinkedIn  Search  Search                                                                   FAQ  FAQ   Usergroups  Usergroups
 
Register  ::  Log in Log in to check your private messages
 
RSS Feed - WebSphere MQ Support RSS Feed - Message Broker Support

MQSeries.net Forum Index » General IBM MQ Support » 10054 TCP errors on channel communcation - tips

Post new topic  Reply to topic
 10054 TCP errors on channel communcation - tips « View previous topic :: View next topic » 
Author Message
HenriqueS
PostPosted: Mon Aug 06, 2007 11:26 am    Post subject: 10054 TCP errors on channel communcation - tips Reply with quote

Master

Joined: 22 Sep 2006
Posts: 235

Windows: 10054 and/or 10038 TCP errors on send() or recv() functions.
AIX: 73
Linux: 104
SUN Solaris: 131
HP UX: 232

Generic explanation:
-"The connection was reset by the remote side executing a "hard" or "abortive" close.
-"Reset by peer."

This error may occur in several different situations. It is an error from the underlying TCP layer, rarely directly related to MQ, and due to this fact, it is very improbable to obtain an MQ formmated error message (MQ*).

My intention here is that such topic will help any future MQ admins searching this forum.

Possiblity #1: Windows 2003 security feature
--------------------------------------------
This possibility affects MS SQL Server directly, but I understand it could happen with any software that listens to a TCP port.

Source url: http://msdn2.microsoft.com/en-us/library/ms189083.aspx

"Connections May Be Forcibly Closed When Running on Windows Server 2003 SP1
When testing scalability with a large number of client connection attempts to an instance of the SQL Server Database Engine running on Windows Server 2003 Service Pack 1, Windows may drop connections if the requests arrive faster than SQL Server can service them. This is a security feature of Windows Server 2003 Service Pack 1, which implements a finite queue for incoming TCP connection requests. It results in the following error:

ProviderNum: 7, Error: 10054, ErrorMessage: "TCP Provider: An existing connection was forcibly closed by the remote host ...

To resolve this issue, use the regedit.exe utility to add the following registry key:
Key Type Name Value
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters\ DWORD SynAttackProtect 00000000

Security Note:
Setting this registry key may expose the server to a SYN flood, denial-of-service attack. Add this registry value only if necessary and with an understanding of the security risks. Remove this registry value when testing is complete."

Possibility #2: disable the creation of new threads for responder channels
--------------------------------------------------------------------------

Source url: http://ck20.com/MQ/WMQ%20&%20WBIMB%20SupportPacs/MQ%20TCPIP%20error%20MQNOREMPOOL.txt

Problem: TCP/IP communications between MQ systems does not work with these
contact admin errors:

On the bad system:
A communications error for TCP/IP occurred.
An unexpected error occurred in communications.
The return code from the TCP/IP (ioctlsocket) call was 10038 (X'2736'). Record these values and tell the systems administrator.

On the good system:
Error on receive from host MQ057-000200 (127.0.0.1).
An error occurred receiving data from MQ057-000200 (127.0.0.1) over TCP/IP. This may be due to a communications failure.
The return code from the TCP/IP (recv) call was 10054 (X'2746'). Record these values and tell the systems administrator.

Solution: Add System Environment variable & reboot:

MQNOREMPOOL=1

Note: setting this environment variable my cause other erros such as described in http://www-1.ibm.com/support/docview.wss?&apar=only&uid=swg1IC43446 (IC43446: CLUSTERS:
AMQ9248 REPORTED WHEN TRYING TO START A CLUSTER SENDER CHANNEL AFTER APPLYING FIX PACK

More about the MQNOREMPOOL:

Source url: http://www-1.ibm.com/support/docview.wss?rs=171&context=SSFKSJ&dc=DB520&uid=swg21266191&loc=en_US&cs=UTF-8&lang=en&rss=ct171websphere

MQNOREMPOOL=1 disables the channel pooling mechanism. When channel pooling is disabled, all the responder threads are spawned in a single MQ listener process, as opposed to distributing the load among multiple AMQRMPPA processes when channel pooling is enabled.

Setting MQNOREMPOOL=1 can cause severe load levels on the MQ listener process. The use of this environment variable is not supported on OpenVMS due to performance and scalability limitations.

Possibility #3: faulty NIC or driver settings
---------------------------------------------

This is what really happened in our MQ installation. A fresh IBM iSeries, Windows 2003 R2 installed on it, and Websphere MQ 6.0, along with a Swift connectivity package.

The inbound channel connections could not last for more than 30 or 60 seconds, going to Inactive state after that.

File transfers and network access seemed very slow also

Using a packet capture software (Wireshark), we were able to catch several TCP Resets with no reason at all and many wrongly calculated TCP checksums flowing.

The first step we made and got some improvement was to lock the NIC speed to "100 full duplex", because the auto-detection feature could be working erratically even though Windows reported the link speed as "100 mpbs". The network speed improved a lot.

But there was still the channel inconsistencies, periods of inactivity and etc. The packet capture tool kerpt informing several TCP packet errors. So we decided to enable another NIC that was already available in the server and disable the currently active one.

This second step solved the problem completely and our channels are fully operational for the last 48 hours.

We ended up filling a request for our network operations team to be more careful when making a new server available...I guess a simple, but deep (PING did not diagnose this), network test before releasing a new machine for production could be a time-savy attitude.

We spent 6 hours exclusively on this issue.
Back to top
View user's profile Send private message
Display posts from previous:   
Post new topic  Reply to topic Page 1 of 1

MQSeries.net Forum Index » General IBM MQ Support » 10054 TCP errors on channel communcation - tips
Jump to:  



You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
Protected by Anti-Spam ACP
 
 


Theme by Dustin Baccetti
Powered by phpBB © 2001, 2002 phpBB Group

Copyright © MQSeries.net. All rights reserved.