| Author | 
		  Message
		 | 
		
		  | vsathyan | 
		  
		    
			  
				 Posted: Tue Jun 02, 2015 7:33 pm    Post subject: shmmni 100% used - Linux | 
				     | 
			   
			 
		   | 
		
		
		   Centurion
 
 Joined: 10 Mar 2014 Posts: 121
  
  | 
		  
		    
			  
				Hi all, 
 
We are facing a strange issue in production without any change to the infrastructure which was deployed almost 4 months ago. 
 
 
The shared V system memory - shmmni gets 100% full and the queue manager performance degrades, finally not accepting any connections and existing connections fail with MQRC 2009/2059. 
 
 
There are only 3 server connection channels in this queue manager with a total number of 34 connections for all the three. The server connection channels have DISCINT = 0 and SHARECNV = 0. Does this create a problem?
 
 
There are other queue managers in the network, which are highly overloaded as compared to this queue manager, but they are using only around 45 out of 6400 sets of shmmni. 
 
 
The queue manager is running with only 16 processes under 'mqm' user account 
 
ps -ef | grep mqm
 
 
The operating system is Oracle Enterprise Linux 6.5. WebSphere MQ 7.5.0.2.
 
 
For a temporary fix, we increased the shmmni to 8192, but we have to identify and apply a permanent fix for this issue. 
 
 
Below are the command outputs
 
----------------------------------------------------------
 
/opt/mqm/bin/mqconfig
 
System V Shared Memory
 
  shmmax              68719476736 bytes                  IBM>=268435456    PASS
 
  shmmni              6400 of 8192 sets          (78%)   IBM>=4096         WARN
 
  shmall              417061616 of 4294967296 pages (9%)    IBM>=2097152      PASS
 
 
 
[mqm@server ~]$ free
 
             total       used       free     shared    buffers     cached
 
Mem:      16330176   16168032     162144   10594096     103840   14658264
 
-/+ buffers/cache:    1405928   14924248
 
Swap:      2097144          0    2097144
 
 
-------------------------------------------------------------------
 
 
Also, in the MQ error logs we observed that there were errors logged with MQRC 2071. (MQRC_STORAGE_NOT_AVAILABLE). When we checked the NFS mount, the usage is around 6%
 
 
nfsserver:/mq_prodnfs/mq_prodnfs
 
                       50G  2.7G   47G   6% /mqdata
 
 
Out of 50GB, only 2.7GB is used and 47GB free. 
 
Googled for MQRC 2071, and as indicated in a couple of links, the app is hosted on windows, and is not posting blank messages either. 
 
 
The setup was running fine from nearly 4 months, and suddenly we have started facing this issue from last Friday. 
 
 
Your inputs are much appreciated. Thanks in advance for your advise. | 
			   
			 
		   | 
		
		
		  | Back to top | 
		  
		  	
		   | 
		
		
		    | 
		
		
		  | exerk | 
		  
		    
			  
				 Posted: Wed Jun 03, 2015 1:13 am    Post subject:  | 
				     | 
			   
			 
		   | 
		
		
		    Jedi Council
 
 Joined: 02 Nov 2006 Posts: 6339
  
  | 
		  
		    
			  
				As a starting point, I suggest checking all Change Management Records applied on that date and see if any were applied specifically to your server, and if so, investigate that change. _________________ It's puzzling, I don't think I've ever seen anything quite like this before...and it's hard to soar like an eagle when you're surrounded by turkeys. | 
			   
			 
		   | 
		
		
		  | Back to top | 
		  
		  	
		   | 
		
		
		    | 
		
		
		  | fjb_saper | 
		  
		    
			  
				 Posted: Wed Jun 03, 2015 2:52 am    Post subject:  | 
				     | 
			   
			 
		   | 
		
		
		    Grand High Poobah
 
 Joined: 18 Nov 2003 Posts: 20768 Location: LI,NY 
  | 
		  
		    
			  
				Also for storage not available: check not only for mqdata but also for the mqlogs file system.   _________________ MQ & Broker admin | 
			   
			 
		   | 
		
		
		  | Back to top | 
		  
		  	
		   | 
		
		
		    | 
		
		
		  | tczielke | 
		  
		    
			  
				 Posted: Wed Jun 03, 2015 4:54 am    Post subject:  | 
				     | 
			   
			 
		   | 
		
		
		   Guardian
 
 Joined: 08 Jul 2010 Posts: 943 Location: Illinois, USA 
  | 
		  
		    
			  
				When you get to the shmmni 100% full condition, have you confirmed that MQ is taking up the shared memory segments?  Have you checked with something like ipcs -m?
 
 
Also, you probably want to look into being on the latest fix pack, which is 7.5.0.5. _________________ Working with MQ since 2010. | 
			   
			 
		   | 
		
		
		  | Back to top | 
		  
		  	
		   | 
		
		
		    | 
		
		
		  | vsathyan | 
		  
		    
			  
				 Posted: Wed Jun 03, 2015 7:59 am    Post subject:  | 
				     | 
			   
			 
		   | 
		
		
		   Centurion
 
 Joined: 10 Mar 2014 Posts: 121
  
  | 
		  
		    
			  
				@exerk,
 
There were no changes applied on that date, but a couple of months ago, there was a forced Linux patching. But that should not affect the queue manager after 2 months. 
 
 
@fjp_saper
 
Unfortunately, data and logs are in the same share/mount (we are in the process of moving it to a different mount). There is space available and the usage is only 6%.
 
 
@tczielke,
 
ipcs -m listed active mq shared processes. There were no processes listed for destruction.
 
 
On an other note, we have identified a damaged queue object, used by N a s t e l monitoring agent. The contact admin process may have been trying to access this object and finally creating the problem. 
 
Currently we have stopped the contact admin agent, restarted the queue manager and as of now, the shmmni is consistently used around 34 out of 8192 sets from the past 10 hours. We are monitoring the same. 
 
 
Will update you once we have more information. 
 
 
thanks all for your valuable time and inputs. 
 
Cheers! | 
			   
			 
		   | 
		
		
		  | Back to top | 
		  
		  	
		   | 
		
		
		    | 
		
		
		  | rammer | 
		  
		    
			  
				 Posted: Wed Jun 03, 2015 2:20 pm    Post subject:  | 
				     | 
			   
			 
		   | 
		
		
		   Partisan
 
 Joined: 02 May 2002 Posts: 359 Location: England 
  | 
		  
		    
			  
				
   
	| vsathyan wrote: | 
   
  
	
 
 
On an other note, we have identified a damaged queue object, used by contact admin monitoring agent. The contact admin process may have been trying to access this object and finally creating the problem. 
 
Currently we have stopped the contact admin agent, restarted the queue manager and as of now, the shmmni is consistently used around 34 out of 8192 sets from the past 10 hours. We are monitoring the same. 
 
 
 | 
   
 
 
 
Sounds like a good spot. | 
			   
			 
		   | 
		
		
		  | Back to top | 
		  
		  	
		   | 
		
		
		    | 
		
		
		  | vsathyan | 
		  
		    
			  
				 Posted: Tue Jul 14, 2015 8:17 am    Post subject:  | 
				     | 
			   
			 
		   | 
		
		
		   Centurion
 
 Joined: 10 Mar 2014 Posts: 121
  
  | 
		  
		    
			  
				Update:
 
 
MQ 7.5.0.2 has a memory leak problem, confirmed by IBM and fixed in 7.5.0.5. 
 
 
We tested before applying the maintenance pack, reproduced the issue, and applied the 7.5.0.5 - tried to reproduce using the same steps which we did in 7.5.0.2. The memory leak did not occur. 
 
 
Hope this helps some one, who is still using MQ 7.5 :p _________________ Custom WebSphere MQ Tools Development C# & Java
 
WebSphere MQ Solution Architect Since 2011
 
WebSphere MQ Admin Since 2004 | 
			   
			 
		   | 
		
		
		  | Back to top | 
		  
		  	
		   | 
		
		
		    | 
		
		
		  | tczielke | 
		  
		    
			  
				 Posted: Tue Jul 14, 2015 8:23 am    Post subject:  | 
				     | 
			   
			 
		   | 
		
		
		   Guardian
 
 Joined: 08 Jul 2010 Posts: 943 Location: Illinois, USA 
  | 
		  
		    
			  
				Thanks for sharing this.  Do you have the APAR number that corrects this issue in 7.5.0.5? _________________ Working with MQ since 2010. | 
			   
			 
		   | 
		
		
		  | Back to top | 
		  
		  	
		   | 
		
		
		    | 
		
		
		  | vsathyan | 
		  
		    
			  
				 Posted: Tue Jul 14, 2015 9:47 am    Post subject:  | 
				     | 
			   
			 
		   | 
		
		
		   Centurion
 
 Joined: 10 Mar 2014 Posts: 121
  
  | 
		  
		    
		   | 
		
		
		  | Back to top | 
		  
		  	
		   | 
		
		
		    | 
		
		
		  | tczielke | 
		  
		    
			  
				 Posted: Tue Jul 14, 2015 9:54 am    Post subject:  | 
				     | 
			   
			 
		   | 
		
		
		   Guardian
 
 Joined: 08 Jul 2010 Posts: 943 Location: Illinois, USA 
  | 
		  
		    
			  
				Thanks.  It looks like that APAR was corrected in 7.5.0.3, too. _________________ Working with MQ since 2010. | 
			   
			 
		   | 
		
		
		  | Back to top | 
		  
		  	
		   | 
		
		
		    | 
		
		
		  | vsathyan | 
		  
		    
			  
				 Posted: Tue Jul 14, 2015 9:58 am    Post subject:  | 
				     | 
			   
			 
		   | 
		
		
		   Centurion
 
 Joined: 10 Mar 2014 Posts: 121
  
  | 
		  
		    
			  
				Yeah, it was corrected in 7.5.0.3. 
 
 
7.5.0.5 has a bunch of fixed applied. When we tested in our environment, there were no side effects and also we can sustain for an year or so.  
 
 
Hence we deployed 7.5.0.5 in our prod environment. The environment is very stable now.  
 
 
Thanks & Regards,
 
vsathyan _________________ Custom WebSphere MQ Tools Development C# & Java
 
WebSphere MQ Solution Architect Since 2011
 
WebSphere MQ Admin Since 2004 | 
			   
			 
		   | 
		
		
		  | Back to top | 
		  
		  	
		   | 
		
		
		    | 
		
		
		  | 
		    
		   |