ASG
IBM
Zystems
Cressida
Icon
Netflexity
 
  MQSeries.net
Search  Search       Tech Exchange      Education      Certifications      Library      Info Center      SupportPacs      LinkedIn  Search  Search                                                                   FAQ  FAQ   Usergroups  Usergroups
 
Register  ::  Log in Log in to check your private messages
 
RSS Feed - WebSphere MQ Support RSS Feed - Message Broker Support

MQSeries.net Forum Index » Clustering » AMQ9500 - No Repository storage

Post new topic  Reply to topic Goto page Previous  1, 2, 3
 AMQ9500 - No Repository storage « View previous topic :: View next topic » 
Author Message
fjb_saper
PostPosted: Tue Dec 18, 2018 8:51 pm    Post subject: Reply with quote

Grand High Poobah

Joined: 18 Nov 2003
Posts: 20696
Location: LI,NY

Can you describe your system in terms of
number of servers
number of qmgrs / server
RAM per server

_________________
MQ & Broker admin
Back to top
View user's profile Send private message Send e-mail
mvic
PostPosted: Thu Dec 20, 2018 8:43 am    Post subject: Reply with quote

Jedi

Joined: 09 Mar 2004
Posts: 2080

Your queue manager has reached an internal memory limitation.
Something has presumably "leaked".. this means memory has been grabbed but not released when it should have been.
Trace does not help see why this has happened, you will probably need to dump off the memory somehow.
Back to top
View user's profile Send private message
Jeff.VT
PostPosted: Thu Dec 20, 2018 1:57 pm    Post subject: Reply with quote

Acolyte

Joined: 02 Mar 2017
Posts: 68

Ya. IBM L3 just got back to us, and they're going to have us do a dump.

Quote:
This queue manager has "leaked" memory in the cluster cache shared
memory area. The most likely source (as it seems at the moment) would
be memory areas remembering the interest of connected applications
(internally we call them "registration areas")


Due to the scary nature of this dump, I'm going to have to schedule a change to perform it, which might take some time. I'm curious what kind of symptoms having a leaky memory in Cluster Cache would cause.
Back to top
View user's profile Send private message
mvic
PostPosted: Thu Dec 20, 2018 2:12 pm    Post subject: Reply with quote

Jedi

Joined: 09 Mar 2004
Posts: 2080

Jeff.VT wrote:
I'm curious what kind of symptoms having a leaky memory in Cluster Cache would cause.

Well on the assumption that it's the same principle as any other memory leak, basically your program tells the OS it needs 1 MB then 1 MB more then 1 MB more then 1 MB more etc. etc. etc. until it's had all the MBs the OS Virtual Memory system is able to give it, and the OS then says "no" to everything else that asks for memory.
I assume that if you restart your queue manager the memory will be released to the OS and you can start fresh.
However, then IBM will not be able to say anything much about what was going on, or find an underlying bug that caused it.
How long was your queue manager up and running before this happened to you?
Back to top
View user's profile Send private message
Jeff.VT
PostPosted: Thu Dec 20, 2018 3:14 pm    Post subject: Reply with quote

Acolyte

Joined: 02 Mar 2017
Posts: 68

mvic wrote:
Jeff.VT wrote:
I'm curious what kind of symptoms having a leaky memory in Cluster Cache would cause.

Well on the assumption that it's the same principle as any other memory leak, basically your program tells the OS it needs 1 MB then 1 MB more then 1 MB more then 1 MB more etc. etc. etc. until it's had all the MBs the OS Virtual Memory system is able to give it, and the OS then says "no" to everything else that asks for memory.
I assume that if you restart your queue manager the memory will be released to the OS and you can start fresh.
However, then IBM will not be able to say anything much about what was going on, or find an underlying bug that caused it.
How long was your queue manager up and running before this happened to you?


We fail them over every 3 months or so.

What sparked the issue... I built our queue managers into a cluster many years ago, but we hadn't ever used it. I migrated our outbound messaging to use the cluster queues instead of alias/remote direct queues.

Those cluster queues existed for months before we started to use them. And the Test system was using it for months before that without any problems like this.

After I migrated it, the errors started a few hours later. But I guess the volume just got to a point where it's too much for it. I can back off on the volume by putting some of our more chatty, less important data back to alias/remote queues.

I did run a Refresh Cluster command on all my PR's and one of my FR's, and the error did go away for a few hours.

----------------

Given what L3 said, it makes sense. And the thing that is impacted, "Cluster Cache for interest of connected applications"... I would *Think* what is happening is that when an application is asking for which queue to send something to, it hits the memory leak, and then asks the full repository. The FR gives it the answer.

I would guess that the chattiness between my PRs and FRs has gone way up. But depending on the message, I would have had calls before now if we were dropping any.

They may not be routing perfectly. And if there's a downstream failure that requires an update to the object to propagate, that may not go out. But I believe messages are still GOING out - if only just not always to the 'perfect' destination.

-----------

Today I updated our queues from "On Group" Bind Type to "Not Fixed". But I assume I would need a Cluster Refresh to see if that did anything. "On Open" wouldn't be remotely useful for us since I'd have to re-code my applications so they don't keep the connection open all the time - and I don't control that.
Back to top
View user's profile Send private message
bruce2359
PostPosted: Thu Dec 20, 2018 3:56 pm    Post subject: Reply with quote

Poobah

Joined: 05 Jan 2008
Posts: 9394
Location: US: west coast, almost. Otherwise, enroute.

Jeff.VT wrote:
Today I updated our queues from "On Group" Bind Type to "Not Fixed". But I assume I would need a Cluster Refresh to see if that did anything. "On Open" wouldn't be remotely useful for us since I'd have to re-code my applications so they don't keep the connection open all the time - and I don't control that.

Uh....

You changed queue attribute BIND option on cluster queues? Why?

NO, you will not need a REFRESH CLUSTER. Cluster software publishes cluster object changes to FRs, and PRs that have expressed interest.

BIND_ON queue attribute is not a connect option; rather, it is an open option for a queue.

Do your application at MQOPEN specify MQOO_BIND_AS_Q_DEF? If not, then your queue attribute change will have no effect. Or, MQOO_BIND_ON_OPEN? Or, MQOO_BIND_ON_GROUP?
_________________
I like deadlines. I like to wave as they pass by.
ב''ה
Lex Orandi, Lex Credendi, Lex Vivendi. As we Worship, So we Believe, So we Live.
Back to top
View user's profile Send private message
Jeff.VT
PostPosted: Fri Mar 15, 2019 7:30 am    Post subject: Reply with quote

Acolyte

Joined: 02 Mar 2017
Posts: 68

Just so I'm not one of "those" people...

The problem has been resolved. Or at least we found a work around.

The problem was the Default Bind on my queues being set to 'On Group'. I don't really understand why. IBM Support suggested there was a bug or something for the On Group binding.

I asked my developers, and they weren't grouping messages at the MQ Layer anyway, so I just set all my clustered queues to NotFixed. The error slowly abated as my queue managers were bounced and failed over in the course of the normal patch cycle.

Thanks to everybody for your help! Strange issue with a strange solution.
Back to top
View user's profile Send private message
Display posts from previous:   
Post new topic  Reply to topic Goto page Previous  1, 2, 3 Page 3 of 3

MQSeries.net Forum Index » Clustering » AMQ9500 - No Repository storage
Jump to:  



You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
Protected by Anti-Spam ACP
 
 


Theme by Dustin Baccetti
Powered by phpBB © 2001, 2002 phpBB Group

Copyright © MQSeries.net. All rights reserved.