ASG
IBM
Zystems
Cressida
Icon
Netflexity
 
  MQSeries.net
Search  Search       Tech Exchange      Education      Certifications      Library      Info Center      SupportPacs      LinkedIn  Search  Search                                                                   FAQ  FAQ   Usergroups  Usergroups
 
Register  ::  Log in Log in to check your private messages
 
RSS Feed - WebSphere MQ Support RSS Feed - Message Broker Support

MQSeries.net Forum Index » Clustering » Elongated Recovery in a DRP test

Post new topic  Reply to topic
 Elongated Recovery in a DRP test « View previous topic :: View next topic » 
Author Message
cicsprog
PostPosted: Tue Aug 21, 2018 5:33 am    Post subject: Elongated Recovery in a DRP test Reply with quote

Partisan

Joined: 27 Jan 2002
Posts: 314

Running MQ v8 on Z/OS and Linux. A large MQ cluster exists across this MQ network. Both full repositories live on Z/OS and are dedicated mostly to handle subscription messages for the large cluster.

During a DRP test, just for the Linux MQ apps, it’s taking an elongated period of time for Linux MQs to become functional. From what I am told, MQ is trying to resolve the connections for the cluster.

While this isn’t probably an optimum DRP test with the full repos on the mainframe, this customer would like to reduce this recovery time for MQ. My only thought for a resolution is to add another full repository on a Linux MQ.

Would appreciate your input on my solution or another approach. Thanks!!!
Back to top
View user's profile Send private message
Vitor
PostPosted: Tue Aug 21, 2018 7:03 am    Post subject: Re: Elongated Recovery in a DRP test Reply with quote

Grand High Poobah

Joined: 11 Nov 2005
Posts: 26093
Location: Texas, USA

cicsprog wrote:
Would appreciate your input on my solution or another approach.


What's causing the delay in "resolving the connections for the cluster "? Do you just mean it's taking a while for the DNS to work through? If so, why and why would having a Linux full repository help with that?


As to your proposed solution, DO NOT ADD A THIRD FR!. If you think moving one of the z/OS ones to distributed will help then so be it, but there are 2 FRs in a cluster. Not 3, and 1 FR only on the way to 2.
_________________
Honesty is the best policy.
Insanity is the best defence.
Back to top
View user's profile Send private message
cicsprog
PostPosted: Tue Aug 21, 2018 7:56 am    Post subject: Reply with quote

Partisan

Joined: 27 Jan 2002
Posts: 314

Ya seeing is believing. Just doing some gap analysis for a client.

I'm being told its 6 hrs trying to resolve connections. Since the FR aren't accessible during the test, its not surprising MQ would thrash around trying to recover.
Back to top
View user's profile Send private message
Vitor
PostPosted: Tue Aug 21, 2018 8:04 am    Post subject: Reply with quote

Grand High Poobah

Joined: 11 Nov 2005
Posts: 26093
Location: Texas, USA

cicsprog wrote:
its not surprising MQ would thrash around trying to recover.


No, it is.

The channels should obey their short and long retry intervals rather than "thrash about", and even if the FRs are unavailable all of the PRs in the cluster should start up the already defined auto channels to the other PRs. Connection should take about 30 seconds. You're going to get a lot of retry messages out of the manually defined channels to the FRs (obviously) but there's no good reason why all that is taking 6 hours. 6 minutes would be a long time. How long until the cluster is available after you fail back and the FRs are available again?

Again, what does "resolve connections" mean? The DNS? The PRs certainly don't need to "thrash about" to find all the other PRs (and the cluster doesn't work that way anyway) because they know where they are, so what is happening for 6 hours?
_________________
Honesty is the best policy.
Insanity is the best defence.
Back to top
View user's profile Send private message
cicsprog
PostPosted: Tue Aug 21, 2018 8:17 am    Post subject: Reply with quote

Partisan

Joined: 27 Jan 2002
Posts: 314

I will try and see if I can get some logs from Linux MQs from past DRs. Otherwise, I need to be present to see what's happening.
Back to top
View user's profile Send private message
bruce2359
PostPosted: Tue Aug 21, 2018 8:48 am    Post subject: Reply with quote

Poobah

Joined: 05 Jan 2008
Posts: 9394
Location: US: west coast, almost. Otherwise, enroute.

During this 6 hour lag, are channels in RUNNING state? RETRY state? Or, any state other than RUNNING?

Did your DISPLAYs during the 6 hour lag indicate any in-doubt transactions or channels?

You didn't indicate what happens at the end of 6 hours. Did the qmgrs suddenly become functional?
_________________
I like deadlines. I like to wave as they pass by.
ב''ה
Lex Orandi, Lex Credendi, Lex Vivendi. As we Worship, So we Believe, So we Live.
Back to top
View user's profile Send private message
Vitor
PostPosted: Tue Aug 21, 2018 9:02 am    Post subject: Reply with quote

Grand High Poobah

Joined: 11 Nov 2005
Posts: 26093
Location: Texas, USA

cicsprog wrote:
Otherwise, I need to be present to see what's happening.


Or you need someone to be a lot more precise and detailed about what's actually happening. Who's telling you that it's because the queue managers are "resolving connections" (whatever that means) and how did they make that determination? Did the DR'd system start working after 6 hours and they pulled this explanation out of thin air because it sounded technical? Where's their evidence for what's happening?

Most important of all, why would an FR on the distributed side make any difference? Do you think it will make a difference or is this mythical person saying that having a Linux based FR will "resolve connections" quicker?

I suspect the logs will show the queue managers starting after a few minutes and then sitting there. But you don't need to be there to see it; you need this person to tell you, in detail, what they saw and how they drew conclusions from it.
_________________
Honesty is the best policy.
Insanity is the best defence.
Back to top
View user's profile Send private message
cicsprog
PostPosted: Tue Aug 21, 2018 7:07 pm    Post subject: Reply with quote

Partisan

Joined: 27 Jan 2002
Posts: 314

Thanks for input...waiting for a call back from MQ Admin to get some more info and supposed PMR that was opened at the time of issue.
Back to top
View user's profile Send private message
fjb_saper
PostPosted: Sun Aug 26, 2018 8:24 pm    Post subject: Reply with quote

Grand High Poobah

Joined: 18 Nov 2003
Posts: 20696
Location: LI,NY

cicsprog wrote:
Thanks for input...waiting for a call back from MQ Admin to get some more info and supposed PMR that was opened at the time of issue.

You may also consider disabling Reverse DNS lookup. I could imagagine that that could bring quite a crick into your delays in DR...
_________________
MQ & Broker admin
Back to top
View user's profile Send private message Send e-mail
Display posts from previous:   
Post new topic  Reply to topic Page 1 of 1

MQSeries.net Forum Index » Clustering » Elongated Recovery in a DRP test
Jump to:  



You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
Protected by Anti-Spam ACP
 
 


Theme by Dustin Baccetti
Powered by phpBB © 2001, 2002 phpBB Group

Copyright © MQSeries.net. All rights reserved.