MQSeries.net :: View topic - How big is your DLQ?

MQSeries.net

Tech Exchange

Education

Certifications

Library

Info Center

SupportPacs

FAQÂ Â

Usergroups

RSS Feed - WebSphere MQ Support

RSS Feed - Message Broker Support

MQSeries.net Forum Index » General IBM MQ Support » How big is your DLQ?

Goto page 1, 2 Next

How big is your DLQ?

« View previous topic :: View next topic »

Author

Message

PeterPotkay

Posted: Tue Jan 28, 2014 2:00 pm Post subject: How big is your DLQ?

Poobah

Joined: 15 May 2001
Posts: 7723

In a shared environment (multiple apps sharing the QM), on a QM that has multiple channels to and from other QMs, how big do you make your QM's DLQ?

For the DLQ's Max Message Size you want to be able to DLQ that occasional 10 MB message that one app sends 2 of once a month.

For the DLQ's Max Q Depth you want to be able to DLQ the 100,000 little 500 byte messages that other app sends every hour.

So in a shared environment you are forced to make the one DLQ able to handle the big messages and the numerous messages. Either use case on its own is not a problem in the DLQ - one 10 MB or 100K of the 500 bytes - who cares.

And then come along app C. The start pumping messages as fast as they can to their remote queue on QM1 aiming at QM2. And because you set their Max Q depth and Max Q size on QM2 properly, they quickly fill their queues and start spilling into the DLQ.

And now you see why I ask the question - given that you have to cater the DLQ to the occasional single big message and the occasional big group of tiny messages, the DLQ is really big, and this 3rd app has the ability to put a lot of data into the DLQ. Probably so much that the disk space fills up before the DLQ's Max Q depth is reached.

I could make the DLQ's Max Q Depth really low so if full of the biggest possible messages it wonâ€™t fill disk, to protect against this 3rd app, but then that harmless batch of 100K tiny messages that we were able to DLQ easily will now cause the channel from the other QM to stop. Or, I could leave the Max Q Depth of the DLQ high and knock down the Max message Length, but then that one lonely 10 MB message that I was able to DLQ in the past will cause the channel to stop.

If you multiplied your DLQ's Max Q Depth times its Max message Size, what do you end up with? 1 GB? 10 GB? Did you just max it out at 999,999,999 of 100 Mb and cross your fingers and toes?

How do you protect against this 3rd app? You can do all you want with setting this apps Max Q Depths and Max Message Sizes, but nothing prevents the app from sending unlimited numbers of messages that are < Max Message Size of their SVRCONN channel as long as they fit into the XMITQ's Max Message Size. And then they can swamp the remote QM's DLQ.

I can set artificially low values for Size and Depth on the DLQ and the XMITQs to push the failure 'up the chain' until the problem app gets a failed MQPUT because the XMITQ behind their Remote Q def is full, but now I'm setting artificially low limits for all other well behaved apps and causing premature QM to QM channel hard stops to prevent that one app from filling a DLQ and then an XMITQ.

At MQ 7.1 I can set up a dedicated set of QM to QM channels with dedicated XMITQs for this 3rd app and set their channels to not use a DLQ and set their queues and XMITQs to an artificially low limit. That way when they fill things up they are only impacting themselves. But that's a one off and sets a bad example. Pretty soon I'm doing this for every app and I have a million SNDR/RCVR channels. Doesn't scale.

I wish we could throttle a SVRCONN channel to limit the number of bytes or number of messages an app could inject into the MQ layer per hour or per day.
_________________
Peter Potkay
Keep Calm and MQ On

exerk

Posted: Tue Jan 28, 2014 10:16 pm Post subject:

Jedi Council

Joined: 02 Nov 2006
Posts: 6339

This could be an argument for class-of-service queue managers, e.g. large persistent message queue manager, small non-persistent message queue manager etc. In the case of the latter, NPMSPEED(FAST) the channels and set the receiving queue manager's channels to not use the DLQ would be one way of solving a potential DLQ flood. I'm sure there will be dissenting voices, perhaps along the lines of "...why have n queue managers to do the job of one?..." immediately springs to mind.

Peter, great post as I think this will generate a lot of useful discussion, thank you!
_________________
It's puzzling, I don't think I've ever seen anything quite like this before...and it's hard to soar like an eagle when you're surrounded by turkeys.

Mr Butcher

Posted: Wed Jan 29, 2014 12:19 am Post subject:

Padawan

Joined: 23 May 2005
Posts: 1716

I am on z/OS, so i am able to use a seperate pageset for DLQ. This prevents DLQ messages filling up space used for the application or system queues.

In addition, I use a project specific DLQ handling in test environments. It works like this:

* From the common DLQ, the DLQ handler (alwayas running) sorts incoming messages by project (identified by queue names / qmgr names / flq of queues) into a temporary (not in terms of MQ temporary queues) project-related local queues.

* these temporary project related local queues are triggered, a batch job is started that sends a notification email to the project owners (and of course MQSeeries administration in cc) including some of the messages dumped in a readable format with DLQ header

* the batch job also moves the messages from the temporary project related local queue to a "permanent" project related local queue and adds an expiration to the messages. These queues are limited by depth, and if the queue is full the remaining messages from the temporary messages get purged.

Assuming the DLQ is big enough and the DLQ handler and the triggered batch jobs are fast enough, your system will never get flooded by DLQ messages, and you will still have enough DLQ space available for project A while project B already filled up its space.

Because of the expiration, i never have to clean up DLQ stuff manually again.

Because of the email notification, i never have to chase programmers / project leaders to notify them about error situations.

DLQ messages that do not fit into the DLQ rules will end up in a "comon" DLQ and mqseries administration is notified.

Of course, this is not used in production, because by design messages are purged when the permanent project related DLQ is full. But its perfect for test and simulation environments.

A similiar approach without purging messages could be used for production, it still enables you to define project-related DLQ space, however, thats just another stage before things (your real DLQ) gets filled up. There may be additional solutions, e.g. queue offloading to disk that may help, or by triggering additional actions from the batch job (e.g. stopping channels, put-disabling queues, .....) but i did not think much about that.

So in production, i have a "big" DLQ on a seperate pageset. Big is big enough to deal with some big, or with many small messages hoping not to have to deal with everything at the same time.
_________________
Regards, Butcher

zpat

Posted: Wed Jan 29, 2014 12:53 am Post subject:

Jedi Council

Joined: 19 May 2001
Posts: 5867
Location: UK

I would like to see a "pacing" option on svrconn channels to provide a minimum elapsed time (in milliseconds) between the start of the MQI operation and the end of it.

This could reduce the impact of looping (or just badly coded) applications that might otherwise make millions of MQI calls (e.g. MQGET to an empty queue) over the network in a short period.
_________________
Well, I don't think there is any question about it. It can only be attributable to human error. This sort of thing has cropped up before, and it has always been due to human error.

smdavies99

Posted: Wed Jan 29, 2014 3:03 am Post subject:

Jedi Council

Joined: 10 Feb 2003
Posts: 6076
Location: Somewhere over the Rainbow this side of Never-never land.

I am on the opinion that any message appearing in the DLQ means that something very bad has happened.

If you use BOQ's for rejected messages AND have appropriate app specific monitoring limits in place then most of the time nothing should hit the BOQ.
I have even worked at companies where BOQ's were forbidden even in DEV.

That said, I usually set a very large QDepth and 100Mb maxmsgl for the BOQ.

I just looked at a system that has been up for 18 months without a restart and there is a grand total of 50 messages in the BOQ's and nothing on the DLQ.
all the BOQ Messages are logged as 'incorrectly addressed' in the actual data so there is nothing to worry about there and the BOQ cleanup app for that subsystem will remove them later today.

With appropriate design and use of expiry times in messages you can make a system self managing. Before anyone says that won't work in their financial business, I know that but none of the messages I'm dealing with currently are finance related. Different business areas means different rules. There is no panacea.
_________________
WMQ User since 1999
MQSI/WBI/WMB/'Thingy' User since 2002
Linux user since 1995

Every time you reinvent the wheel the more square it gets (anon). If in doubt think and investigate before you ask silly questions.

Michael Dag

Posted: Wed Jan 29, 2014 4:12 am Post subject:

Jedi Knight

Joined: 13 Jun 2002
Posts: 2607
Location: The Netherlands (Amsterdam)

I am in the 'middle'

Horses for courses, so separate low volume large messages from high-medium small messages , DLQ should in PROD be empty, sizing requirements should be evaluted to cater for q full situations. Coding bad practices or errors should be caught and fixed in DEV, TEST appropriate acceptance criteria (technical tests of the programs involved) should eliminate most situations

_________________
Michael

MQSystems Facebook page

mqjeff

Posted: Wed Jan 29, 2014 4:26 am Post subject:

Grand Master

Joined: 25 Jun 2008
Posts: 17447

Is the channel really going to keep up with the sending application?

Or are messages going to fill up the xmitq before they fill up the remote DLQ?

PeterPotkay

Posted: Wed Jan 29, 2014 7:19 am Post subject:

Poobah

Joined: 15 May 2001
Posts: 7723

Mr Butcher wrote:

Of course, this is not used in production, because by design messages are purged when the permanent project related DLQ is full. But its perfect for test and simulation environments.

A similiar approach without purging messages could be used for production, it still enables you to define project-related DLQ space, however, thats just another stage before things (your real DLQ) gets filled up. There may be additional solutions, e.g. queue offloading to disk that may help, or by triggering additional actions from the batch job (e.g. stopping channels, put-disabling queues, .....) but i did not think much about that.

So in production, i have a "big" DLQ on a seperate pageset. Big is big enough to deal with some big, or with many small messages hoping not to have to deal with everything at the same time.

Clever solution Mr B, at least for non Prod where automatically purging the messages regardless of their expiry or persistence is potentially an option. A bit of work to set up, but protection.

If we don't have the option to purge messages, and the DLQ in question is not z/OS, any and all queues are sharing the same disk space, so shuffling messages around doesn't buy you much if anything over making the DLQ as big as possible and firing of pagers when a message lands in the DLQ. And then we're right back to the point of my original post.

Boy, it would be great to be able to specify disk space for specific queues on mid tier. Create a 750 GB contact admin qTree, mount it to all your servers, mount all DLQs in that environment to aim at this common chunk of storage. Odds of multiple DLQs filling up with a ton of messages is minimal, but at any one time anyone DLQ would have a TON of space.

An RFE is about to be created

_________________
Peter Potkay
Keep Calm and MQ On

PeterPotkay

Posted: Wed Jan 29, 2014 7:25 am Post subject:

Poobah

Joined: 15 May 2001
Posts: 7723

zpat wrote:

I'm going to create an RFE for a throttling option on MQ channels, QM to QM but especially SVRCONNs.

I will add this 'pacing' consideration.

I will share the RFE link when I have it.

Just like a bouncer in a club decides who gets in, they also control how many and how fast they get in. As owner of the MQ Club, I want an MQ Bouncer assigned to each MQ on-ramp into my club.

My audience yesterday was visibly wrinkling their noses and raising their eyebrows when I said I have no way in the MQ Infrastructure layer of throttling or protecting against a legitimate app sending more messages then we planned for, based on what they planned for.
_________________
Peter Potkay
Keep Calm and MQ On

PeterPotkay

Posted: Wed Jan 29, 2014 7:29 am Post subject:

Poobah

Joined: 15 May 2001
Posts: 7723

smdavies99 wrote:

With appropriate design and use of expiry times in messages you can make a system self managing. Before anyone says that won't work in their financial business, I know that but none of the messages I'm dealing with currently are finance related. Different business areas means different rules. There is no panacea.

I don't think the type of businness matters, if you don't have the option to just purge messages when things get full.

Backout queues don't come into play if the problem scenario is App A on QM1 sending way to many messages to App B on QM2. Eventually the DLQ or disk fills on QM2, the channel stops, the XMITQ fills on QM1. Finally App A gets a q full on its MQPUT. Long before then other apps are inmpacted because channels have stopped, or have severaly slowed down as they dropped into their Message Retry loop. You hope for that. The other thing is the QM goes down because the disk filled up.
_________________
Peter Potkay
Keep Calm and MQ On

PeterPotkay

Posted: Wed Jan 29, 2014 7:30 am Post subject:

Poobah

Joined: 15 May 2001
Posts: 7723

Michael Dag wrote:

I am in the 'middle'

Agree with all that.

And yet you are still vulenerable to one app gettin' busy with your QM. Yes, after the fact fingers can be pointed to show they sent more then they said they were going to. I want to avoid it getting that far.
_________________
Peter Potkay
Keep Calm and MQ On

PeterPotkay

Posted: Wed Jan 29, 2014 7:35 am Post subject:

Poobah

Joined: 15 May 2001
Posts: 7723

mqjeff wrote:

Is the channel really going to keep up with the sending application?

Or are messages going to fill up the xmitq before they fill up the remote DLQ?

If the message retry count is set to zero that channel is going to off load the messages to the DLQ as fast as possible. In this case it likely will keep up. I'm gonna bet on MQ's speed over an App's speed.

We don't run with message retry set to zero, so we have some throttling between the QMs. But when this occurs the entire channel pauses and other innoccent apps' messages are delayed.

Sure, you might sau well this app is not a candidate for a shared MQ environment - give them their own QMs and Brokers. My point is any ap has the potential of going nuts like this.
_________________
Peter Potkay
Keep Calm and MQ On

Michael Dag

Posted: Wed Jan 29, 2014 7:36 am Post subject:

Jedi Knight

Joined: 13 Jun 2002
Posts: 2607
Location: The Netherlands (Amsterdam)

PeterPotkay wrote:

Just like a bouncer in a club decides who gets in, they also control how many and how fast they get in. As owner of the MQ Club, I want an MQ Bouncer assigned to each MQ on-ramp into my club.

My audience yesterday was visibly wrinkling their noses and raising their eyebrows when I said I have no way in the MQ Infrastructure layer of throttling or protecting against a legitimate app sending more messages then we planned for, based on what they planned for.

like the analogy, but seriously... so many infra components that are not able to handle unpredictable load... messages don't come out of nowhere... so if some component seriously starts overproducing messages ... someone messed up or didn't inform you what they were doing... you can't blame the river for overflooding... I am from the Netherlands

_________________
Michael

MQSystems Facebook page

exerk

Posted: Wed Jan 29, 2014 12:08 pm Post subject:

Jedi Council

Joined: 02 Nov 2006
Posts: 6339

Michael Dag wrote:

...like the analogy, but seriously... so many infra components that are not able to handle unpredictable load... messages don't come out of nowhere... so if some component seriously starts overproducing messages ... someone messed up or didn't inform you what they were doing...

And when have you ever got accurate figures from a business unit as to what volumes, sizes, and frequency etc.?
_________________
It's puzzling, I don't think I've ever seen anything quite like this before...and it's hard to soar like an eagle when you're surrounded by turkeys.

fjb_saper

Posted: Wed Jan 29, 2014 3:20 pm Post subject:

Grand High Poobah

Joined: 18 Nov 2003
Posts: 20763
Location: LI,NY

@Peter....
Why not have Multiple DLQ's on a qmgr?
What I really mean is that you have a DLQ handler on the qmgr DLQ that forwards the messages depending on the source and type etc... to the "application DLQ" and if you keep the DLQ header you can also run a dlq handler on the "app" dlq ...

And of course set max msg length and max msg count on the "app" DLQ accordingly....

Have fun

_________________
MQ & Broker admin

Display posts from previous:

Goto page 1, 2 Next

Page 1 of 2

MQSeries.net Forum Index » General IBM MQ Support » How big is your DLQ?

Jump to:

You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum

Protected by Anti-Spam ACP