ASG
IBM
Zystems
Cressida
Icon
Netflexity
 
  MQSeries.net
Search  Search       Tech Exchange      Education      Certifications      Library      Info Center      SupportPacs      LinkedIn  Search  Search                                                                   FAQ  FAQ   Usergroups  Usergroups
 
Register  ::  Log in Log in to check your private messages
 
RSS Feed - WebSphere MQ Support RSS Feed - Message Broker Support

MQSeries.net Forum Index » General Discussion » getting persistent messages from the queue is very slow

Post new topic  Reply to topic Goto page 1, 2, 3, 4, 5, 6  Next
 getting persistent messages from the queue is very slow « View previous topic :: View next topic » 
Author Message
an4ous
PostPosted: Fri Jan 14, 2022 8:27 am    Post subject: getting persistent messages from the queue is very slow Reply with quote

Apprentice

Joined: 14 Jan 2022
Posts: 38

We have ibm mq 9.1.0.7 LTS on RHEL 7.9 virtual Machine with 12 vcpu 32 GiB Memory RAM (now mq utilization about 12 GiB, other - linux cashe ) on ssd with local ext4 fs and 10 Gib network link (real utilization about some mbits per second)

Also we have remote clients - java applications on spring boot with camel.

This clients getting by jms persistent messages from ibm mq queue and transmitting them by http to remote system. Average message size about 10 Kbyte

But getting messages from ibm mq queue and transmitting to remote system is very slow - only ~ 75-150 mesagges per second while to mq queue putting up to 500 messages per second so this queue always is very fast grows.

We do not watch disks loading on mq host but cpu utilization up to 100%. Increasing the number of clients (or threads in one client) does not increas getting speed but increas load averge to mq server proportionally mq clients.

How we speed up getting mesagges from queue?

We have asked about that IBM and and they cheked our runmqras dump from mq, their answer was
Possible solutions to that are:
- If getters can act in parallel, start more getters on the same queue.
- If getters are taking longer than necessary to process each message (eg. database updates?), address this in the getter and/or the database.
- If putters can be made to go slower, do so.
- If putters can tolerate queue-full errors, set a lower MAXDEPTH on the queue, then the putters will have to stop and pause when they have filled the queue.
- Use non-persistent messages outside syncpoints wherever this can be tolerated by the application designers. Non-persistent messages will result in much less disk activity, but are more susceptible to being lost on app or VM failure.

Is it possible to identify the bottleneck (hardware, ibm mq server, mq remote clients or target remote system) which makes receiving messages slow?
Back to top
View user's profile Send private message
exerk
PostPosted: Fri Jan 14, 2022 9:59 am    Post subject: Reply with quote

Jedi Council

Joined: 02 Nov 2006
Posts: 6339

Are you able to instrument the getting applications, to see where the latency is? If the getting application is waiting on downstream resources (item two in your list above) there is very little that can be done in MQ to address it, other than slowing the put rate.
_________________
It's puzzling, I don't think I've ever seen anything quite like this before...and it's hard to soar like an eagle when you're surrounded by turkeys.
Back to top
View user's profile Send private message
bruce2359
PostPosted: Fri Jan 14, 2022 2:12 pm    Post subject: Re: getting persistent messages from the queue is very slow Reply with quote

Poobah

Joined: 05 Jan 2008
Posts: 9394
Location: US: west coast, almost. Otherwise, enroute.

an4ous wrote:
This clients getting by jms persistent messages from ibm mq queue and transmitting them by http to remote system. Average message size about 10 Kbyte


an4ous wrote:
But getting messages from ibm mq queue and transmitting to remote system is very slow - only ~ 75-150 mesagges per second while to mq queue putting up to 500 messages per second so this queue always is very fast grows.

How are these two sentences different?

Are you saying that you have two different applications, AND that one of them behaves well, but the other behaves slowly? Or, is this the same application getting from two different queues?

The first sentence says 'transmitting then by http to remote system.' The second merely says 'transmitting to remote system.' How are these different?

What is the remote system? Is it the same remote system? Is the remote system an MQ qmgr?

The first sentence mentions persistent messages. The second sentence doesn't specify persistent.
_________________
I like deadlines. I like to wave as they pass by.
ב''ה
Lex Orandi, Lex Credendi, Lex Vivendi. As we Worship, So we Believe, So we Live.
Back to top
View user's profile Send private message
PeterPotkay
PostPosted: Fri Jan 14, 2022 6:34 pm    Post subject: Re: getting persistent messages from the queue is very slow Reply with quote

Poobah

Joined: 15 May 2001
Posts: 7717

an4ous wrote:

We do not watch disks loading on mq host but cpu utilization up to 100%.

Specifically what processes on the server are responsible for the high CPU utilization? Don't guess or assume. The answer will be an important clue.

an4ous wrote:

clients getting by jms persistent messages from ibm mq queue and transmitting them by http to remote system.

Prove to me (and yourself) that the http transmission to the remote system is not the part that is slow. Don't guess or assume - prove it.
_________________
Peter Potkay
Keep Calm and MQ On
Back to top
View user's profile Send private message
an4ous
PostPosted: Fri Jan 14, 2022 10:32 pm    Post subject: Reply with quote

Apprentice

Joined: 14 Jan 2022
Posts: 38

Thx for all answers.


top 10 precesses by cpu load
Code:

 ps aux | sort -nrk 3,3 | head -n 10
mqm       8650 26.3  9.3 9828272 3073504 ?     Sl    2021 10947:29 /opt/mqm/bin/amqzlaa0 -mMQMGR -fip17
mqm      32744 25.0  9.1 9837568 3015544 ?     Sl    2021 11911:17 /opt/mqm/bin/amqzlaa0 -mMQMGR -fip14
mqm       8255 24.2  9.3 9827352 3074876 ?     Sl    2021 10095:37 /opt/mqm/bin/amqzlaa0 -mMQMGR -fip16
mqm       2563 21.6  9.3 9827700 3070660 ?     Sl    2021 10295:36 /opt/mqm/bin/amqzlaa0 -mMQMGR -fip15
mqm      24695 16.3  9.4 9827092 3091732 ?     Sl    2021 8283:09 /opt/mqm/bin/amqzlaa0 -mMQMGR -fip10
mqm      18682 15.1  9.3 9827108 3085912 ?     Sl    2021 7664:38 /opt/mqm/bin/amqzlaa0 -mMQMGR -fip9
mqm       9546 15.0  9.2 9832932 3030792 ?     Sl    2021 7650:13 /opt/mqm/bin/amqzlaa0 -mMQMGR -fip13
mqm       7266 14.7  9.1 9826880 2997436 ?     Sl    2021 7449:58 /opt/mqm/bin/amqzlaa0 -mMQMGR -fip12
mqm      27261 14.5  9.3 9827140 3079592 ?     Sl    2021 7394:26 /opt/mqm/bin/amqzlaa0 -mMQMGR -fip11
mqm      13921 10.5  9.2 9827084 3045136 ?     Sl    2021 5343:29 /opt/mqm/bin/amqzlaa0 -mMQMGR -fip7


Clients get by jms persistent messages from ibm mq queue SPLUNK.Q and send them by http to external remote system (splunk), average message size THESE persistent messages about 10 Kbyte, but getting persistent messages from THIS ibm mq queue SPLUNK.Q and sending to remote system (splunk) is very slow via these camel clinets ~ 75-150 mesagges per second while to THIS mq queue SPLUNK.Q putting up to 500 persistent messages per second so this queue always is very fast grows.

A will ask our developers about check latency of transmission
Back to top
View user's profile Send private message
hughson
PostPosted: Sat Jan 15, 2022 1:12 am    Post subject: Reply with quote

Padawan

Joined: 09 May 2013
Posts: 1914
Location: Bay of Plenty, New Zealand

an4ous wrote:
Clients get by jms persistent messages from ibm mq queue SPLUNK.Q and send them by http to external remote system (splunk), average message size THESE persistent messages about 10 Kbyte, but getting persistent messages from THIS ibm mq queue SPLUNK.Q and sending to remote system (splunk) is very slow via these camel clinets ~ 75-150 mesagges per second

Is this describing one getting application or two different getting applications? It's a bit confusing.
_________________
Morag Hughson @MoragHughson
IBM MQ Technical Education Specialist
Get your IBM MQ training here!
MQGem Software
Back to top
View user's profile Send private message Visit poster's website
an4ous
PostPosted: Sat Jan 15, 2022 1:54 am    Post subject: Reply with quote

Apprentice

Joined: 14 Jan 2022
Posts: 38

[quote="hughson"]
an4ous wrote:

Is this describing one getting application or two different getting applications? It's a bit confusing.

I am sorry

We describe one getting application.

We send persistent mesagges to one splunk server from mq. (Identical) camel application(s) simply move persistent messages from mq queue to splunk. We tried from one to several сamel applications - actually instances of a single camel application.

We starts out camel application in kubernets so can fast scale replicas count from 1 to many. Many replicas of camel application do not encrease moving messages from mq to splunk, only increase load averge on mq (as and inrease of consumers count in one replica of camel application)
Back to top
View user's profile Send private message
zpat
PostPosted: Sat Jan 15, 2022 2:28 am    Post subject: Reply with quote

Jedi Council

Joined: 19 May 2001
Posts: 5849
Location: UK

Put your MQ transactional log files on the fastest possible storage. Consider reducing triple-write to single-write if the hardware offers enough checking.
_________________
Well, I don't think there is any question about it. It can only be attributable to human error. This sort of thing has cropped up before, and it has always been due to human error.
Back to top
View user's profile Send private message
hughson
PostPosted: Sat Jan 15, 2022 2:30 am    Post subject: Reply with quote

Padawan

Joined: 09 May 2013
Posts: 1914
Location: Bay of Plenty, New Zealand

zpat wrote:
Put your MQ transactional log files on the fastest possible storage. Consider reducing triple-write to single-write if the hardware offers enough checking.

This will only help if MQ is the slow part.

I agree with what Peter said:-

PeterPotkay wrote:
an4ous wrote:

clients getting by jms persistent messages from ibm mq queue and transmitting them by http to remote system.

Prove to me (and yourself) that the http transmission to the remote system is not the part that is slow. Don't guess or assume - prove it.


Until you know where the slow down is, no point in changing things.

Cheers,
Morag
_________________
Morag Hughson @MoragHughson
IBM MQ Technical Education Specialist
Get your IBM MQ training here!
MQGem Software
Back to top
View user's profile Send private message Visit poster's website
Andyh
PostPosted: Sat Jan 15, 2022 3:06 am    Post subject: Reply with quote

Master

Joined: 29 Jul 2010
Posts: 237

Your output shows 10 amqzlaa0 processes, which suggests around 1000 connections to the queue manager. Is this queue manager primarily handling only this application ?
Have you looked at ANY mq statistics to see how the application is behaving, for example if the getting application were reconnecting and reopening the queue for every message then those costs could dominate the overall getting performance.
These statistics will also show if the messages are being put and got inside or outside syncpoint. MQ concurrency suffers greatly when persistent messages are processed outside syncpoint.
Have you looked at the QSTATUS for the queues in question, how many handles are typically open for input and output.
The runmqras output would have contained some basic MQ stats, so you could look at that (probably in a file called something like kernel.ldmpa) as a starting point. Looking at the retios of the different MQI call counts will give you and idea of how much useful work (puts and gets) is happening relative to overhead (e.g. MQCONN, MQOPEN ) and whether the persistent messages are being processed inside or outside syncpoint.
The amqsrua sample program can also be used to dynamically query various aspects of MQ performance, both at a queue manager and an individual queeu level, and including what IO latency is being suffered when the persistent messages are hardened to the recovery log (which should be tiny with a local SSD).
Back to top
View user's profile Send private message
an4ous
PostPosted: Sat Jan 15, 2022 4:10 am    Post subject: Reply with quote

Apprentice

Joined: 14 Jan 2022
Posts: 38

Thx for you help

Quote:
Your output shows 10 amqzlaa0 processes, which suggests around 1000 connections to the queue manager. Is this queue manager primarily handling only this application ?

Yes we have more than 1000 connections to this queue manager. About 60 other camel applications puts persistent messages to queue for splunk and special camel application about I wrote get and send this messages from queue to splunk server. We also use other queue but their contribution is minimal

Quote:
Have you looked at ANY mq statistics to see how the application is behaving, for example if the getting application were reconnecting and reopening the queue for every message then those costs could dominate the overall getting performance.

Of couse I temperary enabled statistic and accounting for queue and open with amqsmon but I do not know exactly what to look out for. As and amqsrua.

Quote:
t. MQ concurrency suffers greatly when persistent messages are processed outside syncpoint.

Can adding ImplSyncOpenOutput option in qm.ini to help me? https://www.ibm.com/docs/en/ibm-mq/9.1?topic=multiplatforms-implicit-syncpoint

Quote:
The runmqras output would have contained some basic MQ stats, so you could look at that (probably in a file called something like kernel.ldmpa)

In my runmqras archive file there is not kernel.ldmpa file or similar

Code:
which should be tiny with a local SSD

our circular transaction log (with all virtual machine) on ssd disks of hypervisor host but is not very tiny, can deacrese of journal encrease perfomance? https://www.ibm.com/docs/en/ibm-mq/9.1?topic=csl-what-happens-if-i-make-my-log-too-large says that large logs do not affect performance, only space usage and sturtup time of mq server
Back to top
View user's profile Send private message
bruce2359
PostPosted: Sat Jan 15, 2022 8:06 am    Post subject: Reply with quote

Poobah

Joined: 05 Jan 2008
Posts: 9394
Location: US: west coast, almost. Otherwise, enroute.

A fundamental question or two for you:

Is this issue new with v9.1.0.7? Did you have the same issue before 9.1.0.7?

Did this configuration behave well, then suddenly start misbehaving?

Is this a new configuration?

It is quite common to be asked to prove that MQ is NOT the cause of this or that problem.
_________________
I like deadlines. I like to wave as they pass by.
ב''ה
Lex Orandi, Lex Credendi, Lex Vivendi. As we Worship, So we Believe, So we Live.
Back to top
View user's profile Send private message
Andyh
PostPosted: Sat Jan 15, 2022 8:23 am    Post subject: Reply with quote

Master

Joined: 29 Jul 2010
Posts: 237

All of what follows is from memory, and I'd very strongly suggest you verify it on a test system before doing anything on a production system.

Q. Can adding ImplSyncOpenOutput option in qm.ini to help me? https://www.ibm.com/docs/en/ibm-mq/9.1?topic=multiplatforms-implicit-syncpoint
A. ImplSyncOpenOutput only affects MQPUT (i.e. when the queue is opened for output). Getting persistent messages outside syncpoint never really makes sense. The option defaults to 2 indicating that if multiple hCon's have the queue open for output concurrently then MQPUT's might be interpretted as an MQPUT;MQCMIT sequence. It's possible that reducing this to 1 might help, but I'd be tempted to try and understand how the apps are accessing the relevant queues behave before making speculative changes.

Q. In my runmqras archive file there is not kernel.ldmpa file or similar
A. There are various flavours of runmqras that collect differing amounts of data. The stand alone command

"amqldmpa -m <QMGR> \
-c K \
-O 8\
-f /var/mqm/errors/kernel.ldmpa \
-n 5 \
-s 60"

will APPEND 5 reports (-n) to the file named by the -f parameter (which must be writable by the mqm userid) taken at 60 second intervals (-s).
Each of these reports includes global counters of all MQI calls since the queue manager started. The output from this command is intended for MQ internal debugging, but it includes all sorts of interesting internal counters which give some overall insight into how apps are driving the queue manager.

Q. our circular transaction log (with all virtual machine) on ssd disks of hypervisor host but is not very tiny, can deacrese of journal encrease perfomance? https://www.ibm.com/docs/en/ibm-mq/9.1?topic=csl-what-happens-if-i-make-my-log-too-large says that large logs do not affect performance, only space usage and sturtup time of mq server
A. For circular logging, the log should ideally be large enough that secondary extents are not commonly being used. The messaging rates you are talking about are so low that the log size is unlikely to be an issue, but you should certainly look at the log latency (published in the amqsru data for the recovery log) and ensure its what you expect (SSD would typically be sub millisecond, the field is reported in micro seconds by amqsrua). For much higher workloads it can help if the recovery log fits into the RAID cache, or similar.
Back to top
View user's profile Send private message
an4ous
PostPosted: Sat Jan 15, 2022 8:34 am    Post subject: Reply with quote

Apprentice

Joined: 14 Jan 2022
Posts: 38

bruce2359 wrote:
A fundamental question or two for you:

Is this issue new with v9.1.0.7? Did you have the same issue before 9.1.0.7?

Did this configuration behave well, then suddenly start misbehaving?

Is this a new configuration?

It is quite common to be asked to prove that MQ is NOT the cause of this or that problem.


We migrated from ibm mq 8.0 to mq 9.1.0.7 lts in april 2021. Real problems with perfomance we felt in november 2021.
Back to top
View user's profile Send private message
bruce2359
PostPosted: Sat Jan 15, 2022 9:15 am    Post subject: Reply with quote

Poobah

Joined: 05 Jan 2008
Posts: 9394
Location: US: west coast, almost. Otherwise, enroute.

an4ous wrote:
bruce2359 wrote:
A fundamental question or two for you:

Is this issue new with v9.1.0.7? Did you have the same issue before 9.1.0.7?

Did this configuration behave well, then suddenly start misbehaving?

Is this a new configuration?

It is quite common to be asked to prove that MQ is NOT the cause of this or that problem.


We migrated from ibm mq 8.0 to mq 9.1.0.7 lts in april 2021. Real problems with perfomance we felt in november 2021.

What changed in November timeframe? Please don’t say ‘nothing changed.’
_________________
I like deadlines. I like to wave as they pass by.
ב''ה
Lex Orandi, Lex Credendi, Lex Vivendi. As we Worship, So we Believe, So we Live.
Back to top
View user's profile Send private message
Display posts from previous:   
Post new topic  Reply to topic Goto page 1, 2, 3, 4, 5, 6  Next Page 1 of 6

MQSeries.net Forum Index » General Discussion » getting persistent messages from the queue is very slow
Jump to:  



You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
Protected by Anti-Spam ACP
 
 


Theme by Dustin Baccetti
Powered by phpBB © 2001, 2002 phpBB Group

Copyright © MQSeries.net. All rights reserved.