Author |
Message
|
yanaK |
Posted: Mon Jun 08, 2020 10:16 pm Post subject: |
|
|
Acolyte
Joined: 28 May 2020 Posts: 69
|
20480 for 5000 - which process in MQ opens so many files? |
|
Back to top |
|
|
bruce2359 |
Posted: Tue Jun 09, 2020 8:28 am Post subject: |
|
|
Poobah
Joined: 05 Jan 2008 Posts: 9442 Location: US: west coast, almost. Otherwise, enroute.
|
yanaK wrote: |
20480 for 5000 - which process in MQ opens so many files? |
A variety of things, including Socket calls. If client connections are failing, as you describe, each reconnect attempt will drive demand for more resources.
I agree with my worthy colleague that you need to get formal hands-on training in MQ. IBM offers course WM154G IBM MQ V9 System Administration (using Linux for labs), and an equivalent course for Windows. Training will provide the MQ basics, and give you skills in basic problem determination.
https://www.ibm.com/services/learning/course/WM154G _________________ I like deadlines. I like to wave as they pass by.
ב''ה
Lex Orandi, Lex Credendi, Lex Vivendi. As we Worship, So we Believe, So we Live. |
|
Back to top |
|
|
gbaddeley |
Posted: Tue Jun 09, 2020 3:49 pm Post subject: |
|
|
Jedi Knight
Joined: 25 Mar 2003 Posts: 2527 Location: Melbourne, Australia
|
PeterPotkay wrote: |
We wanted to be able to get to 5000 concurrent client connections.
We upped nofile to 20480 to be able to do this in our environment.
Stop the QM
Increase nofile to 20480 for the mqm account
Log out of mqm
Log back into mqm
Start up the queue manager
Test
We wrote a test harness to launch as many client connections as we needed and thru trial and error we got to 20480 for nofile to get to 5000 client connections.
Your numbers may vary. |
afaik, the nofile limit includes all types of UNIX File Descriptors, not just handles for opened disk files. _________________ Glenn |
|
Back to top |
|
|
yanaK |
Posted: Tue Jun 09, 2020 4:14 pm Post subject: |
|
|
Acolyte
Joined: 28 May 2020 Posts: 69
|
I see this in system log:
Code: |
Jun 9 16:44:41 z4321 MQSeries: FFST (23,35,40,536895769) failed: /var/mqm/errors/AMQ16273.0.FDC (errno=0)
Jun 9 16:44:41 z4321 MQSeries: FFST record created in /var/mqm/errors/AMQ16273.0.FDC
Jun 9 16:44:41 z4321 MQSeries: FFST (29,0,66,536895768) failed: /var/mqm/errors/AMQ16273.0.FDC (errno=0)
Jun 9 16:44:41 z4321 MQSeries: FFST record created in /var/mqm/errors/AMQ16273.0.FDC
Jun 9 16:44:42 z4321 agent: 2020-06-09 16:44:42 PDT | ERROR | (domain_forwarder.go:106 in retryTransactions) | Dropped 6 transactions in this retry attempt: 6 for exceeding the retry queue size limit of 30, 0 because the workers are too busy |
I checked that FDC file and this is the header:
Code: |
+-----------------------------------------------------------------------------+
| |
| WebSphere MQ First Failure Symptom Report |
| ========================================= |
| |
| Date/Time :- Tue June 09 2020 16:45:26 PDT |
| UTC Time :- 1591746326.432704 |
| UTC Time Offset :- -420 (PST) |
| Host Name :- z4231 |
| Operating System :- Linux 3.10.0-1127.el7.x86_64 |
| PIDS :- 5724H7230 |
| LVLS :- 7.1.0.7 |
| Product Long Name :- WebSphere MQ for Linux (x86-64 platform) |
| Vendor :- IBM |
| Installation Path :- /opt/mqm |
| Installation Name :- Installation1 (1) |
| Probe Id :- ZL000066 |
| Application Name :- MQM |
| Component :- zlaMain |
| SCCS Info :- lib/zu/amqlrcaa.c, 1.1.1.1 |
| Line Number :- 66 |
| Build Date :- Nov 4 2015 |
| CMVC level :- p710-007-151104 |
| Build Type :- IKAP - (Production) |
| Effective UserID :- 3732 (mqm) |
| Real UserID :- 3732 (mqm) |
| Program Name :- amqzlaa0 |
| Arguments :- -mMQPS3 -fip188 |
| Addressing mode :- 64-bit |
| LANG :- en_US.UTF-8 |
| Process :- 16273 |
| Process(Thread) :- 16273 |
| Thread :- 1 |
| ThreadingModel :- PosixThreads |
| QueueManager :- MQPS3 |
| UserApp :- FALSE |
| ConnId(1) IPCC :- 1109672 |
| ConnId(2) QM :- 43487 |
| Last HQC :- 1.0.0-2956608 |
| Last HSHMEMB :- 2.1.24-40888 |
| Major Errorcode :- xecF_E_UNEXPECTED_RC |
| Minor Errorcode :- xecP_E_PROC_LIMIT |
| Probe Type :- MSGAMQ6118 |
| Probe Severity :- 2 |
| Probe Description :- AMQ6118: An internal WebSphere MQ error has occurred |
| (20006026) |
| FDCSequenceNumber :- 131 |
| Arith1 :- 536895526 (0x20006026) |
| |
+-----------------------------------------------------------------------------+
|
I increased the nofile limit to double (but didn't restart the queue manager - how important is that? Given mqconfig output shows the updated number)
Should I increase the file-max too?
Also I checked the client AMQERR01 log and there was nothing today.
And whenever this attempts I cannot connect to the queue manager:
Quote: |
AMQ9508: Program cannot connect to the queue manager. |
|
|
Back to top |
|
|
PeterPotkay |
Posted: Tue Jun 09, 2020 5:40 pm Post subject: |
|
|
Poobah
Joined: 15 May 2001 Posts: 7717
|
yanaK wrote: |
I increased the nofile limit to double (but didn't restart the queue manager - how important is that? |
Its critical.
Stop the QM.
Change the value.
Log out of mqm and log back in as mqm to pick up the changed values.
Start the QM so the QM is running with the updated values.
Test.
Vote for my RFE to make it easier to know what settings a running queue manager is actually using.
https://www.ibm.com/developerworks/rfe/execute?use_case=viewRfe&CR_ID=42528 _________________ Peter Potkay
Keep Calm and MQ On |
|
Back to top |
|
|
yanaK |
Posted: Tue Jun 09, 2020 6:17 pm Post subject: |
|
|
Acolyte
Joined: 28 May 2020 Posts: 69
|
Thanks @PeterPotkay
And done voting! |
|
Back to top |
|
|
yanaK |
Posted: Sun Jun 21, 2020 7:50 am Post subject: |
|
|
Acolyte
Joined: 28 May 2020 Posts: 69
|
An interesting thing happened here - today when I went in and ran mqconfig I see this warning:
Code: |
$ mqconfig
ksh bgnice:on IBM:off WARN |
I did increase the nofile for mqm user to 20480 (from 10240) and restarted the qm (10 days ago) - I never saw this warning - what does it mean? |
|
Back to top |
|
|
fjb_saper |
Posted: Sun Jun 21, 2020 6:56 pm Post subject: |
|
|
Grand High Poobah
Joined: 18 Nov 2003 Posts: 20729 Location: LI,NY
|
yanaK wrote: |
An interesting thing happened here - today when I went in and ran mqconfig I see this warning:
Code: |
$ mqconfig
ksh bgnice:on IBM:off WARN |
I did increase the nofile for mqm user to 20480 (from 10240) and restarted the qm (10 days ago) - I never saw this warning - what does it mean? |
You're running in Linux yes? Why ksh and not bash? _________________ MQ & Broker admin |
|
Back to top |
|
|
yanaK |
Posted: Tue Jun 23, 2020 12:54 pm Post subject: |
|
|
Acolyte
Joined: 28 May 2020 Posts: 69
|
What does that error mean? And I am running bash - hence the confusion.
@ PeterPotkay Do you know what was the root cause behind nofile limits being exceeded? |
|
Back to top |
|
|
|