ASG
IBM
Zystems
Cressida
Icon
Netflexity
 
  MQSeries.net
Search  Search       Tech Exchange      Education      Certifications      Library      Info Center      SupportPacs      LinkedIn  Search  Search                                                                   FAQ  FAQ   Usergroups  Usergroups
 
Register  ::  Log in Log in to check your private messages
 
RSS Feed - WebSphere MQ Support RSS Feed - Message Broker Support

MQSeries.net Forum Index » Challenge Forum » Challenge Question - 04 / 2008 - Week Two

This forum is locked: you cannot post, reply to, or edit topics.  This topic is locked: you cannot edit posts or make replies. Goto page 1, 2  Next
 Challenge Question - 04 / 2008 - Week Two « View previous topic :: View next topic » 
Author Message
Challenger
PostPosted: Sun Apr 06, 2008 2:03 pm    Post subject: Challenge Question - 04 / 2008 - Week Two Reply with quote

Centurion

Joined: 31 Mar 2008
Posts: 115

Answers to questions from Week One of the April 08 Challenge are summarized below. You may ask questions of the Challenger and you may ask the Challenger to run commands to give you more information. You may ask two questions per post. After your two questions are answered, you are free to post another two questions.

You may also ask questions about steps we took at the request of IBM tech support. For example: "Did L3 ask you stand on your head and what was the result?" "Why, yes, and it made the Challenger very dizzy."

Don't be discouraged! If we don't make progress in Week Two, the Challenger may take pity upon the forum and post a clue.


Original presenting problem:
After upgrade from MQ v5.3 CSD13 to v6.0.2.2, nine queue managers on one server started fine as 'mqm'. Later in the day, a tenth queue manager on the same server would not, and had to be started by a different user ID. OS=Solaris 9

Subsequent problem:
Troubleshooting revealed that two kernel updates had been missed during upgrade. These were corrected and the server rebooted (bringing all kernel parameters to IBM's specifications). After reboot, all ten queue managers were started immediately and all ten failed to start as 'mqm'.

Information and answers from Week One:
When the qmgr fails to start, we are instantly returned to a command prompt.
No stdout.
No FDCs.
No entries to queue manager AMQERRORx.LOG.

Migration steps were as specified in Quick Beginnings. Upgrade checklist had been used successfully on three previous servers of the same configuration. The Solaris utility "pkg" was used.

'setuid' option was properly accepted during installation.
With OAM disabled, qmgr still will not start.

Resource limits for 'mqm' are:
time(seconds) unlimited
file(blocks) unlimited
data(kbytes) unlimited
stack(kbytes) 8192
coredump(blocks) unlimited
nofiles(descriptors) 10000
vmemory(kbytes) unlimited

'mqm' is as NIS ID, and is a member of the mqm group.
'mqm' home directory is an NFS mount.
'mqm' is used successfully on other servers in the NIS domain.

'mqm' user and group have appropriate access to /opt/mqm and /var/mqm/.
Space on /var/mqm is more than sufficient for successful operation.

Solaris is properly running in 64-bit mode.
LIBPATH and LD_LIBRARY_PATH variables are not set for 'mqm'.

L3 analyzed a trace of strmqm, and found a SIGBUS error, However, the Challenger feels the analysis was superficial, and the trace analysis by L3 is not relevant to solving the problem.

Queue managers can be created and deleted as 'mqm'.

Environment: excerpts from 'mqm' environment.
HZ=
LC_COLLATE=en_US.ISO8859-1
LC_CTYPE=en_US.ISO8859-1
LC_MESSAGES=C
LC_MONETARY=en_US.ISO8859-1
LC_NUMERIC=en_US.ISO8859-1
PATH=.:/opt/sfw/bin:/opt/mqm/bin:/opt/mqm/samp/bin:/usr/bin:
SHELL=/bin/sh

Environment: excerpts from user that works.
_=/usr/bin/env
LC_MONETARY=en_US.ISO8859-1
PATH=/usr/bin:/usr/sbin:/opt/sfw/bin:/var/mqm:/var/wmqi:/opt/mqm/bin:/var/mqm/utilities:/opt/mqm/samp/bin:/usr/bin
LC_MESSAGES=C
LC_CTYPE=en_US.ISO8859-1
SHELL=/bin/ksh
HOME=/home/challenger
LC_COLLATE=en_US.ISO8859-1
LC_NUMERIC=en_US.ISO8859-1
Back to top
View user's profile Send private message
gbaddeley
PostPosted: Sun Apr 06, 2008 4:22 pm    Post subject: Reply with quote

Jedi

Joined: 25 Mar 2003
Posts: 2492
Location: Melbourne, Australia

Its unusual that mqm directories are in the PATH, even for the mqm user. Normally on Solaris there are sym links from /usr/bin to /opt/mqm/bin for the main MQ commands. Both users should be able to operate MQ with very simple PATHs, eg. PATH=/usr/bin:/usr/sbin:/opt/sfw/bin:. and maybe with /var/mqm/utilities:/opt/mqm/samp/bin thrown on the end for good measure.

1. Are the links from /usr/bin set up correctly to /opt/mqm/bin ? (output of ls -l /opt/mqm | grep mq)

2. What difference does it make if mqm's PATH is changed to the user that works?
Back to top
View user's profile Send private message
Gaya3
PostPosted: Sun Apr 06, 2008 7:50 pm    Post subject: Reply with quote

Jedi

Joined: 12 Sep 2006
Posts: 2493
Location: Boston, US

Please do the following activities

1. alias strmqm="env LIBPATH=/usr/mqm/lib64:$LIBPATH strmqm"
2. crtmqm sample
3. strmqm sample

and let me know the status/results of the same.

Regards
Gayathri
_________________
Regards
Gayathri
-----------------------------------------------
Do Something Before you Die
Back to top
View user's profile Send private message
gbaddeley
PostPosted: Mon Apr 07, 2008 5:58 pm    Post subject: Re: Challenge Question - 04 / 2008 - Week Two Reply with quote

Jedi

Joined: 25 Mar 2003
Posts: 2492
Location: Melbourne, Australia

[quote="Challenger"][color=green][b]Original presenting problem:[/b][/color]
After upgrade from MQ v5.3 CSD13 to v6.0.2.2, nine queue managers on one server started fine as 'mqm'. Later in the day, a tenth queue manager on the same server would not, and had to be started by a different user ID. OS=Solaris 9

[color=green][b]Subsequent problem:[/b][/color]
Troubleshooting revealed that two kernel updates had been missed during upgrade. These were corrected and the server rebooted (bringing all kernel parameters to IBM's specifications). After reboot, all ten queue managers were started immediately and all ten failed to start as 'mqm'.
[/quote]

There are some clues here that 9 qmgrs were running ok, then later in the day it was not possible to start a 10th. After a reboot, none of them could be started. This suggests that something was changed during the first day, all we have to determine what it was.

1. Did the system administrator or MQ administrator make any changes to the system configuration before the 10th qmgr failed to start?
2. What was it?
_________________
Glenn
Back to top
View user's profile Send private message
Challenger
PostPosted: Mon Apr 07, 2008 8:01 pm    Post subject: Reply with quote

Centurion

Joined: 31 Mar 2008
Posts: 115

I apologize for the delay in responding, and plan to reply tomorrow. Sometimes work gets in the way of the fun stuff.

P.S. The questions are getting more germane as we progress. Good work.
Back to top
View user's profile Send private message
Challenger
PostPosted: Tue Apr 08, 2008 1:48 pm    Post subject: Reply with quote

Centurion

Joined: 31 Mar 2008
Posts: 115

gbaddeley wrote:

1. Are the links from /usr/bin set up correctly to /opt/mqm/bin ? (output of ls -l /opt/mqm | grep mq)

2. What difference does it make if mqm's PATH is changed to the user that works?


Is 'ls -l /opt/mqm |grep mq' what you want to verify the link? Or is this what you want to see:
mqm:/home/mqm>ls -l /usr/bin | grep strmqm
lrwxrwxrwx 1 root other 19 Mar 1 09:56 strmqm -> /opt/mqm/bin/strmqm


If the path of 'mqm' is changed to be identical the other user's path, it works ! The 'mqm' user can start all ten queue managers.

So now you have a workaround. But why did it work, and what is the root cause? Does something still need to be fixed?
Back to top
View user's profile Send private message
Challenger
PostPosted: Tue Apr 08, 2008 1:59 pm    Post subject: Reply with quote

Centurion

Joined: 31 Mar 2008
Posts: 115

Gaya3 wrote:
Please do the following activities

1. alias strmqm="env LIBPATH=/usr/mqm/lib64:$LIBPATH strmqm"
2. crtmqm sample
3. strmqm sample

and let me know the status/results of the same.

Regards
Gayathri


mqm:/home/mqm>alias strmqm="env LIBPATH=/usr/mqm/lib64:$LIBPATH strmqm"
mqm:/home/mqm>crtmqm sample
WebSphere MQ queue manager created.
Creating or replacing default objects for sample.
Default objects statistics : 40 created. 0 replaced. 0 failed.
Completing setup.
Setup completed.

mqm:/home/mqm>strmqm sample
mqm:/home/mqm>


New queue manager, 'sample', will not start as 'mqm'.
Back to top
View user's profile Send private message
Challenger
PostPosted: Tue Apr 08, 2008 2:05 pm    Post subject: Re: Challenge Question - 04 / 2008 - Week Two Reply with quote

Centurion

Joined: 31 Mar 2008
Posts: 115

gbaddeley wrote:
There are some clues here that 9 qmgrs were running ok, then later in the day it was not possible to start a 10th. After a reboot, none of them could be started. This suggests that something was changed during the first day, all we have to determine what it was.

1. Did the system administrator or MQ administrator make any changes to the system configuration before the 10th qmgr failed to start?
2. What was it?


There were no changes made to the system configuration. There were no changes made to any MQ configurations. In fact, I assure you that a comparison of all OS and MQ files 'before & after' would show them to be identical.

But you are correct, something was different. You are so close to the root cause.....
Back to top
View user's profile Send private message
Gaya3
PostPosted: Tue Apr 08, 2008 8:55 pm    Post subject: Reply with quote

Jedi

Joined: 12 Sep 2006
Posts: 2493
Location: Boston, US

Answer these questions to me

1 NIS domain, did you create the IDs on the NIS master server machine.(Both user ID and group ID must be set to mqm)
2. Hope the NFS mount is working properly well with out having any network failures. (setuid,and having root access too)

These are two places that i can see the issues could be

Regards
Gayathri
_________________
Regards
Gayathri
-----------------------------------------------
Do Something Before you Die
Back to top
View user's profile Send private message
gbaddeley
PostPosted: Tue Apr 08, 2008 11:21 pm    Post subject: Reply with quote

Jedi

Joined: 25 Mar 2003
Posts: 2492
Location: Melbourne, Australia

Challenger wrote:
gbaddeley wrote:

1. Are the links from /usr/bin set up correctly to /opt/mqm/bin ? (output of ls -l /opt/mqm | grep mq)

2. What difference does it make if mqm's PATH is changed to the user that works?


Is 'ls -l /opt/mqm |grep mq' what you want to verify the link? Or is this what you want to see:
mqm:/home/mqm>ls -l /usr/bin | grep strmqm
lrwxrwxrwx 1 root other 19 Mar 1 09:56 strmqm -> /opt/mqm/bin/strmqm


If the path of 'mqm' is changed to be identical the other user's path, it works ! The 'mqm' user can start all ten queue managers.

So now you have a workaround. But why did it work, and what is the root cause? Does something still need to be fixed?


Sorry, I did actually mean ls -l /usr/bin, not /opt/mqm. Thanks, you confirmed that the sym link is ok for strmqm.

Environment excerpts from 'mqm'
PATH=.:/opt/sfw/bin:/opt/mqm/bin:/opt/mqm/samp/bin:/usr/bin:

Environment excerpts from user that works
PATH=/usr/bin:/usr/sbin:/opt/sfw/bin:/var/mqm:/var/wmqi:/opt/mqm/bin:/var/mqm/utilities:/opt/mqm/samp/bin:/usr/bin


When the strmqm command is entered, the shell searches the directories in the PATH from left to right. For the user that works, /usr/bin is first (and this is generally how it should be), and the symlink in here for strmqm points to /opt/mqm/bin/strmqm, where the real binary executable file is.

mqm's PATH is rather crazy. It contains "." (the current directory) first, so if there is a file in the current directory with the name strmqm the shell will try to run it.

Q1: Is there a strmqm file there?

/opt/mqm/bin should not be in the PATH, as the correct way to get to MQ commands in here is via the symlinks in /usr/bin.

Its ok to have /opt/mqm/samp/bin in the PATH, as this saves some typing whenever the MQ sample program need to be run (amqsput, amqsget, amqsbcg etc).

/usr/bin is way up the end of the PATH. It should really be up the front.

It can be argued that "." should not be in the PATH because it is a security & integrity risk. If someone slips a script or binary into your directory which has the same name as a command elsewhere, they can high-jack what you are doing.
_________________
Glenn
Back to top
View user's profile Send private message
Challenger
PostPosted: Wed Apr 09, 2008 7:26 am    Post subject: Reply with quote

Centurion

Joined: 31 Mar 2008
Posts: 115

Gaya3 wrote:

1 NIS domain, did you create the IDs on the NIS master server machine.(Both user ID and group ID must be set to mqm)
2. Hope the NFS mount is working properly well with out having any network failures. (setuid,and having root access too)


Yes, 'mqm' ID was created on the master NIS server and is a member of the mqm group. (In a previous post, I believe I posted the relevant lines from the NIS password and group file.)

No NFS errors or issues. The NIS user is successful on other servers.
Back to top
View user's profile Send private message
Challenger
PostPosted: Wed Apr 09, 2008 7:54 am    Post subject: Reply with quote

Centurion

Joined: 31 Mar 2008
Posts: 115

We have a winner!

gbaddeley wrote:


Environment excerpts from 'mqm'
PATH=.:/opt/sfw/bin:/opt/mqm/bin:/opt/mqm/samp/bin:/usr/bin:

Environment excerpts from user that works
PATH=/usr/bin:/usr/sbin:/opt/sfw/bin:/var/mqm:/var/wmqi:/opt/mqm/bin:/var/mqm/utilities:/opt/mqm/samp/bin:/usr/bin


When the strmqm command is entered, the shell searches the directories in the PATH from left to right. For the user that works, /usr/bin is first (and this is generally how it should be), and the symlink in here for strmqm points to /opt/mqm/bin/strmqm, where the real binary executable file is.

mqm's PATH is rather crazy. It contains "." (the current directory) first, so if there is a file in the current directory with the name strmqm the shell will try to run it.

Q1: Is there a strmqm file there?

/opt/mqm/bin should not be in the PATH, as the correct way to get to MQ commands in here is via the symlinks in /usr/bin.

Its ok to have /opt/mqm/samp/bin in the PATH, as this saves some typing whenever the MQ sample program need to be run (amqsput, amqsget, amqsbcg etc).

/usr/bin is way up the end of the PATH. It should really be up the front.

It can be argued that "." should not be in the PATH because it is a security & integrity risk. If someone slips a script or binary into your directory which has the same name as a command elsewhere, they can high-jack what you are doing.


You are correct. There is a v5.3 copy of strmqm in /home/mqm. With "." in the path, the incorrect version was being executed when the execution occurred from that directory.

I inherited this system some months ago, and I must have looked at that copy of strmqm in /home/mqm a million times -- but never thought about that copy of strmqm.

On the day of the upgrade, I apparently was not in /home/mqm when the first nine queue managers were started, so the proper copy of strmqm was read.

The problem was identified when L3 asked me to do a truss, which showed that v5.3 code was executing. Then they asked for "which strmqm" as both users. It was then that it became clear that strmqm was executing from "./strmqm".

Good work, gbaddeley. PM your shipping information to me, and a valuable mqseries.net coffee mug will be on its way to you!
Back to top
View user's profile Send private message
Challenger
PostPosted: Wed Apr 09, 2008 8:18 am    Post subject: Reply with quote

Centurion

Joined: 31 Mar 2008
Posts: 115

Here is one more Challenge for you! Just for fun....

Can you guess my true identity from reading all that I've written? I post regularly to the forum, but I am not of jedi status.

If you can guess who I am, you'll receive a trinket from the Challenger's home town.

Who am I?
Back to top
View user's profile Send private message
jefflowrey
PostPosted: Wed Apr 09, 2008 8:18 am    Post subject: Reply with quote

Grand Poobah

Joined: 16 Oct 2002
Posts: 19981

Challenger wrote:
There is a v5.3 copy of strmqm in /home/mqm.


Oh, man...

Great job everyone, great challenge!
_________________
I am *not* the model of the modern major general.
Back to top
View user's profile Send private message
bbburson
PostPosted: Wed Apr 09, 2008 9:37 am    Post subject: Reply with quote

Partisan

Joined: 06 Jan 2004
Posts: 378
Location: Nowhere near a queue manager

Challenger wrote:
Here is one more Challenge for you! Just for fun....

Can you guess my true identity from reading all that I've written? I post regularly to the forum, but I am not of jedi status.

If you can guess who I am, you'll receive a trinket from the Challenger's home town.

Who am I?

My guess (and it is PURELY a guess) is Toronto_MQ.
Back to top
View user's profile Send private message
Display posts from previous:   
This forum is locked: you cannot post, reply to, or edit topics.  This topic is locked: you cannot edit posts or make replies. Goto page 1, 2  Next Page 1 of 2

MQSeries.net Forum Index » Challenge Forum » Challenge Question - 04 / 2008 - Week Two
Jump to:  



You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
Protected by Anti-Spam ACP
 
 


Theme by Dustin Baccetti
Powered by phpBB © 2001, 2002 phpBB Group

Copyright © MQSeries.net. All rights reserved.