Author |
Message
|
Vitor |
Posted: Thu Sep 07, 2006 1:14 am Post subject: |
|
|
 Grand High Poobah
Joined: 11 Nov 2005 Posts: 26093 Location: Texas, USA
|
AH - idiot this end was looking in the 6.0 book, which has been rewritten somewhat. Got them now.
I think the best thing here is for us to agree that's you're right and I should just sit in a corner for a while writing out "I must not comment on Java matters" 100 times. Maybe that field IS the CCSID of the MQMD - don't see how but what do I know about Java? Really?
As a wise man says - "Grand Master just means posts too much". In this case once too often. Apologies.  _________________ Honesty is the best policy.
Insanity is the best defence. |
|
Back to top |
|
 |
deepu4u |
Posted: Thu Sep 07, 2006 1:17 am Post subject: |
|
|
Apprentice
Joined: 20 Jun 2005 Posts: 37
|
Sorry, If I hurt ur feelin.
Well, I'm workin over 5.3 version.
CCSID is a variable in MQEnvironment class. |
|
Back to top |
|
 |
Vitor |
Posted: Thu Sep 07, 2006 1:20 am Post subject: |
|
|
 Grand High Poobah
Joined: 11 Nov 2005 Posts: 26093 Location: Texas, USA
|
No, no, my bad. Shouldn't talk on subjects I know nothing about! Certainly shouldn't have assumed you were on 6.0!! _________________ Honesty is the best policy.
Insanity is the best defence. |
|
Back to top |
|
 |
fjb_saper |
Posted: Thu Sep 07, 2006 2:06 pm Post subject: |
|
|
 Grand High Poobah
Joined: 18 Nov 2003 Posts: 20763 Location: LI,NY
|
deepu4u wrote: |
Sorry, If I hurt ur feelin.
Well, I'm workin over 5.3 version.
CCSID is a variable in MQEnvironment class. |
Well CCSID comes up a little bit everywhere depending on what effect you are going for.
Behavior of MQ is ruled by 3 variables in the MQMD part: Format, ccsid, encoding.
Encoding determines the way numeric information is stored (IEEE, big endian, little endian, etc...)
CCSID and Format rule text behavior.
CCSID on the qmgr object as specified in the MQEnvironment is to tell the QMGR the CCSID of the client. This will enable the qmgr to make the automatic translation for a msg with format MQSTR from the MQMD CCSID to the client's CCSID.
If you leave the CCSID value at 0 for the client , the qmgr you are connected to will assume you have it's own CCSID as in "dis qmgr ccsid"
Enjoy  _________________ MQ & Broker admin |
|
Back to top |
|
 |
deepu4u |
Posted: Thu Sep 07, 2006 8:43 pm Post subject: |
|
|
Apprentice
Joined: 20 Jun 2005 Posts: 37
|
Hi...
Please let me know if I got it wrong..
Let say
Client Enc = c
QM Enc = q
Message Enc = m
So whenever client put a message over a queue then QM would translate character field in the MQMD header from c to q. Keeping the payload unchanged ie. keep the character payload in m.
When client get a message from a queue then QM will translate the MQMD field of message to client encoding ie from q to c.
Why would QM would translate the message payload when MQMD has the encoding of payload. Any application which want to use this message can see the encoding from MQMD.
Quote: |
CCSID on the qmgr object as specified in the MQEnvironment is to tell the QMGR the CCSID of the client. This will enable the qmgr to make the automatic translation for a msg with format MQSTR from the MQMD CCSID to the client's CCSID. |
|
|
Back to top |
|
 |
fjb_saper |
Posted: Fri Sep 08, 2006 2:37 am Post subject: |
|
|
 Grand High Poobah
Joined: 18 Nov 2003 Posts: 20763 Location: LI,NY
|
deepu4u wrote: |
Hi...
Please let me know if I got it wrong..
Let say
Client Enc = c
QM Enc = q
Message Enc = m
So whenever client put a message over a queue then QM would translate character field in the MQMD header from c to q. Keeping the payload unchanged ie. keep the character payload in m.
When client get a message from a queue then QM will translate the MQMD field of message to client encoding ie from q to c.
Why would QM would translate the message payload when MQMD has the encoding of payload. Any application which want to use this message can see the encoding from MQMD. |
I understand you are using encoding as in xml encoding =. This translates into CCSID in MQ. MQ Encoding is a different animal and applies only to numeric representation and not to character sets.
fjb_saper wrote: |
CCSID on the qmgr object as specified in the MQEnvironment is to tell the QMGR the CCSID of the client. This will enable the qmgr to make the automatic translation for a msg with format MQSTR from the MQMD CCSID to the client's CCSID. |
Because that is part of the MQ functionality if you request it (convert option on MQGET). Not everybody has a character set translator...
If you happen to read the JMS spec (or broker documentation) it says for soap over JMS that the JMS "encoding" here CCSID takes precedence over whatever is put into the soap/xml header.
Enjoy
 _________________ MQ & Broker admin |
|
Back to top |
|
 |
simon.starkie |
Posted: Wed Sep 20, 2006 8:50 am Post subject: |
|
|
Disciple
Joined: 24 Mar 2002 Posts: 180
|
I got one of these also. My code was doing a "validate without schema" (i.e. external schema name is blank).
java.io.UTFDataFormatException: Invalid byte 1 of 1-byte UTF-8 sequence.
at org.apache.xerces.impl.io.UTF8Reader.invalidByte(Unknown Source)
at org.apache.xerces.impl.io.UTF8Reader.read(Unknown Source)
at org.apache.xerces.impl.XMLEntityScanner.load(Unknown Source)
at org.apache.xerces.impl.XMLEntityScanner.scanContent(Unknown Source)
at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanContent(Unknown Source)
at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl$FragmentContentDispatcher.dispatch(Unknown Source)
at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source)
at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
at org.apache.xerces.parsers.XMLParser.parse(Unknown Source)
at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source)
at org.kp.nps.common.xmlutils.XMLParserImpl.validateXml(XMLParserImpl.java:157)
at org.kp.nps.common.xmlutils.XMLParserImpl.validateNoNamespaceWithInputFile(XMLParserImpl.java:253)
Last edited by simon.starkie on Sat Sep 23, 2006 9:07 am; edited 4 times in total |
|
Back to top |
|
 |
simon.starkie |
Posted: Sat Sep 23, 2006 9:09 am Post subject: Yes, there was definitly a non-UTF-8 compliant byte! |
|
|
Disciple
Joined: 24 Mar 2002 Posts: 180
|
Well, the XML document definitly contains bad data.
Using XVI32, I saw a x'C2' in the middle of one of the spasRqrComment.commentText nodes.
Removal of the offending X'C2' byte from the XML message solved the
java.io.UTFDataFormatException: Invalid byte 1 of 1-byte UTF-8 sequence
problem.
A x'C2' byte is the character "B" in EBCDIC on the mainframe. Since the data originated on a mainframe this seems likely that this particular byte was not converted somewhere along the line before it was stored in the Oracle staging table that the middleware layer (my code) extracts from.
And X'C2' is apparently allowed in a Oracle column defined as VARCHAR2, so my SELECT statement receives the String containing the X'C2' byte which is a Java VO. My code then transforms the VO to an XML document which I then validate with Xerces but the X'C2' byte raises the UTFDataFormatException.
So the short term solution was to remove the bad X'C2' byte from the Oracle table and re-run the middleware extract (actually, in this case, editting the XML to remove the X'C2' from the XML document and placing it back on the Queue with RFHUTIL was quicker).
The longer term solution for me will be to:
1. Try to persuade the owners of the upstream (from me) application to validate their commentString (varchar) field for valid characters before storing it in Oracle. In this case, the field involved was just freeform user defined actuarial comments, so there is no apparent need for non-UTF-8 compliant special characters such as umlauts, etc. A simple check for A-Z, a-z and 0-9 should be sufficient.
2. Enhance my middleware J2EE code to provide more information about exactly which field in the XML message fails durng validation. This will help identify exactly which field in these large XML documents are involved during problem determination, should there be any future re-occurrences of this type of problem.
3. Enhance my middleware error management system so it can tolerate message with invalid content without breaking the parser. This will probably involve wrapping the original message with CDATA tags. This change can be implemented via a new error management system JAR which all of the middleware apps use for exception processing. This should avoid System Exceptions in the error management layer which currently percolate back to the middleware application layer causing more serious problems such as the MDB Listener gets stopped as required when System Exceptions are thrown. |
|
Back to top |
|
 |
fjb_saper |
Posted: Sat Sep 23, 2006 10:30 pm Post subject: |
|
|
 Grand High Poobah
Joined: 18 Nov 2003 Posts: 20763 Location: LI,NY
|
Well I wonder why that is....
Is Oracle sending UTF-8 data but not setting the CCSID of the message to 1208 ?
Is Oracle pretending to send UTF-8 data but has non UTF-8 stuff embedded in it?
Are you not requesting the data with CCSID 1208?
Please tell us what is going on.
Thanks.
F.J. _________________ MQ & Broker admin |
|
Back to top |
|
 |
|