| Author | 
		  Message
		 | 
		
		  | jborella | 
		  
		    
			  
				 Posted: Tue Nov 02, 2010 12:56 am    Post subject:  | 
				     | 
			   
			 
		   | 
		
		
		   Apprentice
 
 Joined: 04 Jun 2009 Posts: 26
  
  | 
		  
		    
			  
				
   
	| kimbert wrote: | 
   
  
	If I have understood this correctly:
 
- the XML has an XML declaration which accurately describes the encoding of the XML document. 
 
- the FileInput node is not reading the XML declaration to determine the encoding. | 
   
 
 
Exactly.
 
   
	| kimbert wrote: | 
   
  
	| That may be a deliberate design decision aimed at making the FileInput node behave consistently with other nodes. | 
   
 
 
Can You elaborate on that? I'm not sure I understand what You mean.
 
   
	| kimbert wrote: | 
   
  
	| I think we should wait and see what the response to the PMR is. | 
   
 
 
Good idea, though I'm interested in Your considerations about consistency with other nodes. I've posted a PMR and are in the process of providing details of the problems I'm experiencing. | 
			   
			 
		   | 
		
		
		  | Back to top | 
		  
		  	
		   | 
		
		
		    | 
		
		
		  | kimbert | 
		  
		    
			  
				 Posted: Tue Nov 02, 2010 1:21 am    Post subject:  | 
				     | 
			   
			 
		   | 
		
		
		    Jedi Council
 
 Joined: 29 Jul 2003 Posts: 5543 Location: Southampton 
  | 
		  
		    
			  
				I mean that the MQInput node takes the encoding from the transport ( the MQMD header ) and not from the XML declaration. The FileInput node also takes the encoding from the 'transport' ( I understand your reservations about that use of the term).
 
 
The FileInput node is intended to read any type of data. Non-XML formats typically do not carry the encoding in the file so a node property is a good way to specify the encoding. | 
			   
			 
		   | 
		
		
		  | Back to top | 
		  
		  	
		   | 
		
		
		    | 
		
		
		  | jborella | 
		  
		    
			  
				 Posted: Tue Nov 02, 2010 1:31 am    Post subject:  | 
				     | 
			   
			 
		   | 
		
		
		   Apprentice
 
 Joined: 04 Jun 2009 Posts: 26
  
  | 
		  
		    
			  
				
   
	| kimbert wrote: | 
   
  
	| I mean that the MQInput node takes the encoding from the transport ( the MQMD header ) and not from the XML declaration. The FileInput node also takes the encoding from the 'transport' ( I understand your reservations about that use of the term). | 
   
 
 
I think thats spot on, in what the discussion is about. In MQ the MQMD header is an external header, and thus it feels natural to use the MQMD.CodedCharSetId field to decide the character encoding. How can an internal hardcoded value be considered an external header?
 
 
   
	| kimbert wrote: | 
   
  
	| The FileInput node is intended to read any type of data. Non-XML formats typically do not carry the encoding in the file so a node property is a good way to specify the encoding. | 
   
 
 
Yes and no. I've configured the FileInput node to use the XMLNSC parser, and that should indicate pretty much that I wan't to parse XML. | 
			   
			 
		   | 
		
		
		  | Back to top | 
		  
		  	
		   | 
		
		
		    | 
		
		
		  | fjb_saper | 
		  
		    
			  
				 Posted: Tue Nov 02, 2010 11:52 am    Post subject:  | 
				     | 
			   
			 
		   | 
		
		
		    Grand High Poobah
 
 Joined: 18 Nov 2003 Posts: 20768 Location: LI,NY 
  | 
		  
		    
			  
				
   
	| jborella wrote: | 
   
  
	
   
	| kimbert wrote: | 
   
  
	| The FileInput node is intended to read any type of data. Non-XML formats typically do not carry the encoding in the file so a node property is a good way to specify the encoding. | 
   
 
 
Yes and no. I've configured the FileInput node to use the XMLNSC parser, and that should indicate pretty much that I want to parse XML. | 
   
 
 
 
You are right of course but the XMLNS / XMLNSC parsers will also be able to parse xml data without any encoding declaration as long as the CCSID of the data is provided to the parser.
 
 
The problem for you as I see it, is that you would want the node to behave in following fashion:
 
- global default the CCSID of the box 
 
 -  1st override the CCSID on the node
 
 -  2nd override the CCSID in the encoding xml declaration on the file
  
 
 
Have fun   _________________ MQ & Broker admin | 
			   
			 
		   | 
		
		
		  | Back to top | 
		  
		  	
		   | 
		
		
		    | 
		
		
		  | jborella | 
		  
		    
			  
				 Posted: Tue Nov 23, 2010 12:34 am    Post subject:  | 
				     | 
			   
			 
		   | 
		
		
		   Apprentice
 
 Joined: 04 Jun 2009 Posts: 26
  
  | 
		  
		    
			  
				As expected I got a reply from IBM stating:
 
 
"The CCSID is mandatory property on File Input node and defaults to broker system default value. FileInput node uses stream based parser which interprets the physical bit stream of an incoming message based on the CCSID configured on the node and hence there is a need to know the CCSID before the node parsers the message data. There is currently no functionality in the broker  product to allow the XML parsers to change CCSID and Encoding based on the contents of the XML prolog. This is a current limitation in the product."
 
 
Only good thing is, that they admit that it's a limitation in the product, but I can't use that to very much.
 
 
We now use the following workaround. We read in the XML with the FileInput node as a BLOB, and then use the ESQL:
 
 
   
	| Code: | 
   
  
	CREATE COMPUTE MODULE LOF_Mainflow_mf_v1_SetCharacterSet
 
           CREATE FUNCTION Main() RETURNS BOOLEAN
 
           BEGIN
 
                      CALL CopyMessageHeaders();
 
                      DECLARE encoding INTEGER 546;
 
                      DECLARE cssid INTEGER 850;
 
                      DECLARE blobData BLOB InputRoot.BLOB.BLOB;
 
 
 
                      -- Remove Byte Order Mark
 
                      DECLARE tmpStr CHAR LTRIM(CAST(blobData as CHAR CCSID 850 ENCODING 546));
 
                      IF STARTSWITH(tmpStr,'<') = FALSE THEN
 
                                 SET tmpStr = SUBSTRING(tmpStr FROM POSITION('<' IN tmpStr));
 
                                 SET blobData = CAST(tmpStr AS BLOB CCSID cssid ENCODING encoding);
 
                      END IF;
 
 
 
                      -- Parse with CP850 to find XmlEncoding
 
                      CREATE LASTCHILD OF OutputRoot DOMAIN('XMLNSC') PARSE(blobData, encoding, cssid);
 
 
 
                      -- If the XML defines an encoding, use that
 
                      -- If not, use UTF-8
 
                      IF UPPER(OutputRoot.XMLNSC.(XMLNSC.XmlDeclaration)*.Encoding) = 'WINDOWS-1252' THEN
 
                                 SET cssid = 1252;
 
                      ELSE
 
                                 SET cssid = 1208;
 
                      END IF;
 
 
 
                      -- Reparse BLOB with XML defined encoding
 
                      SET OutputRoot.XMLNSC = NULL;
 
                      CREATE LASTCHILD OF OutputRoot DOMAIN('XMLNSC') PARSE(blobData, encoding, cssid);
 
                      SET OutputRoot.Properties.Encoding = encoding;
 
                      SET OutputRoot.Properties.CodedCharSetId = cssid;
 
 
 
                      RETURN TRUE;
 
           END;
 
 
 
           CREATE PROCEDURE CopyMessageHeaders() BEGIN
 
                      DECLARE I INTEGER 1;
 
                      DECLARE J INTEGER;
 
                      SET J = CARDINALITY(InputRoot.*[]);
 
                      WHILE I < J DO
 
                                 SET OutputRoot.*[I] = InputRoot.*[I];
 
                                 SET I = I + 1;
 
                      END WHILE;
 
           END;
 
 
 
           CREATE PROCEDURE CopyEntireMessage() BEGIN
 
                      SET OutputRoot = InputRoot;
 
           END;
 
END MODULE; | 
   
 
 | 
			   
			 
		   | 
		
		
		  | Back to top | 
		  
		  	
		   | 
		
		
		    | 
		
		
		  | kimbert | 
		  
		    
			  
				 Posted: Tue Nov 23, 2010 3:35 am    Post subject:  | 
				     | 
			   
			 
		   | 
		
		
		    Jedi Council
 
 Joined: 29 Jul 2003 Posts: 5543 Location: Southampton 
  | 
		  
		    
			  
				| Glad you got it working. Thank you for being a good citizen and posting the solution. | 
			   
			 
		   | 
		
		
		  | Back to top | 
		  
		  	
		   | 
		
		
		    | 
		
		
		  | jborella | 
		  
		    
			  
				 Posted: Wed Dec 01, 2010 1:04 am    Post subject:  | 
				     | 
			   
			 
		   | 
		
		
		   Apprentice
 
 Joined: 04 Jun 2009 Posts: 26
  
  | 
		  
		    
			  
				
   
	| kimbert wrote: | 
   
  
	| Glad you got it working. Thank you for being a good citizen and posting the solution. | 
   
 
 
You are welcome. I always find it frustrating myself, when a thread isn't closed with some kind of solution or conclusion. Thank You all for promt and insightfull responses. | 
			   
			 
		   | 
		
		
		  | Back to top | 
		  
		  	
		   | 
		
		
		    | 
		
		
		  | 
		    
		   |