| Author | 
		  Message
		 | 
		
		  | shwetabh WMB | 
		  
		    
			  
				 Posted: Sun Jul 17, 2016 6:55 pm    Post subject: Cdata Parsing issue | 
				     | 
			   
			 
		   | 
		
		
		   Novice
 
 Joined: 15 Jul 2016 Posts: 23
  
  | 
		  
		    
			  
				Hi Everyone,
 
   We are getting issue while parsing CData section.I am using IIB v9004
 
We have two flows:-
 
a)MQInput -> Compute -> MqOutput
 
 
In this flow we get message as BLOB ,some validation is done and Cdata is added in one field of xml using REPLACE command  and is sent to output node as BLOB.
 
 
b)MQInput -> Compute -> MQOutput.
 
In this flow we are parsing the message coming from flow in a) using XMLNSC parser.
 
Here it is behaving differently for different message.
 
In some cases CData is divided as list .i.e if value is [[India] abc xyz [erty] 
 
field is populated as list with values 1) [[India   2)] abc xyz [erty  3) ]
 
on using trace node after MQInput i am able to see 4 rows also .
 
However this is not happening in every cases.
 
 
Anyone faced same issue?If anyone know the resolution ,please help.
 
 
With Regards, | 
			   
			 
		   | 
		
		
		  | Back to top | 
		  
		  	
		   | 
		
		
		    | 
		
		
		  | mqjeff | 
		  
		    
			  
				 Posted: Mon Jul 18, 2016 3:33 am    Post subject:  | 
				     | 
			   
			 
		   | 
		
		
		   Grand Master
 
 Joined: 25 Jun 2008 Posts: 17447
  
  | 
		  
		    
			  
				It sounds like an issue with your code.
 
 
Or the program sending the message to you. _________________ chmod  -R ugo-wx / | 
			   
			 
		   | 
		
		
		  | Back to top | 
		  
		  	
		   | 
		
		
		    | 
		
		
		  | shwetabh WMB | 
		  
		    
			  
				 Posted: Mon Jul 18, 2016 5:13 am    Post subject:  | 
				     | 
			   
			 
		   | 
		
		
		   Novice
 
 Joined: 15 Jul 2016 Posts: 23
  
  | 
		  
		    
			  
				Not sure if that is the case.Reason being we are getting this intermittently and there is just change in the value of  one tag containing CData. If it was code issue, it should happen in every case
 
 
+
 
 
in the flow a) we are  just mapping header,adding CData to field and mapping final blob to OutputRoot
 
 
+
 
 
in the flow b) xmlnsc parser is in the input node where we are having issue.
 
 
 
However I will post  the trace for clear picture | 
			   
			 
		   | 
		
		
		  | Back to top | 
		  
		  	
		   | 
		
		
		    | 
		
		
		  | mqjeff | 
		  
		    
			  
				 Posted: Mon Jul 18, 2016 5:19 am    Post subject:  | 
				     | 
			   
			 
		   | 
		
		
		   Grand Master
 
 Joined: 25 Jun 2008 Posts: 17447
  
  | 
		  
		    
			  
				If it was a code issue, it would only occur when certain fields of a message came in.
 
 
If it was an issue with the message, it would only occur at certain points in your code.
 
 
If it's the XMLNSC parser on the input node that's failing, then the input message is bad.
 
 
Not your fault, not your problem.  Tell the people sending the message to stop sending bad XML. _________________ chmod  -R ugo-wx / | 
			   
			 
		   | 
		
		
		  | Back to top | 
		  
		  	
		   | 
		
		
		    | 
		
		
		  | timber | 
		  
		    
			  
				 Posted: Mon Jul 18, 2016 2:40 pm    Post subject:  | 
				     | 
			   
			 
		   | 
		
		
		    Grand Master
 
 Joined: 25 Aug 2015 Posts: 1292
  
  | 
		  
		    
			  
				
   
	| Quote: | 
   
  
	| Cdata is added in one field of xml using REPLACE command and is sent to output node as BLOB | 
   
 
That sounds like a rather unsafe way to add a CDATA section to an XML message. What happens if the BLOB contains characters that are illegal in XML? What if the BLOB happens to contain the CDATA termination string ']]>' and so requires two consecutive CDATA sections in order to remain valid XML?
 
 
   
	| Quote: | 
   
  
	In some cases CData is divided as list .i.e if value is [[India] abc xyz [erty]
 
field is populated as list with values 1) [[India 2)] abc xyz [erty 3) ]
 
on using trace node after MQInput i am able to see 4 rows also .
 
However this is not happening in every cases.  | 
   
 
I suspect that this is 'as-designed'. It is not always possible to represent a piece of data using a single CDATA section. It's possible that the XMLNSC parser splits a CDATA section and starts a new element in the message tree when it sees the string '[['. You could experiment and see if I'm right.
 
   
	| Quote: | 
   
  
	| Anyone faced same issue?If anyone know the resolution ,please help. | 
   
 
I don't think it's a bug. I think it is a feature of XMLNSC, and your message flow needs to cope with whatever text elements XMLNSC puts into the message tree.
 
Before you protest...XMLNSC is doing pretty much the same as any DOM-compliant XML processor. You would have to cater for the possibility of multiple CDATA sections if you were writing Java.
 
 
   
	| Quote: | 
   
  
	| However I will post the trace for clear picture | 
   
 
That's a good idea. Please use [c o d e] tags ( there is a button above the edit window) to make the trace readable. | 
			   
			 
		   | 
		
		
		  | Back to top | 
		  
		  	
		   | 
		
		
		    | 
		
		
		  | shwetabh WMB | 
		  
		    
			  
				 Posted: Mon Jul 18, 2016 8:51 pm    Post subject:  | 
				     | 
			   
			 
		   | 
		
		
		   Novice
 
 Joined: 15 Jul 2016 Posts: 23
  
  | 
		  
		    
			  
				thats true but  message getting parsed in our code is not having ]]> to create multiple CData section.
 
if data contains ]]> is there a way we can  have only one CData section?
 
I think we cant .We can only  workaround over it .However for our present codebase where multiple CData section  is created, we are storing the value in character variable and then  reassigning it  to OutputTree field to  make it one CData field.This is  our  present approach to make our code working.
 
 
But my concern is why  xmlnsc is parsing it in such way .
 
 
If we enable trace,it say  InputRoot.XMLNSC.A.B is resolved to  ROW
 
 
However in  some of the input ,it says InputRoot.XMLNSC.A.B is assigned  to OutputRoot without specifying  anything about ROW.
 
 
Both the above values mentioned are in same pattern. | 
			   
			 
		   | 
		
		
		  | Back to top | 
		  
		  	
		   | 
		
		
		    | 
		
		
		  | timber | 
		  
		    
			  
				 Posted: Tue Jul 19, 2016 12:00 am    Post subject:  | 
				     | 
			   
			 
		   | 
		
		
		    Grand Master
 
 Joined: 25 Aug 2015 Posts: 1292
  
  | 
		  
		    
			  
				
   
	| Quote: | 
   
  
	| But my concern is why xmlnsc is parsing it in such way .  | 
   
 
Feel free to open a PMR and ask IBM. But my point stands; you *could* get multiple CDATA sections in future, and your message flow should probably cope with that possibility.
 
 
   
	| Quote: | 
   
  
	| for our present codebase where multiple CData section is created, we are storing the value in character variable and then reassigning it to OutputTree field to make it one CData field.This is our present approach to make our code working.  | 
   
 
Why are you bothering to combine the CDATA sections? The XML parser in the downstream application will not care whether it gets one CDATA section or multiple. Either way it is
 
a) valid XML and
 
b) should be interpreted in exactly the same way
 
Or are you saying that the downstream application relies on receiving all of the text content of a tag in exactly one CDATA section. If so, it is badly-designed. | 
			   
			 
		   | 
		
		
		  | Back to top | 
		  
		  	
		   | 
		
		
		    | 
		
		
		  | shwetabh WMB | 
		  
		    
			  
				 Posted: Tue Jul 19, 2016 12:17 am    Post subject:  | 
				     | 
			   
			 
		   | 
		
		
		   Novice
 
 Joined: 15 Jul 2016 Posts: 23
  
  | 
		  
		    
			  
				How are we going to store value of field in database until unless it is in one CData field?Currently  it is specifying it as ROW as mentioned above. Can we assign  it to CHAR field in db?It will fail 
 
 
+
 
 
It is not coming as row for all input message .It is coming intermittently.
 
 
Taking a value  from source and storing columns in database should not be bad design.It is  one simple design.Code will fail while inserting multiple CData value in db with type cast error. | 
			   
			 
		   | 
		
		
		  | Back to top | 
		  
		  	
		   | 
		
		
		    | 
		
		
		  | mqjeff | 
		  
		    
			  
				 Posted: Tue Jul 19, 2016 4:03 am    Post subject:  | 
				     | 
			   
			 
		   | 
		
		
		   Grand Master
 
 Joined: 25 Jun 2008 Posts: 17447
  
  | 
		  
		    
			  
				Is it code downstream of your code that's trying to store the data in the database?
 
 
If it's *your* code trying to store the data in a database, then you shouldn't be trying to stick a CDATA section in there.
 
 
You should be taking the CDATA section and either parsing it into pieces to insert into columns in the db, or converting it into a string and then inserting that into a single column.
 
 
But if your code is the same all the time, and the message always goes through the same codepath, and sometimes it works and sometimes it doesn't.
 
 
Then it's the message you're receiving that's broken. _________________ chmod  -R ugo-wx / | 
			   
			 
		   | 
		
		
		  | Back to top | 
		  
		  	
		   | 
		
		
		    | 
		
		
		  | shwetabh WMB | 
		  
		    
			  
				 Posted: Tue Jul 19, 2016 4:18 am    Post subject:  | 
				     | 
			   
			 
		   | 
		
		
		   Novice
 
 Joined: 15 Jul 2016 Posts: 23
  
  | 
		  
		    
			  
				| We have downstream system  where we are routing the message. Storing  the value in  database is  one of the activity. These data in db will be used for monitoring  by business. | 
			   
			 
		   | 
		
		
		  | Back to top | 
		  
		  	
		   | 
		
		
		    | 
		
		
		  | mqjeff | 
		  
		    
			  
				 Posted: Tue Jul 19, 2016 4:24 am    Post subject:  | 
				     | 
			   
			 
		   | 
		
		
		   Grand Master
 
 Joined: 25 Jun 2008 Posts: 17447
  
  | 
		  
		    
			  
				Ok.
 
 
So you should be inserting the contents of the cdata section into the database in a way that matches the columns of the database table.
 
 
That is, you shouldn't be passing the IIB element that represents the cdata section directly to the INSERT statement.
 
 
Again.  
 
 
Does the message always go through the same path through your code?  That is, chooses the same options for any if statements or etc? 
 
 
If it does, and sometimes it fails and sometimes it doesn't - then it's not likely an issue with your code!  it's an issue with the message, or it's an issue with the database, or etc.. etc.. etc..
 
 
You need to use tracing of some kind to find out *what the failure is*. Then you can take the right steps to debug it. _________________ chmod  -R ugo-wx / | 
			   
			 
		   | 
		
		
		  | Back to top | 
		  
		  	
		   | 
		
		
		    | 
		
		
		  | shwetabh WMB | 
		  
		    
			  
				 Posted: Tue Jul 19, 2016 4:36 am    Post subject:  | 
				     | 
			   
			 
		   | 
		
		
		   Novice
 
 Joined: 15 Jul 2016 Posts: 23
  
  | 
		  
		    
			  
				1) we used trace and it is coming as ROW in some cases where it fails.
 
   In other case it is not showing ROW.
 
 
2) Datatype is CHAR  and it working  fine  except these few cases.
 
 
3) Yes,it is passing through same line of code in esql.
 
 
4) XML is wellformed .Validated it with  other XML plugin.I used notepad++.
 
 
We have resolution.we are casting value as CHAR and then insert in DB.
 
 
However, we are trying to  find the root cause of such conflict.
 
Just trying to analyse how xml is parsed when schema is not defined in input node.Which grammer is used by parser while parsing.Do we need to define specific field (one causing issue)as PCDATA  in schema  and use the schema .Can we use <!DOCTYPE in xml for specific  field to set as #PCDATA and setting window preference to preserve whitespace for PCData .
 
 
Meanwhile ,We have raised PMR for same. | 
			   
			 
		   | 
		
		
		  | Back to top | 
		  
		  	
		   | 
		
		
		    | 
		
		
		  | fjb_saper | 
		  
		    
			  
				 Posted: Tue Jul 19, 2016 4:38 am    Post subject:  | 
				     | 
			   
			 
		   | 
		
		
		    Grand High Poobah
 
 Joined: 18 Nov 2003 Posts: 20768 Location: LI,NY 
  | 
		  
		    
			  
				Well it might be an issue with his code, if the DB expects a CLOB for the message and it is currently mapped to a CData section.
 
 
He needs to get the CData content (aggregation of potentially multiple CData entries) and store that in the DB. He needs to be aware that there is no "one to one" mapping of a CData section to the DB.   _________________ MQ & Broker admin | 
			   
			 
		   | 
		
		
		  | Back to top | 
		  
		  	
		   | 
		
		
		    | 
		
		
		  | shwetabh WMB | 
		  
		    
			  
				 Posted: Tue Jul 19, 2016 4:45 am    Post subject:  | 
				     | 
			   
			 
		   | 
		
		
		   Novice
 
 Joined: 15 Jul 2016 Posts: 23
  
  | 
		  
		    
			  
				  
 
 
Yes and we have rectified it  as said before by casting into char before inserting it into DB + 
 
DB field is CHAR.
 
 
But  like  I said we are trying to find why one CData section is getting divided into multiple CData after passing MQInput node.Why in trace, it is appearing as ROW.
 
 
Many downstream  system might fail due  to this .And this field value is  sort of comment which can be  in  html format or plain text. | 
			   
			 
		   | 
		
		
		  | Back to top | 
		  
		  	
		   | 
		
		
		    | 
		
		
		  | shwetabh WMB | 
		  
		    
			  
				 Posted: Tue Jul 19, 2016 4:48 am    Post subject:  | 
				     | 
			   
			 
		   | 
		
		
		   Novice
 
 Joined: 15 Jul 2016 Posts: 23
  
  | 
		  
		    
			  
				| And we can  insert single CData section value into char if its within range .All value with single CData is inserted successfully. | 
			   
			 
		   | 
		
		
		  | Back to top | 
		  
		  	
		   | 
		
		
		    | 
		
		
		  | 
		    
		   |