| Author | 
		  Message
		 | 
		
		  | gisly | 
		  
		    
			  
				 Posted: Wed Mar 25, 2015 12:19 am    Post subject: Retaining the original value of an escaped XML string | 
				     | 
			   
			 
		   | 
		
		
		   Apprentice
 
 Joined: 10 May 2012 Posts: 29
  
  | 
		  
		    
			  
				Hi!
 
I've got aproblem with Message Broker v. 7
 
 
I obtain a response from an HTTP-node. The XML in the response has a tag, which contains an escaped XML string.
 
It looks like that
 
   
	| Code: | 
   
  
	<response>
 
<someTag>SOME TEXT CONTENT WITH XML ENTITIES LIKE &</someTag>
 
</response> | 
   
 
 
 
Then I have to transform this "inner" XML into an XML tree in a different message flow.
 
This is how I do it:
 
   
	| Code: | 
   
  
	
 
DECLARE responseAsBlob BLOB  CAST(the_reference_to_the_response_tag AS BLOB CCSID InputRoot.Properties.CodedCharSetId);
 
 
CREATE LASTCHILD OF Environment.Variables.ServiceResult DOMAIN('XMLNSC') PARSE(boBLOB CCSID InputRoot.Properties.CodedCharSetId);
 
 | 
   
 
 
 
 
 
However, WMB does not preserve the original content of the string. Instead of that, it "unescapes" it all.
 
As a result, the CREATE LASTCHILD ... PARSE statement works with a string which looks like that
 
   
	| Code: | 
   
  
	| <someTag>SOME TEXT CONTENT WITH XML ENTITIES LIKE &</someTag> | 
   
 
 
 
Therefore, the string is no longer a correct XML because it has dangling "&".
 
 
My question is:
 
is there any way either to prevent WMB from unescaping these characters 
 
or to make it escape them again when parsing?
 
 
The only solution I know now is to simply replace these ampersands manually. But I don't really like it. | 
			   
			 
		   | 
		
		
		  | Back to top | 
		  
		  	
		   | 
		
		
		    | 
		
		
		  | fjb_saper | 
		  
		    
			  
				 Posted: Wed Mar 25, 2015 4:20 am    Post subject:  | 
				     | 
			   
			 
		   | 
		
		
		    Grand High Poobah
 
 Joined: 18 Nov 2003 Posts: 20768 Location: LI,NY 
  | 
		  
		    
			  
				This is working as designed. It looks like the problem is with your original input. see the end of the content:
 
   
	| Code: | 
   
  
	| ... WITH XML ENTITIES LIKE &</someTag> | 
   
 
 
Which translates as expected into:
 
   
	| Code: | 
   
  
	| ... WITH XML ENTITIES LIKE &</someTag> | 
   
 
 
Now if the dangling & is not right, that's the fault of your input. Tell them to fix it!.   _________________ MQ & Broker admin | 
			   
			 
		   | 
		
		
		  | Back to top | 
		  
		  	
		   | 
		
		
		    | 
		
		
		  | gisly | 
		  
		    
			  
				 Posted: Wed Mar 25, 2015 4:37 am    Post subject:  | 
				     | 
			   
			 
		   | 
		
		
		   Apprentice
 
 Joined: 10 May 2012 Posts: 29
  
  | 
		  
		    
			  
				Hi!
 
But this
 
   
	| Code: | 
   
  
	<response> 
 
<someTag>SOME TEXT CONTENT WITH XML ENTITIES LIKE &</someTag> 
 
</response> | 
   
 
 
is a valid XML, isn't it?
 
 
If it were simply 
 
   
	| Code: | 
   
  
	| <someTag>SOME TEXT CONTENT WITH XML ENTITIES LIKE &</someTag> | 
   
 
, it would generate a correct XML. So the problem is that WMB is too clever   and unescapes the inner entities | 
			   
			 
		   | 
		
		
		  | Back to top | 
		  
		  	
		   | 
		
		
		    | 
		
		
		  | fjb_saper | 
		  
		    
			  
				 Posted: Wed Mar 25, 2015 4:46 am    Post subject:  | 
				     | 
			   
			 
		   | 
		
		
		    Grand High Poobah
 
 Joined: 18 Nov 2003 Posts: 20768 Location: LI,NY 
  | 
		  
		    
			  
				Now I get your meaning...   However how to know which XML to escape and which not to?
 
 
I believe your original input should look like
 
   
	| Code: | 
   
  
	<response>
 
<someTag>SOME TEXT CONTENT WITH XML ENTITIES LIKE &amp;</someTag> 
 
</response> | 
   
 
 
and that this text would then be correctly restored with an escaped ampersand at the end... In other words your dangling & needs to be double escaped...   _________________ MQ & Broker admin | 
			   
			 
		   | 
		
		
		  | Back to top | 
		  
		  	
		   | 
		
		
		    | 
		
		
		  | joebuckeye | 
		  
		    
			  
				 Posted: Wed Mar 25, 2015 4:56 am    Post subject:  | 
				     | 
			   
			 
		   | 
		
		
		    Partisan
 
 Joined: 24 Aug 2007 Posts: 365 Location: Columbus, OH 
  | 
		  
		    
			  
				I would say any good XML parser would create the final string you are seeing.
 
 
How is any parser supposed to know you want it to translate < into < but keep the & as & ? | 
			   
			 
		   | 
		
		
		  | Back to top | 
		  
		  	
		   | 
		
		
		    | 
		
		
		  | fjb_saper | 
		  
		    
			  
				 Posted: Wed Mar 25, 2015 5:10 am    Post subject:  | 
				     | 
			   
			 
		   | 
		
		
		    Grand High Poobah
 
 Joined: 18 Nov 2003 Posts: 20768 Location: LI,NY 
  | 
		  
		    
			  
				
   
	| joebuckeye wrote: | 
   
  
	I would say any good XML parser would create the final string you are seeing.
 
 
How is any parser supposed to know you want it to translate < into < but keep the & as & ? | 
   
 
 
 
So you're telling me that the output of escaping 
 
 is not going to be
 
   
	| Code: | 
   
  
	| &amp;</tag> | 
   
 
 ?? _________________ MQ & Broker admin | 
			   
			 
		   | 
		
		
		  | Back to top | 
		  
		  	
		   | 
		
		
		    | 
		
		
		  | kimbert | 
		  
		    
			  
				 Posted: Wed Mar 25, 2015 5:47 am    Post subject:  | 
				     | 
			   
			 
		   | 
		
		
		    Jedi Council
 
 Joined: 29 Jul 2003 Posts: 5543 Location: Southampton 
  | 
		  
		    
			  
				fjb_saper's first reply is 100% correct.
 
 
gisly: Your reply to fjb_saper is missing the point. Yes, this:
   
	| Code: | 
   
  
	<response>
 
<someTag>SOME TEXT CONTENT WITH XML ENTITIES LIKE &</someTag>
 
</response> | 
   
 
is valid XML. 
 
 
But this 'XML' document is not valid:
   
	| Code: | 
   
  
	| </someTag>SOME TEXT CONTENT WITH XML ENTITIES LIKE &</someTag> | 
   
 
 
 
I think somebody has tried to construct a test for handling of & but they have forgotten to escape the & in the original XML. Which is exactly what fjb_saper said. _________________ Before you criticize someone, walk a mile in their shoes. That way you're a mile away, and you have their shoes too. | 
			   
			 
		   | 
		
		
		  | Back to top | 
		  
		  	
		   | 
		
		
		    | 
		
		
		  | joebuckeye | 
		  
		    
			  
				 Posted: Wed Mar 25, 2015 6:05 am    Post subject:  | 
				     | 
			   
			 
		   | 
		
		
		    Partisan
 
 Joined: 24 Aug 2007 Posts: 365 Location: Columbus, OH 
  | 
		  
		    
			  
				
   
	| fjb_saper wrote: | 
   
  
	
   
	| joebuckeye wrote: | 
   
  
	I would say any good XML parser would create the final string you are seeing.
 
 
How is any parser supposed to know you want it to translate < into < but keep the & as & ? | 
   
 
 
 
So you're telling me that the output of escaping 
 
 is not going to be
 
   
	| Code: | 
   
  
	| &amp;</tag> | 
   
 
 ?? | 
   
 
 
 
Sorry, I was saying that if you hand the string:
 
   
	| Code: | 
   
  
	<response>
 
<someTag>SOME TEXT CONTENT WITH XML ENTITIES LIKE &</someTag>
 
</response> | 
   
 
 
 
to an XML parser you will get what he got.  My reply was to him, not you (which was not clear in my response as you replied after I opened the thread).
 
 
So I agree that the problem was on the input side in creating the "XML string" the OP received and that a proper parser should have created the string you mention with '&amp;'. | 
			   
			 
		   | 
		
		
		  | Back to top | 
		  
		  	
		   | 
		
		
		    | 
		
		
		  | gisly | 
		  
		    
			  
				 Posted: Wed Mar 25, 2015 6:26 am    Post subject:  | 
				     | 
			   
			 
		   | 
		
		
		   Apprentice
 
 Joined: 10 May 2012 Posts: 29
  
  | 
		  
		    
			  
				Thanks.
 
Well, my problem is that what they are sending has to be a correct XML. And that's all they are obliged to do. I see your point, but I don't think they're obliged to escape the ampersand "two times".
 
 
So, the only way to handle it is to replace it within the flow. | 
			   
			 
		   | 
		
		
		  | Back to top | 
		  
		  	
		   | 
		
		
		    | 
		
		
		  | kimbert | 
		  
		    
			  
				 Posted: Wed Mar 25, 2015 7:48 am    Post subject:  | 
				     | 
			   
			 
		   | 
		
		
		    Jedi Council
 
 Joined: 29 Jul 2003 Posts: 5543 Location: Southampton 
  | 
		  
		    
			  
				
   
	| Quote: | 
   
  
	| what they are sending has to be a correct XML | 
   
 
I agree. But that's a statement of the obvious.
 
   
	| Quote: | 
   
  
	| And that's all they are obliged to do. | 
   
 
Not true. They are also obliged to ensure that the content of the 'response' tag is a valid XML document.
 
But the sender is trying to embed '<someTag>SOME TEXT CONTENT WITH XML ENTITIES LIKE &</someTag>'. That is not valid XML ( try it in your browser). Which is why your message flow has a problem when it tries to parse it.
 
   
	| Quote: | 
   
  
	| I see your point... | 
   
 
Kind of you to say so, but I don't think you do....yet.
   
	| Quote: | 
   
  
	| I don't think they're obliged to escape the ampersand "two times".  | 
   
 
I do. They are obliged to escape it first in XML document 2, and then they are obliged to escape the leading '&' on the '&' when it gets embedded in XML document 1. 
 
 
Your turn   _________________ Before you criticize someone, walk a mile in their shoes. That way you're a mile away, and you have their shoes too. | 
			   
			 
		   | 
		
		
		  | Back to top | 
		  
		  	
		   | 
		
		
		    | 
		
		
		  | gisly | 
		  
		    
			  
				 Posted: Mon Jul 06, 2015 7:20 am    Post subject:  | 
				     | 
			   
			 
		   | 
		
		
		   Apprentice
 
 Joined: 10 May 2012 Posts: 29
  
  | 
		  
		    
			  
				Hi!
 
 
Sorry for writing only now, we finally realized kimbert  was right.
 
So we managed to persuade them to send us correct data. | 
			   
			 
		   | 
		
		
		  | Back to top | 
		  
		  	
		   | 
		
		
		    | 
		
		
		  | 
		    
		   |