| Author | 
		  Message
		 | 
		
		  | Dharamteja | 
		  
		    
			  
				 Posted: Wed Jun 27, 2018 12:53 pm    Post subject: DFDL Parsing characters with accent marks | 
				     | 
			   
			 
		   | 
		
		
		   Newbie
 
 Joined: 27 Jun 2018 Posts: 2
  
  | 
		  
		    
			  
				Hi All , 
 
 
I'm converting data  from pipe delimited to Fixed length but my input data contains  some accent mark characters  due to that I'm not getting fixed length properly 
 
 
 
could you please tell me which Encoding(code page) should I use in DFDL property , I have tried with ISO-8859-1 but no use  and I have tried with UTF-8 also but accent characters are changed to another character , my intention is I have to parse the entire data  with out any length issue please help me       | 
			   
			 
		   | 
		
		
		  | Back to top | 
		  
		  	
		   | 
		
		
		    | 
		
		
		  | rekarm01 | 
		  
		    
			  
				 Posted: Wed Jun 27, 2018 5:06 pm    Post subject: Re: DFDL Parsing characters with accent marks | 
				     | 
			   
			 
		   | 
		
		
		   Grand Master
 
 Joined: 25 Jun 2008 Posts: 1415
  
  | 
		  
		    
			  
				
   
	| Dharamteja wrote: | 
   
  
	| I'm converting data from pipe delimited to Fixed length | 
   
 
 
Fixed length what? Bits? Bytes? Characters? Something else?  "Fixed length" doesn't mean much without also mentioning fixed length units.
 
 
   
	| Dharamteja wrote: | 
   
  
	| could you please tell me which Encoding(code page) should I use in DFDL property | 
   
 
 
That partly depends on what the receiving application(s) will tolerate.  But if the DFDL schema indicates (or implies) that lengthUnits is 'bytes', then it's probably better to stick with a single-byte character set, such as ISO 8859-1.
 
 
   
	| Dharamteja wrote: | 
   
  
	| I have tried with ISO-8859-1 but no use | 
   
 
 
Tried how?  What does "no use" mean?  More details would help here.
 
 
   
	| Dharamteja wrote: | 
   
  
	| and I have tried with UTF-8 also but accent characters are changed to another character | 
   
 
 
Changed how?  Are they changed within the message, or is whatever display tool used to view the contents just not displaying them correctly? | 
			   
			 
		   | 
		
		
		  | Back to top | 
		  
		  	
		   | 
		
		
		    | 
		
		
		  | fjb_saper | 
		  
		    
			  
				 Posted: Wed Jun 27, 2018 8:09 pm    Post subject:  | 
				     | 
			   
			 
		   | 
		
		
		    Grand High Poobah
 
 Joined: 18 Nov 2003 Posts: 20768 Location: LI,NY 
  | 
		  
		    
			  
				Did you try using the InputRoot.Properties.CodedCharSetId for the output?
 
This way no conversion should be necessary for the characters in question?   _________________ MQ & Broker admin | 
			   
			 
		   | 
		
		
		  | Back to top | 
		  
		  	
		   | 
		
		
		    | 
		
		
		  | timber | 
		  
		    
			  
				 Posted: Thu Jun 28, 2018 2:58 am    Post subject:  | 
				     | 
			   
			 
		   | 
		
		
		    Grand Master
 
 Joined: 25 Aug 2015 Posts: 1292
  
  | 
		  
		    
			  
				
   
	| Quote: | 
   
  
	| I'm converting data from pipe delimited to Fixed length but my input data contains some accent mark characters | 
   
 
You are not thinking clearly about the problem. 
 
Your message flow needs to map the input data to the output data. Not just the field names but also the *values*. Every character that could possibly occur in the input data needs to be mapped to the output, or else trigger an error. 
 
 
You have told us almost nothing about your output format. What length units are the 'fixed length' fields using? What character encodings will the receiving application accept? If you send multi-byte characters as part of a fixed-number-of-bytes field, will the receiving application be able to handle them? | 
			   
			 
		   | 
		
		
		  | Back to top | 
		  
		  	
		   | 
		
		
		    | 
		
		
		  | Dharamteja | 
		  
		    
			  
				 Posted: Tue Jul 03, 2018 6:30 am    Post subject:  | 
				     | 
			   
			 
		   | 
		
		
		   Newbie
 
 Joined: 27 Jun 2018 Posts: 2
  
  | 
		  
		    
			  
				Thanks for your response , My issue got resolved 
 
 
I have changed UTF-8 encoding in Input dfdl (pipe-delimited) and same was updated in output dfdl(fixed length ) as well  and length units I changed from bytes to chars then I got length properly 
 
 
I have parsed same data (blob) to another flow to insert into AS400 db but I parsed them as blob and converting into chars with ccsid 1208 and I didn't mention encoding then those accent characters also inserting into db as it is (not converted into another char )      | 
			   
			 
		   | 
		
		
		  | Back to top | 
		  
		  	
		   | 
		
		
		    | 
		
		
		  | 
		    
		   |