ASG
IBM
Zystems
Cressida
Icon
Netflexity
 
  MQSeries.net
Search  Search       Tech Exchange      Education      Certifications      Library      Info Center      SupportPacs      LinkedIn  Search  Search                                                                   FAQ  FAQ   Usergroups  Usergroups
 
Register  ::  Log in Log in to check your private messages
 
RSS Feed - WebSphere MQ Support RSS Feed - Message Broker Support

MQSeries.net Forum Index » WebSphere Message Broker (ACE) Support » Special characters getting converted to different characters

Post new topic  Reply to topic
 Special characters getting converted to different characters « View previous topic :: View next topic » 
Author Message
mhd_zabi
PostPosted: Sun Sep 25, 2016 8:37 am    Post subject: Special characters getting converted to different characters Reply with quote

Newbie

Joined: 25 Sep 2016
Posts: 7
Location: Mangalore

Hi All,
I am using IIB9 on AIX7.1. I am facing an issue while reading a csv file.
The flow doesnt do much. It reads the input csv file record by record by skipping the header record and then appends each record to the output file. Basically the flow just removed the header record from the input and creates the output with the header record removed.
Now the problem comes with the special characters. Characters like ä and ö are getting converted to ä and ö respectively. I checked the input file and it seems to be a UTF-8 encoded file(I checked by opening this file in Notepad++). But the output files with the changed characters seem be ANSI encoded.

Can anyone suggest how this is getting changed.

Thanks.
Back to top
View user's profile Send private message
fjb_saper
PostPosted: Sun Sep 25, 2016 11:27 am    Post subject: Reply with quote

Grand High Poobah

Joined: 18 Nov 2003
Posts: 20696
Location: LI,NY

What is the CCSID on OutputRoot.Properties when writing the file?
_________________
MQ & Broker admin
Back to top
View user's profile Send private message Send e-mail
mhd_zabi
PostPosted: Sun Sep 25, 2016 8:32 pm    Post subject: Reply with quote

Newbie

Joined: 25 Sep 2016
Posts: 7
Location: Mangalore

CCSID is 1208 and Encoding is 273 in both input and output properties.

Strange thing is that yesterday when i tried to read the input as Whole File, and just copied it to the output, the special characters were getting passed as is. Its only when i try to transform the file, that the special characters are getting changed.
Back to top
View user's profile Send private message
adubya
PostPosted: Sun Sep 25, 2016 11:54 pm    Post subject: Reply with quote

Partisan

Joined: 25 Aug 2011
Posts: 377
Location: GU12, UK

How are you performing the transformation ? Java/ESQL/built in node ?

If using Java then check any methods which handle byte arrays are using the correct encoding parameters.
_________________
Independent Middleware Consultant
andy@knownentity.com
Back to top
View user's profile Send private message Send e-mail
mhd_zabi
PostPosted: Mon Sep 26, 2016 2:38 am    Post subject: Reply with quote

Newbie

Joined: 25 Sep 2016
Posts: 7
Location: Mangalore

I am using ESQL, but all it does is
Call CopyMessageHeaders
Call CopyEntireMessage
Set the output filename
check whether is the IsEmpty flag is TRUE or FALSE based on which i send the data to finish file terminal to complete file.

The transformation of removing the header record from input is being done by the Records and Elements property in the FileInput node where I am checking the box to skip first record.
Back to top
View user's profile Send private message
smdavies99
PostPosted: Mon Sep 26, 2016 2:51 am    Post subject: Reply with quote

Jedi Council

Joined: 10 Feb 2003
Posts: 6076
Location: Somewhere over the Rainbow this side of Never-never land.

This:-
mhd_zabi wrote:

Call CopyMessageHeaders
Call CopyEntireMessage
Set the output filename


is incorrect.

You do ONE or the other of

Call CopyMessageHeaders
Call CopyEntireMessage

No both.
In your case, as you are not doing anything to the message body then

Code:


Call CopyEntireMessage


Is the right one to use.
_________________
WMQ User since 1999
MQSI/WBI/WMB/'Thingy' User since 2002
Linux user since 1995

Every time you reinvent the wheel the more square it gets (anon). If in doubt think and investigate before you ask silly questions.
Back to top
View user's profile Send private message
timber
PostPosted: Mon Sep 26, 2016 2:53 am    Post subject: Reply with quote

Grand Master

Joined: 25 Aug 2015
Posts: 1280

Quote:
Strange thing is that yesterday when i tried to read the input as Whole File, and just copied it to the output, the special characters were getting passed as is. Its only when i try to transform the file, that the special characters are getting changed.
That's not strange at all. If the flow does not change the message tree then the output will be copied from the input bitstream.
Quote:
I checked the input file and it seems to be a UTF-8 encoded file(I checked by opening this file in Notepad++). But the output files with the changed characters seem be ANSI encoded.
Seems to be? On what basis are you asserting this? Most character encodings will look reasonably OK when interpreted as UTF-8 when you are viewing mostly ASCII characters. So the fact that Notepad++ displayed the file correctly doesn't prove very much.

You cannot make assumptions about the character encoding. You absolutely must find out what encoding the sender used when the file was written. Then specify that encoding in your message flow.
Back to top
View user's profile Send private message
rekarm01
PostPosted: Mon Sep 26, 2016 8:23 am    Post subject: Reply with quote

Grand Master

Joined: 25 Jun 2008
Posts: 1415

mhd_zabi wrote:
CCSID is 1208 and Encoding is 273 in both input and output properties.

There is also the possibility that the message flow works, but something is wrong with whatever application interprets/renders/displays the output message. For example, if NotePad++ requires a BOM in order to correctly detect UTF-8, but the message flow does not preserve the BOM when transforming the message, then NotePad++ would garble the message. Applications such as rfhutil and amqsbcg0 that can display message bytes in hexadecimal are much more useful for unambiguously examining messages.
Back to top
View user's profile Send private message
mhd_zabi
PostPosted: Mon Sep 26, 2016 11:27 pm    Post subject: Reply with quote

Newbie

Joined: 25 Sep 2016
Posts: 7
Location: Mangalore

Thank you all for your inputs, specially rekarm01
I explored some more based on what rekarm01 said and below is what I found.
The input we recieved from the source was a UTF-8 with BOM, as i could see the BOM hex code in the input. The output file that i generated, did not have this. Now Notepad++ still detects this without the BOM and shows encoding as "UTF-8 Without BOM", but excel and our end application (which in this case is Siebel) is not able to detect the encoding as UTF-8 in which case it takes it as ASCII and corrupts the special characters. The BOM seems to be getting removed when i remove the header record from the input.
I was wondering whether there is some way to retain this BOM even after removing the header record.
Back to top
View user's profile Send private message
timber
PostPosted: Tue Sep 27, 2016 1:49 am    Post subject: Reply with quote

Grand Master

Joined: 25 Aug 2015
Posts: 1280

The BOM is just a sequence of 3 bytes. You could write those 3 bytes to the file before appending the other records.

Alternatively, you could un-select the 'Skip first record' and put in some logic to truncate all except the first character of the first record ( rather than skipping it completely ).

Before you do either of those, you may want to ask yourself whether the BOM is actually required by the receiver.
Back to top
View user's profile Send private message
mhd_zabi
PostPosted: Thu Sep 29, 2016 1:46 am    Post subject: Reply with quote

Newbie

Joined: 25 Sep 2016
Posts: 7
Location: Mangalore

Thanks Timber

I was able to append the BOM to the beginning of the file by converting it to Bitsream and it worked fine. The end application was able to read the file as UTF-8 and processed the characters correctly.

Thank you all for you help.
Back to top
View user's profile Send private message
Display posts from previous:   
Post new topic  Reply to topic Page 1 of 1

MQSeries.net Forum Index » WebSphere Message Broker (ACE) Support » Special characters getting converted to different characters
Jump to:  



You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
Protected by Anti-Spam ACP
 
 


Theme by Dustin Baccetti
Powered by phpBB © 2001, 2002 phpBB Group

Copyright © MQSeries.net. All rights reserved.