Hi All,
Linux, OE11.4 CPINTERNAL is 8859-1
I'm not sure if anyone has encountered the following before -->
Webservice with output longchar that returns a UTF-8 XML string (alledgedly).
When the XML contains any accented letters or certain other chars then a .net client calling the webservice can't parse the XML.
If i write a test program i get data - however if i COPY-LOB the utf-8 longchar to disk its also mangled. If I repeat the same with NO-CONVERT the xml appears valid and can be opened as a valid XML file from disk.
The following hilights the issue i'm having ->
1. - This when opened up in IE, xmlnotepad etc is perfectly valid
DEFINE TEMP-TABLE ttData NO-UNDO
FIELD MyValue AS CHARACTER.
CREATE ttData.
ttData.MyValue = "Annie Kotzé".
FOR EACH ttData:
MESSAGE
ttData.MyValue
VIEW-AS ALERT-BOX INFO BUTTONS OK.
END.
TEMP-TABLE ttData:WRITE-XML("FILE","C:\temp\out.xml",TRUE,"UTF-8").
2.This writes "improper" text to disk (implicit convert i'm guessing).
replace TEMP-TABLE ttData:WRITE-XML("FILE","C:\temp\out.xml",TRUE,"UTF-8"). with
DEFINE VARIABLE cData as LONGCHAR NO-UNDO.
FIX-CODE-PAGE(cData) = "UTF-8".
TEMP-TABLE ttData:WRITE-XML("LONGCHAR",cData,TRUE,"UTF-8").
copy-lob cData to file "c:\temp\out.xml").
3. Same as 2 above but change copy-lob cData to file "c:\temp\out.xml"). ti copy-lob cData to file "c:\temp\out.xml") NO-CONVERT. This is now "valid".
So I'm thinking that there is some kind of implicit conversion happening between the procedures output longchar and the adapter / tomcat ?
any ideas ?
Thanks in advance :)
The "implicit conversion" is the same implicit conversion if you use OUTPUT TO "foo.txt". If you don't tell the AVM what the target codepage is, it will use -cpstream. The COPY-LOB statement (and the LONGCHAR itself) has no knowledge that the data in the LONGCHAR is an XML document with an encoding declaration. The document contains the text 'encoding="UTF-8"', but that is just a string of text to the LONGCHAR. COPY-LOB is doing what you asked it to do: copy the LONGCHAR of a given codepage to a file. Since there is no CONVERT TARGET CODEPAGE on your COPY-LOB, it is implicitly converting to -cpstream.
Hi Garry,
I guessed as much with the copy-lob - but it seems (if i delete the odd chars out the source db tables) that there is another implicit conversion going on between the procedure and the webservice adapter/tomcat too (which is what I was trying to imply with my examples). If there _is_ an implicit conversion going on is there any way to make it not happen ? :)
Sorry, I am not sure I make the connection between what your samples show and the "implicit conversion going on between the procedure and the webservice adapter/tomcat". If I understand you correctly, you have an OE web service that is generating an XML response, and this XML response is malformed because the encoding declaration states the document is UTF-8 but it contains malformed UTF-8 chars. Is this correct? And the malformed XML looks the same as the output from your code sample above. Is this correct?
The webservice has (as one of) an output that is a string field. The dataset is written to a longchar, and the longchar is passed back out. it might be completely co-incidental but when chars such as é are in the output it mangles only those chars. When those chars are removed and a straight copy-lob to disk shows the XMl as parseable - the callee of the webservice also has success.
It sounds like the same conversion is happening, which is why the character is mangled. Is it specific subset of accented characters that are affected, or are all non-ASCII chars affected?
Can you provide a code sample of what is happening in your web service? From your description, you have:
procedure wsproc:
define output parameter cout as longchar no-undo.
...
hds:write-xml("LONGCHAR",cout,true,"UTF-8").
Is this correct?
I would suggest logging a call with TS with a reproducible case to get a better understanding of where the conversion is happening. I don't believe this is a bug, I just can't put my finger on the exact place this conversion is happening. it might be your -cpstream again, converting data before sending to the WSA. Is it possible to try a simple case with -cpstream UTF-8 on the appserver agents?
Have a look at these KB entries:
knowledgebase.progress.com/.../000037525
knowledgebase.progress.com/.../000038017
I did not analyse your case, but at first glimpse it may have something in common.