Link to home
Start Free TrialLog in
Avatar of moonrisesystems
moonrisesystems

asked on

DELPHI XE - ClientDataSet problem with Unicode chars in XML file

I have an simple XML file with 2 fields - Name and  Value.  I am looking to use this XML file like a database table i.e. locate values based on name, and update values based on name as well.  I have code that does all of this using a TClientDataSet, and XMLTransFormProvider, the two .xtr files defining the 'ToDataPacket' and 'ToXML' transformations.  I also have a TDataSource tied to a DB Grid to display the records.  Everything works fine until I hit a name/value record in which the value field contains unicode characters.  In this case, the TDataSet returns the single byte representation '??????' rather than the actual unicode chars.  Strangely enough, when I WRITE to the file replacing a value with a unicode string, it correctly goes into the xml file, but still will not be 'read' out and be displayed correctly.

The xml file is a Unicode file with a little endian BOM and displays correctly in IE.  I have tried both including and not including 'encoding="UTF-16" in the header.  The clientdataset has the fields defined as ftWideString.  The xtr files also have defined the fields as 'string.uni' both in the DataPacket and XML xtr files.

What am I missing?  There is probably a simple answer to this, but you know how things are after staring at something for a while...!

Avatar of Emmanuel PASQUIER
Emmanuel PASQUIER
Flag of France image

If you can successfully write your unicode data then the problem is not in the TClientDataSet, but somewhere between it and the display. Try maybe a different DBGrid component ( there is a good one in Jedi lib JVCL).

Or can you post here your application and XML file, so that we might look at it ?
Avatar of rinfo
rinfo

You need to add this in the uses clause
{$IFNDEF UNICODE}
uses SwSystem;
{$ENDIF}

and procedure to call would be something like this
{$IFDEF UNICODE}
  CDS.LoadFromFile(GetCurrentDir + '\CDS.XML');
{$ELSE}
  CDS.LoadFromFile(gsAppPath + 'CDS.XML');
{$ENDIF}

{$IFDEF UNICODE}
  CDS.SaveToFile(GetCurrentDir + '\CDS.XML', dfXML);
{$ELSE}
  CDS.SaveToFile(gsAppPath + 'CDS.XML', dfXML);
{$ENDIF}

ref : http://docwiki.embarcadero.com/CodeExamples/en/ClientDataSet_%28Delphi%29

Avatar of moonrisesystems

ASKER

The compiler directive UNICODE is defined by default in XE, which is what I want anyway so this is not the issue.

Also, I get the same result using TJvDBGrid.  The problem is definitely in the CDS (or the XMLTransFormProvider) somewhere.  I am not planning to use the data in any data aware controls, simply obtaining the values, which in my app represent settings.

To recreate this is a bit involved and requires creating xtr files using the XMLMapper utility which only comes with the enterprise version unfortunately.  I have included the contents of those files below.
You need a form with the ClientDataSet and XMLTransformProvider components filled out with the items below.  Once the data is set, a simple button with the code below will allow you to recreate this.  I hope I don't have typos here...!

1) Here is a sample of my xml file - saved as a unicode file (v6settings.xml):
<?xml version="1.0" ?> 
<v60netstopdata>
<nssettings>
  <ns_name>SettingNumber1</ns_name> 
  <ns_type>string</ns_type> 
  <ns_value>ValueNumber1</ns_value> 
  </nssettings>
<nssettings>
  <ns_name>SettingNumber2</ns_name> 
  <ns_type>string</ns_type> 
  <ns_value>¿¿¿¿</ns_value> 
  </nssettings>
<nssettings>
  <ns_name>SettingNumber3</ns_name> 
  <ns_type>string</ns_type> 
  <ns_value>ValueNumber3</ns_value> 
  </nssettings>
</v60netstopdata>

Open in new window

Note SettingNumber2 has a Unicode value.

(Also note: the ns_type field is used by my program and is not part of the problem here...)

All I am attempting to do is 1)Locate a specific setting and 2 retrieve it's value:
if ClientDataSet1.Locate('ns_name','SettingNumber2',[loCaseInsensitive]) then
  SomeVariable := ClientDataSet1.FieldValues['ns_value'];

Open in new window

This works for everything is my real xml file EXCEPT for Unicode values.  The return for this example value is '????'.

When I go to change a value however it works fine - it correctly replaces the ns_value with the unicode string.
if ClientDataSet1.Locate('ns_name','SettingNumber2',[loCaseInsensitive]) then
begin
  Edit;
  FieldValues[ns_value'] := '¿¿¿¿';
  Post; 
  ApplyUpdate(0);
end;

Open in new window

The ClientDataSet has the fields defined as ftWideString, and the xtr files have the datatypes listed as string.uni.  also, the ClientDataSet indexfieldnames is set to ns_name.

Here is the TransFormRead xtr file contents: (v6settingsToDP.xtr)
<XmlTransformation Version="1.0"><Transform Direction="ToCds" DataEncoding="UTF-16"><SelectEach dest="DATAPACKET\ROWDATA\ROW" from="\v60netstopdata\nssettings"><Select dest="@ns_name" from="\ns_name"/><Select dest="@ns_type" from="\ns_type"/><Select dest="@ns_value" from="\ns_value"/></SelectEach></Transform><XmlSchema RootName="v60netstopdata"><![CDATA[<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
  <xs:element name="v60netstopdata" type="v60netstopdataType"/>
  <xs:complexType name="v60netstopdataType">
    <xs:sequence>
      <xs:element name="nssettings" type="nssettingsType" minOccurs="0" maxOccurs="unbounded"/>
    </xs:sequence>
  </xs:complexType>
  <xs:element name="nssettings" type="nssettingsType"/>
  <xs:complexType name="nssettingsType">
    <xs:sequence>
      <xs:element name="ns_name" type="ns_nameType"/>
      <xs:element name="ns_type" type="ns_typeType"/>
      <xs:element name="ns_value" type="ns_valueType"/>
    </xs:sequence>
  </xs:complexType>
  <xs:element name="ns_name" type="ns_nameType"/>
  <xs:simpleType name="ns_nameType">
    <xs:restriction base="xs:string"/>
  </xs:simpleType>
  <xs:element name="ns_type" type="ns_typeType"/>
  <xs:simpleType name="ns_typeType">
    <xs:restriction base="xs:string"/>
  </xs:simpleType>
  <xs:element name="ns_value" type="ns_valueType"/>
  <xs:simpleType name="ns_valueType">
    <xs:restriction base="xs:string"/>
  </xs:simpleType>
</xs:schema>]]></XmlSchema><CdsSkeleton/><XslTransform/><Skeleton><![CDATA[<?xml version="1.0"?><DATAPACKET Version="2.0"><METADATA><FIELDS><FIELD attrname="ns_name" fieldtype="string.uni" WIDTH="110"/><FIELD attrname="ns_type" fieldtype="string" WIDTH="8"/><FIELD attrname="ns_value" fieldtype="string.uni" WIDTH="2390"/></FIELDS><PARAMS/></METADATA><ROWDATA/><METADATA><FIELDS><FIELD attrname="ns_name" fieldtype="string.uni" WIDTH="110"/><FIELD attrname="ns_type" fieldtype="string" WIDTH="8"/><FIELD attrname="ns_value" fieldtype="string.uni" WIDTH="2390"/></FIELDS><PARAMS/></METADATA><ROWDATA/></DATAPACKET>
]]></Skeleton></XmlTransformation>

Open in new window


Here is the TransFormWrite xtr file contents (v6settingsToXML.xtr)
<XmlTransformation Version="1.0"><Transform Direction="ToXml" DataEncoding="UTF-16"><SelectEach from="DATAPACKET\ROWDATA\ROW" dest="\v60netstopdata\nssettings"><Select from="@ns_name" dest="\ns_name"/><Select from="@ns_type" dest="\ns_type"/><Select from="@ns_value" dest="\ns_value"/></SelectEach></Transform><XmlSchema RootName="v60netstopdata"><![CDATA[<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
  <xs:element name="ns_name" type="ns_nameType"/>
  <xs:simpleType name="ns_nameType">
    <xs:restriction base="xs:string"/>
  </xs:simpleType>
  <xs:element name="ns_type" type="ns_typeType"/>
  <xs:simpleType name="ns_typeType">
    <xs:restriction base="xs:string"/>
  </xs:simpleType>
  <xs:element name="ns_value" type="ns_valueType"/>
  <xs:simpleType name="ns_valueType">
    <xs:restriction base="xs:string"/>
  </xs:simpleType>
  <xs:element name="nssettings" type="nssettingsType"/>
  <xs:complexType name="nssettingsType">
    <xs:sequence>
      <xs:element name="ns_name" type="ns_nameType"/>
      <xs:element name="ns_type" type="ns_typeType"/>
      <xs:element name="ns_value" type="ns_valueType"/>
    </xs:sequence>
  </xs:complexType>
  <xs:element name="v60netstopdata" type="v60netstopdataType"/>
  <xs:complexType name="v60netstopdataType">
    <xs:sequence>
      <xs:element name="nssettings" type="nssettingsType" minOccurs="0" maxOccurs="unbounded"/>
    </xs:sequence>
  </xs:complexType>
</xs:schema>]]></XmlSchema><CdsSkeleton><![CDATA[<DATAPACKET Version="2.0"><METADATA><FIELDS><FIELD attrname="ns_name" fieldtype="string.uni" WIDTH="110"/><FIELD attrname="ns_type" fieldtype="string" WIDTH="8"/><FIELD attrname="ns_value" fieldtype="string.uni" WIDTH="2390"/></FIELDS><PARAMS/></METADATA><ROWDATA/><METADATA><FIELDS><FIELD attrname="ns_name" fieldtype="string.uni" WIDTH="110"/><FIELD attrname="ns_type" fieldtype="string" WIDTH="8"/><FIELD attrname="ns_value" fieldtype="string.uni" WIDTH="2390"/></FIELDS><PARAMS/></METADATA><ROWDATA/><METADATA><FIELDS><FIELD attrname="ns_name" fieldtype="string.uni" WIDTH="110"/><FIELD attrname="ns_type" fieldtype="string" WIDTH="8"/><FIELD attrname="ns_value" fieldtype="string.uni" WIDTH="2390"/></FIELDS><PARAMS/></METADATA><ROWDATA/></DATAPACKET>
]]></CdsSkeleton><XslTransform/><Skeleton><![CDATA[<?xml version="1.0"?>
<v60netstopdata><nssettings><ns_name></ns_name><ns_type></ns_type><ns_value></ns_value></nssettings></v60netstopdata>
]]></Skeleton></XmlTransformation>

Open in new window


Finally - I also just confirmed that this issue exists in XE2 also...
pfeeewww... what a hell you are rising just to read XML values...

Have you considered using XML components, load your file and browse your XML tree to get the values, simply ?

I use a good one, much more efficient than MS XML crappy COM object
OpenXML : http://philo.de/xml/
the utility library needed is also part of Jedi, so you don't need to download it again
It is fast, memory savvy, very easy to use.. All this compared to MSXML which is a big ugly pig.
Only problem : documentation is a bit outdated. But you can find tutorials that you can adapt easily to the new methods.
I think I only needed recompile to use it with XE. I can't remember if I had to make some adjustments, but if I did it was quickly sorted out or I would remember.

Try it for size, you won't regret it.
If all I was trying to do was read a few XML values I agree that this is overkill.  My xml file contains over 4500 records and I need to use it like a standard database table meaning indexed lookups (browsing a tree likely won't perform well enough), Reads and Writes.  All of this works just fine (and quickly) as it is with the exception of a Read when a Unicode value is present.  

The more that I work with this the more it appears that this is a bug.  In the XMLMappere.exe utility when you open an xml file and define the datatypes of each field, it will correcltly display unicode values for the field in the Node Properties - Sample Values field.  But when you click on the Mapping tab and select 'XML to DataPacket', then click 'Create and Test Transformation', the datapacket values show ???? not the unicode characters.  If you select 'DataPacket to XML' then click 'Create and Test Transformation'  the values shown are correct....

ASKER CERTIFIED SOLUTION
Avatar of Emmanuel PASQUIER
Emmanuel PASQUIER
Flag of France image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
I have placed an incident report with Embarcadero on this issue and sure enough they have confirmed that there is a problem with the TCientDataSet / TXMLTransformProvider in the way that it handles Unicode Values. It is supposed to work as I had expected and coded for.  

It looks like there is no correct answer to my specific question, but I will spend some time looking at OpenXML as suggested and will post back my results.
HI ! I'm just doing a little "after-sale service" for my advices about OpenXML. Did you had any luck with it ?
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
The solution was a suggestion to try a specific different approach which I had not thought of.  There were no specific steps or or code provided - I did the research and developed the code for the solution.  But the approach was absolutely the way to go.
hi all
"Asked by: moonrisesystems"

Open unit DSIntf.pas
find function -->function StringToVariantArray(const S: Rawbytestring): OleVariant;
and change -->function StringToVariantArray(const S: UTF8String): OleVariant;
work 100%

best regard
ambako_georgia
P.S.

and
open unit Xmlxform.pas
find string -->Result := DSIntf.StringToVariantArray(AnsiString(S));
change -->DSIntf.StringToVariantArray(S);

delete
DSIntf.dcu and  Xmlxform.dcu

rebuild projects