• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 4343
  • Last Modified:

DELPHI XE - ClientDataSet problem with Unicode chars in XML file

I have an simple XML file with 2 fields - Name and  Value.  I am looking to use this XML file like a database table i.e. locate values based on name, and update values based on name as well.  I have code that does all of this using a TClientDataSet, and XMLTransFormProvider, the two .xtr files defining the 'ToDataPacket' and 'ToXML' transformations.  I also have a TDataSource tied to a DB Grid to display the records.  Everything works fine until I hit a name/value record in which the value field contains unicode characters.  In this case, the TDataSet returns the single byte representation '??????' rather than the actual unicode chars.  Strangely enough, when I WRITE to the file replacing a value with a unicode string, it correctly goes into the xml file, but still will not be 'read' out and be displayed correctly.

The xml file is a Unicode file with a little endian BOM and displays correctly in IE.  I have tried both including and not including 'encoding="UTF-16" in the header.  The clientdataset has the fields defined as ftWideString.  The xtr files also have defined the fields as 'string.uni' both in the DataPacket and XML xtr files.

What am I missing?  There is probably a simple answer to this, but you know how things are after staring at something for a while...!

0
moonrisesystems
Asked:
moonrisesystems
  • 5
  • 4
  • 2
  • +1
2 Solutions
 
epasquierCommented:
If you can successfully write your unicode data then the problem is not in the TClientDataSet, but somewhere between it and the display. Try maybe a different DBGrid component ( there is a good one in Jedi lib JVCL).

Or can you post here your application and XML file, so that we might look at it ?
0
 
rinfoCommented:
You need to add this in the uses clause
{$IFNDEF UNICODE}
uses SwSystem;
{$ENDIF}

and procedure to call would be something like this
{$IFDEF UNICODE}
  CDS.LoadFromFile(GetCurrentDir + '\CDS.XML');
{$ELSE}
  CDS.LoadFromFile(gsAppPath + 'CDS.XML');
{$ENDIF}

{$IFDEF UNICODE}
  CDS.SaveToFile(GetCurrentDir + '\CDS.XML', dfXML);
{$ELSE}
  CDS.SaveToFile(gsAppPath + 'CDS.XML', dfXML);
{$ENDIF}

ref : http://docwiki.embarcadero.com/CodeExamples/en/ClientDataSet_%28Delphi%29

0
 
moonrisesystemsAuthor Commented:
The compiler directive UNICODE is defined by default in XE, which is what I want anyway so this is not the issue.

Also, I get the same result using TJvDBGrid.  The problem is definitely in the CDS (or the XMLTransFormProvider) somewhere.  I am not planning to use the data in any data aware controls, simply obtaining the values, which in my app represent settings.

To recreate this is a bit involved and requires creating xtr files using the XMLMapper utility which only comes with the enterprise version unfortunately.  I have included the contents of those files below.
You need a form with the ClientDataSet and XMLTransformProvider components filled out with the items below.  Once the data is set, a simple button with the code below will allow you to recreate this.  I hope I don't have typos here...!

1) Here is a sample of my xml file - saved as a unicode file (v6settings.xml):
<?xml version="1.0" ?> 
<v60netstopdata>
<nssettings>
  <ns_name>SettingNumber1</ns_name> 
  <ns_type>string</ns_type> 
  <ns_value>ValueNumber1</ns_value> 
  </nssettings>
<nssettings>
  <ns_name>SettingNumber2</ns_name> 
  <ns_type>string</ns_type> 
  <ns_value>¿¿¿¿</ns_value> 
  </nssettings>
<nssettings>
  <ns_name>SettingNumber3</ns_name> 
  <ns_type>string</ns_type> 
  <ns_value>ValueNumber3</ns_value> 
  </nssettings>
</v60netstopdata>

Open in new window

Note SettingNumber2 has a Unicode value.

(Also note: the ns_type field is used by my program and is not part of the problem here...)

All I am attempting to do is 1)Locate a specific setting and 2 retrieve it's value:
if ClientDataSet1.Locate('ns_name','SettingNumber2',[loCaseInsensitive]) then
  SomeVariable := ClientDataSet1.FieldValues['ns_value'];

Open in new window

This works for everything is my real xml file EXCEPT for Unicode values.  The return for this example value is '????'.

When I go to change a value however it works fine - it correctly replaces the ns_value with the unicode string.
if ClientDataSet1.Locate('ns_name','SettingNumber2',[loCaseInsensitive]) then
begin
  Edit;
  FieldValues[ns_value'] := '¿¿¿¿';
  Post; 
  ApplyUpdate(0);
end;

Open in new window

The ClientDataSet has the fields defined as ftWideString, and the xtr files have the datatypes listed as string.uni.  also, the ClientDataSet indexfieldnames is set to ns_name.

Here is the TransFormRead xtr file contents: (v6settingsToDP.xtr)
<XmlTransformation Version="1.0"><Transform Direction="ToCds" DataEncoding="UTF-16"><SelectEach dest="DATAPACKET\ROWDATA\ROW" from="\v60netstopdata\nssettings"><Select dest="@ns_name" from="\ns_name"/><Select dest="@ns_type" from="\ns_type"/><Select dest="@ns_value" from="\ns_value"/></SelectEach></Transform><XmlSchema RootName="v60netstopdata"><![CDATA[<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
  <xs:element name="v60netstopdata" type="v60netstopdataType"/>
  <xs:complexType name="v60netstopdataType">
    <xs:sequence>
      <xs:element name="nssettings" type="nssettingsType" minOccurs="0" maxOccurs="unbounded"/>
    </xs:sequence>
  </xs:complexType>
  <xs:element name="nssettings" type="nssettingsType"/>
  <xs:complexType name="nssettingsType">
    <xs:sequence>
      <xs:element name="ns_name" type="ns_nameType"/>
      <xs:element name="ns_type" type="ns_typeType"/>
      <xs:element name="ns_value" type="ns_valueType"/>
    </xs:sequence>
  </xs:complexType>
  <xs:element name="ns_name" type="ns_nameType"/>
  <xs:simpleType name="ns_nameType">
    <xs:restriction base="xs:string"/>
  </xs:simpleType>
  <xs:element name="ns_type" type="ns_typeType"/>
  <xs:simpleType name="ns_typeType">
    <xs:restriction base="xs:string"/>
  </xs:simpleType>
  <xs:element name="ns_value" type="ns_valueType"/>
  <xs:simpleType name="ns_valueType">
    <xs:restriction base="xs:string"/>
  </xs:simpleType>
</xs:schema>]]></XmlSchema><CdsSkeleton/><XslTransform/><Skeleton><![CDATA[<?xml version="1.0"?><DATAPACKET Version="2.0"><METADATA><FIELDS><FIELD attrname="ns_name" fieldtype="string.uni" WIDTH="110"/><FIELD attrname="ns_type" fieldtype="string" WIDTH="8"/><FIELD attrname="ns_value" fieldtype="string.uni" WIDTH="2390"/></FIELDS><PARAMS/></METADATA><ROWDATA/><METADATA><FIELDS><FIELD attrname="ns_name" fieldtype="string.uni" WIDTH="110"/><FIELD attrname="ns_type" fieldtype="string" WIDTH="8"/><FIELD attrname="ns_value" fieldtype="string.uni" WIDTH="2390"/></FIELDS><PARAMS/></METADATA><ROWDATA/></DATAPACKET>
]]></Skeleton></XmlTransformation>

Open in new window


Here is the TransFormWrite xtr file contents (v6settingsToXML.xtr)
<XmlTransformation Version="1.0"><Transform Direction="ToXml" DataEncoding="UTF-16"><SelectEach from="DATAPACKET\ROWDATA\ROW" dest="\v60netstopdata\nssettings"><Select from="@ns_name" dest="\ns_name"/><Select from="@ns_type" dest="\ns_type"/><Select from="@ns_value" dest="\ns_value"/></SelectEach></Transform><XmlSchema RootName="v60netstopdata"><![CDATA[<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
  <xs:element name="ns_name" type="ns_nameType"/>
  <xs:simpleType name="ns_nameType">
    <xs:restriction base="xs:string"/>
  </xs:simpleType>
  <xs:element name="ns_type" type="ns_typeType"/>
  <xs:simpleType name="ns_typeType">
    <xs:restriction base="xs:string"/>
  </xs:simpleType>
  <xs:element name="ns_value" type="ns_valueType"/>
  <xs:simpleType name="ns_valueType">
    <xs:restriction base="xs:string"/>
  </xs:simpleType>
  <xs:element name="nssettings" type="nssettingsType"/>
  <xs:complexType name="nssettingsType">
    <xs:sequence>
      <xs:element name="ns_name" type="ns_nameType"/>
      <xs:element name="ns_type" type="ns_typeType"/>
      <xs:element name="ns_value" type="ns_valueType"/>
    </xs:sequence>
  </xs:complexType>
  <xs:element name="v60netstopdata" type="v60netstopdataType"/>
  <xs:complexType name="v60netstopdataType">
    <xs:sequence>
      <xs:element name="nssettings" type="nssettingsType" minOccurs="0" maxOccurs="unbounded"/>
    </xs:sequence>
  </xs:complexType>
</xs:schema>]]></XmlSchema><CdsSkeleton><![CDATA[<DATAPACKET Version="2.0"><METADATA><FIELDS><FIELD attrname="ns_name" fieldtype="string.uni" WIDTH="110"/><FIELD attrname="ns_type" fieldtype="string" WIDTH="8"/><FIELD attrname="ns_value" fieldtype="string.uni" WIDTH="2390"/></FIELDS><PARAMS/></METADATA><ROWDATA/><METADATA><FIELDS><FIELD attrname="ns_name" fieldtype="string.uni" WIDTH="110"/><FIELD attrname="ns_type" fieldtype="string" WIDTH="8"/><FIELD attrname="ns_value" fieldtype="string.uni" WIDTH="2390"/></FIELDS><PARAMS/></METADATA><ROWDATA/><METADATA><FIELDS><FIELD attrname="ns_name" fieldtype="string.uni" WIDTH="110"/><FIELD attrname="ns_type" fieldtype="string" WIDTH="8"/><FIELD attrname="ns_value" fieldtype="string.uni" WIDTH="2390"/></FIELDS><PARAMS/></METADATA><ROWDATA/></DATAPACKET>
]]></CdsSkeleton><XslTransform/><Skeleton><![CDATA[<?xml version="1.0"?>
<v60netstopdata><nssettings><ns_name></ns_name><ns_type></ns_type><ns_value></ns_value></nssettings></v60netstopdata>
]]></Skeleton></XmlTransformation>

Open in new window


Finally - I also just confirmed that this issue exists in XE2 also...
0
Concerto's Cloud Advisory Services

Want to avoid the missteps to gaining all the benefits of the cloud? Learn more about the different assessment options from our Cloud Advisory team.

 
epasquierCommented:
pfeeewww... what a hell you are rising just to read XML values...

Have you considered using XML components, load your file and browse your XML tree to get the values, simply ?

I use a good one, much more efficient than MS XML crappy COM object
OpenXML : http://philo.de/xml/
the utility library needed is also part of Jedi, so you don't need to download it again
It is fast, memory savvy, very easy to use.. All this compared to MSXML which is a big ugly pig.
Only problem : documentation is a bit outdated. But you can find tutorials that you can adapt easily to the new methods.
I think I only needed recompile to use it with XE. I can't remember if I had to make some adjustments, but if I did it was quickly sorted out or I would remember.

Try it for size, you won't regret it.
0
 
moonrisesystemsAuthor Commented:
If all I was trying to do was read a few XML values I agree that this is overkill.  My xml file contains over 4500 records and I need to use it like a standard database table meaning indexed lookups (browsing a tree likely won't perform well enough), Reads and Writes.  All of this works just fine (and quickly) as it is with the exception of a Read when a Unicode value is present.  

The more that I work with this the more it appears that this is a bug.  In the XMLMappere.exe utility when you open an xml file and define the datatypes of each field, it will correcltly display unicode values for the field in the Node Properties - Sample Values field.  But when you click on the Mapping tab and select 'XML to DataPacket', then click 'Create and Test Transformation', the datapacket values show ???? not the unicode characters.  If you select 'DataPacket to XML' then click 'Create and Test Transformation'  the values shown are correct....

0
 
epasquierCommented:
4500 records are nothing for OpenXML.
I use it to load/manipulate 50Mb XML, and it can do it in a few (dozen) of seconds. Well, its 50Mb after all, you have to expect it to take a little time.

Load you data into an internal data structure (hashfile or dictionary) -> less than 1s
Access data from your dictionary (almost instantaneous for only 4500 records with hash) and write it back only once when you need it (again less than 1s).
And the memory footprint of your internal data will be minimal.

XML is an EXCHANGE format. It is a most crappy way of storing internal data
0
 
moonrisesystemsAuthor Commented:
I have placed an incident report with Embarcadero on this issue and sure enough they have confirmed that there is a problem with the TCientDataSet / TXMLTransformProvider in the way that it handles Unicode Values. It is supposed to work as I had expected and coded for.  

It looks like there is no correct answer to my specific question, but I will spend some time looking at OpenXML as suggested and will post back my results.
0
 
epasquierCommented:
HI ! I'm just doing a little "after-sale service" for my advices about OpenXML. Did you had any luck with it ?
0
 
moonrisesystemsAuthor Commented:
Hi!  I was going to respond back today so I am glad you reminded me.  It took me a while to get back to this because I was pulled in another direction for a few days...

I did not get too far with OpenXML mainly due to time constraints and lack of documentation. I did manage to determine that using TXMLDocument was way too slow for my requirements.  But one sugestion that you made in your advice - 'Load you data into an internal data structure (hashfile or dictionary)' gave me just enough of a clue to solve my problem - and it performs blazingly fast as well!  

With my original code I was attmpting to use the XML file itself as the data structure which is why I needed the .xtr transformation files for the TClientDataSet to work properly.  But now using TDictionary to load the data into, loading the entire data file and accessing specific records is instantaneous and the code to make it all work is much simpler.

So I formally accept your suggestion as the solution to my problem and award you the points!  Thanks for the Advice!

As for the TClientDataset prolem, I placed a report with QC at Embarcadero and after going back and forth a few times they have confirmed and found the problem code.  It is still listed as OPEN but I would think that the fix will be in the next Update...
0
 
moonrisesystemsAuthor Commented:
The solution was a suggestion to try a specific different approach which I had not thought of.  There were no specific steps or or code provided - I did the research and developed the code for the solution.  But the approach was absolutely the way to go.
0
 
ambako_georgiaCommented:
hi all
"Asked by: moonrisesystems"

Open unit DSIntf.pas
find function -->function StringToVariantArray(const S: Rawbytestring): OleVariant;
and change -->function StringToVariantArray(const S: UTF8String): OleVariant;
work 100%

best regard
ambako_georgia
0
 
ambako_georgiaCommented:
P.S.

and
open unit Xmlxform.pas
find string -->Result := DSIntf.StringToVariantArray(AnsiString(S));
change -->DSIntf.StringToVariantArray(S);

delete
DSIntf.dcu and  Xmlxform.dcu

rebuild projects
0

Featured Post

Free Tool: Port Scanner

Check which ports are open to the outside world. Helps make sure that your firewall rules are working as intended.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

  • 5
  • 4
  • 2
  • +1
Tackle projects and never again get stuck behind a technical roadblock.
Join Now