Solved

DELPHI XE - ClientDataSet problem with Unicode chars in XML file

Posted on 2011-09-09
13
3,872 Views
Last Modified: 2012-05-12
I have an simple XML file with 2 fields - Name and  Value.  I am looking to use this XML file like a database table i.e. locate values based on name, and update values based on name as well.  I have code that does all of this using a TClientDataSet, and XMLTransFormProvider, the two .xtr files defining the 'ToDataPacket' and 'ToXML' transformations.  I also have a TDataSource tied to a DB Grid to display the records.  Everything works fine until I hit a name/value record in which the value field contains unicode characters.  In this case, the TDataSet returns the single byte representation '??????' rather than the actual unicode chars.  Strangely enough, when I WRITE to the file replacing a value with a unicode string, it correctly goes into the xml file, but still will not be 'read' out and be displayed correctly.

The xml file is a Unicode file with a little endian BOM and displays correctly in IE.  I have tried both including and not including 'encoding="UTF-16" in the header.  The clientdataset has the fields defined as ftWideString.  The xtr files also have defined the fields as 'string.uni' both in the DataPacket and XML xtr files.

What am I missing?  There is probably a simple answer to this, but you know how things are after staring at something for a while...!

0
Comment
Question by:moonrisesystems
  • 5
  • 4
  • 2
  • +1
13 Comments
 
LVL 25

Expert Comment

by:epasquier
ID: 36515933
If you can successfully write your unicode data then the problem is not in the TClientDataSet, but somewhere between it and the display. Try maybe a different DBGrid component ( there is a good one in Jedi lib JVCL).

Or can you post here your application and XML file, so that we might look at it ?
0
 
LVL 9

Expert Comment

by:rinfo
ID: 36515974
You need to add this in the uses clause
{$IFNDEF UNICODE}
uses SwSystem;
{$ENDIF}

and procedure to call would be something like this
{$IFDEF UNICODE}
  CDS.LoadFromFile(GetCurrentDir + '\CDS.XML');
{$ELSE}
  CDS.LoadFromFile(gsAppPath + 'CDS.XML');
{$ENDIF}

{$IFDEF UNICODE}
  CDS.SaveToFile(GetCurrentDir + '\CDS.XML', dfXML);
{$ELSE}
  CDS.SaveToFile(gsAppPath + 'CDS.XML', dfXML);
{$ENDIF}

ref : http://docwiki.embarcadero.com/CodeExamples/en/ClientDataSet_%28Delphi%29

0
 
LVL 1

Author Comment

by:moonrisesystems
ID: 36517328
The compiler directive UNICODE is defined by default in XE, which is what I want anyway so this is not the issue.

Also, I get the same result using TJvDBGrid.  The problem is definitely in the CDS (or the XMLTransFormProvider) somewhere.  I am not planning to use the data in any data aware controls, simply obtaining the values, which in my app represent settings.

To recreate this is a bit involved and requires creating xtr files using the XMLMapper utility which only comes with the enterprise version unfortunately.  I have included the contents of those files below.
You need a form with the ClientDataSet and XMLTransformProvider components filled out with the items below.  Once the data is set, a simple button with the code below will allow you to recreate this.  I hope I don't have typos here...!

1) Here is a sample of my xml file - saved as a unicode file (v6settings.xml):
<?xml version="1.0" ?> 
<v60netstopdata>
<nssettings>
  <ns_name>SettingNumber1</ns_name> 
  <ns_type>string</ns_type> 
  <ns_value>ValueNumber1</ns_value> 
  </nssettings>
<nssettings>
  <ns_name>SettingNumber2</ns_name> 
  <ns_type>string</ns_type> 
  <ns_value>¿¿¿¿</ns_value> 
  </nssettings>
<nssettings>
  <ns_name>SettingNumber3</ns_name> 
  <ns_type>string</ns_type> 
  <ns_value>ValueNumber3</ns_value> 
  </nssettings>
</v60netstopdata>

Open in new window

Note SettingNumber2 has a Unicode value.

(Also note: the ns_type field is used by my program and is not part of the problem here...)

All I am attempting to do is 1)Locate a specific setting and 2 retrieve it's value:
if ClientDataSet1.Locate('ns_name','SettingNumber2',[loCaseInsensitive]) then
  SomeVariable := ClientDataSet1.FieldValues['ns_value'];

Open in new window

This works for everything is my real xml file EXCEPT for Unicode values.  The return for this example value is '????'.

When I go to change a value however it works fine - it correctly replaces the ns_value with the unicode string.
if ClientDataSet1.Locate('ns_name','SettingNumber2',[loCaseInsensitive]) then
begin
  Edit;
  FieldValues[ns_value'] := '¿¿¿¿';
  Post; 
  ApplyUpdate(0);
end;

Open in new window

The ClientDataSet has the fields defined as ftWideString, and the xtr files have the datatypes listed as string.uni.  also, the ClientDataSet indexfieldnames is set to ns_name.

Here is the TransFormRead xtr file contents: (v6settingsToDP.xtr)
<XmlTransformation Version="1.0"><Transform Direction="ToCds" DataEncoding="UTF-16"><SelectEach dest="DATAPACKET\ROWDATA\ROW" from="\v60netstopdata\nssettings"><Select dest="@ns_name" from="\ns_name"/><Select dest="@ns_type" from="\ns_type"/><Select dest="@ns_value" from="\ns_value"/></SelectEach></Transform><XmlSchema RootName="v60netstopdata"><![CDATA[<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
  <xs:element name="v60netstopdata" type="v60netstopdataType"/>
  <xs:complexType name="v60netstopdataType">
    <xs:sequence>
      <xs:element name="nssettings" type="nssettingsType" minOccurs="0" maxOccurs="unbounded"/>
    </xs:sequence>
  </xs:complexType>
  <xs:element name="nssettings" type="nssettingsType"/>
  <xs:complexType name="nssettingsType">
    <xs:sequence>
      <xs:element name="ns_name" type="ns_nameType"/>
      <xs:element name="ns_type" type="ns_typeType"/>
      <xs:element name="ns_value" type="ns_valueType"/>
    </xs:sequence>
  </xs:complexType>
  <xs:element name="ns_name" type="ns_nameType"/>
  <xs:simpleType name="ns_nameType">
    <xs:restriction base="xs:string"/>
  </xs:simpleType>
  <xs:element name="ns_type" type="ns_typeType"/>
  <xs:simpleType name="ns_typeType">
    <xs:restriction base="xs:string"/>
  </xs:simpleType>
  <xs:element name="ns_value" type="ns_valueType"/>
  <xs:simpleType name="ns_valueType">
    <xs:restriction base="xs:string"/>
  </xs:simpleType>
</xs:schema>]]></XmlSchema><CdsSkeleton/><XslTransform/><Skeleton><![CDATA[<?xml version="1.0"?><DATAPACKET Version="2.0"><METADATA><FIELDS><FIELD attrname="ns_name" fieldtype="string.uni" WIDTH="110"/><FIELD attrname="ns_type" fieldtype="string" WIDTH="8"/><FIELD attrname="ns_value" fieldtype="string.uni" WIDTH="2390"/></FIELDS><PARAMS/></METADATA><ROWDATA/><METADATA><FIELDS><FIELD attrname="ns_name" fieldtype="string.uni" WIDTH="110"/><FIELD attrname="ns_type" fieldtype="string" WIDTH="8"/><FIELD attrname="ns_value" fieldtype="string.uni" WIDTH="2390"/></FIELDS><PARAMS/></METADATA><ROWDATA/></DATAPACKET>
]]></Skeleton></XmlTransformation>

Open in new window


Here is the TransFormWrite xtr file contents (v6settingsToXML.xtr)
<XmlTransformation Version="1.0"><Transform Direction="ToXml" DataEncoding="UTF-16"><SelectEach from="DATAPACKET\ROWDATA\ROW" dest="\v60netstopdata\nssettings"><Select from="@ns_name" dest="\ns_name"/><Select from="@ns_type" dest="\ns_type"/><Select from="@ns_value" dest="\ns_value"/></SelectEach></Transform><XmlSchema RootName="v60netstopdata"><![CDATA[<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
  <xs:element name="ns_name" type="ns_nameType"/>
  <xs:simpleType name="ns_nameType">
    <xs:restriction base="xs:string"/>
  </xs:simpleType>
  <xs:element name="ns_type" type="ns_typeType"/>
  <xs:simpleType name="ns_typeType">
    <xs:restriction base="xs:string"/>
  </xs:simpleType>
  <xs:element name="ns_value" type="ns_valueType"/>
  <xs:simpleType name="ns_valueType">
    <xs:restriction base="xs:string"/>
  </xs:simpleType>
  <xs:element name="nssettings" type="nssettingsType"/>
  <xs:complexType name="nssettingsType">
    <xs:sequence>
      <xs:element name="ns_name" type="ns_nameType"/>
      <xs:element name="ns_type" type="ns_typeType"/>
      <xs:element name="ns_value" type="ns_valueType"/>
    </xs:sequence>
  </xs:complexType>
  <xs:element name="v60netstopdata" type="v60netstopdataType"/>
  <xs:complexType name="v60netstopdataType">
    <xs:sequence>
      <xs:element name="nssettings" type="nssettingsType" minOccurs="0" maxOccurs="unbounded"/>
    </xs:sequence>
  </xs:complexType>
</xs:schema>]]></XmlSchema><CdsSkeleton><![CDATA[<DATAPACKET Version="2.0"><METADATA><FIELDS><FIELD attrname="ns_name" fieldtype="string.uni" WIDTH="110"/><FIELD attrname="ns_type" fieldtype="string" WIDTH="8"/><FIELD attrname="ns_value" fieldtype="string.uni" WIDTH="2390"/></FIELDS><PARAMS/></METADATA><ROWDATA/><METADATA><FIELDS><FIELD attrname="ns_name" fieldtype="string.uni" WIDTH="110"/><FIELD attrname="ns_type" fieldtype="string" WIDTH="8"/><FIELD attrname="ns_value" fieldtype="string.uni" WIDTH="2390"/></FIELDS><PARAMS/></METADATA><ROWDATA/><METADATA><FIELDS><FIELD attrname="ns_name" fieldtype="string.uni" WIDTH="110"/><FIELD attrname="ns_type" fieldtype="string" WIDTH="8"/><FIELD attrname="ns_value" fieldtype="string.uni" WIDTH="2390"/></FIELDS><PARAMS/></METADATA><ROWDATA/></DATAPACKET>
]]></CdsSkeleton><XslTransform/><Skeleton><![CDATA[<?xml version="1.0"?>
<v60netstopdata><nssettings><ns_name></ns_name><ns_type></ns_type><ns_value></ns_value></nssettings></v60netstopdata>
]]></Skeleton></XmlTransformation>

Open in new window


Finally - I also just confirmed that this issue exists in XE2 also...
0
 
LVL 25

Expert Comment

by:epasquier
ID: 36517384
pfeeewww... what a hell you are rising just to read XML values...

Have you considered using XML components, load your file and browse your XML tree to get the values, simply ?

I use a good one, much more efficient than MS XML crappy COM object
OpenXML : http://philo.de/xml/
the utility library needed is also part of Jedi, so you don't need to download it again
It is fast, memory savvy, very easy to use.. All this compared to MSXML which is a big ugly pig.
Only problem : documentation is a bit outdated. But you can find tutorials that you can adapt easily to the new methods.
I think I only needed recompile to use it with XE. I can't remember if I had to make some adjustments, but if I did it was quickly sorted out or I would remember.

Try it for size, you won't regret it.
0
 
LVL 1

Author Comment

by:moonrisesystems
ID: 36524369
If all I was trying to do was read a few XML values I agree that this is overkill.  My xml file contains over 4500 records and I need to use it like a standard database table meaning indexed lookups (browsing a tree likely won't perform well enough), Reads and Writes.  All of this works just fine (and quickly) as it is with the exception of a Read when a Unicode value is present.  

The more that I work with this the more it appears that this is a bug.  In the XMLMappere.exe utility when you open an xml file and define the datatypes of each field, it will correcltly display unicode values for the field in the Node Properties - Sample Values field.  But when you click on the Mapping tab and select 'XML to DataPacket', then click 'Create and Test Transformation', the datapacket values show ???? not the unicode characters.  If you select 'DataPacket to XML' then click 'Create and Test Transformation'  the values shown are correct....

0
 
LVL 25

Accepted Solution

by:
epasquier earned 500 total points
ID: 36524464
4500 records are nothing for OpenXML.
I use it to load/manipulate 50Mb XML, and it can do it in a few (dozen) of seconds. Well, its 50Mb after all, you have to expect it to take a little time.

Load you data into an internal data structure (hashfile or dictionary) -> less than 1s
Access data from your dictionary (almost instantaneous for only 4500 records with hash) and write it back only once when you need it (again less than 1s).
And the memory footprint of your internal data will be minimal.

XML is an EXCHANGE format. It is a most crappy way of storing internal data
0
Threat Intelligence Starter Resources

Integrating threat intelligence can be challenging, and not all companies are ready. These resources can help you build awareness and prepare for defense.

 
LVL 1

Author Comment

by:moonrisesystems
ID: 36531842
I have placed an incident report with Embarcadero on this issue and sure enough they have confirmed that there is a problem with the TCientDataSet / TXMLTransformProvider in the way that it handles Unicode Values. It is supposed to work as I had expected and coded for.  

It looks like there is no correct answer to my specific question, but I will spend some time looking at OpenXML as suggested and will post back my results.
0
 
LVL 25

Expert Comment

by:epasquier
ID: 36894826
HI ! I'm just doing a little "after-sale service" for my advices about OpenXML. Did you had any luck with it ?
0
 
LVL 1

Assisted Solution

by:moonrisesystems
moonrisesystems earned 0 total points
ID: 36903231
Hi!  I was going to respond back today so I am glad you reminded me.  It took me a while to get back to this because I was pulled in another direction for a few days...

I did not get too far with OpenXML mainly due to time constraints and lack of documentation. I did manage to determine that using TXMLDocument was way too slow for my requirements.  But one sugestion that you made in your advice - 'Load you data into an internal data structure (hashfile or dictionary)' gave me just enough of a clue to solve my problem - and it performs blazingly fast as well!  

With my original code I was attmpting to use the XML file itself as the data structure which is why I needed the .xtr transformation files for the TClientDataSet to work properly.  But now using TDictionary to load the data into, loading the entire data file and accessing specific records is instantaneous and the code to make it all work is much simpler.

So I formally accept your suggestion as the solution to my problem and award you the points!  Thanks for the Advice!

As for the TClientDataset prolem, I placed a report with QC at Embarcadero and after going back and forth a few times they have confirmed and found the problem code.  It is still listed as OPEN but I would think that the fix will be in the next Update...
0
 
LVL 1

Author Closing Comment

by:moonrisesystems
ID: 36935120
The solution was a suggestion to try a specific different approach which I had not thought of.  There were no specific steps or or code provided - I did the research and developed the code for the solution.  But the approach was absolutely the way to go.
0
 

Expert Comment

by:ambako_georgia
ID: 37795944
hi all
"Asked by: moonrisesystems"

Open unit DSIntf.pas
find function -->function StringToVariantArray(const S: Rawbytestring): OleVariant;
and change -->function StringToVariantArray(const S: UTF8String): OleVariant;
work 100%

best regard
ambako_georgia
0
 

Expert Comment

by:ambako_georgia
ID: 37796302
P.S.

and
open unit Xmlxform.pas
find string -->Result := DSIntf.StringToVariantArray(AnsiString(S));
change -->DSIntf.StringToVariantArray(S);

delete
DSIntf.dcu and  Xmlxform.dcu

rebuild projects
0

Featured Post

How your wiki can always stay up-to-date

Quip doubles as a “living” wiki and a project management tool that evolves with your organization. As you finish projects in Quip, the work remains, easily accessible to all team members, new and old.
- Increase transparency
- Onboard new hires faster
- Access from mobile/offline

Join & Write a Comment

Suggested Solutions

Title # Comments Views Activity
firstChar challenge 13 86
wordmultiple challenge 12 93
creating threads in delphi 1 55
Currency Conversion? 1 39
I was working on a PowerPoint add-in the other day and a client asked me "can you implement a feature which processes a chart when it's pasted into a slide from another deck?". It got me wondering how to hook into built-in ribbon events in Office.
This is about my first experience with programming Arduino.
An introduction to basic programming syntax in Java by creating a simple program. Viewers can follow the tutorial as they create their first class in Java. Definitions and explanations about each element are given to help prepare viewers for future …
In this seventh video of the Xpdf series, we discuss and demonstrate the PDFfonts utility, which lists all the fonts used in a PDF file. It does this via a command line interface, making it suitable for use in programs, scripts, batch files — any pl…

744 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

13 Experts available now in Live!

Get 1:1 Help Now