Solved

HOW DO I ESCAPE XML RESERVED CHARACTERS LIKE &, <, >, ETC. ?

Posted on 2004-04-28
9
18,832 Views
Last Modified: 2013-11-19
I have created a PL/SQL Procedure to query data from some tables and creating an
XML document. Some of the data contains XML reserved characters like the
ampersand(&), less-than sign (<), greater-than sign (>), apostrophe(') and
quotation mark(").
--
Is there a PL/SQL Function or procedure that will escape all XML reserved
characters?
--
Here is one of my PL/SQL functions:
--
/***********************************************************
*
**********************************************************/
FUNCTION get_contributors_fragment
(
p_description_id IN NUMBER DEFAULT -1
)
RETURN CLOB
AS
-- Local Variables.
CURSOR cur
(
p_id IN NUMBER
) IS
SELECT '<contributor>' ||
'<person>' ||
'<person_id>'||personID||'</person_id>' ||
'<person_name>' ||
'<last_name>'||lastname||'</last_name>' ||
'<first_name>'||firstname||'</first_name>' ||
'</person_name>' ||
'</person>' ||
'<institution>' ||
'<institution_id>'||institutionID||'</institution_id>' ||
'<institution_name>'||institutionName||'</institution_name>' ||
'<institution_code>'||institutionCode||'</institution_code>' ||
'<url>'||institutionWebsite||'</url>' ||
'</institution>' ||
'</contributor>' AS XML
FROM xylem.v_contributor
WHERE descriptionid = 114;
--
xmldoc CLOB := NULL;
BEGIN
FOR rec IN cur( p_description_id )
LOOP
xmldoc := xmldoc || rec.xml;
END LOOP;
IF xmldoc IS NOT NULL THEN
xmldoc := '<contributors>' || xmldoc || '</contributors>';
END IF;
RETURN xmldoc;
END;
0
Comment
Question by:tdsimpso
  • 4
  • 3
  • 2
9 Comments
 
LVL 23

Expert Comment

by:seazodiac
ID: 10940468
First of all, let's hope that you are in Oracle9i:)


here is the solution from Oracle


Problem Description
-------------------
 
You have an '&' (ampersand) stored as a part of the data inside the database
table. You want to retrieve the data from table in XML format using the Oracle
provided XML SQL Utility(XSU) and retain the format of the text (for example if you
have data that is already in XML format).  You use DBMS_XMLQuery package to query
the data and the output result shows '&amp;' instead of '&' (ampersand).
 
 
Solution Description
--------------------
 
There is no way of easily handling this using the DBMS_XMLQUERY package,
use the DBMS_XMLGEN package that comes with 9i database instead.
 
The package is similar to the DBMS_XMLQuery package, except that it is written
in C and compiled into the database kernel.
 
The package has  a procedure "setConvertSpecialChars(ctxHandle, boolean);"  through
which you can set whether special characters should be converted or not
to their escaped representation. The default is to perform the conversion.  
 
 
Code Sample using DBMS_XMLGEN Pacakge to Escape special XML Characters
======================================================================
 
 
REM serveroutput on so we can see the results
set serveroutput on
REM scan off to prevent parsing of ampersands in sql*plus
set scan off
 
 
REM Drop table if present
 
drop table xmltab;
 
REM Create table
   
create table xmltab (specchar varchar2(4000));
 
REM Insert the data that has special XML characters like '&', '<' and '>'
 
   insert into xmltab values ('my text &amp; comment');
   insert into xmltab values ('<index>1</index><data>My Data</data>');
 
REM  Run the anonymous PL/SQL block to retrieve the data in XML format from
REM  the "xmltab" table  
   
declare
   
  -- declare variables
  queryCtx DBMS_XMLGEN.ctxhandle;
  result CLOB;
  xmlstr varchar2(32767);
  line varchar2(4000);
 
begin
 
   -- create a new context handle from a passed in SQL query
   queryCtx := DBMS_XMLGEN.newContext('select * from xmltab');
 
   -- set the conversion of special xml characters to false
   DBMS_XMLGEN.setConvertSpecialChars(queryCtx, false);
     
   -- get XML
   result := DBMS_XMLGEN.getXML(queryCtx);
 
   -- to print the data
   xmlstr := dbms_lob.SUBSTR(result,32767);
   loop
      exit when xmlstr is null;
      line := substr(xmlstr,1,instr(xmlstr,chr(10))-1);
      dbms_output.put_line('| '||line);
      xmlstr := substr(xmlstr,instr(xmlstr,chr(10))+1);
   end loop;
 
   -- close the context
   DBMS_XMLGEN.closeContext(queryCtx);  
end;
/
   
 
Result
========
 
| <?xml version="1.0"?>
| <ROWSET>
|  <ROW>
|   <SPECCHAR>my text &amp; comment</SPECCHAR>
|  </ROW>
|  <ROW>
|   <SPECCHAR><index>1</index><data>My Data</data></SPECCHAR>
|  </ROW>
| </ROWSET>
 
 
Explanation
-----------
 
In raw XML, the ampersand(&), less-than sign(<), greater-than sign(>), double  
quotes(") and apostrophes(') are interpreted as part of the markup instead of
character data.  Therefore the XSU encodes the raw ampersand as '&amp;' when
converting the text data stored inside the database into a well-formed XML
document.  This can be a problem if the data you are processing is already in XML
format and so Oracle has provided an API call to make this easier to do in the DBMS_XMLGEN package.
 

0
 

Author Comment

by:tdsimpso
ID: 10940789
I saw this article but the problem is I have a custom XML document and the result of:

| <?xml version="1.0"?>
| <ROWSET>
|  <ROW>
|   <SPECCHAR>my text &amp; comment</SPECCHAR>
|  </ROW>
|  <ROW>
|   <SPECCHAR><index>1</index><data>My Data</data></SPECCHAR>
|  </ROW>
| </ROWSET>

Does not work for me.

I need to extract "my text &amp; comment" and put it in my XML Document:
| <?xml version="1.0"?>
|  <book>
|   <title>my text &amp; comment</title>
|  </book>
|  <book>
|   <title><index>1</index><data>My Data</data></title>
|  </book>


So I need something like this:

Select XMLFormat( title ) as xmlTitle from book;

The "XMLFormat" Function would escape all XML reserved characters.

I do not need to format the whole record set.

0
 

Author Comment

by:tdsimpso
ID: 10940998
I altered the example for my needs like this:

DECLARE
  -- declare variables
  queryCtx DBMS_XMLGEN.ctxhandle := NULL;
  result CLOB;
  xmlstr VARCHAR2(32767);
  line VARCHAR2(4000);
BEGIN
   --
   -- create a new context handle from a passed in SQL query
   queryCtx := DBMS_XMLGEN.newContext('SELECT featureid, itemnumber, description FROM v_feature WHERE featureid < 20 AND Itemnumber IS NOT NULL');
   --
   -- set the conversion of special xml characters to false
   --DBMS_XMLGEN.setConvertSpecialChars(queryCtx, FALSE);
   --
   DBMS_XMLGEN.setRowTag( queryCtx, 'feature');
   DBMS_XMLGEN.setRowSetTag( queryCtx, 'features' );
   -- get XML
   result := DBMS_XMLGEN.getXML(queryCtx);
   --
   -- to print the data
   xmlstr := dbms_lob.SUBSTR(result,32767);
   --DBMS_OUTPUT.PUT_LINE( xmlstr );
   LOOP
      EXIT WHEN xmlstr IS NULL;
      line := SUBSTR(xmlstr,1,INSTR(xmlstr,CHR(10))-1);
      DBMS_OUTPUT.PUT_LINE('| '||line);
      xmlstr := SUBSTR(xmlstr,INSTR(xmlstr,CHR(10))+1);
   END LOOP;
  --
   -- close the context
   DBMS_XMLGEN.closeContext(queryCtx);  
END;

And I got this in return:
|<?xml version="1.0"?>
| <features>
|  <feature>
|   <FEATUREID>2</FEATUREID>
|   <ITEMNUMBER>1</ITEMNUMBER>
|   <DESCRIPTION>Growth ring boundaries distinct</DESCRIPTION>
|  </feature>
|  <feature>
|   <FEATUREID>3</FEATUREID>
|   <ITEMNUMBER>2</ITEMNUMBER>
|   <DESCRIPTION>Growth ring boundaries indistinct or absent</DESCRIPTION>
|  </feature>
|  <feature>
|   <FEATUREID>6</FEATUREID>
|   <ITEMNUMBER>3</ITEMNUMBER>
|   <DESCRIPTION>Wood ring-porous</DESCRIPTION>
|  </feature>
|  <feature>
|   <FEATUREID>7</FEATUREID>
|   <ITEMNUMBER>4</ITEMNUMBER>
|   <DESCRIPTION>Wood semi-ring-porous</DESCRIPTION>
|  </feature>
|  <feature>
|   <FEATUREID>8</FEATUREID>
|   <ITEMNUMBER>5</ITEMNUMBER>
|   <DESCRIPTION>Wood diffuse-porous</DESCRIPTION>
|  </feature>
|  <feature>
|   <FEATUREID>10</FEATUREID>
|   <ITEMNUMBER>6</ITEMNUMBER>
|   <DESCRIPTION>Vessels in tangential bands</DESCRIPTION>
|  </feature>
|  <feature>
|   <FEATUREID>11</FEATUREID>
|   <ITEMNUMBER>7</ITEMNUMBER>
|   <DESCRIPTION>Vessels in diagonal and / or radial pattern</DESCRIPTION>
|  </feature>
|  <feature>
|   <FEATUREID>12</FEATUREID>
|   <ITEMNUMBER>8</ITEMNUMBER>
|   <DESCRIPTION>Vessels in dendritic pattern</DESCRIPTION>
|  </feature>
|  <feature>
|   <FEATUREID>14</FEATUREID>
|   <ITEMNUMBER>9</ITEMNUMBER>
|   <DESCRIPTION>Vessels exclusively solitary (90% or more)</DESCRIPTION>
|  </feature>
|  <feature>
|   <FEATUREID>15</FEATUREID>
|   <ITEMNUMBER>10</ITEMNUMBER>
|   <DESCRIPTION>Vessels in radial multiples of 4 or more common</DESCRIPTION>
|  </feature>
|  <feature>
|   <FEATUREID>16</FEATUREID>
|   <ITEMNUMBER>11</ITEMNUMBER>
|   <DESCRIPTION>Vessel clusters common</DESCRIPTION>
|  </feature>
|  <feature>
|   <FEATUREID>18</FEATUREID>
|   <ITEMNUMBER>12</ITEMNUMBER>
|   <DESCRIPTION>Solitary vessel outline angular</DESCRIPTION>
|  </feature>
| </features>

I was able the change the labels of the RowSet Tag and Row Tag to <features> and <feature>.
The problem is that I need to remove this: <?xml version="1.0"?> because I am appending this to my own XML Documents.  I also need to change the column name/tag
from <FEATUREID> to <feature_id>
from <ITEMNUMBER> to <item_number>
from <DESCRIPTION> to <feature_desc>

I can not find anyway to change the other element tags.

Any Ideas?
0
 
LVL 76

Assisted Solution

by:slightwv (䄆 Netminder)
slightwv (䄆 Netminder) earned 30 total points
ID: 10941135
If you don't wish to use the DOM or XMLQuery/XMLGen, just create the function that you suggested and brute force the replacements:

note: typed in and not tested/validated (and only show for '&' and '<'

create or replace function XMLformat (myStr in varchar2) return varchar2 is
begin
  return(  replace(replace(myStr,'&','&amp;),'<','&lt;')  );
end;

0
Find Ransomware Secrets With All-Source Analysis

Ransomware has become a major concern for organizations; its prevalence has grown due to past successes achieved by threat actors. While each ransomware variant is different, we’ve seen some common tactics and trends used among the authors of the malware.

 
LVL 23

Accepted Solution

by:
seazodiac earned 470 total points
ID: 10941198
@tdsimpso: ----------->I also need to change the column name/tag
from <FEATUREID> to <feature_id>
from <ITEMNUMBER> to <item_number>
from <DESCRIPTION> to <feature_desc>



well this is pretty easy, you just need to change your query like this:

queryCtx := DBMS_XMLGEN.newContext('SELECT featureid "feature_id", itemnumber "item_number", description "feature_desc" FROM v_feature WHERE featureid < 20 AND Itemnumber IS NOT NULL');
0
 
LVL 23

Expert Comment

by:seazodiac
ID: 10941233
For this :

<?xml version="1.0"?>


Header definition, you cannot shake it off , without it, XML documents generated by Oracle would not be considered complete...
0
 

Author Comment

by:tdsimpso
ID: 10941787
Thanks seazodiac.  Just what I needed.  Too bad I can not strip off the header definition.

slightwv, how can the DOM in Oracle be useful.  Is the DOM use to create more complicated XML Documents?
This might be another another expert question.  I'll create another one.
0
 

Author Comment

by:tdsimpso
ID: 10941875
I just posted another question about the Oracle DBMS_XMLDOM.

http://www.experts-exchange.com/Databases/Oracle/Q_20971066.html
0
 
LVL 76

Expert Comment

by:slightwv (䄆 Netminder)
ID: 10942030
already responded to........

Yes, to DOM allows more complex features. In a nutshell:  It gives you access to the nuts and bolts in the XML.
0

Featured Post

How to improve team productivity

Quip adds documents, spreadsheets, and tasklists to your Slack experience
- Elevate ideas to Quip docs
- Share Quip docs in Slack
- Get notified of changes to your docs
- Available on iOS/Android/Desktop/Web
- Online/Offline

Join & Write a Comment

I remember the day when someone asked me to create a user for an application developement. The user should be able to create views and materialized views and, so, I used the following syntax: (CODE) This way, I guessed, I would ensure that use…
I will show you how to create a ASP.NET Captcha control without using any HTTP HANDELRS or what so ever. you can easily plug it into your web pages. For Example a = 2 + 3 (where 2 and 3 are 2 random numbers) Session("Answer") = 5 then we…
This video explains what a user managed backup is and shows how to take one, providing a couple of simple example scripts.
The viewer will learn how to look for a specific file type in a local or remote server directory using PHP.

705 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

13 Experts available now in Live!

Get 1:1 Help Now