Tree regular expressions

Hello,

If I understood it correctly, RELAX NG is a tree regular expression language for xml, right?
http://relaxng.org/#other-pages

With "classical" regular expressions, you can also use them to do substitutions, such as:
++++++++++++++++++++++++++++++++++++
$ sed -e "s/\([0-9]\)\([a-z]\)/CHANGED(\2 \1)/g"
This is a sample text with 2a inside.
This is a sample text with CHANGED(a 2) inside.
++++++++++++++++++++++++++++++++++++

Is it possible to use RELAX NG to match xml parts and make substitutions?
It would be an advanced version of XSLT.

For instance, I could look for "card" elements that do not have an email, and insert a default one.
  <card>
    <name>Smith</name>
  </card>

  <card>
    <name>Smith</name>
    <email>Smith@example.com</email>
  </card>

-------
Are you aware if a solution for this already exists?

Best regards,
DAvid Portabella

dportabellaAsked:
Who is Participating?
 
Geert BormansInformation ArchitectCommented:
>I don't see why that could not be also the case with trees, instead of strings.

you are correct, there is no reason why that is not the case,
but RelaxNG aims at validation only, not substitution

I am not aware of projects extending RelaxNG in a way as you described.

I came accross this
http://www.springerlink.com/content/rdmgm0jadccwp290/
I am not aware that this is useful, or that any of Tadeusz' work is somewhere online
but it might be worthwhile sending him an email

cheers

Geert
0
 
Geert BormansInformation ArchitectCommented:
Relax NG above all is a schema language... it is for validation only.
One of the things it does differently compared to W3C schema is that it doesn't change the XML document
(W3C schema does things such as providing default values of attributes et al.)
Relax NG doesn't even do that, it just tells an XML document is valid according to the schema or not

With "tree regular expression" the definition means to express the following.
RelaxNG describes "patterns" of XML documents (like a regex describes a pattern of a string)
XML documents are like trees (hierarchical), hence RelaxNG describes tree patterns,
it is a sort of a regular expression language for XML documents
A RelaxNG validator checks whether there is a match between the patterns in the schema and the document,
if there is, the document is "valid"

You can't use a RelaxNG schema for changing your document
... that would be a transformation, which you need to express in XSLT.

You could build an application that alters the document, based on a relaxing schema
That application would be best developed in XSLT
(I am not aware that such an application floats around somewhere)
You could also abuse Schematrons allerting mechanism to achieve what you want,
but I am quite convinced that XSLT is really the way to go with this

Using XSLTs push mechanism with apply templates
and starting from an identity transform
the XSLT to achieve what you need would be fairly simple

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:template match="node()">
    <xsl:copy>
        <xsl:copy-of select="@*"/>
        <xsl:apply-templates select="node()"/>
    </xsl:copy>
</xsl:template>
 
    <xsl:template match="card">
        <xsl:copy>
            <xsl:copy-of select="@*"/>
            <xsl:apply-templates select="node()"/>
            <xsl:if test="not(email)">
                <email><xsl:value-of select="name"/><xsl:text>@example.com</xsl:text></email>
            </xsl:if>
        </xsl:copy>
    </xsl:template>
</xsl:stylesheet>

but I suspect you need a more generic approach?

cheers

Geert

0
 
dportabellaAuthor Commented:
Gertone,

Thanks for the info.
With a regular expression, you can say whether a string is valid or not.
Furthermore, you can use it to transform the string.

I don't see why that could not be also the case with trees, instead of strings.

Thanks also for the XSLT example; however, the example of the email was a toy example.
As you say, I would need a more generic approach.

Before your message, I tried to look around for "Tree regular expressions substitutions" but I did not find anything. With your message, I realized that the correct keyword is "transformation" instead of "substitution". Looking around for "Tree regular expressions transformation", I found:
>Parse::Eyapp introduces a new language called Tree Regular Expressions that easies the transformation of trees.
>http://nereida.deioc.ull.es/~pl/perlexamples/node239.html

which maybe could do the job.

However, I would prefer to find a more standard project (maybe using RELAX NG).

I do think that other people have thought of extending RELAX NG to use it also for transformation.
Any idea of how to find such a project? (even if it is not based on RELAX NG)

0
 
dportabellaAuthor Commented:
Geert, thanks for the info.

I wrote an email to him. Let's see.
I also realized that I only need to look for "transformation language for XML",
and several alternatives to XSLT appear.

One that seems what I was looking for is Xcerpt: https://sourceforge.net/projects/xcerpt/
which is also implemented in Java (which I need).
Unfortunately this project is in development and there is very few documentation.

So, still looking for an appropriate package.
The problem is not yet solved, but I think that I should already award you the points.

Many thanks,
DAvid Portabella
0
 
Geert BormansInformation ArchitectCommented:
welcome,
good luck
(I ll give xcerpt a look)
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.