Solved

Transforming XML into formatted text using XSLT

Posted on 2004-04-19
6
245 Views
Last Modified: 2012-06-27
I want to transform XML files exported from Treepad into formatted text files that ShadowPlan on my Palm can import. All of the tutorials and help info I can find on XSLT focuses on XML -> XHTML and there is very little about how to use XSLT to generate formatted text. I've only been partially successful so far and am convinced that I'm approaching this from the wrong angle. I can't get <xsl:output method="text" indent="yes"> to work the way I believe it should.

The files are in a tree structure where each element can have an article/note attached.

Here are examples of the XML that will be used as input, and then the format that the output needs to be in to be imported into Shadow.

TreePad XML exported file:
<?xml version="1.0"?>
<treepad_xml version="1.0">
      <database>
            <name>Test Tree</name>
            <node>
                  <title>Test Tree</title>
                  <article datatype="Text"/>
                  <node>
                        <title>Item 1</title>
                        <article datatype="Text">This is item 1</article>
                        <node>
                              <title>Item 1a</title>
                              <article datatype="Text">This is item 1a</article>
                        </node>
                  </node>
                  <node>
                        <title>Item 2</title>
                        <article datatype="Text">This is item 2</article>
                        <node>
                              <title>Item 2a</title>
                              <article datatype="Text">This is item 2a</article>
                        </node>
                  </node>
            </node>
      </database>
</treepad_xml>


Rules for importing text into Shadow:
1) notes can be on multiple lines but must start with <Note: and end with >
2) items can be indented with tabs or spaces to indicate hierarchy. This example uses tabs to make the indentations clear.

The above XML file should result in exactly this text file here:

Item 1
<Note: this is item1>
      Item 1a
<Note: this is item 1a>
Item 2
<Note: this is item 2>
      Item 2a
<Note: this is item 2a>

0
Comment
Question by:mbryan822
  • 3
  • 2
6 Comments
 
LVL 6

Expert Comment

by:metalmickey
Comment Utility
so for every instance of node you want to generate essentially an unordered list with sub-lists within each nested node?
0
 
LVL 6

Expert Comment

by:metalmickey
Comment Utility
this xslt will transform your xml intoa tree structure, although this is the html way of doing it....


<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:fo="http://www.w3.org/1999/XSL/Format">
  <xsl:output method="html" indent="yes"/>
  <xsl:template match="/">
    <html>
      <head>
        <title/>
      </head>
      <body>
        <xsl:apply-templates/>
      </body>
    </html>
  </xsl:template>
  <xsl:template match="database">
    <ul>
      <xsl:apply-templates/>
    </ul>
  </xsl:template>
  <xsl:template match="node">
    <ul>
      <xsl:apply-templates/>
    </ul>
  </xsl:template>
  <xsl:template match="name">
    <li>
      <xsl:apply-templates/>
    </li>
  </xsl:template>
  <xsl:template match="title">
    <li>
      <xsl:apply-templates/>
    </li>
  </xsl:template>
  <xsl:template match="article">
    <li>
      <xsl:apply-templates/>
    </li>
  </xsl:template>
  <xsl:template match="@datatype"/>
</xsl:stylesheet>

You'll need to translate the <ul> into the linebreak equivalent and the li's into tab spaces. Since there is no markup around the text it may be difficult to indent the tabs using the xsl above.

its not the solution, so no points here, but it may provide some usight into the transformation structure of the xslt.


HTH

MM



0
 
LVL 10

Accepted Solution

by:
Yury_Delendik earned 500 total points
Comment Utility
<?xml version="1.0" encoding="utf-8" ?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" >

  <xsl:output method="text" />

  <xsl:template match="database">
    <xsl:apply-templates select="node" />
  </xsl:template>

  <xsl:template match="node">
    <xsl:param name="ident" select="''" />

    <xsl:value-of select="$ident" />    
    <xsl:value-of select="title" />
    <xsl:text>&#13;&#10;</xsl:text>

    <xsl:value-of select="$ident" />    
    <xsl:text>&lt;Note: </xsl:text>
    <xsl:value-of select="article" />
    <xsl:text>&gt;&#13;&#10;</xsl:text>

    <xsl:apply-templates select="node">
       <xsl:with-param name="ident" select="concat($ident, '  ')" />
    </xsl:apply-templates>    
  </xsl:template>
</xsl:stylesheet>

Makes:

Test Tree
<Note: >
  Item 1
  <Note: This is item 1>
    Item 1a
    <Note: This is item 1a>
  Item 2
  <Note: This is item 2>
    Item 2a
    <Note: This is item 2a>
0
Highfive Gives IT Their Time Back

Highfive is so simple that setting up every meeting room takes just minutes and every employee will be able to start or join a call from any room with ease. Never be called into a meeting just to get it started again. This is how video conferencing should work!

 

Author Comment

by:mbryan822
Comment Utility
metalmickey - thanks for the info you posted, this will help me even though it's not exactly what I need right now. I appreciate the time you took to post the info you did. I'm trying to learn XSLT and anything helps right now.

--------------------------

Yury - your solution is VERY close but there are a couple small changes that need to be made to make the resulting text work with Shadow's import mechanism. I'm not good enough with XSLT yet (this is my first attempt at using it) and can't figure out how to modify your solution to get what I need.

1. indentation should be one space (or tab) per level. Your solution somehow creates 2 spaces instead of 1 space for each level of indentation. This confuses Shadow and the file doesn't import correctly until I manually remove the extra spaces.

2. the <Note: lines must have no spaces preceeding them or they are seen as nodes instead of notes for the preceeding node. See my original post to see that the <Note: lines are not indented.
0
 

Author Comment

by:mbryan822
Comment Utility
I am increasing the point value for this question because I can't find a solution yet even though Yury's example was very close.

I must update the test .xml file because the one I submitted had been modified after being exported by Treepad. I guess I modified it in my attempts to get something working.

Here is what the test tree xml file that I orginally posted really should look like:

<?xml version="1.0"?>
<treepad_xml version="1.0">
      <database>
            <name>
Test Tree
</name>
            <node>
                  <title>
Test Tree
</title>
                  <article datatype="Text">

</article>
                  <node>
                        <title>
Item 1
</title>
                        <article datatype="Text">
This is item 1&#13;
</article>
                        <node>
                              <title>
Item 1a
</title>
                              <article datatype="Text">
This is item 1a
</article>
                        </node>
                  </node>
                  <node>
                        <title>
Item 2
</title>
                        <article datatype="Text">
This is item 2&#13;
</article>
                        <node>
                              <title>
Item 2a
</title>
                              <article datatype="Text">
This is item 2a&#13;
</article>
                        </node>
                  </node>
            </node>
      </database>
</treepad_xml>


I need the output to look like this:

Item 1
<Note: this is item1>
 Item 1a
<Note: this is item 1a>
Item 2
<Note: this is item 2>
 Item 2a
<Note: this is item 2a>


Note that Item 1a and Item 2a are indented by one space. This causes them to become children of Item 1 and Item 2 respectively when imported into Shadow on my Palm. If Item 1a had a child it would be indented with 2 spaces. Tabs are also ok instead of spaces, but it must remain consistent throughout the file.
Also note that the <Note: lines must appear on one line.

If I can get it to work with this .xml file as well as Yury's original solution did, I could finish the job with a simple AWK script. But, it seems to me that XSLT should be able to do it all.

I really appreciate any help anyone can give me with this. Even hints are very welcome. The things I'm not understanding are:
1. How do I control indentation?
2. How do I force CRLF's where I need them?
Extra Credit:
1. On the Palm, notes have a 4k limit, so for this to *really* work, it will need to split long notes up into 4k chunks. Depending on how difficult this will be to do, this project may not be worth pursuing. To work properly, the resulting file would have to look something like the example below.
(using the above example, if Item 1a had a note larger than 8k but smaller than 12k, it would split the note into 3 parts, each a child of Item 1a)

 Item 1a
  Part 1
<Note: this is the first 4k>
  Part 2
<Note: this is the second 4k>
  Part 3
<Note: this is the remainder of the original note>

Is this sort of thing even possible with XSLT?

Thank you for any information, clues, hints or suggestions.
If I can make this work for large Treepad files, I will post the solution on the Treepad website in the utilities section so other Treepad users who also use Shadow on the palm will be able to share data too.
0
 

Author Comment

by:mbryan822
Comment Utility
I made a mistake!
I said:
Also note that the <Note: lines must appear on one line.

this really should say:
Also note that the <Note: lines must start in column 1 with no preceeding spaces. They can appear on one line, or multiple lines. The rules for Notes are:
1. start with <Note: with no preceeding spaces
2. end with the first ">" encountered - this means that the Treepad file cannot contain "<" or ">" characters.

Sorry for the oversight.
0

Featured Post

How your wiki can always stay up-to-date

Quip doubles as a “living” wiki and a project management tool that evolves with your organization. As you finish projects in Quip, the work remains, easily accessible to all team members, new and old.
- Increase transparency
- Onboard new hires faster
- Access from mobile/offline

Join & Write a Comment

Suggested Solutions

Title # Comments Views Activity
xml read and write 3 58
D3, mouseover, SVG, Javacript 6 76
Fetch XML Unions? 3 357
Unable to resolve XML http request 4 62
Browsing the questions asked to the Experts of this forum, you will be amazed to see how many times people are headaching about monster regular expressions (regex) to select that specific part of some HTML or XML file they want to extract. The examp…
Many times as a report developer I've been asked to display normalized data such as three rows with values Jack, Joe, and Bob as a single comma-separated string such as 'Jack, Joe, Bob', and vice versa.  Here's how to do it. 
Sending a Secure fax is easy with eFax Corporate (http://www.enterprise.efax.com). First, Just open a new email message.  In the To field, type your recipient's fax number @efaxsend.com. You can even send a secure international fax — just include t…
Get a first impression of how PRTG looks and learn how it works.   This video is a short introduction to PRTG, as an initial overview or as a quick start for new PRTG users.

744 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

11 Experts available now in Live!

Get 1:1 Help Now