Adding FO  to an existing XSL that was used to convert XML to HTML.  I need to generate a PDF version of the output

Posted on 2014-07-30
Last Modified: 2014-08-06
I have a number of successful XML/XSL transformation that generate complex HTML pages.  I need to make instead a PDF that can be embedded in an XML file instead of an HTML that is displayed in response.  I have started with my simplest XML/XSL to figure out how to make this work.

I see in docs that this is a two step process but I am not sure how I need to accomplish this solution.

I find tutorials on XSL with fo, but do I use my current transformation and output XML, then run that through another XSL with the FO commands to apply formatting to tables and such.  Or is this stuff I need to add to my current XSL sheet by inserting FO entries throughout my format?

The biggest part of the output is various tables, so it would be nice to be able to adjust the "class" attributes I use to format into ones that pick up formatting from the fo templates and attributes.

But I am not clear on the overall approach.  Once I get that ahh-haa moment when I can see to full picture I can start hitting the detail of what is needed.

BTW - I use <oXygen/> XML Editor to test my setups before I deploy them to the Linux Web Server with Apache FOP.  I did get a simple setup I found in Experts to work, but I can not seem to build that forward to function with my XSL.

I would be glad to provide samples if this is too vague.
Question by:nlpalmquist
  • 3
  • 3
LVL 60

Expert Comment

by:Geert Bormans
ID: 40230505
Since I have no clue about the complexity of the transforms, I give you some generic hints of what you can do (and what I have done before so I know it makes sense)

I usually take the other way round. If I need both XSL-FO and HTML, I tend to develop teh XSL-FO first and derive the HTML from there (using an XSLT that does XSL-FO into HTML (there is a good example on the RENDER-X website you can use as a basis to that

Your situation is different since you already have the XSLT for HTML.
If at all possible, try to start from the existing XHTML
If you need to use th eHTML as a starting point for the next XSLT you will need to change the output method
in the existing XSLT to have a wellformed XML as HTML (xsl:output method="xml")
I would consider that my first option, develop an XSLT that transforms an HTML into XSL-FO

second viable option means interfering the existing XSLT and overload templates, that is somewhat more advanced and we will discuss that if the first option does not work for you

If you are not bound to FOP but could eg. use ANtenna House Formatter instead... just add a different media type to your CSS and let Antenna House format the HTML + CSS into PDF. I did pretty fancy books using that technique

Using Oxygen transform scenarios you can set up pipelines by checking all transforms at once in the scenario, but I assume you knew that

good luck
LVL 62

Expert Comment

ID: 40231001
Nobody did it before,.. In 20 years of XSL and PDF...

Author Comment

ID: 40232084

I am stuck with Apache FOP.  Could you look at the xsl and rough in how I could add FO?  I have tried to follow tutorials and samples and I am missing something.

I have attached my last files.  Sample XML and my current version for XSL-FO.  Maybe I need to remove some of the HTML tags that make no sense like HEAD and BODY.

I have broken down my output to Headings, Body, and Footer.  Which I hope will generate a header on each page, the body that might run from page to page like a normal book by filling a page and then starting a new page, and a footer on each page.

It seems to generate what looks to me like a valid FO file.

I am not worried about the stylesheets, I figure I have to replace that by inserting some kind of FO:BLOCK and FO:INLINE but want to get a basic translation to work before I attempt that step.
PRTG Network Monitor: Intuitive Network Monitoring

Network Monitoring is essential to ensure that computer systems and network devices are running. Use PRTG to monitor LANs, servers, websites, applications and devices, bandwidth, virtual environments, remote systems, IoT, and many more. PRTG is easy to set up & use.

LVL 60

Expert Comment

by:Geert Bormans
ID: 40232340
@gheist, I really don't know how to make sense of your comment

I have been creating PDF from XML, HTML and SGML for about 25 years now, using DSSSL, XSL-FO, OmniMark, XSLT, CSS, FrameMaker, InDesign,  and a whole bunch of high end publishing tools for really tricky books and demanding publishers... you might be surprised what can be done
Creating the PDF is the hard part, embedding it in XML is but a matter of referencing it (if a reference is acceptable) or encoding it using Base64 and stuffing that in as string text
SO I don't know what your problem is

@nlpalmquist, I will look into your xsl later this evening or tomorrow, will return from holidays two days from now, so can't rush it

Author Comment

ID: 40232708
I've been messing with this all day and the basic theory I have found is that I need to look at my HTML output document from my normal XML/XSL transformation like it is just another XML input document.  Maybe it needs a bit of tweaking for paging purposes at that first level.

If I do that I can look for HTML tags like <td></td> and and insert the FO commands into the document instead,

would change to
  <fo:table-cell border="solid 1px black">
            <fo:block font-weight="bold">Something</fo:block>
I figure I can apply templates to make this happen but I can only get a sample from one of the tutorials to work when it is just straight FO to start with.  

I don't see how to do that except to run my command line to make the HTML and then pick up the HTML document as the input XML and apply the FO stylesheet to make the PDF.

I have been trying to do both formats at the same time and maybe that is my problem.

I really appreciate any direction you can give me.  I can certainly build the details once I can get a small solution working.
LVL 60

Accepted Solution

Geert Bormans earned 500 total points
ID: 40232966
basic theory I have found is that I need to look at my HTML output document from my normal XML/XSL transformation like it is just another XML input document.

Yes, that is exactly what I meant with my first option.
You might need to change the serialisation settings so that the input of the next transform is really XML (XHTML) and not HTML. Or don't serialise at all in between steps

The HTML table model does not match the FO table model 100%, so there is some tweaking to do other than just making each td into a fo:table-cell

Author Closing Comment

ID: 40244592
Thank you for confirming my assumptions.  I always find docs written by programmers to assume many things that should be stated more exactly.

Featured Post

The Eight Noble Truths of Backup and Recovery

How can IT departments tackle the challenges of a Big Data world? This white paper provides a roadmap to success and helps companies ensure that all their data is safe and secure, no matter if it resides on-premise with physical or virtual machines or in the cloud.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
parse convert xml feed to text (python) 2 81
Create XML 5 49
MS Access XML Export Query Setup Multiple Tag Values 15 35
Issue with XSLT mapping 10 35
Preface This article introduces an authentication and authorization system for a website.  It is understood by the author and the project contributors that there is no such thing as a "one size fits all" system.  That being said, there is a certa…
JavaScript has plenty of pieces of code people often just copy/paste from somewhere but never quite fully understand. Self-Executing functions are just one good example that I'll try to demystify here.
Viewers will learn about arithmetic and Boolean expressions in Java and the logical operators used to create Boolean expressions. We will cover the symbols used for arithmetic expressions and define each logical operator and how to use them in Boole…
HTML5 has deprecated a few of the older ways of showing media as well as offering up a new way to create games and animations. Audio, video, and canvas are just a few of the adjustments made between XHTML and HTML5. As we learned in our last micr…

776 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question