Link to home
Start Free TrialLog in
Avatar of jdharsha
jdharsha

asked on

stream bytearray from java as image during pdf creation using Apache FOP

Hello,
How to stream a bytearray from java and print it on the pdf which is being created dynamically using Apache FOP.
Thanks,
Avatar of mrcoffee365
mrcoffee365
Flag of United States of America image

What have you tried so far?  Do you just need a link to the FOP documentation online?

Usually people post the code they have tried so far, and post the error messages they get with their question.

FOP documentation:
http://xmlgraphics.apache.org/fop/1.1/servlets.html
One code example site:
http://www.brucephillips.name/blog/index.cfm/2008/11/4/Use-Apache-FOP-Library-To-Create-A-PDF-From-XML
Avatar of jdharsha
jdharsha

ASKER

Sure, please find below what I did so far, let me know if this helps...

I have a helper where I set the byte stream into a pojo like below:

      byte[] imageInBytes = object.getLogo();
      try {
            InputStream in = new ByteArrayInputStream(imageInBytes);
            BufferedImage bImageFromConvert = ImageIO.read(in);
            dataBean.setLogo(bImageFromConvert.toString());
      } catch (IOException e) {
            log.error("Error creating Logo", e);
      }
            
and finally set into an object representation of this XML like below..

objXML.setLOGO(dataBean.getLogo());

Now I'm not sure what would be syntax to stream that into the XSLT??

<fo:external-graphic src='../LOGO'/>
So you don't want to use FOP to create a PDF file?  Streaming xml to xslt is a different thing.  

Be warned that the FOP handling of graphics is poor.  It's possible, but tricky, especially if there is more than one graphic.  Shouldn't be so, but it is.
I'm using FOP to create PDF. My bad I don't mean streaming but rather set this "ByteArrayInputStream" image into the XML and retrieve this while generating the PDF.  

FYI, I prepare the XML by setting into its corresponding object representation in java and then apply the XSL template to this xml. Please let me know if this doesn't makes sense.
Okay -- so the xsl question was not related, right?

We'll assume you have a String with the XML in it for passing to FOP.

But wait -- you mean you want to put binary into an XML object?  That is not how it's done.  You put the binary file somewhere on a web server, and refer to it with a url in the xml file.  There's specific syntax for it.

However, I see that there is a mechanism for instream foreign objects, which might be what you need -- have you checked the examples in the documentation?
http://xmlgraphics.apache.org/fop/examples.html
Yes, lets just say I have a String with the XML to pass to FOP.

I understand and used the case where the image file is somewhere on the file system and refer that file as below:

<fo:external-graphic src='url("/images/logo.jpg")' content-width="100px" content-height="50px"/>

But the challenge I have here is to retrieve this image from the Data Base and set it on the PDF while generating the PDF. If you mean that I need to first write this out onto the file system and refer to it with a URL in the XML then let me know or I was wondering if there a way to directly send this across without writing into the file system?
Great -- you're right, that's the syntax for including the graphic.  Which means it's on a web server somewhere.  You might need to do a fully qualified url for the image, e.g., "http://x.y.com/images/logo.jpg" though -- we found that it bypassed some problems to use the fully qualified url.

Or are you just creating PDFs which are not being served by a web server?

Have you checked the xmlgraphics fop example in embedding called ExampleSVG2PDF.java?  That might be a good start for what you want.  I think you still will want to get the jpg from the db (it actually lives as a binary obj in the db?) and write it to the file system, but I'm sure you could also create an in-memory binary instantiation of the jpg and go from there.
If you are creating a PDF and have a web server for displaying the same document, we have had good luck lately with wkhtmltopdf:
https://code.google.com/p/wkhtmltopdf/

It's another approach.  Easier than FOP, but it has its own challenges.  Again -- the jpg will have to be written to the file system and be available to the web server as a url for generating the html page which will be converted to a PDF.
I realize that I cannot write the image out to the file system and then refer to it in the XML as I will have different images that needs to be associated depending on the business. Example, lets say I have different vendors and each vendor have a company logo and when I send out this PDF to the buyer he should see the logo of the vendor with whom he made the business.

I think I will have to figure out a way to import this image on the fly from the DB on this PDF. BTW, I do not have an option of switching away from FOP at this point, though it is good to know.

Thanks,
I think the easiest method for you to use would be to code a custom URIResolver and make up your own URI scheme name to refer to your images. It is a bit hard to give you an example that is fully tailored to your situation since you haven't posted much of your code, but by going off what you have and some of the subsequent posts that you have made, I have come up with something that should hopefully be fairly close to what you need and that also runs standalone if you just want to check what I have done.

Ok, first is the java code of a class that runs standalone...
import java.io.BufferedOutputStream;
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.FileOutputStream;
import java.io.InputStream;
import java.io.StringReader;

import javax.xml.transform.Source;
import javax.xml.transform.Transformer;
import javax.xml.transform.TransformerException;
import javax.xml.transform.TransformerFactory;
import javax.xml.transform.URIResolver;
import javax.xml.transform.sax.SAXResult;
import javax.xml.transform.stream.StreamSource;

import org.apache.fop.apps.Fop;
import org.apache.fop.apps.FopFactory;
import org.apache.fop.apps.MimeConstants;

public class TestApacheFop {

    public static void main(String[] args) throws Exception {
        String inputXml = "<details vendor=\"cat\">These are the details of the CAT vendor</details>";
        
        FopFactory fopFactory = FopFactory.newInstance();
        fopFactory.setURIResolver(new DatabaseImageURIResolver());
        BufferedOutputStream os = new BufferedOutputStream(new FileOutputStream("fop.pdf"));
        Fop fop = fopFactory.newFop(MimeConstants.MIME_PDF, os);
        
        TransformerFactory transformerFactory = TransformerFactory.newInstance();
        Transformer transformer = transformerFactory.newTransformer(new StreamSource(new FileInputStream("fop.fo")));
        
        transformer.transform(new StreamSource(new StringReader(inputXml)), new SAXResult(fop.getDefaultHandler()));
        
        os.flush();
        os.close();
    }
    
    private static class DatabaseImageURIResolver implements URIResolver {
        @Override
        public Source resolve(String href, String base) throws TransformerException {
            if (href.startsWith("db:")) {
                // It's a db:vendorName URL so attempt to get the image from the db using the requested vendor name
                
                // Remove the "db:" prefix from the href string to give us the vendor name
                String vendor = href.substring(3);
                return new StreamSource(DatabaseUtils.getLogoImageStreamForVendor(vendor));
            }
            
            // Not for us to resolve
            return null;
        }
    }
    
    private static class DatabaseUtils {
        public static InputStream getLogoImageStreamForVendor(String vendor) {
            // Normally this method (or something like it) would actually query the DB, get the BLOB for the vendor argument supplied and return an InputStream from the BLOB
            //  There really shouldn't be a need to use an intermediate byte[] but if you had other requirements you could do that. Note that FOP will cache the image for
            //  a particular URI so even if you use the same image multiple times in your document, the DB will still only get queried once.
            
            // But in this example method, I will just load an image from the file system and return an InputStream for that....
            try {
                return new FileInputStream(vendor + ".png");
            } catch (FileNotFoundException e) {
                // Can't resolve the vendor name to a PNG file on the filesystem so return null;
                return null;
            }
        }
    }
    
}

Open in new window

And the XSL that is referred to in the above code (fop.fo). It's just a simple XSL that operates on that input XML but also generates the right URI to refer to the correct vendor logo...
<?xml version="1.0" encoding="ISO-8859-1"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="/details">
<fo:root xmlns:fo="http://www.w3.org/1999/XSL/Format" font-family="Times Roman" font-size="12pt">

<fo:layout-master-set>
    <fo:simple-page-master margin-right="1.5cm" margin-left="1.5cm" margin-bottom="2cm" margin-top="1cm" page-width="21cm" page-height="29.7cm" master-name="left">
      <fo:region-body margin-top="0.5cm" margin-bottom="1.7cm"/>
      <fo:region-before extent="0.5cm"/>
      <fo:region-after extent="1.5cm"/>
    </fo:simple-page-master>

</fo:layout-master-set>

<fo:page-sequence id="N2528" master-reference="left">

<fo:flow flow-name="xsl-region-body">
<fo:block>
  <fo:block font-size="16pt" font-weight="bold" space-before.minimum="1em" space-before.optimum="1.5em" space-before.maximum="2em">
  <xsl:value-of select="@vendor"/>
  </fo:block>
  <fo:block>
   <fo:external-graphic><xsl:attribute name="src" >db:<xsl:value-of select="@vendor"/></xsl:attribute></fo:external-graphic>
   <xsl:value-of select="."/>
  </fo:block>
</fo:block>

</fo:flow>
</fo:page-sequence>

</fo:root>
</xsl:template>
</xsl:stylesheet>

Open in new window

A few things to note...

- In this I have just used a static input XML rather than an object representation that you mention above, but there isn't a lot of difference. Just use whatever code that you have to marshall your object into (I'm guessing) a DOM Source that you supply to the transform to get FOP to do it's thing.

- You don't mention how the vendor name is selected in order to get the correct logo from the database. I have made an assumption that perhaps that vendor name might be part of the input XML (String or object) and so that is where the selection of the logo starts (the vendor="cat" attribute on the root element of the input XML)

- The XSL gets the vendor name from the input XML (in this case "cat") and it constructs the required fo:external-graphic element with its src attribute set (in this case to... src="db:cat"). The db: is what distinguishes this URI from any other, like http: or file:, etc

- The custom URIResolver in the java code above, checks to see if the URI requested starts with db:   If it doesn't it just returns "null" so that other URIResolvers (or other mechanisms) can be used to resolve the URI. This allows normal URI's that might refer to static images on the filesystem, for example, to continue to work normally. If it does start with db:   it strips off the db: part and then refers it on to some code that can take the rest of the URI ("cat" in this case) and return an inputstream for it. This will be your existing DB code, or something like it, but in this simple example it just looks to the file system for a file called cat.png

- The selection of "db" as the URI scheme prefix was entirely arbitrary. You could use any value that you are happy with, as long as the XSL creates the URI with the right prefix and the URIResolver detects the right prefix.



Hopefully, this is all fairly self explanatory but I do realise that there are a lot of moving parts to this, so if you have any questions, just let me know!
Thanks mccarl, I haven't got a chance to get to this yet, but looks like this should work and may be I would need some help in the process of getting to it. I will implement and then accept the solution soon.
Thanks again,
Yeah, no worries. Let me know if there are any problems, whenever you are able to try it out.
Hey Mccarl,
Sorry for taking so long to get to this. Firstly, Let me understand your solution, we are first setting the URI resolver into the fop factory object and then trying to resolve on the fly during the transalation which would do a DB query to get the corresponding image stream, right?

But in my case I already have the byte stream with me while I'm preparing this XML document and I think all I need here is to know how I can set this Byte array stream into the XML so that when I feed the XML to FOP and apply the XSL template I should be able to read the stream i.e. display the image on the PDF? I apologize if what I said made no sense?

I would really appreciate your help,

Thanks for your patience!
I think it would help if you looked at some of the documentation we have been recommending.  The apache fop site has specific examples about instream svg images:
http://xmlgraphics.apache.org/fop/examples.html

Try the recommended methods, and post the results.
But in my case I already have the byte stream with me while I'm preparing this XML document
Ok, I was just assuming that the only reason that you already had the byte stream was because you had just read it from the DB, and I was just combining the two steps of reading from the DB and producing the PDF into one step and making the code a little bit more streamlined. What this approach allows is the ability for the input document to refer to any number of different images, and just have them resolved from the DB as needed.

But anyway, I don't fully know your entire requirements or exactly what you are doing so if you have the byte stream already (due to some other reason) and you can't defer reading it from the DB until you are producing the document then we can still do it that way.

Just let me ask one question though... Is there only one image that needs to be written into the output file? (There may be multiple instances of that image, but only one distinct image). If so, you can do something like this...
import java.io.BufferedOutputStream;
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.FileOutputStream;
import java.io.InputStream;
import java.io.StringReader;

import javax.xml.transform.Source;
import javax.xml.transform.Transformer;
import javax.xml.transform.TransformerException;
import javax.xml.transform.TransformerFactory;
import javax.xml.transform.URIResolver;
import javax.xml.transform.sax.SAXResult;
import javax.xml.transform.stream.StreamSource;

import org.apache.fop.apps.Fop;
import org.apache.fop.apps.FopFactory;
import org.apache.fop.apps.MimeConstants;

public class TestApacheFop {

    public static void main(String[] args) throws Exception {
        String inputXml = "<details vendor=\"cat\">These are the details of the CAT vendor</details>";
        
        byte[] image = object.getLogo();            //  Or however you get the byte[] that contains your image data

        FopFactory fopFactory = FopFactory.newInstance();
        fopFactory.setURIResolver(new ByteArrayURIResolver(image));
        BufferedOutputStream os = new BufferedOutputStream(new FileOutputStream("fop.pdf"));
        Fop fop = fopFactory.newFop(MimeConstants.MIME_PDF, os);
        
        TransformerFactory transformerFactory = TransformerFactory.newInstance();
        Transformer transformer = transformerFactory.newTransformer(new StreamSource(new FileInputStream("fop.fo")));
        
        transformer.transform(new StreamSource(new StringReader(inputXml)), new SAXResult(fop.getDefaultHandler()));
        
        os.flush();
        os.close();
    }
    
    private static class ByteArrayURIResolver implements URIResolver {
        private byte[] image;

        public ByteArrayURIResolver(byte[] image) {
            this.image = image;
        }

        @Override
        public Source resolve(String href, String base) throws TransformerException {
            if (href.startsWith("mem:")) {
                // It's a mem:_______ URL so stream out the bytes from the byte array
                return new StreamSource(new ByteArrayInputStream(image));
            }
            
            // Not for us to resolve
            return null;
        }
    }
    
}

Open in new window

<?xml version="1.0" encoding="ISO-8859-1"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="/details">
<fo:root xmlns:fo="http://www.w3.org/1999/XSL/Format" font-family="Times Roman" font-size="12pt">

<fo:layout-master-set>
    <fo:simple-page-master margin-right="1.5cm" margin-left="1.5cm" margin-bottom="2cm" margin-top="1cm" page-width="21cm" page-height="29.7cm" master-name="left">
      <fo:region-body margin-top="0.5cm" margin-bottom="1.7cm"/>
      <fo:region-before extent="0.5cm"/>
      <fo:region-after extent="1.5cm"/>
    </fo:simple-page-master>

</fo:layout-master-set>

<fo:page-sequence id="N2528" master-reference="left">

<fo:flow flow-name="xsl-region-body">
<fo:block>
  <fo:block font-size="16pt" font-weight="bold" space-before.minimum="1em" space-before.optimum="1.5em" space-before.maximum="2em">
  <xsl:value-of select="@vendor"/>
  </fo:block>
  <fo:block>
   <fo:external-graphic src="mem:dummyString" />
   <xsl:value-of select="."/>
  </fo:block>
</fo:block>

</fo:flow>
</fo:page-sequence>

</fo:root>
</xsl:template>
</xsl:stylesheet>

Open in new window

Note: I have changed the names of some things to make it more obvious that the image bytes aren't directly from the DB but from a byte stream in memory. In particular, the URI in the <fo:external-graphic> element now just has to start with "mem:", the part after that is not used, because there is only one image that can be inserted (the one passed to the contructor of ByteArrayURIResolver).

This is why I originally wrote it to refer to the DB directly, because then you can potentially access any number of images during the processing of the file. If you still need to use more the one image, and you still can't do it directly from the DB, let me know because we could modify things be able to access multiple byte[]'s (but I just didn't want to complicate things)
Mccarl,

Thanks for your response again!

Yes, there is only one image that needs to be written out to the output file.

"The question I have is, do I have to use URI Resolver?"

As I said earlier I have a byte[] stream and an "object representation" (i.e. a pojo object) of an XML document and all I have to do is set this streamsource into that "pojo"..it might sound as simple as it might be..the problem I have though is not sure what data type I need to use for the element in the "XSD" that generates this "pojo".

Note: as I have some architectural constraints in using URI resolver.

Ex:
 in the xsd:

     <xsd:schema>
      <xsd:complexType name="POLICY_DATA">
       <xsd:sequence>
               .
            .
         <xsd:element name="LOGO" type="xsd:?????" />
            .
            .
       </xsd:sequence>
      </xsd:complextype>
     </xsd:schema>

in JAVA:

dataObj.setLOGO(new StreamSource(new ByteArrayInputStream(image)));

//Where dataObj is the object representation of the xsd mentioned above.

Thanks for your help!
all I have to do is set this streamsource into that "pojo"..
I think you are looking at this in the wrong way. Yes, there are ways to set binary content into the POJO so that it ultimately appears in the XML, in fact there are probably many ways of doing this.

However, the problem is in what Apache FOP will accept in terms of input XML to be able to produce the output PDF file. And (as we have said a few times now) I don't think Apache FOP has ANY way of accepting the image bytes from directly in the XML input. Except, as mrcoffee has said, if your image is in an XML representation, such as SVG. But if you are dealing with just the bytes of, for example, a JPEG image, then as far as I know there is no way.


So the question comes back to...
I have some architectural constraints in using URI resolver
What are these perceived constraints? I would be surprised if they are not able to be overcome with a bit of help and direction. As I don't see any other way to do what you want to do, could you post the issue that you see (and hopefully some code) and we can work through those?
Thanks for the explanation and I think I'm starting to understand...and I'm with you now but I still have to overcome a problem.
      
Here is the problem..

As I mentioned I'm populating a "data bean" which happens in a separate class let just call it a "Bean Builder" class and after the bean is prepared the call is directed to the class where I instantiate the FopFactory instance, let call this as "Report Builder" class(Please find the code below), in the "Report Builder" class  all I do/suppose to be doing is just read the xml as string and template as byte array and I'm not supposed to be setting the "image" byte stream here but in the "Bean Builder" class.
      
Now, should I be creating another FopFactory instance and set the MIME types e.t.c in the "Bean Builder" class? If so, should I be worried about the memory issues? Or is there a way that I could reuse the same instance?
      
Hope this make some sense, please let me know if you need more info. Again, thanks for your patience.
      
public ByteArrayOutputStream buildPdfReport(PdfDataBean dataBean) throws Exception {
            String xmlString = dataBean.getXmlDataString();
            byte[] template = dataBean.getXslData();

            FopFactory fopFactory = FopFactory.newInstance();

            //IMPORTANT: MUST load font config before creating userAgent
            loadFontConfig(fopFactory);

//Note: for debugging ONLY the javax.xml.transform.TransformerFactory will show line #s
            //TransformerFactory tFactory = TransformerFactory.newInstance();
            TransformerFactory tFactory = TransformerFactoryImpl.newInstance();

            ByteArrayOutputStream out = new ByteArrayOutputStream();

            //Setup custom FOUserAgent
            FOUserAgent userAgent = fopFactory.newFOUserAgent();

            //this is necessary to include images
            URL baseURL = this.getClass().getClassLoader().getResource("fop");
            if (null != baseURL) {
                  userAgent.setBaseURL(baseURL.toExternalForm());
            }

            //Setup FOP
            try {
                  Fop fop = fopFactory.newFop(MimeConstants.MIME_PDF, userAgent, out);

                  ByteArrayInputStream bs = new ByteArrayInputStream(template);
                  URL commonTemplatesUrl = PdfReportBuilder.class.getResource("/fop/xsl/");
                  Source xsltSrc = new StreamSource(bs, commonTemplatesUrl.toExternalForm());

                  //Setup Transformer
                  Transformer transformer = tFactory.newTransformer(xsltSrc);

                  //Make sure the XSL transformation's result is piped through to FOP
                  Result res = new SAXResult(fop.getDefaultHandler());
                  StringReader reader = new StringReader((String) xmlString);

                  //Setup input
                  StreamSource src = new StreamSource();
                  src.setReader(reader);

                  //Start the transformation and rendering process
                  transformer.transform(src, res);


            } catch (Exception e) {
                  log.error("Error creating pdf ", e);
                  throw e;
            }

            return out;

      }
Ok, so is this a possibility...

Change the definition of PdfDataBean so that you can store the byte[] within, ie. have a method such as...      public void setLogo(byte[] image)       that you call from your BeanBuilder class to store the byte[] of the image. And then you would also have a method such as...      public byte[] getLogo()       that you would call from your Reportbuilder class to get the image bytes. Then your buildPdfReport method could look like this...
public ByteArrayOutputStream buildPdfReport(PdfDataBean dataBean) throws Exception {
            String xmlString = dataBean.getXmlDataString();
            byte[] template = dataBean.getXslData();
            byte[] image = dataBean.getLogo();

            FopFactory fopFactory = FopFactory.newInstance();
            fopFactory.setURIResolver(new ByteArrayURIResolver(image));


            //IMPORTANT: MUST load font config before creating userAgent
            loadFontConfig(fopFactory);

//Note: for debugging ONLY the javax.xml.transform.TransformerFactory will show line #s
            //TransformerFactory tFactory = TransformerFactory.newInstance();
            TransformerFactory tFactory = TransformerFactoryImpl.newInstance();

            ByteArrayOutputStream out = new ByteArrayOutputStream();

            //Setup custom FOUserAgent
            FOUserAgent userAgent = fopFactory.newFOUserAgent();

            //this is necessary to include images
            URL baseURL = this.getClass().getClassLoader().getResource("fop");
            if (null != baseURL) {
                  userAgent.setBaseURL(baseURL.toExternalForm());
            }

            //Setup FOP
            try {
                  Fop fop = fopFactory.newFop(MimeConstants.MIME_PDF, userAgent, out);

                  ByteArrayInputStream bs = new ByteArrayInputStream(template);
                  URL commonTemplatesUrl = PdfReportBuilder.class.getResource("/fop/xsl/");
                  Source xsltSrc = new StreamSource(bs, commonTemplatesUrl.toExternalForm());

                  //Setup Transformer
                  Transformer transformer = tFactory.newTransformer(xsltSrc);

                  //Make sure the XSL transformation's result is piped through to FOP
                  Result res = new SAXResult(fop.getDefaultHandler());
                  StringReader reader = new StringReader((String) xmlString);

                  //Setup input
                  StreamSource src = new StreamSource();
                  src.setReader(reader);

                  //Start the transformation and rendering process
                  transformer.transform(src, res);


            } catch (Exception e) {
                  log.error("Error creating pdf ", e);
                  throw e;
            }

            return out;

      }

Open in new window

And ByteArrayURIResolver is defined exactly the same as in my previous post, and your XSL is something like what I posted in the previous post too, ie. line 23 of that example is what you need to place in YOUR xsl file at the place where you would like the Logo image to appear.
"Ok, so is this a possibility..."

I'm afraid not.. I did think about that, but here is the thing "buildPdfReport" is a common class that is being used by different "BeanBuilder" classes and that I can't set something that is specific to one BeanBuilder class.
ASKER CERTIFIED SOLUTION
Avatar of mccarl
mccarl
Flag of Australia image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
mccarl,
I was convinced with your solution and went ahead and implemented but was not so sure about the "Code Review" thing, I went ahead and implemented anyways. So far things looks OK and I haven't heard anything from the reviewer.

I would appreciate our help and patience!!

Thanks a lot again!!
Your welcome. I'm glad that you were able to get something working! :)