We help IT Professionals succeed at work.
Get Started

Urgently need upload advice

Gene Klamerus
on
375 Views
Last Modified: 2012-05-10
Okay,

This is related to a previous post, but updated as facts have changed.

We have an java JBoss solution which receive file uploads from clients.  Some clients are fairly limited in capability, but they can do POSTs.  So we have them POSTing XML.  SOAP and other approaches are too complex for some.

This application needs to "catch" the files and a few attribute fields which are used to store it in our repository.

Both our server and the client applications are all running in our data center with good fast networking.

What we have in place is that the clients create an XML string with a few tags for the attributes and then the content of the file is base64 encoded in a <content></content> tag.

The XML looks something like:

<?xml version="1.0" encoding="UTF-8"?>
<Request Type="CreateDocument">
      <CreateDocument DocBase="asdfaqsd" Username="loginID" Password="qkjowe">
            <DocInfo Type="invoice" ParentFolders="\Temp\subfolder" ACL="customer_invoice_acl">
                  <Attributes>
                        <Attrib Name="object_name" Value="customer_invoice.doc" />
                        <Attrib Name="authors" Value="Bill|Bob|Joe" />
                  </Attributes>
                  <Content Type="pdf">as1239asdfjlkjqweoyalkj12#98732j#4234uoiu23423kjad</Content>
            </DocInfo>
      </CreateDocument>
</Request>

The files are up to 50 MB so far, but we could have some that are 150 MB (on disk).  This makes for a large XML.

We don't have network issues because this is all in the data center.  They're pretty fast.

Originally we were fighting with the heap not being big enough for these files, but now with adjustments that's okay.

Now we're using up CPU in parsing the XML.

From what people are telling me its the SAX parser which is having to reallocate a string large enough to load this as it's uploaded and copy the old string into the new and then append some more data and then repeat, repeat, repeat.  It's taking 2.5 hours to complete.

I'm wondering if there isn't a simpler way to deal with this upload rather than parsing it through SAX to avoid this.  In the end we need to turn it into a byte[] to load into our storage system.

Is there a way to to send an XML POST to a file directly or to begin sending just the <content> data to a file so that we're not fighting with all this string extending and processing?  We can't really move to one of the more elegant solutions of attachments and such.

We were doing something like this previously with ASP and didn't seem to have issues, but we certainly are with our Java and JBoss.

Comment
Watch Question
Java Developer
CERTIFIED EXPERT
Top Expert 2010
Commented:
This problem has been solved!
Unlock 2 Answers and 3 Comments.
See Answers
Why Experts Exchange?

Experts Exchange always has the answer, or at the least points me in the correct direction! It is like having another employee that is extremely experienced.

Jim Murphy
Programmer at Smart IT Solutions

When asked, what has been your best career decision?

Deciding to stick with EE.

Mohamed Asif
Technical Department Head

Being involved with EE helped me to grow personally and professionally.

Carl Webster
CTP, Sr Infrastructure Consultant
Ask ANY Question

Connect with Certified Experts to gain insight and support on specific technology challenges including:

  • Troubleshooting
  • Research
  • Professional Opinions
Did You Know?

We've partnered with two important charities to provide clean water and computer science education to those who need it most. READ MORE