This is related to a previous post, but updated as facts have changed.
We have an java JBoss solution which receive file uploads from clients. Some clients are fairly limited in capability, but they can do POSTs. So we have them POSTing XML. SOAP and other approaches are too complex for some.
This application needs to "catch" the files and a few attribute fields which are used to store it in our repository.
Both our server and the client applications are all running in our data center with good fast networking.
What we have in place is that the clients create an XML string with a few tags for the attributes and then the content of the file is base64 encoded in a <content></content> tag.
The XML looks something like:
<?xml version="1.0" encoding="UTF-8"?>
<CreateDocument DocBase="asdfaqsd" Username="loginID" Password="qkjowe">
<DocInfo Type="invoice" ParentFolders="\Temp\subfolder" ACL="customer_invoice_acl">
<Attrib Name="object_name" Value="customer_invoice.doc" />
<Attrib Name="authors" Value="Bill|Bob|Joe" />
The files are up to 50 MB so far, but we could have some that are 150 MB (on disk). This makes for a large XML.
We don't have network issues because this is all in the data center. They're pretty fast.
Originally we were fighting with the heap not being big enough for these files, but now with adjustments that's okay.
Now we're using up CPU in parsing the XML.
From what people are telling me its the SAX parser which is having to reallocate a string large enough to load this as it's uploaded and copy the old string into the new and then append some more data and then repeat, repeat, repeat. It's taking 2.5 hours to complete.
I'm wondering if there isn't a simpler way to deal with this upload rather than parsing it through SAX to avoid this. In the end we need to turn it into a byte to load into our storage system.
Is there a way to to send an XML POST to a file directly or to begin sending just the <content> data to a file so that we're not fighting with all this string extending and processing? We can't really move to one of the more elegant solutions of attachments and such.
We were doing something like this previously with ASP and didn't seem to have issues, but we certainly are with our Java and JBoss.