how to remove empty newlines out from an html result of an XQuery with Saxon?

As a result of an XQuery with Saxon (& Tagsoup) launched in command-line mode I get an html result with some multiple unwanted consecutive empty newlines.

Is there something like an XQuery "declare option" or a command-line switch to strip all the empty newlines or at least all the multiple consecutive empty newlines?
Geert BormansInformation ArchitectCommented:
since you are outputting text streams

I don't think there is a stripspace-alike function in XQuery
But I guess you achieve a lot by just using normalize-space()
lucavillaAuthor Commented:
here I'm outputting html streams  (see "html result").   Different solution?

Where should I put the normalize-space() instruction?
Geert BormansInformation ArchitectCommented:
whichever way you look at it, the newlines are in text nodes.
It is a matter of finding where they structuraly are use normalize-space() on the text node

normalize-space() changes every sequence of whitespace in a text node into a single space. Very important function for normalisation of whitespace

I think you need to take stricter control of what you pass through or not, and use the normalize-space() on every text string that has spurious whitespace

It sounds as if you are grabbing a web-page (I assume that because of the TagSoup) and do some operations on it and spit it back out as web page. have you considered XSLT instead of XQuery. Sounds like a better tool for such operation, since XSLT has better control for whitespace (and mixed content)

lucavillaAuthor Commented:
hmmm you're convincing me to switch to XSLT.

I'm grabbing web-pages checking for specific contents through XPaths.

How would I convert a simple XQUERY like the following be into XSLT?
 declare default element namespace '';
 declare option saxon:output 'method=html';
lucavillaAuthor Commented:
by the way what about "XQUERY Update" vs XSLT?
Geert BormansInformation ArchitectCommented:
What is the point of having html as the output method if all you output is text?
Is it part of a larger process and are you making sure th echaracter set is html ready?
That would be the only use I see for this... there must be better ways then

XSLT would easily add the skeleton tags html, body etc...
but if you just need the string... this XQUERY would do fine

I assume there are many span elements in this second div and the newlines are in there
tokenize(doc('test.html')//*[@id="foobar"]/div/div[2]/span/text()/normalize-space(), ' ')

You can't be updating (XQuery Update) an XML database because XQuery is working on the intermediate result after tagsoup, you have no access from the XQuery to tghe original source, because there is a tagsoup layer in between
