Link to home
Start Free TrialLog in
Avatar of Marco Gasi
Marco GasiFlag for Spain

asked on

Fromatting problem saving to file parsed html content

Hi all.
I'm using simple_html_dom.php to parse some page. Everything works fine, but when I need to get li content I get all list content as one item instead of getting each list element separated.

I use this function:
function getTextBetweenTags( $string, $tagname )
{
	global $tokens;
	$html = new simple_html_dom();
	$html->load( $string );
	foreach ( $html->find( $tagname ) as $element )
	{
		$tokens[] = $element->plaintext;
	}
}

Open in new window


I use this on my localhost so I don't worry about global. Now suppose I have this html:

<h1>header1 </h1>
<<h3>header3 </h3>
<ul>
<li>item1</li>
<li>item2</li>
<li>item3</li>
</ul>

Open in new window


Using the function above I get correct result and if I print the resulting array I get 5 array elements. But I want to put this elements in a json file to speed up the use of a jquery plugin for instant translation (jquery.lang.js). So I'm using this piece of code:

		$json = array();
		foreach ($tokens as $t)
		{
			$t = trim($t);
			$json[] = "$t" . ":\r\n" . "\"\",\r\n";
		}

Open in new window


I would expect to get this:

"header1":
"",
"header3":
"",
"li1":
"",
"li2":
"",
"li3":
"",

Open in new window


But I get this instead:

"header1":
"",
"header3":
"",
"li1             li2                li3":
"",

Open in new window


Any idea?
Thanks in advance
Marco
Avatar of Ray Paseur
Ray Paseur
Flag of United States of America image

I think I might approach this a little differently.  Also, if you haven't seen it yet, this is one of the best comments ever on the difficulties of parsing complex markup.

Please see: http://iconoun.com/demo/temp_marqusg.php
<?php // demp/temp_marqusg.php

/**
 * http://www.experts-exchange.com/questions/28692697/Fromatting-problem-saving-to-file-parsed-html-content.html
 *
 * http://php.net/manual/en/function.simplexml-load-string.php
 * http://php.net/manual/en/function.json-encode.php
 */
error_reporting(E_ALL);
echo '<pre>';

// SOME TEST DATA
$htm = <<<EOD
<h1>header1 </h1>
<h3>header3 </h3>
<ul>
<li>item1</li>
<li>item2</li>
<li>item3</li>
</ul>
EOD;

// WRAP THE HTML INTO AN XML DOCUMENT
$doc = '<wrap>' . $htm . '</wrap>';

// TRY TO MAKE AN OBJECT
$obj = SimpleXML_Load_String($doc);

// ACTIVATE THIS TO SEE THE OBJECT
// var_dump($obj);

// TRY TO MAKE A JSON STRING
$jso = json_encode($obj, JSON_PRETTY_PRINT);
echo htmlentities($jso);

Open in new window

Avatar of Marco Gasi

ASKER

Lol, I had read that post: this is the reason because I moved to a dom parser script...
Thanks for your replay, Ray: your script is wonderful. But, said that I need an output like the one I describe above, I need then to preocess the json produced by your code to format it as I need or there is ome other tecnique to do it?
Another important point is that I don't need to get the whole document content but just some tag content leaving the rest as it is. As I said, I use this to speed up the creation of some json file which will hold the translation of the website text so I need to parse only the tag where is some text to translate. Since I have a series of pages which are all identical (they describe the company products) I know I need to translate just h1, h3 h4 and li elements.
Weel, id I use directly the native DOM parser:

		$dom = new DOMDocument;
		$dom->loadHTML( $content );
		$li = $dom->getElementsByTagName('li');
		foreach ( $li as $l )
		{
			$tokens[] = $l->nodeValue;
		}
		foreach ($tokens as $t)
		{
			echo '"' . $t . '"' . ":<br>" . "\"\",<br>";
		}

Open in new window


I get this:

"item1 item2 item3":
"",
"item1":
"",
"item2":
"",
"item3":
"",

Open in new window


That is, for each ul tag I get first all li items merged in one array item and then I get them separated. What does this mean?
Your output looks more like the third element in tokens is your "ul" element instead of three "li" elements.

Can you show the code you're using to call getTextBetweenTags?

Also, you're loading up the DOM every time that getTextBetweenTags is called. It'd be a lot more efficient to load the DOM once and have getTextBetweenTags call that loaded/parsed object each time.
Thanks gr8gonzo for your reply. Now I'm away but please, look at my last comment: even using DOM in the way I have shown give the same result. I agree with you: it's probably the whole ul element: how to exclude it?
Anyway, I call that funvtion this way:
getTextBetweenTags($content, 'li'); 

Open in new window

Please show us a "real world" test case so we can see what the entire document looks like.  There may be easier ways to do this, and the most accurate test data set will show the best results.
With pleasure, but I can't do it just now. I'll do it later, within two or three hours. I'll post here the full file and the full script I'm using to process it.
OK, great - Just need the input and the expected output.  No need to post the script code.
Ok, even because the script after all is all here yet :-)
Here's the input:
<section class="content page">
	<div id="page-title"><h1>Alberi luminosi</h1></div>
	<div class="container page-container">
		<div class=" row">
			<div class="col-12">
				<ul>
					<li class="parent">
						<a class="fancybox" href="<?php echo img_url( 'arboles/01.jpg' ) ?>">
							<img class="imgFLthumb" src="<?php echo img_url( 'arboles/01.jpg' ) ?>" />
						</a>
						<h4 lang="it">Albero luminoso</h4>
						<ul> <li lang="it">Altezza mt 2,10</li>
							<li lang="it">Nr 6 rami + tronco</li>
							<li lang="it">Alimentazione 230 volts /24 volts, con trasformatore 700 Led, 50 Watt</li>
							<li lang="it">Colori: arancio, giallo, rosso, blu, verde</li></ul>
					</li>
					<li class="parent">
						<a class="fancybox" href="<?php echo img_url( 'arboles/02.jpg' ) ?>">
							<img class="imgFLthumb" src="<?php echo img_url( 'arboles/02.jpg' ) ?>" />
						</a>
						<h4 lang="it">Albero luminoso LED mod. "MELO"</h4>
						<ul> <li lang="it">Altezza mt 3,00</li>
							<li lang="it">Alimentazione 230 volts/ 24 volts, con trasformatore</li>
							<li lang="it">Disponibile con Foglie e Mele o Foglie e Fiori</li></ul>
					</li>
					<li class="parent">
						<a class="fancybox" href="<?php echo img_url( 'arboles/03.jpg' ) ?>">
							<img class="imgFLthumb" src="<?php echo img_url( 'arboles/03.jpg' ) ?>" />
						</a>
						<h4 lang="it">Albero luminoso LED</h4>
						<ul> <li lang="it">Altezza mt 5,00</li>
							<li lang="it">Alimentazione 230 volts/ 24 volts, con trasformatore</li>
							<li lang="it">Multicolor con 5200 Led, controllo gioco luci tramite telecomando</li>
							<li lang="it">IN OFFERTA SPECIALE FINO AD ESAURIMENTO SCORTE</li></ul>
					</li>
					<li class="parent">
						<a class="fancybox" href="<?php echo img_url( 'arboles/04.jpg' ) ?>">
							<img class="imgFLthumb" src="<?php echo img_url( 'arboles/04.jpg' ) ?>" />
						</a>
						<h4 lang="it">Albero luminoso LED mod. FICUS</h4>
						<ul> <li lang="it">Altezza mt 3,00</li>
							<li lang="it">Alimentazione 230 volts /24 volts, con trasformatore, Consumo 100 Watt</li>
							<li lang="it">Controllo movimento luci con telecomando</li>
							<li lang="it">Colori: Rosso, Bianco e Celeste</li> </ul>
					</li>
					<li class="parent">
						<a class="fancybox" href="<?php echo img_url( 'arboles/05.jpg' ) ?>">
							<img class="imgFLthumb" src="<?php echo img_url( 'arboles/05.jpg' ) ?>" />
						</a>
						<h4 lang="it">Albero luminoso LED mod "S"</h4>
						<ul> <li lang="it">Altezza mt.2</li>
							<li lang="it">Alimentazione 230 volts/24 volts, con trasformatore</li>
							<li lang="it">Consumo 80 Watt</li>
							<li lang="it">Tronco Nero Foglie verdi e Fiori Azzurri,rossi o viola</li>
							<li lang="it">Tronco Bianco Foglie verdi con fiori bianchi</li>
							<li lang="it">Tronco bianco foglie bianche con fiori bianchi</li></ul>
					</li>
					<li class="parent">
						<a class="fancybox" href="<?php echo img_url( 'arboles/06.jpg' ) ?>">
							<img class="imgFLthumb" src="<?php echo img_url( 'arboles/06.jpg' ) ?>" />
						</a>
						<h4 lang="it">Albero luminoso LED Mod.DUBAI</h4>
						<ul> <li lang="it">Altezza mt 1,30</li>
							<li lang="it">Alimentazione 230 volts /24 volts, con trasformatore, Cons. 50W</li>
							<li lang="it">Con foglie verdi e fiori rossi, celesti o viola</li></ul>
					</li>
					<li class="parent">
						<a class="fancybox" href="<?php echo img_url( 'arboles/07.jpg' ) ?>">
							<img class="imgFLthumb" src="<?php echo img_url( 'arboles/07.jpg' ) ?>" />
						</a>
						<h4 lang="it">Albero LED Tronco "L" Mod. CBL01/CBL02</h4>
						<ul> <li lang="it">Altezza mt 1,50/1,80</li>
							<li lang="it">Alimentazione 230 volts /24 volts, con trasformatore, Cons. 80W</li>
							<li lang="it">Con foglie verdi e fiori: Rossi, Blu o Viola</li></ul>
					</li>
					<li class="parent">
						<a class="fancybox" href="<?php echo img_url( 'arboles/08.jpg' ) ?>">
							<img class="imgFLthumb" src="<?php echo img_url( 'arboles/08.jpg' ) ?>" />
						</a>
						<h4 lang="it">Albero luminoso LED mod. CBL01</h4>
						<ul> <li lang="it">Altezza Totale Mt. 2,50</li>
							<li lang="it">Bellissimo con Nr. 1950 Led tra foglie e fiori</li>
							<li lang="it">Alimentazione 230 volts/ 24 volts, con trasformatore</li>
							<li lang="it">Disponibile con Foglie col. Verde e Fiori: Rossi, Viola, Blu</li></ul>
					</li>
				</ul>
			</div>
		</div>
	</div>
</section>

Open in new window


About the output, I just need tha plain text inside html tags: the best would be get all tags which have some text within and put that text in an array. Then I could process the array an get an output like this:

"sometext":
"",
"someothertext":
"",

Open in new window


and so on.

I started today to work on this to make a tedious part of my job easier and quickier and... well, do you know how it has gone :-)
ASKER CERTIFIED SOLUTION
Avatar of Ray Paseur
Ray Paseur
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Hi Ray.
I tried your code and it worked fine but a strange error I can't fix: please test it on this file:
<section class="content page">
	<div id="page-title"><h1 lang="it">Cinema 10D</h1></div>
	<div class="container page-container">
		<div class=" row">
			<div class="col-md-6 col-sm-6 col-xs-12">
				<h4 lang="it">CHI NON HA VISTO UN FILM IN 10D? </h4>
				<p lang="it">Punti di forza del cinema 10D: bassi costi di gestione, incassi immediati, con la possibilita' di riscatto dell'investimento in pochi mesi con un target di clienti che va da 3 anni fino oltre 70 anni.</p>
			</div>
			<div class="col-md-6 col-sm-6 col-xs-12">
				<h4 lang="it">CINEMA 8D/10D SPETTACOLO VIAGGIANTE!</h4>
				<p lang="it">Personalizziamo camion e rimorchi per spettacoli viaggianti - ideale per gli operatori del carnevale. Costruito con i più alti standard di sicurezza, tutto compatibile con certificati di legge CEE. Contattaci per maggior info.
					Oggi anche in noleggio (8 o 12 posti).</p>
			</div>
			<div class="col-md-12" style="margin: 30px auto;">
				<h5 lang="it">EFFECI MOVIE dispone dei migliori film presenti sul mercato, direttamente dalla sede U.S.A., con prezzi decisamente imbattibili. Offriamo una ricca gamma di filmati 3D a vostra scelta con temi sempre diversi per intrattenere piccoli e grandi. Dai un'cocchiata al <a href="<?php echo base_url( 'peliculones' ) ?>">catalogo dei film</a></h5> 
			</div>
		</div>
		<div class="row oferta">
			<div class="col-md-6 col-sm-6 col-xs-12">
				<img src="<?php echo pics_url( '00_cine02.gif' ) ?>" alt="cine" />
				<p lang="it">I nostri ingegneri troveranno le migliori soluzioni in base alle vostre esigenze, progettando il vostro cinema tridimensionale.</p>
			</div>
			<div class="col-md-6 col-sm-6 col-xs-12">
				<h3 lang="it">OFFERTA SPECIALE! (low cost)</h3>
				<ul>
					<li lang="it">CINEMA 6D con pistoni pneumatici</li>
					<li lang="it">8 posti- Poltrone ecologiche in pelle</li>
					<li lang="it">Sound Gold system- 4 speakers</li>
					<li lang="it">DLP proiettori full HD</li>
					<li lang="it">3D schermo polarizzato su misura</li> 
					<li lang="it">100 occhiali 3D </li>
					<li lang="it">10 film 3D</li>
					<li lang="it">3D software alta definizione</li>
					<li lang="it">Compressore silenziato</li>
					<li lang="it">Effetti Speciali:	solletico alle gambe, getto d'aria, bolle di sapone, strobo e movimento delle sedie. </li>
					<li lang="it">Possibilità di aumentare le poltrone fino12/16 posti.</li>
					<li lang="it"><h5 lang="it">&euro; 17.900</h5></li>
				</ul>
			</div>
		</div>
		<div class="row">
			<div class="col-md-12 col-sm-12 col-xs-12" style="margin: 40px auto;">
				<h3 lang="it">TUTTI I NOSTRI CINEMA SONO COSTRUITI CON I PIU' ALTI STANDARD DI SICUREZZA
					INTERAMENTE CONFORME ALLE NORMATIVE CEE 20155</h3>
			</div>
		</div>
		<div class="row">
			<div class="row-same-height row-full-height">
				<div class="ol-xs-4 col-xs-height orange col-full-height">
					<ul>
						<h4 lang="it">CARATTERISTICHE</h4>
						<li lang="it">Certificazioni CE per ogni singolo pezzo.</li>
						<li lang="it">Ogni poltrona è di materiale ecologico garantito 5 anni</li>
						<li lang="it">Sound Gold system (4speakers)</li>
						<li lang="it">DLP proiettori full HD</li>
						<li lang="it">3D schermo Polarizzato</li>
						<li lang="it">occhiali 3D</li>
						<li lang="it">movie 3D in dotazione</li>
						<li lang="it">3D software alta definizione</li>
						<li lang="it">compressore silenziato HP</li>
					</ul>
				</div>
				<div class="col-xs-4 col-xs-height blue col-full-height">
					<h4 lang="it">EFFETTI STANDARD</h4>
					<ul>
						<li lang="it">Video 3D full HD</li>
						<li lang="it">Solletico alle gambe</li>
						<li lang="it">Soffio al collo</li>
						<li lang="it">Strobo</li>
						<li lang="it">Carosello di luci</li>
						<li lang="it">Getto d'acqua</li>
						<li lang="it">Getto d'aria </li>
						<li lang="it">Bolle di sapone</li>
						<li lang="it"><li lang="it">Laser</li>
						<li lang="it">Fumo</li> 
						<li lang="it">Effetto tornado</li>
					</ul>
				</div>
				<div class="col-xs-4 col-xs-height magenta col-full-height">
					<h4 lang="it">EFFETTI EXTRA</h4>
					<ul>
						<li lang="it">Profumo</li>
						<li lang="it">Neve</li> 
						<li lang="it">solletico alle mani</li>
						<li lang="it">tremolio</li> 
						<li lang="it">Effetto fuoco</li>
						<li lang="it">Dolby Sorround Sound 5.1.</li>
						<li lang="it">Contapersone</li>
						<li lang="it">Telecamera+TV </li>
					</ul>
				</div>
			</div>
			<div class="row">
				<div class="col-md-12 col-sm-12 col-xs-12" style="margin: 40px auto;">
					<h3 lang="it">Non hai un locale per installare il tuo cinema multidimensionale?</h3>
					<h3 lang="it">Noleggia il nostro box personalizzabile!</h3>
				</div>
				<div class="col-md-6 col-sm-6 col-xs-12">
					<img src="<?php echo pics_url( '00_cine_03.jpg' ) ?>" alt="cine" />
				</div>
				<div class="col-md-6 col-sm-6 col-xs-12">
					<p lang="it">Un opportunita vantaggiosa per guadagnare affittando e con possibilita di acquistare il cinema a rate.
						Minimo 3 mesi ad un massimo di 12, pagando mensilmente con rata anticipata ed un piccolo deposito che andra a scalare su prezzo di riscatto.</p>
					<p lang="it">Il Cinema tridimensionale e' compresivo di: Nr. 15 Film</p>
					<p lang="it">Effetti: aria, soffio sul collo, solletico alle gambe, bolle sapone, effetto tornado, strobo, laser, fumo, carosello luci, subwoofer + casse audio Special Gold</p>
				</div>
			</div>
		</div>
	</div>
</section>

Open in new window

I get opening-closing tag mismatch errors I can't understand.
Another problem is that if I try to extend your code to process h1 and h3 elements too I get empty files:

		$htm = str_replace('<?', '&lt;?', $htm);
		$htm = str_replace('?>', '?&gt;', $htm);

		// SOME SIGNAL STRINGS
		$h1     = '<h1 lang="it">';
		$end_h1 = '</h1>';
		$h3     = '<h3 lang="it">';
		$end_h3 = '</h3>';
		$h4     = '<h4 lang="it">';
		$end_h4 = '</h4>';
		$ul     = '<ul>';
		$end_ul = '</ul>';
		$bl     = '<data>';
		$end_bl = '</data>';

		// BREAK THE HTML STRING INTO DATA UNITS ON THE H4 TAGS
		$arr = explode($h1, $htm);
		unset($arr[0]);
		foreach ($arr as $key => $sub)
		{
				$sub = $h1 . $sub;
				$poz = strpos($sub, $end_ul);
				$sub = substr($sub,0,$poz);
				$sub .= $end_ul;
				$arr[$key] = $bl . $sub . $end_bl;
		}

		// ACTIVATE THIS TO SEE THE ARRAY (USE "VIEW SOURCE")
		// var_dump($arr);

		// TIDY UP THE HTML STRING
		$htm = implode(NULL, $arr);
		// BREAK THE HTML STRING INTO DATA UNITS ON THE H4 TAGS
		$arr = explode($h3, $htm);
		unset($arr[0]);
		foreach ($arr as $key => $sub)
		{
				$sub = $h3 . $sub;
				$poz = strpos($sub, $end_ul);
				$sub = substr($sub,0,$poz);
				$sub .= $end_ul;
				$arr[$key] = $bl . $sub . $end_bl;
		}

		// ACTIVATE THIS TO SEE THE ARRAY (USE "VIEW SOURCE")
		// var_dump($arr);

		// TIDY UP THE HTML STRING
		$htm = implode(NULL, $arr);
		
		
		// BREAK THE HTML STRING INTO DATA UNITS ON THE H4 TAGS
		$arr = explode($h4, $htm);
		unset($arr[0]);
		foreach ($arr as $key => $sub)
		{
				$sub = $h4 . $sub;
				$poz = strpos($sub, $end_ul);
				$sub = substr($sub,0,$poz);
				$sub .= $end_ul;
				$arr[$key] = $bl . $sub . $end_bl;
		}

		// ACTIVATE THIS TO SEE THE ARRAY (USE "VIEW SOURCE")
		// var_dump($arr);

		// TIDY UP THE HTML STRING
		$htm = implode(NULL, $arr);
		$htm = preg_replace('/\s\s+/', ' ', $htm);

		// WRAP THE HTML STRING INTO AN XML DOCUMENT
		$doc = '<wrap>' . $htm . '</wrap>';

		// ACTIVATE THIS TO SHOW THE XML DOCUMENT
		// echo htmlentities($doc);

		// TRY TO MAKE AN OBJECT
		$obj = SimpleXML_Load_String($doc);

		// ACTIVATE THIS TO SEE THE OBJECT
		// var_dump($obj);

		// PROCESS THE OBJECT TO DISPLAY THE PARTS
		foreach ($obj->data as $element)
		{
//				echo PHP_EOL . $element->h4;
				$tokens[] = PHP_EOL . '"' . $element->h1 . '":';
				$tokens[] = PHP_EOL . '"",';
//				echo PHP_EOL . $element->h4;
				foreach($element->ul->li as $item)
				{
//						echo PHP_EOL . '   ' . $item;
					$tokens[] = PHP_EOL . '"' . $item . '":';
					$tokens[] = PHP_EOL . '"",';
				}
//				echo PHP_EOL;
		}
		file_put_contents( $newname, $tokens );

Open in new window

It's evident I don't understand the logic of your code: can you explain, please?

Finally, I just don't understand what's wrong in this code:

	for ( $i = 0; $i < count( $files ); $i++ )
	{
		$fn = $files[ $i ];
		$fn = str_replace( '\\', '/', $fn );
		$parts = pathinfo( $fn );
		$fname = $parts[ 'basename' ];
		$dirname = $parts[ 'dirname' ];
		$el = explode( '.', $fname );
		$json_name = $el[ 0 ] . '.json';
		echo "<br>Processing file $fn<br>";
		$content = file_get_contents( $fn );
		$matches = array();
		$tokens = array();
		$dom = new DOMDocument;
		libxml_use_internal_errors(true);
		$dom->loadHTML( $content );
		libxml_clear_errors();
		$li = $dom->getElementsByTagName('li');
		foreach ( $li as $l )
		{
			$tokens[] = $l->nodeValue;
		}

		$json = array();
		foreach ( $tokens as $t )
		{
			$t = trim( $t );
			$json[] = '"' . $t . '"' . ":" . PHP_EOL;
			$json[] = '"",' . PHP_EOL;
		}
		$newname = $dirname . '/' . $json_name;
		file_put_contents( $newname, $tokens );
	}
	echo "Done!";

Open in new window

I get an extra element which contains the whole list, so I get
"item1  item2  item3":
"",
"item1":
"",
"item2":
"",
"item3":
"",

Open in new window

Just curious... Where does this HTML come from?  I see PHP scripts embedded in the HTML document, so I'm wondering if the information you want to capture is available in another form (perhaps a database or text / template file) without the markup.
The HTML has been written by me: it is the view in a CodeIgniter site with no database, even if it should be there - and yes, I know Laravel i great and I'm learning it but it require some more time that the time I have just now :-)
I appreciate you approach problem-solving oriented, but I really would like to understand the unexpected result of getElementByTag...
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Unfortunately no, Ray.
The output I need, that is the final json file is something like this:

"Cinema 10D":
"",
"OFFERTA SPECIALE! (low cost)":
"",
"TUTTI I NOSTRI CINEMA SONO COSTRUITI CON I PIU' ALTI STANDARD DI SICUREZZA 	INTERAMENTE CONFORME ALLE NORMATIVE CEE 20155":
"",
"Non hai un locale per installare il tuo cinema multidimensionale?":
"",
"Noleggia il nostro box personalizzabile!":
"",

Open in new window

Anyway, it seems that at least a part of the issue origins from the fact i have nested lists and within this list I have h4 and h3 tags: if I use my original code which uses simple_html_dom.php script without processing specifically h3 and h4 I get a better result. But I still get this behavior:

if my list is as the following one
					<li class="parent">
						<a class="fancybox" href="<?php echo img_url( 'efectosespeciales/01.jpg' ) ?>">
							<img class="imgFLthumb" src="<?php echo img_url( 'efectosespeciales/01.jpg' ) ?>" />
						</a>
						<h4 lang="it">Macchina Spara Coriandoli funzionante con bombole c02</h4>
						<ul><li lang="it">Cod PTC01</li>
							<li lang="it">Incredibili! per eventi in stadio e palazzetti</li>
							<li lang="it">Due modelli medio e Grande, completi di cassa in alluminio</li>
							<li lang="it">richiudibile con maniglie</li></ul>
					</li>

Open in new window


I get this:
"Macchina Spara Coriandoli funzionante con bombole c02  						Cod PTC01  							Incredibili! per eventi in stadio e palazzetti  							Due modelli medio e Grande, completi di cassa in alluminio  							richiudibile con maniglie":
"",
"Cod PTC01":
"",
"Incredibili! per eventi in stadio e palazzetti":
"",
"Due modelli medio e Grande, completi di cassa in alluminio":
"",
"richiudibile con maniglie":
"",

Open in new window

That is the first item grabs the whole list before it gets the content of the <li class='parent'> and then the parser parses the nested list giving its contents.
So the parser works as expected and my markup, even if validated, breaks the parser.
Thank you for your help.
Thank you Ray: I always leran something by you.
Thanks, Marco.  Sorry I couldn't get an exact solution for you.  All the best, ~Ray