Solved

detecting lines with "wrong" linebreak

Posted on 2008-10-15
3
181 Views
Last Modified: 2010-03-05
I have a very long xml-file and have now spotted some not wellformed lines. There seem to be linebreaks in some of the tag-content so I have something like:

<Value> bla bla </Value>
<Value> bad
line </Value
<Value> bla bla <Value>

Im looking for a regexp to detect the bad line and chomp it so I will have

<Value> bla bla </Value>
<Value> bad line </Value
<Value> bla bla <Value>

Tried the code below but it didn't work. Chomped all lines.
Beware that some of the good lines might also have a whitespace-char after the last ">"

foreach $line (@lines) {
   print $line;
   if (!($line =~ m/>$/)) {
      print LOG "HIT LINE $line";
      chomp($line);
      print OUT $line;
   } else {
      print OUT $line;
   }  
}


0
Comment
Question by:ventumsolve
3 Comments
 
LVL 17

Accepted Solution

by:
mjcoyne earned 500 total points
ID: 22719221
#!/usr/bin/perl -w
use strict;

my @lines = <DATA>;

for (my $i = 0; $i < $#lines; $i++) {
    next if $i == 0;
    if ($lines[$i] !~ /^</) {
        chomp ($lines[$i-1]);
    }
}

print @lines;

__DATA__
<Value> bla bla </Value>
<Value> bad
line </Value
<Value> bla bla <Value>
0
 
LVL 6

Expert Comment

by:RSLE
ID: 22721409

$data = join("", @lines);
$data =~ s/\n|\cM//g;           ## remove all line breaks
$data =~ s/\<\/.+?\>/$&\n/g;    ## re-add them
print $data;

Open in new window

0
 

Author Comment

by:ventumsolve
ID: 22747994
thanks,
would have liked to see what was wrong with my anchor m/>$/

0

Featured Post

Gigs: Get Your Project Delivered by an Expert

Select from freelancers specializing in everything from database administration to programming, who have proven themselves as experts in their field. Hire the best, collaborate easily, pay securely and get projects done right.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Email validation in proper way is  very important validation required in any web pages. This code is self explainable except that Regular Expression which I used for pattern matching. I originally published as a thread on my website : http://www…
A year or so back I was asked to have a play with MongoDB; within half an hour I had downloaded (http://www.mongodb.org/downloads),  installed and started the daemon, and had a console window open. After an hour or two of playing at the command …
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…
Windows 10 is mostly good. However the one thing that annoys me is how many clicks you have to do to dial a VPN connection. You have to go to settings from the start menu, (2 clicks), Network and Internet (1 click), Click VPN (another click) then fi…

786 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question