Go Premium for a chance to win a PS4. Enter to Win

x
?
Solved

detecting lines with "wrong" linebreak

Posted on 2008-10-15
3
Medium Priority
?
187 Views
Last Modified: 2010-03-05
I have a very long xml-file and have now spotted some not wellformed lines. There seem to be linebreaks in some of the tag-content so I have something like:

<Value> bla bla </Value>
<Value> bad
line </Value
<Value> bla bla <Value>

Im looking for a regexp to detect the bad line and chomp it so I will have

<Value> bla bla </Value>
<Value> bad line </Value
<Value> bla bla <Value>

Tried the code below but it didn't work. Chomped all lines.
Beware that some of the good lines might also have a whitespace-char after the last ">"

foreach $line (@lines) {
   print $line;
   if (!($line =~ m/>$/)) {
      print LOG "HIT LINE $line";
      chomp($line);
      print OUT $line;
   } else {
      print OUT $line;
   }  
}


0
Comment
Question by:ventumsolve
3 Comments
 
LVL 17

Accepted Solution

by:
mjcoyne earned 2000 total points
ID: 22719221
#!/usr/bin/perl -w
use strict;

my @lines = <DATA>;

for (my $i = 0; $i < $#lines; $i++) {
    next if $i == 0;
    if ($lines[$i] !~ /^</) {
        chomp ($lines[$i-1]);
    }
}

print @lines;

__DATA__
<Value> bla bla </Value>
<Value> bad
line </Value
<Value> bla bla <Value>
0
 
LVL 6

Expert Comment

by:RSLE
ID: 22721409

$data = join("", @lines);
$data =~ s/\n|\cM//g;           ## remove all line breaks
$data =~ s/\<\/.+?\>/$&\n/g;    ## re-add them
print $data;

Open in new window

0
 

Author Comment

by:ventumsolve
ID: 22747994
thanks,
would have liked to see what was wrong with my anchor m/>$/

0

Featured Post

New feature and membership benefit!

New feature! Upgrade and increase expert visibility of your issues with Priority Questions.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Many time we need to work with multiple files all together. If its windows system then we can use some GUI based editor to accomplish our task. But what if you are on putty or have only CLI(Command Line Interface) as an option to  edit your files. I…
In the distant past (last year) I hacked together a little toy that would allow a couple of Manager types to query, preview, and extract data from a number of MongoDB instances, to their tool of choice: Excel (http://dilbert.com/strips/comic/2007-08…
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…
Six Sigma Control Plans

885 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question