?
Solved

Perl Mass File Edit

Posted on 2004-10-14
7
Medium Priority
?
242 Views
Last Modified: 2011-09-20
I'm wanting a program which will add a single line to the beginning of every .html recursively in a directory (and every directory below it). I'm assuming the best way to do this would be to use perl but I'm not sure where to start. Can someone point me in the right direction?

Cheers,

Darrell
0
Comment
Question by:redneon
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
7 Comments
 
LVL 48

Expert Comment

by:Tintin
ID: 12312263
You can use Perl, however, if you are on a Unix system, it's easier to use a shell script.

Assume you have the line to add in a file called /tmp/newline, then do

#!/bin/sh
for file in `find /dir/with/htmlpages -name "*.html"`
do
   echo "Amending $file"
   cat /tmp/newline $file >/tmp/$$ && mv /tmp/$$ $file
done
0
 
LVL 18

Accepted Solution

by:
kandura earned 100 total points
ID: 12317751
In case you're on Windows or another OS where Tintin's solution won't work:

There was an article on Perl.com yesterday about mass file edits: http://www.perl.com/pub/a/2004/10/14/file_editing.html
The topic on File::Find is relevant to what you want.
0
 
LVL 13

Expert Comment

by:gripe
ID: 12321707
**WARNING - Do not cut and paste this code, read entire response before attempting.

The following one-liner will insert 'INSERT THIS LINE' on the first line of every .html file under the current directory:

perl -MFile::Find -we'find sub { /\.html$/ && do { push @ARGV, $File::Find::name;}}, ".";  $^I=".bak"; while (<>) { print "INSERT THIS LINE\n" if $. == 1; print;} continue { close ARGV if eof; }'

Broken down:

-MFile::Find <-- use the 'File::Find' module
-we <-- Enable warnings and 'execute' the following perl block

# Code block:
find  sub { /\.html$/ && do { push @ARGV, $File::Find::Name; }}, ".";
# use the 'find' subroutine from 'File::Find' and
# pass it a subroutine reference as the first argument that
# will be executed for every file found. In this case, pushes the full path names
# of all '.html' files into the @ARGV array. The regex condition will be matched
# against the 'current found file'. (IE: it will iterate over every file under the
# pathname given (explained below) and copy the result to $_;
#
# The second argument to find (".") is the directory to start the 'find' routine in.
# In this case, the current directory. This will work recursively.


$^I=".bak";
# Turn on 'edit in place' and back up edited files with the '.bak' extension

while (<>) { print "INSERT THIS LINE.\n" if $. == 1; print; } continue { close ARGV if eof; }
# iterate over all of the files in @ARGV and edit them in place
# If $. (The line number of the current file) is '1', print out "INSERT THIS LINE\n"
# print the current line.
# All 'prints', regexes, operations on '$_' will be direct edits to the file.
# the continue block ensures that the current file is closed when finished editing
# and therefore resets the line numbering ($.) mechanism for each file.
#
# end code

This is both a very powerful and very dangerous routine. This WILL edit things in place and you should not use it unless you understand what it's doing and how. You could very easily edit/overwrite/destroy all kinds of things you don't want to. This will back up any files it edits in a file with the same name and the '.bak' extension but if you happen to run the same (wrong) command twice, your changes will be lost.

BE CAREFUL :)

Do not just cut and paste this and expect it to do what you want. Any questions or requests for explanation welcome.

Thanks
Matt
0
 
LVL 13

Expert Comment

by:gripe
ID: 12321742
Same one-liner in a standalone script 'edit.pl':

#!/usr/bin/perl

use warnings;
use strict;
use File::Find;

find \&wanted, ".";
$^I=".bak";

while (<>) {
        print "INSERT THIS LINE\n" if $. == 1;
        print;
} continue { close ARGV if eof; }

sub wanted {
        /\.html$/ && do {
                push @ARGV, $File::Find::name;
        }
}
0
 
LVL 84

Assisted Solution

by:ozo
ozo earned 100 total points
ID: 12670712
use warnings;
use strict;
use File::Find;
use Tie::File;
find sub{
    /\.html/ && -f && -T && do {
        tie my @array, 'Tie::File',$_ or warn "$_ $!";
        unshift @array,"INSERT THIS LINE";
        untie @ARRAY;
    }
},".";
0

Featured Post

Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

A year or so back I was asked to have a play with MongoDB; within half an hour I had downloaded (http://www.mongodb.org/downloads),  installed and started the daemon, and had a console window open. After an hour or two of playing at the command …
In the distant past (last year) I hacked together a little toy that would allow a couple of Manager types to query, preview, and extract data from a number of MongoDB instances, to their tool of choice: Excel (http://dilbert.com/strips/comic/2007-08…
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…
Six Sigma Control Plans
Suggested Courses

771 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question