Solved

Copy html file with reference

Posted on 2016-08-01
24
52 Views
Last Modified: 2016-11-02
Hi All,
I have an requirement to copy the html file into specific folder if this contains pdf file name.
For example...
in "Temp" folder
Temp->Html folder contains two txt files and multiple html files.
1.txt
     t1.pdf
     t2.pdf
check t1.pdf in html files and copy that html file into "download" folder.
check t2.pdf in html files and copy that html file into "download" folder.
.....

Can you please provide any reference or sample perl script to achieve this?
Thanks,
Shail
0
Comment
Question by:Shailesh Shinde
  • 6
  • 6
  • 5
  • +1
24 Comments
 
LVL 25

Expert Comment

by:SStory
ID: 41750931
You can check if in the files by doing:
     #!/bin/bash
     files = `grep -l t[1-2].pdf /Temp/Html/*.html`

Open in new window


This will get a bash script variable with all files in it. This isn't perl, but can easily be done with BASH.  Just take those files and copy them to the desired directory... Watch out for BASH substitution.
0
 
LVL 11

Expert Comment

by:tel2
ID: 41751241
Hi SStory,
I don't think it will work with spaces around the '='.
Instead of this:
    files = `grep -l t[1-2].pdf /Temp/Html/*.html`
I think it should be this:
    files=`grep -l t[1-2].pdf /Temp/Html/*.html`

Hi Shailesh,
What do you mean by "Temp->Html"?
Do you mean "Temp/Html" or what?
0
 
LVL 26

Accepted Solution

by:
wilcoxon earned 500 total points
ID: 41751469
This should do what you want...
use strict;
use warnings;
my $dir = shift or die "Usage:  $0 src_dir dest_dir\n"
my $dest = shift or die "Usage:  $0 src_dir dest_dir\n"
opendir DIR, $dir or die "could not open dir $dir: $!";
my @files = grep { -f "$dir/$_" } readdir DIR;
closedir DIR;
my @pdfs;
foreach my $txt (grep /\.txt$/, @files) {
    open IN, "$dir/$txt" or die "could not open $dir/$txt: $!";
    while (<IN>) {
        chomp;
        push @pdfs, $_;
    }
    close IN;
}
my $pdfrx = join '|', @pdfs;
foreach my $html (grep /\.html$/, @files) {
    open IN, "$dir/$html" or die "could not open $dir/$html: $!";
    if (grep /\b(?:$pdfrx)\b/, <IN>) {
        close IN;
        mv "$dir/$html", "$dest/$html" or die "could not move $dir/$html: $!";
    } else {
        close IN;
    }
}

Open in new window

0
 
LVL 25

Expert Comment

by:SStory
ID: 41756417
tel2,

Look as I might, I don't see any quotes around my =. I do have backticks around the command to be executed. I also do see any difference in what you showed me from mine and then yours:
    files = `grep -l t[1-2].pdf /Temp/Html/*.html`

Open in new window

"I think it should be this:"
    files=`grep -l t[1-2].pdf /Temp/Html/*.html`

Open in new window


the only difference I notice is that you took out a space before the =

I actually tested my solution and based upon what the OP seems to have said, it seemed to work fine.
0
 
LVL 26

Expert Comment

by:wilcoxon
ID: 41756481
SStory, there are two problems with your provided solution for what the original poster asked:
  • The solution is not in perl.
  • You do not read the names of the pdf files from the text file (you hard-code t1 and t2).
0
 
LVL 25

Expert Comment

by:SStory
ID: 41756669
wilcoxon, OK.  I do agree it is not in perl.  I don't see where the OP mentions the file name being in a text file...but maybe you are a better interpreter than I.  If you are correct then it would need a slight modification.  I did want the OP to realize that it could easily be done without Perl if my interpretation was correct.
0
 
LVL 26

Expert Comment

by:wilcoxon
ID: 41756736
I agree the OPs question could be clearer about where the pdf names come from and what the txt files are for.
  • He mentions 2 txt files but then only lists 1.
  • Are the t1.pdf and t2.pdf listed in the txt file or are they actual files also found in the directory?
    • If listed in the txt file, should all pdfs listed in any txt file be used to check the html files?
    • If in the directory, should all pdfs found in the dir be checked in the html files?  If so, what are the txt files for?

Shailesh Shinde, can you answer the above questions so we make sure we are providing you with the solution you want?
0
 
LVL 11

Expert Comment

by:tel2
ID: 41757442
Hi SStory,

> "Look as I might, I don't see any quotes around my =."
I didn't say there were quotes around your =.  I put my own quotes around it because I was quoting a portion of your code.  When I said:
> "I don't think it will work with spaces around the '='.
I was referring to the 'spaces' around it.

> "I do have backticks around the command to be executed. I also do see any difference in what you showed me from mine and then yours:
    files = `grep -l t[1-2].pdf /Temp/Html/*.html`
I think it should be this:
    files=`grep -l t[1-2].pdf /Temp/Html/*.html`
the only difference I notice is that you took out a space before the ="

Yes, and I also removed a space after it.

If I run your command (with spaces) in bash, I get this error:
   files: command not found
but if I remove the spaces, I get no error.
It has always been my understanding & experience that shells like sh, ksh & bash don't allow spaces on either side of equals signs in assignments like this, and the above error message supports that theory.  So I don't know how it could be working for you.  Any ideas how it could be...anyone?
0
 
LVL 3

Author Comment

by:Shailesh Shinde
ID: 41757714
Hi,
Sorry, for delay in my comments.
Yes, t1.pdf and t2.pdf is in .txt file which needs to be check in Temp->Html folder containing html files. If found the t1.pdf in any of the html copy that html into another folder.

Thanks,
Shail
0
 
LVL 3

Author Comment

by:Shailesh Shinde
ID: 41757718
Hi,
I will test wilcoxon's perl script for my requirement.

Thanks,
Shail
0
Complete Microsoft Windows PC® & Mac Backup

Backup and recovery solutions to protect all your PCs & Mac– on-premises or in remote locations. Acronis backs up entire PC or Mac with patented reliable disk imaging technology and you will be able to restore workstations to a new, dissimilar hardware in minutes.

 
LVL 11

Expert Comment

by:tel2
ID: 41757723
Hi again Shailesh,

I know I've asked these questions before, but I haven't seen answers yet, so I'm going to ask them again:
    What do you mean by "Temp->Html"?
    Do you mean "Temp/Html" or what?
0
 
LVL 25

Expert Comment

by:SStory
ID: 41758188
tel2: Sorry, my bad. Somehow I saw "quotes"
You are right it didn't need spaces. even though in a bash script file.
0
 
LVL 11

Expert Comment

by:tel2
ID: 41758641
Thanks SStory.

No worries, all is forgiven.

There's still something I don't understand about this, though.  When you say:
> "You are right it didn't need spaces. even though in a bash script file."
I don't think it's a matter of 'need'.  To me that implies they are optional, but from my experience, it just won't work if you have any spaces around the '='.  Do you agree?

> "I actually tested my solution and based upon what the OP seems to have said, it seemed to work fine."
So are you saying you get no errors when you have spaces around the '='?
0
 
LVL 3

Author Comment

by:Shailesh Shinde
ID: 41769843
Hi,
sorry for delay.
Temp->Html is actually a folder path
There is temp folder and html folder is nested in this temp folder.

Thanks,
Shail
0
 
LVL 25

Expert Comment

by:SStory
ID: 41770178
tel2: no, it didn't work for me either.
0
 
LVL 26

Expert Comment

by:wilcoxon
ID: 41770346
Did you ever test my perl script?  Did it meet your needs?  If not, what is not working?
0
 
LVL 11

Expert Comment

by:tel2
ID: 41770990
Hi Shailesh,

> "Temp->Html is actually a folder path
> There is temp folder and html folder is nested in this temp folder."

If the Html folder is inside the Temp folder, then you represent it like this in UNIX/Linux:
    Temp/Html
That is how your commands (like "mkdir", "cd", etc) will refer to it, too.
Please don't represent that like Temp->Html in future, as that will confuse some people...like me.  Looks like you have some kind of pointer or link.


Hi SStory,
> "tel2: no, it didn't work for me either."
OK, but if it didn't work for you, then what did you mean by this?:
> "I actually tested my solution and based upon what the OP seems to have said, it seemed to work fine."
0
 
LVL 26

Expert Comment

by:wilcoxon
ID: 41852280
There are certainly responses so this should not be deleted without assigning points.

The OP said he would test the script I provided but then never responded with if it worked or what any issues were.
1
 
LVL 26

Expert Comment

by:wilcoxon
ID: 41863987
2) Accept comments as solution

https:#41751469 appears to be the only comment with a solution
0
 
LVL 11

Expert Comment

by:tel2
ID: 41864622
I agree with wilcoxon's (totally unbiased) suggestion...     8)
...even if his link to his post doesn't work properly.  I think it should read: https:#a41751469
0
 
LVL 3

Author Comment

by:Shailesh Shinde
ID: 41870432
Hi All,
I have check the Workspace for this question to accept multiple solutions. However, this question is not available within this option as this has been accepted with solution provided by below detail...

wilcoxon
Accepted Solution 2016-08-10 at 21:04:59ID: 41751469

Thanks,
Shail
0

Featured Post

6 Surprising Benefits of Threat Intelligence

All sorts of threat intelligence is available on the web. Intelligence you can learn from, and use to anticipate and prepare for future attacks.

Join & Write a Comment

Suggested Solutions

Title # Comments Views Activity
How to use Video memory as swap or ramdrive ? 1 63
Sed question 2 48
oracle query help 36 67
Port ssh and port rsysc are different 2 59
If you have a server on collocation with the super-fast CPU, that doesn't mean that you get it running at full power. Here is a preamble. When doing inventory of Linux servers, that I'm administering, I've found that some of them are running on l…
In the distant past (last year) I hacked together a little toy that would allow a couple of Manager types to query, preview, and extract data from a number of MongoDB instances, to their tool of choice: Excel (http://dilbert.com/strips/comic/2007-08…
Learn several ways to interact with files and get file information from the bash shell. ls lists the contents of a directory: Using the -a flag displays hidden files: Using the -l flag formats the output in a long list: The file command gives us mor…
This demo shows you how to set up the containerized NetScaler CPX with NetScaler Management and Analytics System in a non-routable Mesos/Marathon environment for use with Micro-Services applications.

758 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

19 Experts available now in Live!

Get 1:1 Help Now