Solved

Copy html file with reference

Posted on 2016-08-01
24
53 Views
Last Modified: 2016-11-02
Hi All,
I have an requirement to copy the html file into specific folder if this contains pdf file name.
For example...
in "Temp" folder
Temp->Html folder contains two txt files and multiple html files.
1.txt
     t1.pdf
     t2.pdf
check t1.pdf in html files and copy that html file into "download" folder.
check t2.pdf in html files and copy that html file into "download" folder.
.....

Can you please provide any reference or sample perl script to achieve this?
Thanks,
Shail
0
Comment
Question by:Shailesh Shinde
  • 6
  • 6
  • 5
  • +1
24 Comments
 
LVL 25

Expert Comment

by:SStory
ID: 41750931
You can check if in the files by doing:
     #!/bin/bash
     files = `grep -l t[1-2].pdf /Temp/Html/*.html`

Open in new window


This will get a bash script variable with all files in it. This isn't perl, but can easily be done with BASH.  Just take those files and copy them to the desired directory... Watch out for BASH substitution.
0
 
LVL 12

Expert Comment

by:tel2
ID: 41751241
Hi SStory,
I don't think it will work with spaces around the '='.
Instead of this:
    files = `grep -l t[1-2].pdf /Temp/Html/*.html`
I think it should be this:
    files=`grep -l t[1-2].pdf /Temp/Html/*.html`

Hi Shailesh,
What do you mean by "Temp->Html"?
Do you mean "Temp/Html" or what?
0
 
LVL 26

Accepted Solution

by:
wilcoxon earned 500 total points
ID: 41751469
This should do what you want...
use strict;
use warnings;
my $dir = shift or die "Usage:  $0 src_dir dest_dir\n"
my $dest = shift or die "Usage:  $0 src_dir dest_dir\n"
opendir DIR, $dir or die "could not open dir $dir: $!";
my @files = grep { -f "$dir/$_" } readdir DIR;
closedir DIR;
my @pdfs;
foreach my $txt (grep /\.txt$/, @files) {
    open IN, "$dir/$txt" or die "could not open $dir/$txt: $!";
    while (<IN>) {
        chomp;
        push @pdfs, $_;
    }
    close IN;
}
my $pdfrx = join '|', @pdfs;
foreach my $html (grep /\.html$/, @files) {
    open IN, "$dir/$html" or die "could not open $dir/$html: $!";
    if (grep /\b(?:$pdfrx)\b/, <IN>) {
        close IN;
        mv "$dir/$html", "$dest/$html" or die "could not move $dir/$html: $!";
    } else {
        close IN;
    }
}

Open in new window

0
 
LVL 25

Expert Comment

by:SStory
ID: 41756417
tel2,

Look as I might, I don't see any quotes around my =. I do have backticks around the command to be executed. I also do see any difference in what you showed me from mine and then yours:
    files = `grep -l t[1-2].pdf /Temp/Html/*.html`

Open in new window

"I think it should be this:"
    files=`grep -l t[1-2].pdf /Temp/Html/*.html`

Open in new window


the only difference I notice is that you took out a space before the =

I actually tested my solution and based upon what the OP seems to have said, it seemed to work fine.
0
 
LVL 26

Expert Comment

by:wilcoxon
ID: 41756481
SStory, there are two problems with your provided solution for what the original poster asked:
  • The solution is not in perl.
  • You do not read the names of the pdf files from the text file (you hard-code t1 and t2).
0
 
LVL 25

Expert Comment

by:SStory
ID: 41756669
wilcoxon, OK.  I do agree it is not in perl.  I don't see where the OP mentions the file name being in a text file...but maybe you are a better interpreter than I.  If you are correct then it would need a slight modification.  I did want the OP to realize that it could easily be done without Perl if my interpretation was correct.
0
 
LVL 26

Expert Comment

by:wilcoxon
ID: 41756736
I agree the OPs question could be clearer about where the pdf names come from and what the txt files are for.
  • He mentions 2 txt files but then only lists 1.
  • Are the t1.pdf and t2.pdf listed in the txt file or are they actual files also found in the directory?
    • If listed in the txt file, should all pdfs listed in any txt file be used to check the html files?
    • If in the directory, should all pdfs found in the dir be checked in the html files?  If so, what are the txt files for?

Shailesh Shinde, can you answer the above questions so we make sure we are providing you with the solution you want?
0
 
LVL 12

Expert Comment

by:tel2
ID: 41757442
Hi SStory,

> "Look as I might, I don't see any quotes around my =."
I didn't say there were quotes around your =.  I put my own quotes around it because I was quoting a portion of your code.  When I said:
> "I don't think it will work with spaces around the '='.
I was referring to the 'spaces' around it.

> "I do have backticks around the command to be executed. I also do see any difference in what you showed me from mine and then yours:
    files = `grep -l t[1-2].pdf /Temp/Html/*.html`
I think it should be this:
    files=`grep -l t[1-2].pdf /Temp/Html/*.html`
the only difference I notice is that you took out a space before the ="

Yes, and I also removed a space after it.

If I run your command (with spaces) in bash, I get this error:
   files: command not found
but if I remove the spaces, I get no error.
It has always been my understanding & experience that shells like sh, ksh & bash don't allow spaces on either side of equals signs in assignments like this, and the above error message supports that theory.  So I don't know how it could be working for you.  Any ideas how it could be...anyone?
0
 
LVL 3

Author Comment

by:Shailesh Shinde
ID: 41757714
Hi,
Sorry, for delay in my comments.
Yes, t1.pdf and t2.pdf is in .txt file which needs to be check in Temp->Html folder containing html files. If found the t1.pdf in any of the html copy that html into another folder.

Thanks,
Shail
0
 
LVL 3

Author Comment

by:Shailesh Shinde
ID: 41757718
Hi,
I will test wilcoxon's perl script for my requirement.

Thanks,
Shail
0
Comprehensive Backup Solutions for Microsoft

Acronis protects the complete Microsoft technology stack: Windows Server, Windows PC, laptop and Surface data; Microsoft business applications; Microsoft Hyper-V; Azure VMs; Microsoft Windows Server 2016; Microsoft Exchange 2016 and SQL Server 2016.

 
LVL 12

Expert Comment

by:tel2
ID: 41757723
Hi again Shailesh,

I know I've asked these questions before, but I haven't seen answers yet, so I'm going to ask them again:
    What do you mean by "Temp->Html"?
    Do you mean "Temp/Html" or what?
0
 
LVL 25

Expert Comment

by:SStory
ID: 41758188
tel2: Sorry, my bad. Somehow I saw "quotes"
You are right it didn't need spaces. even though in a bash script file.
0
 
LVL 12

Expert Comment

by:tel2
ID: 41758641
Thanks SStory.

No worries, all is forgiven.

There's still something I don't understand about this, though.  When you say:
> "You are right it didn't need spaces. even though in a bash script file."
I don't think it's a matter of 'need'.  To me that implies they are optional, but from my experience, it just won't work if you have any spaces around the '='.  Do you agree?

> "I actually tested my solution and based upon what the OP seems to have said, it seemed to work fine."
So are you saying you get no errors when you have spaces around the '='?
0
 
LVL 3

Author Comment

by:Shailesh Shinde
ID: 41769843
Hi,
sorry for delay.
Temp->Html is actually a folder path
There is temp folder and html folder is nested in this temp folder.

Thanks,
Shail
0
 
LVL 25

Expert Comment

by:SStory
ID: 41770178
tel2: no, it didn't work for me either.
0
 
LVL 26

Expert Comment

by:wilcoxon
ID: 41770346
Did you ever test my perl script?  Did it meet your needs?  If not, what is not working?
0
 
LVL 12

Expert Comment

by:tel2
ID: 41770990
Hi Shailesh,

> "Temp->Html is actually a folder path
> There is temp folder and html folder is nested in this temp folder."

If the Html folder is inside the Temp folder, then you represent it like this in UNIX/Linux:
    Temp/Html
That is how your commands (like "mkdir", "cd", etc) will refer to it, too.
Please don't represent that like Temp->Html in future, as that will confuse some people...like me.  Looks like you have some kind of pointer or link.


Hi SStory,
> "tel2: no, it didn't work for me either."
OK, but if it didn't work for you, then what did you mean by this?:
> "I actually tested my solution and based upon what the OP seems to have said, it seemed to work fine."
0
 
LVL 26

Expert Comment

by:wilcoxon
ID: 41852280
There are certainly responses so this should not be deleted without assigning points.

The OP said he would test the script I provided but then never responded with if it worked or what any issues were.
1
 
LVL 26

Expert Comment

by:wilcoxon
ID: 41863987
2) Accept comments as solution

https:#41751469 appears to be the only comment with a solution
0
 
LVL 12

Expert Comment

by:tel2
ID: 41864622
I agree with wilcoxon's (totally unbiased) suggestion...     8)
...even if his link to his post doesn't work properly.  I think it should read: https:#a41751469
0
 
LVL 3

Author Comment

by:Shailesh Shinde
ID: 41870432
Hi All,
I have check the Workspace for this question to accept multiple solutions. However, this question is not available within this option as this has been accepted with solution provided by below detail...

wilcoxon
Accepted Solution 2016-08-10 at 21:04:59ID: 41751469

Thanks,
Shail
0

Featured Post

VMware Disaster Recovery and Data Protection

In this expert guide, you’ll learn about the components of a Modern Data Center. You will use cases for the value-added capabilities of Veeam®, including combining backup and replication for VMware disaster recovery and using replication for data center migration.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
Auto channel for WiFi (Access Point) 3 76
PC upgrade to Linux Mint 7 65
Remove a folder in Linux 9 90
Setting up static IP - Ubuntu server 16.04 12 23
I. Introduction There's an interesting discussion going on now in an Experts Exchange Group — Attachments with no extension (http://www.experts-exchange.com/discussions/210281/Attachments-with-no-extension.html). This reminded me of questions tha…
The purpose of this article is to demonstrate how we can use conditional statements using Python.
Learn several ways to interact with files and get file information from the bash shell. ls lists the contents of a directory: Using the -a flag displays hidden files: Using the -l flag formats the output in a long list: The file command gives us mor…
Learn how to navigate the file tree with the shell. Use pwd to print the current working directory: Use ls to list a directory's contents: Use cd to change to a new directory: Use wildcards instead of typing out long directory names: Use ../ to move…

911 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

17 Experts available now in Live!

Get 1:1 Help Now