Screen Print

Hi

A https website has multiple links, where i have to click on a link and once it opens take the screen print and save it as tiff in specific folder. is there a way that i can automatically download all the files from the webpage. I am trying to automate with wget however i am not able to figure out the complete path of the sub menu of the website

Tnx
UK
NirvanamanagerAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

NirvanamanagerAuthor Commented:
Thank you
0
dbruntonCommented:
Try Win HTTrack https://www.httrack.com/

Website copier that might do what you want.  Might.  It's not perfect.
0
NirvanamanagerAuthor Commented:
I have tried httrack and wget as well, it will not solve my problem. I am not able get complete path of URL from the sub pages. For example if sai there are 30 links in yahoo.com and I have cleack each link and take a screen print, however I another able to see the path of URL I can see only as yahoo.com but not the path
0
Ultimate Tool Kit for Technology Solution Provider

Broken down into practical pointers and step-by-step instructions, the IT Service Excellence Tool Kit delivers expert advice for technology solution providers. Get your free copy now.

NirvanamanagerAuthor Commented:
what I am looking for is, I work for a company where lot if customers send their invoices in the form of PDF by scanning once it is scanned it will be saved in a database and a lunk will be created for that particular invoice. I need to click on each link and once the invoice opens I will take a screen print for audit purpose.
0
MereteCommented:
Would it help you to use Greenshot to capture the page. Greenshot will capture the entire page
capture a region water mark it save  and also print it includes a screen editor
Greenshot only works with internet explorer however
http://getgreenshot.org/
Greenshot - Screen Capture how to
https://www.youtube.com/watch?v=VTtQPx8F9O8
0
Scott Fell, EE MVEDeveloper & EE ModeratorCommented:
I'm confused by what you are asking.  At first it sounds like you just want to capture screen prints.  But in your follow up, it sounds like is you are getting PDF's sent to you and now you need to save them in some form.

Are the pdf invoices coming to you from vendors to be paid? or are they invoices you sent out and your customers saved them and sent in for one reason or another?

 Then what is saved to a database? The pdf or are you trying to extract text?  

"a lunk will be created "  What is a lunk?

"take a screen print for audit purpose."  What are you doing with the screen shot?  saving the image? or printing it out.

If you can detail your workflow even more, we can help you.  I think what you need and what you think you need are two different things.
0
NirvanamanagerAuthor Commented:
Hi Scott,

Sorry for not detailing it out properly. Here are the detailed steps.

1. I open a website (epserver.com)
2. Login with user credentials
3. Browse through respective page by vendor( selecting from drop down)
4. Where there are links by clicking on it, it would open an invoice
5. Take the screen print
6. Save it in the local drive

I am trying to automate steps 4,5and 6. Green shot would do steps 5 and 6 however, clicking on each link in the page is a challange, is there a way that I can click on each link automatically?

Solutions that I have already tried.

1. Wget : with wget I am unable browse through exact page (vendor page)
2. Green shot: yes it is useful for steps 5 and 6

Hope the question is clear.

By the 'lunk' is a typo, it should be link
0
Scott Fell, EE MVEDeveloper & EE ModeratorCommented:
Do you control the server?
0
NirvanamanagerAuthor Commented:
No I don't have control over the servet
0
Scott Fell, EE MVEDeveloper & EE ModeratorCommented:
Where does the database come in?  What are you saving to the database?  A link, image, text?  What are you doing with the screen print? printing, saving?
0
NirvanamanagerAuthor Commented:
Database what I was referring to is screen prints saving in shared drive. I wouldn't have referred it as database.

Link is an image, not sure what format it is (jpeg, PNG, tiff)

Screen print is saved in a shared drive
0
Scott Fell, EE MVEDeveloper & EE ModeratorCommented:
You can use adobe acrobat pro to manage this work flow https://acrobat.adobe.com/us/en/products/acrobat-pro.html.  

If you are looking to script something on your own, I have image magick on my server http://www.imagemagick.org/script/index.php and used it to create automated workflows.  You essentially feed it command line's via  your script.  There are other similar software products that can do the same.  Joe Winograd has some articles on this topic http://www.experts-exchange.com/articles/13696/Batch-Conversion-of-PDF-and-TIFF-files-via-Command-Line-Interface.html and

If you want to script something from scratch, you should have a start on your own and Experts here can help you troubleshoot any issues you have.
0
NirvanamanagerAuthor Commented:
Thanks a lot Scott, will try these today at work, I will try and provide you some screenshots of what I am trying to achieve, so that we are on same page.
0
NirvanamanagerAuthor Commented:
Anybody else who has a solution, I thought it should be a damn easy one for all the geniuses out here
0
NirvanamanagerAuthor Commented:
OK let's break this

From spreadsheet cell a1 I copy and search in webpage click when I find the contents of cell a1 and loop it for cell a2

Can this be automated
0
ArkCommented:
Assuming you have a html code of main page. Here is a function to grab all links("<a>...</a>" tags) form HTML:
Imports System.Net
Imports System.Text.RegularExpressions
    Private Class LinkItem
        Public Text As String
        Public url As String
    End Class

    Private Function getLinks(html As String) As List(Of LinkItem)
        Dim lst As New List(Of LinkItem)
        Dim m = Regex.Match(html, "<body[^>]*>(.*?)</body>", RegexOptions.IgnoreCase Or RegexOptions.Singleline)
        If m.Success Then
            Dim matches As MatchCollection = Regex.Matches(m.Value, "<a.*?href=[""'](?<url>.*?)[""'].*?>(?<name>.*?)</a>", RegexOptions.IgnoreCase)
            For Each m1 As Match In matches
                lst.Add(New LinkItem With {.Text = m1.Groups(2).Value, .url = m1.Groups(1).Value})
            Next
        End If
        Return lst
    End Function

Open in new window

Next, using hidden webbrowser control, you can make a screen shot from href's urls. Full example:
Imports System.Net
Imports System.Text.RegularExpressions

Public Class Form1
    Private Class LinkItem
        Public Text As String
        Public url As String
    End Class

    Private folderPath = "e:\temp", fileName = "", _loaded As Boolean
    Private WithEvents wb As New WebBrowser With {.ScriptErrorsSuppressed = True}

    Private Sub Button1_Click(sender As System.Object, e As System.EventArgs) Handles Button1.Click
        Dim baseUrl = "http://www.vbtutor.net/vb2010/index.html"
         Using client As New WebClient
            If client.Proxy IsNot Nothing Then
                client.Proxy.Credentials = CredentialCache.DefaultCredentials
            End If
            Dim html = client.DownloadString(baseUrl)
            Dim links = getLinks(html)
            Dim count As Integer
            For Each link In links
                If link.Text.Contains("Managing") Then
                    count += 1
                    Label1.Text = "Loading " & link.Text & "..."
                    fileName = link.Text & ".tiff"
                    _loaded = False
                    Dim url = link.url
                    If Not url.StartsWith("http") Then
                        url = baseUrl.Substring(0, baseUrl.LastIndexOf("/") + 1) & link.url
                    End If
                    wb.Navigate(url)
                    Do While Not _loaded
                        My.Application.DoEvents()
                        Threading.Thread.Sleep(200)
                    Loop
                End If
            Next
        End Using
        MsgBox("Done")
    End Sub

    Private Function getLinks(html As String) As List(Of LinkItem)
        Dim lst As New List(Of LinkItem)
        Dim m = Regex.Match(html, "<body[^>]*>(.*?)</body>", RegexOptions.IgnoreCase Or RegexOptions.Singleline)
        If m.Success Then
            Dim matches As MatchCollection = Regex.Matches(m.Value, "<a.*?href=[""'](?<url>.*?)[""'].*?>(?<name>.*?)</a>", RegexOptions.IgnoreCase)
            For Each m1 As Match In matches
                lst.Add(New LinkItem With {.Text = m1.Groups(2).Value, .url = m1.Groups(1).Value})
            Next
        End If
        Return lst
    End Function

    Private Sub wb_DocumentCompleted(sender As Object, e As System.Windows.Forms.WebBrowserDocumentCompletedEventArgs) Handles wb.DocumentCompleted
        wb.ClientSize = New Size(1024, 768)
        Dim Height As Integer = wb.Document.Body.ScrollRectangle.Bottom
        If Height = 0 Then Height = 768
        wb.ClientSize = New Size(1024, Height)
        Using Bmp = New Bitmap(wb.Bounds.Width, Height)
            wb.DrawToBitmap(Bmp, wb.Bounds)
            Bmp.Save(IO.Path.Combine(folderPath, fileName), Drawing.Imaging.ImageFormat.Tiff)
        End Using
        _loaded = True
    End Sub
End Class

Open in new window

Note that for https secure connection you can get security alert. To avoid it call
ServicePointManager.ServerCertificateValidationCallback = (Function(sender, certificate, chain, sslPolicyErrors) True)

Open in new window

before calling main page.
0
NirvanamanagerAuthor Commented:
Thanks a lot ARK will try this. Thanks a lot again
0
NirvanamanagerAuthor Commented:
Hi ARK thank you and extremely sorry for the late reply. consedring i am very new to VBA could you let me know how do I run the code provided by you
0
ArkCommented:
Actually, this code is for VB.Net WinForm application. You can make COM-visible dll from it and call it from VBA.
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
NirvanamanagerAuthor Commented:
Thank you very much
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Visual Basic Classic

From novice to tech pro — start learning today.

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.