Link to home
Start Free TrialLog in
Avatar of Luis Diaz
Luis DiazFlag for Colombia

asked on

Export HTML webpages

Hello experts,

I created a large number of questions on EE. I would like to swiftly and rapidly access to the required information.
With current EE version future, Question section is not easy to use. This is something that I miss a lot from the previous version in which I was able to sort by date my questions and if I am not wrong perform word search for my questions.

Example:  If I want to search a question/solution of year 2017 I need to know exactly in which page is assign.
The alternative is to use Knowledge base feature, however it would be complicate to assign for each of my questions/solutions a KM link & label (839) I already did for 100.
Other alternative the one that I prefer is to simple export all HTML questions through PowerShell or Windows Batch.

Through PowerShell: I am able to get csv files so I think that I cant get html files:

Function Get-CsvFromWeb {
[CmdletBinding()]
Param(
	[String]$Directory,
	[String]$Name,
	[String]$IDapi
)
	[Net.ServicePointManager]::SecurityProtocol = [Net.SecurityProtocolType]::Tls12
	$user= ""
	$pass= ""
	$timeStamp = Get-Date -Format 'yyyyMMdd_HHmmss'
	If (-not (Test-Path -Path $Directory)) {
		New-Item -Path $Directory -ItemType Directory -Force | Out-Null
		Write-Verbose "Target directory '$($Directory)' created"
	}
	If ([IO.Path]::GetExtension($Name) -ne '.csv') {
		$Name = $Name + '.csv'
	}
	$outFile = Join-Path -Path $Directory -ChildPath ($Name -replace '\.csv$', "_$($timeStamp).csv")
	$bytes = [System.Text.Encoding]::ASCII.GetBytes("$($user):$($pass)")
	$base64 = [System.Convert]::ToBase64String($bytes)
	$basicAuthValue = "Basic $($base64)"
	$headers = @{Authorization = $basicAuthValue}
	$url = ""
	Invoke-WebRequest -Uri $url -Headers $headers -Method POST -OutFile $outFile -ErrorAction Stop
	Write-Verbose "File saved to '$($outFile)'"
}

$currentDir = Split-Path -Path $Script:MyInvocation.MyCommand.Path
$exportDir = $currentDir +'
# Example:
Get-CsvFromWeb -Directory $exportDir -Name -IDapi 3 -Verbose 

Open in new window


If someone can help me to get HTML files it would be great.
Avatar of Hello There
Hello There

If you mean Expert Exchange, you can simply search whatever you want. Just open your Profile -> Contributions and use the search and type the key word. You can also sort by date, title, topic...
Avatar of Luis Diaz

ASKER

I know but this doesn't filter all my questions. I prefer to export all my HTML questions/solutions.
I am afraid this would be really hard or even impossible.

If you use the correct keywords on EE, you will always find what you need.
I would do a feature request
Luis,

Are you using the Advanced Search option for Your Questions in your profile?  If so, what are you trying to do that isn't supported with the options there?

User generated image

»bp
Hi Bill,

The issue is that this option is not available at My Question section.
User generated image
I can do it by going to search options but is not straightforward/easy to use. By the way how can I find my ID member?
In the field explanation I see a cross. This filter really work?
User generated image
Thank you for your help.
Open your Profile -> Contributions -> click on Advanced Search
Yes, it takes an extra click or two, but basically:

  • Go to your account profile
  • On the overview page that comes up, down from the top just a bit, there is a section like shown below.  Click on the number next to "Questions".
  • Then click on "Advanced Search"
  • (that will also show you your member number)

User generated image

»bp
Hi Luis,

> I prefer to export all my HTML questions/solutions.

It's easy to do that (I already did...see below), but my question is...what are you going to do with the exported (downloaded) HTML files?

I did this in AutoHotkey, not PowerShell. I wrote these two articles here at EE about downloading from the Internet:
How to download number of Views, Endorsements, Points for Experts Exchange Articles and Videos
Automatically download files from the web - AutoHotkey Script

Also, I published two five-minute EE video Micro Tutorials showing how the script in the first article works:
How to download number of Views, Endorsements, Points for Experts Exchange Articles and Videos--Demo
ArticlesVideosEE: Download statistics on Experts Exchange Articles and Videos - Demo of Enhancements

Leveraging that code, I already have a preliminary version of a script that downloads EE questions into HTML files. It requires a text file with the URLs, which is easy to create from your EE Profile...takes a few minutes using a decent text editor...in fact, the text file with URLs for all your questions (828!) is attached. I tested my code on a few of those (not all 828) and it worked fine...here's the logfile from the run:

20200418094718 DownloadQuestionsEE Version 1.1 Beginning date and time: 2020-04-18_09.47.18 Input File: c:\temp\LuisDiazQuestions.txt
20200418094719 Success: https://www.experts-exchange.com/questions/29179154/Powershell-Execute-script-through-a-bat.html
20200418094720 Success: https://www.experts-exchange.com/questions/29179134/Powershell-export-html-files.html
20200418094721 Success: https://www.experts-exchange.com/questions/29179132/Windows-batch-transpose-VBScript-to-Windows-batch-copy-process.html
20200418094721 Ending date and time: 2020-04-18_09.47.21 Succeeded: 3 Failed: 0

Open in new window

The HTML file from this question is attached as an example. But my question remains...what are you going to do with them? Regards, Joe

Edit:The AutoHotkey script is attached. You'll need to change the variables in the *** begin variables to change *** section. Regards, Joe
29179134-Powershell-export-html-fil.html
LuisDiazQuestionsAll.txt
DownloadQuestionsEE.ahk
Hi Joe,

Great news!
I will look at your procedure.

>what are you going to do with the exported (downloaded) HTML files?

Have them for backup purpose and be able to work on EE scripts when I am in offline (Plane, Train, vacation with my parents and sister etc...)
> I will look at your procedure.

OK, I'll go back to my previous comment and edit it to attach the script so that everything is contained in a single post.

> be able to work on EE scripts when I am in offline

Well, keep in mind that web links will, of course, not work. You'll probably have enough downloaded text in most cases to get some work done, but not everything will be there for you.

Regards, Joe
And of course comments / solutions that have "attachments" won't be available.

Another approach could be something like HTTrack Website Copier, but I haven't really thought out how to get the right full page of all your questions to drive it, so that it could then download them plus their attached content.  It would take a lot of tweaking of options and trial and error I suspect, and not sure if it would result in something great or not...  Just brain storming...


»bp
Agree with Bill that HTTrack may work, although I, too, haven't thought about how to automate it for many pages. Also, you'll probably need to be careful in limiting its mirroring depth and perhaps setting other limits because EE has a ton of links. Here's an EE question where I posted its Limits settings dialog, which may be helpful as you think about using it:

https://www.experts-exchange.com//questions/29071836/how-can-i-download-a-simple-site-that-i-can-then-browse-offline-when-i-dont-have-internet-on-the-train.html

Regards, Joe
Hello Experts,

Thank you very much for those useful advice.

I will start using HTTrack, which seems to be very powerful and light.
I am going to adopt an hybrid approach here.
1.List all EE question title as recommended by Joe. This will allows me to have all my question and search a keyword to identify the full question title or ID.
2.Use advance search to have the online version
3.If necessary download it as recommended by Joe.

@Joe: I start using: DownloadQuestionsEE.ahk but I don't why I have LuisDiazQuestions.txt empty. I modify begins variables but I don't what is wrong here.

I attached my files.
If you can please let me know what is wrong.
DownloadQuestionsEE.ahk
LuisDiazQuestions.txt
Hi Luis,

> please let me know what is wrong

(1) Download the LuisDiazQuestionsAll.txt file in my main post above. I named it with "All" because it has all 828 of your questions. However, when I tested the script, I put just three of them in the file called LuisDiazQuestions.txt because I didn't want to download all 828.

(2) To download all 828, change line 72 in the script to this:

FileListInput:="C:\Users\luis-\Documents\5.Prog\2.Autohot-Vbs-Cmd-Ps\2.1Autohotkeys\LuisDiazQuestionsAll.txt" ; full path to plain text file with a question URL on each line

Open in new window

To test just a few, make a file with just a few links in it and name it LuisDiazQuestionsTest.txt. Then change line 72 to this:

FileListInput:="C:\Users\luis-\Documents\5.Prog\2.Autohot-Vbs-Cmd-Ps\2.1Autohotkeys\LuisDiazQuestionsTest.txt" ; full path to plain text file with a question URL on each line

Open in new window

The point is, as the comment on the line says, the FileListInput variable is the full path to a plain text file with a question URL on each line. Regards, Joe
Hi Joe,

Clear for downloading HTML webpage.

What is not clear is the following:

It requires a text file with the URLs, which is easy to create from your EE Profile...takes a few minutes using a decent text editor...in fact, the text file with URLs for all your questions (828!) is attached.

This operation was done manually?
> This operation was done manually?

Yes. I did it by saving the Contributions>Questions pages at your EE Profile. You can do 200 at a time, so I needed to save five pages to get all 828, then easily pulled out the URLs using my fav text editor. Probably took less than 10 minutes.
A ok, now I understand. Thank you for this update.
I took a look at contribution view and I think it is very powerful it has the newest/oldest filter, the grid/detail view + the capability to display 200 questions instead of 25 for current Question section. I don't understand why this view is so hidden and not emphasis! It should replace Question section. The only issue that I see that title is not properly displayed so I was wondering @Joe how you did to copy various questions titles if they are not properly displayed?
User generated image
> I think it is very powerful it has the newest/oldest filter

Yes, very powerful, and the filter has much more than just Newest/Oldest choice...lots of ways to sort!

> the capability to display 200 questions

Yes, 200 is nice, but 500 would be better. :)

> @Joe how you did to copy various questions titles if they are not properly displayed?

They're cut off only in the display, but under the covers, the URL has the full title. For example, here are a few lines (with long titles) from one of your saved pages:

<https://www.experts-exchange.com/questions/29179320/Excel-VBA-copy-sheets-from-one-workbook-to-another.html>
<https://www.experts-exchange.com/questions/29179132/Windows-batch-transpose-VBScript-to-Windows-batch-copy-process.html>
<https://www.experts-exchange.com/questions/29176456/Powershell-Windows-Batch-VB-Script-Save-as-multiple-files-with-new-extension.html>

Regards, Joe
Thank you for this message Joe. I will not assign a solution yet as I would like to deeply analyse this week contribution view and other topics mentioned otherwise I risk to miss this important question.

Regards,
Luis.
> Thank you for this message Joe ... I would like to deeply analyse this week contribution view and other topics mentioned

You're welcome, Luis, take all the time you need for your analysis. Regards, Joe
Hi Joe,

Sorry to disturb you but I am trying to apply the procedure recommended at:

#a43069359


Text selected and paste in a Notepad++ doesn't give me any line with https://www.experts-exchange.com/questions/IDnumber

User generated image
If you can let me know how to proceed it would be great.

QuestionsID are very important personally speaking. The idea is to manage them in a txt file with manual process as it cannot be automated but I need to know how can get this information.

Regards,
Luis.
ASKER CERTIFIED SOLUTION
Avatar of Joe Winograd
Joe Winograd
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Thank you Joe.

Got it. Know I am able to find them.

User generated image
Thank you for your help.

Regards,
Luis.
You're welcome, Luis. That's great news! Regards, Joe