Link to home
Start Free TrialLog in
Avatar of Bryan71
Bryan71

asked on

Parsing output with VBS or JS

I am trying to build a quick extract to regularly pull my companies facebook fans from the API. I am using iMacro and have written the code to pull one of our Facebook pages from a csv file/column 1, and then pull the fan name and id with the source url using regular expression and write it to a csv file (output: name, id, source url).

The place I am stuck (I am not a programmer) is that this is being pulled from the Facebook graph API and is not an html page so I can only get it to pull the first data pair. So, evidently I am being told what I have is correct so far but what needs to happen is that I need to have a VBS or JS that I actually launch that runs the macro script and then loops thru the graph api page pulling all data pairs and writes it to a csv file, then pulls the next row from my source csv file of our company pages, then parses the output, until there are no more source rows with a url. The script currently will pull either the first data pair using regex or the entire page of text as shown below but just writes the results  to the csv file not organized at all. Unfortunely with days of trial and error...it is beyond my abilities.

So, here is a sample of the text that I am pulling that needs to be parsed. The only field I really need is the id paired with the source url (so I can keep track of the fans for each of our unique pages).

{
"data": [
{
"name": "Fan Name...",
"id": "123455767"
},
{
"name": "Fan Name...",
"id": "123455767555777"
},
{
"name": "Fan Name...",
"id": "1234557675656565"
}
]
}

Here is a sample of the code I have that actually pulls the first pair from the page and writes it to the csv file. I have another version of this code that does not use regex and can pull all of the text from the entire page, but then need to be parsed into the output file.

VERSION BUILD=7010818
TAB T=1
TAB CLOSEALLOTHERS
SET !EXTRACT_TEST_POPUP NO
SET !ERRORIGNORE YES

SET !DATASOURCE extract1.csv
'Number of columns in the CSV file.
SET !DATASOURCE_COLUMNS 1
'Start at line 2 to skip the headers in the file
SET !LOOP 2
'Increase the current position in the file with each loop
SET !DATASOURCE_LINE {{!LOOP}}

SEARCH SOURCE=REGEXP:"name":\s"([^"]+)" EXTRACT="$1"
SEARCH SOURCE=REGEXP:"id":\s(\d*)" EXTRACT="$1"

ADD !EXTRACT {{!URLCURRENT}}

SAVEAS TYPE=EXTRACT FOLDER=C:\Documents<SP>and<SP>Settings\admin\Desktop\folder1 FILE=extract1.csv

Here is a script I found that I am not using correctly since I don't understand how to do this.



Now, the solution as I understand it is that I need to launch a script that uses launches the imacro code and pulls all of the text, parses it and writes it to the csv file, then runs the imacro code again to pull the next source url, pulls the text and parses it...until there are no additional source urls then shuts down. Any help is appreciated as I have already burned up many hours trying to solve this before bothering anyone else...
Option Explicit

Dim iim1,iret
Set iim1 = CreateObject("imacros")
iret = iim1.iimInit("",FALSE) 'connect to open iMacros browser window 

Dim macro

Dim counter
counter = 1 

Dim extraction, extractionArray(5)

do while not (iret < 0)
   macro = "CODE:"
   macro = macro + "VERSION BUILD=7010818     "+vbNewLine
   macro = macro + "TAB T=1      "    + vbNewLine
   macro = macro + "TAB CLOSEALLOTHERS     "+vbNewLine
        macro = macro + "SET !EXTRACT_TEST_POPUP NO      "    + vbNewLine
        macro = macro + "SET !ERRORIGNORE YES      "    + vbNewLine
   macro = macro + "SET !DATASOURCE extract1.csv      "    + vbNewLine
                         'Number of columns in the CSV file. This must be accurate!
        macro = macro + "SET !DATASOURCE_COLUMNS 1      "    + vbNewLine
                         'Start at line 2 to skip the headers in the file
        macro = macro + "SET !LOOP 1      "    + vbNewLine
                         'Increase the current position in the file with each loop 
        macro = macro + "SET !DATASOURCE_LINE {{!LOOP}}      "    + vbNewLine
        macro = macro + "URL GOTO={{!COL1}}      "    + vbNewLine
        macro = macro + "SEARCH SOURCE=REGEXP:"name":\s"([^"]+)" EXTRACT="$1"      "    + vbNewLine
        macro = macro + "SEARCH SOURCE=REGEXP:"id":\s"(\d*)" EXTRACT="$1"      "    + vbNewLine
        macro = macro + "ADD !EXTRACT {{!URLCURRENT}}      "    + vbNewLine
        macro = macro + "SAVEAS TYPE=EXTRACT FOLDER=C:\Documents<SP>and<SP>Settings\admin\Desktop\folder1 FILE=results1.csv      "    + vbNewLine
   iret = iim1.iimPlay(macro)
   
   counter = counter + 3
   if (iret < 0) then
      
   end if
loop

iret = iim1.iimPlay(macro)

Open in new window

Avatar of Bryan71
Bryan71

ASKER

Here is a revised request to maybe simplify the solution. Since I have the first part working (extracting all text from multiple pages into a text file that needs to be parsed)...maybe if I could just get some assistance in just parsing the text file into a csv file.

Here is what the extracted text file looks like:

"https://longurlofsourcepage.com/1234444555543","{
   ""data"": [
      {
         ""name"": ""blah blah blah..."",
         ""id"": ""100000012345""
      },
      {
         ""name"": ""blah blah blah..."",
         ""id"": ""121212122255""
      },
      {
         ""name"": ""blah blah blah..."",
         ""id"": ""188552255888""
      },
    ]
}"
"https://longurlofsourcepage.com/123998877445","{
   ""data"": [
      {
         ""name"": ""blah blah blah..."",
         ""id"": ""5488996325874""
      },
      {
         ""name"": ""blah blah blah..."",
         ""id"": ""959595858587874""
      },
      {
         ""name"": ""blah blah blah..."",
         ""id"": ""323265659898774""
      },
    ]
}"
"https://longurlofsourcepage.com/4565446544457789","{
   ""data"": [
      {
         ""name"": ""blah blah blah..."",
         ""id"": ""852852852852""
      },
      {
         ""name"": ""blah blah blah..."",
         ""id"": ""7474745454565656""
      },
      {
         ""name"": ""blah blah blah..."",
         ""id"": ""85856565353535""
      },
    ]
}"

Here is what I need it parsed into:

Source URL (Appears just before each pages extract), name, id
https://longurl....., name, id
https://longurl....., name, id
https://longurl....., name, id
https://longurl2....., name, id
https://longurl2....., name, id
https://longurl2....., name, id
https://longurl3....., name, id
https://longurl3....., name, id
https://longurl3....., name, id

If combining the two activities is too complex, I can always run it in two steps like this.....

Thanks!
Avatar of leakim971
I don't see the urls in your sample...
You may use something like this :
Adidditionaly if you want to save the content of the textarea on your computer, it is only possible on MS Internet Explorer : http://www.c-point.com/JavaScript/articles/file_access_with_JavaScript.htm

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<title>Untitled Document</title>
<script language="javascript" src="http://code.jquery.com/jquery-1.4.2.min.js"></script>
<script language="javascript">

	var json = {"data":[{"name":"Fan Name...","id":"123455767"},{"name":"Fan Name...","id":"123455767555777"},{"name":"Fan Name...","id":"1234557675656565"}]};

	$(document).ready(function() {
		for(var i=0;i<json.data.length;i++) {
			$("textarea").append( json.data[i].name + "," + json.data[i].id + "\n" );
		}
	});

</script>
</head>
<body>
<textarea cols="64" rows="64"></textarea>
</body>
</html>

Open in new window

Avatar of Bryan71

ASKER

leakim971, Thanks for the contribution. My comment just above yours explained a simpler parsing request from my original post. I have an exact extract of the txt document that my original codes output. So, let me restate: I need a parser that I can point at a specific text document that is structured like the sample of varied length (unknown number of fan names/ids per source url (see sample). When I run the parser it writes the output in a csv file in 3 columns (source url in front of each fan name/id that follows it). I put a sample extract just below my comment above of what the output of the parser would need to be.....

Hopefully that helps describe what is needed.....Thanks!
>I have an exact extract of the txt document that my original codes output
Yes, from the FB graph API, you get a JSON object easy to parse with Javascript :

I have an exact extract of the txt document that my original codes output :



{
"data": [
{
"name": "Fan Name...",
"id": "123455767"
},
{
"name": "Fan Name...",
"id": "123455767555777"
},
{
"name": "Fan Name...",
"id": "1234557675656565"
}
]
}

Open in new window

In my html page, the same object line 9 :


var json = {"data":[{"name":"Fan Name...","id":"123455767"},{"name":"Fan Name...","id":"123455767555777"},{"name":"Fan Name...","id":"1234557675656565"}]};

Open in new window

From this object we create CSV content ready to put in a file on IE using ActiveX : http://www.c-point.com/JavaScript/articles/file_access_with_JavaScript.htm

If you open the page I provided, you will not see the URL part (the first column in your CSV content) because I don't know where you get it.
I saw you post an other content but it's not easy to parse (for me), the original format from your first post is better, because it's a true JSON object easy to parse with Javascript.



"https://longurlofsourcepage.com/1234444555543","{
   ""data"": [
      {
         ""name"": ""blah blah blah..."",
         ""id"": ""100000012345""
      },
      {
         ""name"": ""blah blah blah..."",
         ""id"": ""121212122255""
      },
      {
         ""name"": ""blah blah blah..."",
         ""id"": ""188552255888""
      },
    ]
}"
"https://longurlofsourcepage.com/123998877445","{
   ""data"": [
      {
         ""name"": ""blah blah blah..."",
         ""id"": ""5488996325874""
      },
      {
         ""name"": ""blah blah blah..."",
         ""id"": ""959595858587874""
      },
      {
         ""name"": ""blah blah blah..."",
         ""id"": ""323265659898774""
      },
    ]
}"
"https://longurlofsourcepage.com/4565446544457789","{
   ""data"": [
      {
         ""name"": ""blah blah blah..."",
         ""id"": ""852852852852""
      },
      {
         ""name"": ""blah blah blah..."",
         ""id"": ""7474745454565656""
      },
      {
         ""name"": ""blah blah blah..."",
         ""id"": ""85856565353535""
      },
    ]
}

Open in new window

Avatar of Bryan71

ASKER

leakim971, Thanks for the additional explanation. Couple of comments: I need to do this for a group of our pages and wanted to do them all from a single spreadsheet or DB vs. one at a time. I am not sure how to use this even for one page. I open it in IE...then how do I navigate to one of the graph pages to have it parse the fans and put it in a csv file?
please provide sample data excluding your code solution

1. a sample or mock-up of the raw data
2. what you want to do with the data (e.g. out put the date as an html table, etc)
typo.

please provide sample data excluding your code solution

1. a sample or mock-up of the raw data
2. what you want to do with the data (e.g. out put the DATA as an html table, etc)
Avatar of Bryan71

ASKER

Here is an a sample file of the file that needs to be parsed (sample.txt).

Also, here is the desired output file as well (output.csv)

Thx!
sample.txt
Avatar of Bryan71

ASKER

Output file format from parsing...
output.csv
Avatar of Bryan71

ASKER

Again....the sample.txt file could have more than 3 urls with data...so the parser need to keep parsing until it comes to the end of the .txt file.
Could you confirm your build yourself the sample.txt file ? Because the same address appear four time instead having distinct address ?
If, yes, please confirm you agree with the following :


Clipboard01.jpg
Avatar of Bryan71

ASKER

Actually each URL will be different.......what is important is that the URL will be just before the fans for the page. When another URL is shown, then the following fans are from that page....until you reach the end of the file....
ok, are you ok with the picture ?
Avatar of Bryan71

ASKER

Sorry about the sample files.......I have corrected the sample.txt file to correctly reflect the URL changes...the output.csv file is correct.....sorry for any confusion. Here are a correctly matched set of input/output files.....


sample.txt
output.csv
Avatar of Bryan71

ASKER

leakim971:
ok, are you ok with the picture ?

Yes.....as long as the URL changes with each group of fans.....see corrected sample.txt file....and output.csv file. They are correctly matched now to perfectly represent the source file and output file.....thx!
The following code read a file sample.txt on a server :

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<title>Untitled Document</title>
<script language="javascript" src="http://code.jquery.com/jquery-1.4.2.min.js"></script>
<script language="javascript">
	$(document).ready(function() {
		$(document).ready(function() {
			$.get("sample.txt", function(data) {				
				data = data.replace(/""id"": ""|""data"": \[|""name"": ""|"",|"",\r|""\n/g,function($1) {
					return $1.split("\"\"").join("\"");
				});
				data = data.replace(/},\n\s+]/g, "}]").replace(/}"\x0d\x0a"|}"\x0d"|}"\x0a"/g, "},\"").split("\",\"{").join("\":{");
				data = "{" + data.substring(0, data.length-1) + "}";
				//$("textarea").append( data );
				var json = $.parseJSON( data );
				for(var j  in json) {
					url = json[j];
					for(var i=0;i<json[j].data.length;i++) {
						$("textarea").append( j + "," + json[j].data[i].name + "," + json[j].data[i].id + "\n" );
					}
				}
			});
		});
	});
</script>
</head>
<body>
<textarea cols="80" rows="16"></textarea>
</body>
</html>

Open in new window

Avatar of Bryan71

ASKER

Interesting. So if I have the HTML file.....what directory should the "sample.txt" file be in in order for the parsed text to appear in the browser window? Will it save it as a csv file or would I then copy and paste it from the browser window into a spreadsheet? Just trying to establish how I would use this in my process....
Avatar of Bryan71

ASKER

Also, I noticed you have called out at the bottom of the code how many rows and columns. What happens if the parsed text is 15,000 rows long? There is no way to know how many rows until it is parsed.....
>Interesting. So if I have the HTML file.....what directory should the "sample.txt" file be in in order for the parsed text to appear in the browser window?
Same as the HTML file

Will it save it as a csv file or would I then copy and paste it from the browser window into a spreadsheet?
It's the next step. If the current code work we can try to do better.


Avatar of Bryan71

ASKER

Not sure how to test this.......I pasted the text from the sample.txt into the window in IE and it just shows the pasted text not parsed....How can I test it?
>I pasted the text from the sample.txt into the window in IE and it just shows the pasted text not parsed....How can I test it?

put your sample.txt file in the same folder as your HTML page and open the page.  The HTML page will open itself the sample.txt file to parse it and put content in the textarea initialy empty.
Avatar of Bryan71

ASKER

I put them both directly in the C:\ directory, launched in IE with Active X controls active......nothing happens. I'd love to give it a try to see how it works.....but the text box in IE just sits empty.....
>I put them both directly in the C:\ directory

You need a web server.
Avatar of Bryan71

ASKER

Yes....I loaded them on a webserver still with no result.

http://mototek.us/file/extract.html

http://mototek.us/file/sample.txt

Opened then in IE.....still an empty page. Any ideas?
ok, you sample.txt file is not the same, give me some seconds to check it.
Avatar of Bryan71

ASKER

Yes.....the sample.txt could be thousands of lines long.....maybe hundreds of thousands........
Avatar of Bryan71

ASKER

and could have 100,000 fans for one page even....so the number of URLs and the number of Fans could be any number....
you did bad copy/paste so the sample is not correct. I do a correction the last char only :

<script language="javascript">

	$(document).ready(function() {
		$(document).ready(function() {
			$.get("sample.txt", function(data) {		
				data = data.replace(/""id"": ""|""data"": \[|""name"": ""|"",|"",\r|""\n/g,function($1) {
					return $1.split("\"\"").join("\"");
				});
				data = data.replace(/},\n\s+]/g, "}]").replace(/}"\x0d\x0a"|}"\x0d"|}"\x0a"/g, "},\"").split("\",\"{").join("\":{");
				data = "{" + data.substring(0, data.lastIndexOf("\"")) + "}";
				try {
					var json = $.parseJSON( data );
				}
				catch(e) {
					$("textarea").append( data );
				}
				for(var j  in json) {
					url = json[j];
					for(var i=0;i<json[j].data.length;i++) {
						$("textarea").append( j + "," + json[j].data[i].name + "," + json[j].data[i].id + "\n" );
					}
				}
			});
		});
	});

</script>

Open in new window

Avatar of Bryan71

ASKER

OK....I put the corrected code on the URLs above.....now it does show the original text file in the window.....but does not parse it yet.....any thoughts?
It's because there's an error in the file, you can it with this tools : http://jsonlint.com/
Clipboard02.jpg
Avatar of Bryan71

ASKER

I'm not sure how the json tool helps me at this point.....

I did get the code to work on my computer with the original sample.txt file. I cannot get it to function when I multiply that file times 10....100....whatever. This is very cool so far....
>I'm not sure how the json tool helps me at this point.....
Copy the content of the textarea you get to the json tools textarea and click on Validate

>I did get the code to work on my computer with the original sample.txt file.
Yes, it work fine :)

>I cannot get it to function when I multiply that file times 10....100....whatever.
I understand your preocupation is over the number of record but should not be. Your first big file was build by your hand and I found error in the file. I hope the one you're trying is a true one...

>This is very cool so far....
If it work... To know where the error is, use the tools or give me a true file to let me check it
Avatar of Bryan71

ASKER

I just ran the script and have an updated sample output to parse. I was very careful only to make it anonymous (url, name, id)...otherwise it is identical output.....
newsample.txt
Work fine for me :


https://long.url.com/123456789/5555gg55gg55.147821596|S5W-NLcb5DufFYiNFp1c03yz4Jg,Person Name1,52525252525252
https://long.url.com/123456789/5555gg55gg55.147821596|S5W-NLcb5DufFYiNFp1c03yz4Jg,Person Name2,454545454545454
https://long.url.com/123456789/5555gg55gg55.147821596|S5W-NLcb5DufFYiNFp1c03yz4Jg,Person Name3,8787878778787
https://long.url.com/123456789/5555gg55gg55.147821596|S5W-NLcb5DufFYiNFp1c03yz4Jg,¿¿¿¿¿ ¿¿¿¿¿¿¿¿,36555225511111
https://long.url.com/123456789/5555gg55gg55.147821596|S5W-NLcb5DufFYiNFp1c03yz4Jg,Person Name5,100000444455454588
https://long.url.com/123456789/5555gg55gg55.147821596|S5W-NLcb5DufFYiNFp1c03yz4Jg,Person Name6,10000008528528527
https://long.url.com/87456922541253/5555gg55gg55.147821596|S5W-NLcb5DufFYiNFp1c03yz4Jg,Person Name1b,1000085555
https://long.url.com/87456922541253/5555gg55gg55.147821596|S5W-NLcb5DufFYiNFp1c03yz4Jg,Person Name2b,8585858566561323232
https://long.url.com/87456922541253/5555gg55gg55.147821596|S5W-NLcb5DufFYiNFp1c03yz4Jg,Person Name3b,1052366654441
https://long.url.com/87456922541253/5555gg55gg55.147821596|S5W-NLcb5DufFYiNFp1c03yz4Jg,Person Name4b,54545454545252525
https://long.url.com/87456922541253/5555gg55gg55.147821596|S5W-NLcb5DufFYiNFp1c03yz4Jg,Person Name5b,100077774445
https://long.url.com/65656565454545/5555gg55gg55.147821596|S5W-NLcb5DufFYiNFp1c03yz4Jg,Person Name1c,8522222555555
https://long.url.com/65656565454545/5555gg55gg55.147821596|S5W-NLcb5DufFYiNFp1c03yz4Jg,Person Name2c,8855221111447
https://long.url.com/65656565454545/5555gg55gg55.147821596|S5W-NLcb5DufFYiNFp1c03yz4Jg,Person Name3c,121212121212120
https://long.url.com/65656565454545/5555gg55gg55.147821596|S5W-NLcb5DufFYiNFp1c03yz4Jg,Person Name4c,5252552525545454
https://long.url.com/65656565454545/5555gg55gg55.147821596|S5W-NLcb5DufFYiNFp1c03yz4Jg,Person Name5c,45545455452521111
https://long.url.com/65656565454545/5555gg55gg55.147821596|S5W-NLcb5DufFYiNFp1c03yz4Jg,Person Name6c,152222222111445

Open in new window

ASKER CERTIFIED SOLUTION
Avatar of leakim971
leakim971
Flag of Guadeloupe image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of Bryan71

ASKER

Yes....I did get it to work as well! This is excellent! I can see why being very careful with the exact sample.txt file format is so important now.

Since I need to use this in a web browser (IE), is it hard to add 2 controls to it? One to "browse" for the .txt file to parse (instead of hard coding in the file name), and the second to "save as" and save the output as a .csv file when it is completed.

This is awesome.......

>is it hard to add 2 controls to it?

Check the link provided ID:33840387
Avatar of Bryan71

ASKER

Do I just paste the controls before and after the code above in the html file? This looks very cool......
no, to difficult to integrate so forget it, use a good copy/cut to a text file and save manually.
Avatar of Bryan71

ASKER

Great job. This gets me most of the way there....I appreciate the assistance.
Thanks for the points!
Avatar of Bryan71

ASKER

Hey leakim971, I did a very large extract of many URLs with thousands of fans.....and no matter what I do it processes the file but when it is done it just shows the original file in the window (the parser must find something not expected). Can I post the source file for you to look at to see if you can determine what is happening? Thanks!
For sure!
Avatar of Bryan71

ASKER

Thanks....

mototek.us/file/groups1-output-edited.txt
got it, you can remove it
Try this :


<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<title>Untitled Document</title>
<script language="javascript" src="http://code.jquery.com/jquery-1.4.2.min.js"></script>
<script language="javascript">

	$(document).ready(function() {
		$(document).ready(function() {
			$.get("groups1-output-edited.txt", function(d) {
				var end = 0;
				while(end!=-1) {
					var str = end;
					var end = d.indexOf("\"http", str + 1);
					if(end<0) end = d.length
					var data = d.substring(str, end);
					data = data.replace(/""id"": ""|""data"": \[|""name"": ""|"",|"",\r|""\n/g,function($1) {
						return $1.split("\"\"").join("\"");
					});
					data = data.replace(/},\n\s+]/g, "}]").replace(/}"\x0d\x0a"|}"\x0d"|}"\x0a"/g, "},\"").split("\",\"{").join("\":{");
					data = "{" + data.substring(0, data.lastIndexOf("\"")) + "}";
					try {
						var json = $.parseJSON( data );
					}
					catch(e) {
						$("textarea:eq(1)").append( data );
					}
					for(var j  in json) {
						url = json[j];
						for(var i=0;i<json[j].data.length;i++) {
							$("textarea:eq(0)").append( j + "," + json[j].data[i].name + "," + json[j].data[i].id + "\n" );
						}
					}
				}
			});
		});
	});

</script>
</head>
<body>
<textarea cols="120" rows="16"></textarea><br />
<h1>Bad :</h1><br />
<textarea cols="120" rows="16"></textarea>
</body>
</html>

Open in new window

A console application with VB.NET :

(if you want the csv file, hire or send me an email)
Imports System.IO

Module Module1

    Sub Main()
        Dim fs As New FileStream("D:\\groups1-output-edited.txt", FileMode.Open, FileAccess.Read)
        Dim d As New StreamReader(fs)
        d.BaseStream.Seek(0, SeekOrigin.Begin)
        Dim bytes(fs.Length) As Char
        Dim s As String
        s = d.Read(bytes, 0, fs.Length)
        d.Close()
        s = CStr(bytes)

        fs = New FileStream("D:\\groups1-output-edited.csv", FileMode.Create, FileAccess.Write)
        Dim w As New StreamWriter(fs)
        w.BaseStream.Seek(0, SeekOrigin.End)

        Dim fin As Integer = 0
        While fin <> -1
            Dim beg As Integer = fin
            fin = s.IndexOf("""" & "http", beg + 1)
            Dim data
            If fin >= 0 Then
                data = s.Substring(beg, fin - beg)
            Else
                data = s.Substring(beg)
            End If
            Dim lines() As String = Data.Split(Chr(10))
            If lines.Length > 6 Then
                Dim b = 1
                Dim e = lines(0).IndexOf("""" & "," & """" & "{") - 1
                Dim url As String = lines(0).Substring(b, e)
                For i As Integer = 3 To lines.Length - 3 Step 4
                    Dim name As String = lines(i)
                    If name.IndexOf("name") < 0 Then
                        name = ""
                        i -= 1
                    Else
                        b = name.IndexOf("""" & """" & "name" & """" & """" & ": " & """")
                        e = name.LastIndexOf("""" & """")
                        name = name.Substring(b + 12, e - b - 12)
                    End If
                    Dim id As String = lines(i + 1)
                    b = id.IndexOf("""" & """" & "id" & """" & """" & ": " & """")
                    e = id.LastIndexOf("""" & """")
                    id = id.Substring(b + 10, e - b - 10)

                    w.WriteLine(url & "," & name & "," & id)

                Next i
            End If
        End While

        w.Close()
    End Sub

End Module

Open in new window

Avatar of Bryan71

ASKER

Wow...that worked great.....I had waited overnight on the IE version and couldn't get it exported without freezing the browser.....

I loaded MS Visual Studio and run this from there...was not sure how else to execute it outside of that. Worked in 12 seconds....I was beginning to think I needed a Pearl script to do it fast.

When we need enhancements or other tools I would be happy to hire you on those projects....let me know how we would contact you...Thanks!
EE put a Hire me button in member profil, don't hesitate!

>was not sure how else to execute it outside of that.
You can use the following :

example : C:\Projects\ConsoleApplication3\ConsoleApplication3\bin\Release>ConsoleApplication3 d:\groups1-output-edited.txt d:\groups1-output-edited.csv
Imports System.IO

Module Module1

    Sub Main(ByVal sArgs() As String)
        If sArgs.Length <> 2 Then
            Console.WriteLine("please provide source and target filenames")
        Else
            Try
                Dim fs As New FileStream(sArgs(0), FileMode.Open, FileAccess.Read)
                Dim d As New StreamReader(fs)
                d.BaseStream.Seek(0, SeekOrigin.Begin)
                Dim bytes(fs.Length) As Char
                Dim s As String
                s = d.Read(bytes, 0, fs.Length)
                d.Close()
                s = CStr(bytes)

                fs = New FileStream(sArgs(1), FileMode.Create, FileAccess.Write)
                Dim w As New StreamWriter(fs)
                w.BaseStream.Seek(0, SeekOrigin.End)

                Dim fin As Integer = 0
                While fin <> -1
                    Dim beg As Integer = fin
                    fin = s.IndexOf("""" & "http", beg + 1)
                    Dim data
                    If fin >= 0 Then
                        data = s.Substring(beg, fin - beg)
                    Else
                        data = s.Substring(beg)
                    End If
                    Dim lines() As String = data.Split(Chr(10))
                    If lines.Length > 6 Then
                        Dim b = 1
                        Dim e = lines(0).IndexOf("""" & "," & """" & "{") - 1
                        Dim url As String = lines(0).Substring(b, e)
                        For i As Integer = 3 To lines.Length - 3 Step 4
                            Dim name As String = lines(i)
                            If name.IndexOf("name") < 0 Then
                                name = ""
                                i -= 1
                            Else
                                b = name.IndexOf("""" & """" & "name" & """" & """" & ": " & """")
                                e = name.LastIndexOf("""" & """")
                                name = name.Substring(b + 12, e - b - 12)
                            End If
                            Dim id As String = lines(i + 1)
                            b = id.IndexOf("""" & """" & "id" & """" & """" & ": " & """")
                            e = id.LastIndexOf("""" & """")
                            id = id.Substring(b + 10, e - b - 10)

                            w.WriteLine(url & "," & name & "," & id)

                        Next i
                    End If
                End While
                w.Close()
            Catch e As FileNotFoundException
                Console.WriteLine("File not found, check filename and path")
            Catch e As IOException
                Console.WriteLine("IO Error, check filename and path")
            Catch e As UnauthorizedAccessException
                Console.WriteLine("IO Error, you're not allowed to use this path")
            Catch e As Exception
                Console.WriteLine(e)
            End Try
        End If

    End Sub

End Module

Open in new window