[Okta Webinar] Learn how to a build a cloud-first strategyRegister Now

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 823
  • Last Modified:

how do I extract data from my gmail mailbox?

I want to extract links from from emails in my yahoo and gmail accounts and save them in a text file.

How do I do that? Thanks for your help.
0
wildgoose1
Asked:
wildgoose1
  • 7
  • 6
1 Solution
 
97WideGlideCommented:

You'll want to get gmail via pop then use something like :

http://www.spadixbd.com/extractor/

Not sure if Yahoo has pop access or not - don't think it does. and I'm not sure how you would do it via the web interface.
0
 
wildgoose1Author Commented:
Thanks for your comment. This works with MS Outlook and Outlook Express, I need something that can just access my gmail and pull out the information I need.

Is there any way to do that without using outlook?
0
 
97WideGlideCommented:
Hm, I'll be interested to hear other suggestions but it seems to me that you would have to manually click on your emails or else have some bot do it for you.  I'm not sure what bots are available to do that.
Of course, u could have a custom app do it but that's another story.
0
Concerto's Cloud Advisory Services

Want to avoid the missteps to gaining all the benefits of the cloud? Learn more about the different assessment options from our Cloud Advisory team.

 
wildgoose1Author Commented:
Is there any way to do it with Thunderbird?
0
 
wildgoose1Author Commented:
Or export all the emails onto a single text file to extract the links from there?
0
 
97WideGlideCommented:
I'm not familiar with Thunderbird but I would be pretty sure that it would work just fine.  Just have to find where and in what format Thunderbird stores your downloaded messages and scrape the links out.  

Depending on how much work you want to do here is a link that might help your search:

http://schf.uc.org/articles/2007/02/14/scraping-gmail-with-mechanize-and-hpricot

Seems like other people have wanted to do the same thing.
I haven't found any off the shelf solutions yet.
0
 
wildgoose1Author Commented:
That looks interesting. But I have no idea what mechanize and hpricot are.
How do I run that script? Any help to do that would ge awesome, thanks!
0
 
wildgoose1Author Commented:
just being able to extract all the emails in their entirety from thunderbird or gmail directly would be great.
0
 
97WideGlideCommented:
Try it, c if it works with Thunderbird.
0
 
wildgoose1Author Commented:
try what exactly?
0
 
97WideGlideCommented:
Download your email using Thunderbird.
Download the link extractor which I mentioned in the first post.
Point the link extractor at your downloaded email.
C if it works 4 U.

This isn't really something that I do on a regular basis and I'm afraid I can't give you too much help to get it done.  I can only point you in the direction of what I might do.  The heavy lifting is up to you.
0
 
wildgoose1Author Commented:
It only works with MS Outlook and Outlook Express.
So I'm back to square one.

I'm quite prepared to do teh heavy lifting as you put it but I need pointing in the direction of solving this problem not round in circles.



0
 
97WideGlideCommented:
According to their website the software works with any type of file and they list quite a few in detail :

Search and extract (http, ftp, email, news, phone, fax) links or text from any type of file with this powerful, automated link or URL extractor.
Extract Link is a powerful, highly accurate, fast threaded link extractor utility to search and extract link (http, ftp, email, news, phone, fax) from any type of file (Html, Word, Excel, executables, ZIP, and so on). User can save the results in text or excel file and the output file can then be easily imported in any complex database tool as desire. It presents results in link, base, domain separately and supports link compare, URL extraction depth, duplicate link/base removal, domain check list, filters, etc.

This unique link extractor can also search inside archived files (ace, arc, arj, bh, cab, jar, lha, rar, tar, zip, etc.) for specific type of link and extract them.

Can you please tell me why it only works with MS Outlook and Outlook Express for you ?

Download the extractor?
Click on "new search"?
Find where Thunderbird stores your messages ?
   http://email.about.com/cs/mozillatips/qt/et082002.htm
Point the extractor at that directory and define the type of links you want to extract ?

0

Featured Post

Concerto Cloud for Software Providers & ISVs

Can Concerto Cloud Services help you focus on evolving your application offerings, while delivering the best cloud experience to your customers? From DevOps to revenue models and customer support, the answer is yes!

Learn how Concerto can help you.

  • 7
  • 6
Tackle projects and never again get stuck behind a technical roadblock.
Join Now