• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 96
  • Last Modified:

How to format data in an excel spreedsheet so it shows a unique list of data in the following format

my goal: to export a list of email addresses from the sent items folder from my outlook 2010 and format the data into this format so i could import it into my firewall whitelist as a .txt file:

Allow, *@mxtoolbox.com, *
Allow, *@amazon.com, *
Allow, *@flip.com, *
Allow, *@demo.com, *
Allow, *@gmail.com, *
Allow, *@yahoo.com, *


I was able to export the email addresses from all the domains i have sent to (that were in my sent items folder in outlook 2010)

I did this by using this process:

Basically you can use the Outlook Import/Export function to achieve this. It's typically used to migrate the actual emails, but it suits the purpose of getting just the email addresses (or any other value for that matter).

Here is how to do it in Outlook (version 2007):

Click on 'File', 'Import and Export' to open the Wizard
Select 'Export to a File' [then click Next]
Select 'Microsoft Excel 97-2003' or a CSV file [click Next]
Select an Outlook folder [Next]
Click on Browse to select a location and then type the file name to export the information to (terminated by .xls) [Next]
Click on Map Custom Fields. This will bring up a list of all the available fields that are available in that folder.
Since we are only interested in Email address, click on Clear Map and then from the Left side click on From: (address) and drag that to the Right list
Click OK
Click Finish
Open the output file using Excel

but upon export via Outlook 2010 to a .xls file the output has given me a lot of extra data such as the following:

/o=yahoo/ou=first administrative group/cn=Recipients/cn=Joe
/o=yahoo/ou=first administrative group/cn=Recipients/cn=Mike
/o=yahoo/ou=first administrative group/cn=Recipients/cn=Bill
/o=yahoo/ou=first administrative group/cn=Recipients/cn=Ron
Helen@aol.com
Helen@aol.com
Helen@aol.com;Sarah@aol.com
Tim@mxtoolbox.com
Jen@amazon.com
Mark@flip.com;Joe@flip.com


As you could see there are many problems with this data:

1) All email addresses need to be removed from this data
2) the duplicate email addresses need to be removed from not only the entire column but also from the same cell.
3) out of the remaining data a list of unique domains names needs to be created
4) this unique list of domain names needs to be put in this format as a .txt file:


Allow, *@amazon.com, *
Allow, *@flip.com, *
Allow, *@aol.com, *
Allow, *@gmail.com, *
Allow, *@yahoo.com, *


Can anyone help me figure out how to do this in excel or do you know of any other third party software that will get the same?

Thank you in advance for your time!
0
IT_Field_Technician
Asked:
IT_Field_Technician
  • 7
  • 3
  • 2
1 Solution
 
Rgonzo1971Commented:
Hi,

pls try the code on the exported list in Excel it creates a new workbook that you can save as txt

Sub macro()
Dim Res() As Variant
Dim dict As Object
Set dict = CreateObject("Scripting.Dictionary")
Application.ScreenUpdating = False
Dim regExp As Object
Set regExp = CreateObject("vbscript.regexp")

With regExp
    .Pattern = "\@[^;]+"
    .Global = True
    For Each c In Range(Range("A1"), Range("A" & Rows.Count).End(xlUp))
        Set matches = .Execute(c)
        For Each Match In matches
            If Not dict.exists(Match.Value) Then
                dict.Add Match.Value, Match.Value
            End If
        Next
    Next
End With

Res = dict.Items()
Workbooks.Add
For Idx = 0 To UBound(Res)
    Range("A1").Offset(Idx) = "Allow, *" & Res(Idx) & ", *"
Next
Application.ScreenUpdating = True

End Sub

Open in new window

Regards
0
 
KoenCommented:
besides the technical solution... whitelisting domains is almost never a good solution:
- you are continuously updating the list
- basic spammer and spoofers use hotmail.com or yahoo.com which you will probably have in your whitelist for business reasons anyway

A far better way is to check mail validity at your mailgate, i.e. check whether the targetted e-mail exists on your domain, if not bounce with (or without) notification, then pass the rest through a decent spam filter.
0
 
IT_Field_TechnicianAuthor Commented:
thank you Rgonzo1971 i will try this.

Koen - I am very interested in this - can you please elaborate - is this a feature in exchange or something? Does this approach have an official technical name?

thank you both!
0
Concerto's Cloud Advisory Services

Want to avoid the missteps to gaining all the benefits of the cloud? Learn more about the different assessment options from our Cloud Advisory team.

 
IT_Field_TechnicianAuthor Commented:
Rgonzo1971 - how do i execute this script?

I believe I need to create a new module but I am not sure what to do after that?
0
 
IT_Field_TechnicianAuthor Commented:
OK Rgonzo1971 - I was able to press the Run icon in the VBA editor and the code WORKED perfectly!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

Thanks you so much!

Is there a way to edit the code so it would automatically remove common email addresses?
0
 
IT_Field_TechnicianAuthor Commented:
Is there a way to edit the code so it would automatically remove common email addresses?

Like:

live.com
email.com
yahoo.com
gmail.com
paypal.com
ect...
0
 
Rgonzo1971Commented:
then try
Sub macro()
Dim Res() As Variant
Dim dict As Object
Set dict = CreateObject("Scripting.Dictionary")
Application.ScreenUpdating = False
Dim regExp As Object
Set regExp = CreateObject("vbscript.regexp")

With regExp
    .Pattern = "(\@)([^;]+)"
    .Global = True
    For Each c In Range(Range("A1"), Range("A" & Rows.Count).End(xlUp))
        Set matches = .Execute(c)
        For Each Match In matches
            If Not dict.exists(Match.submatches(1)) Then
                dict.Add Match.submatches(1), Match.submatches(1)
            End If
        Next
    Next
End With

Res = dict.Items()
Workbooks.Add
Range("A1").Resize(UBound(Res) + 1) = WorksheetFunction.Transpose(Res)
Application.ScreenUpdating = True

End Sub

Open in new window

0
 
KoenCommented:
yes, it is about handling your mail in different steps, where the first step is the 'Before Arrival' test.
This is a topic typically threated in discussions about backscatter (if you don't do anything here, your mail server will send NDR's to the (often innocent domains being abused by spammers).

Typically a Before Arrival test will drop (not bounce) any mail send from the outside to a non-existing e-mail address, this avoiding sending 'false NDR's' (downside is that the few real mistakes (typo's) are dropped also, but in my opinion it does not outweigh the advantages).

there are 3th party products out there (no advertising), that can do it for you (no need to change your Exchange environment), but there is also elements you can do on the Exchange itself (without extra investment).

I am however not a techie... so I cannot tell you exactly what and how... sorry

but google on 'avoiding backscatter' , 'recipient filtering' and you should find plenty of info.

in this question on this forum you can also find some explanations and links on recipient filtering
0
 
IT_Field_TechnicianAuthor Commented:
Rgonzo1971

That script did execute without any errors but the results were not what i was looking for.

It gave me results of a bunch of email addresses first along with the data that was un-needed after that.

The first VBA code seemed to work perfectly but i need VBA code that will at that point take the results in this format:

Allow, *@domain1.com, *
Allow, *@flip.com, *
Allow, *@domain2.com, *
Allow, *@gmail.com, *
Allow, *@yahoo.com, *
Allow, *@domain3.com, *
Allow, *@domain4.com, *
Allow, *@aol.com, *
Allow, *@domain5.com, *

and remove any lines/cells that have commonly spoofed email domains such as:

live.com
email.com
yahoo.com
gmail.com
paypal.com
ect...

(I will enter my own list if you provide1-3 examples domains in the code you provide)
but yet keep the same format of the results such as:

Allow, *@domain1.com, *
Allow, *@domain2.com, *
Allow, *@domain3.com, *
Allow, *@domain4.com, *
Allow, *@domain5.com, *

Thank you again for all your help!!
0
 
IT_Field_TechnicianAuthor Commented:
Understood Koen Thanks for the information!
0
 
Rgonzo1971Commented:
Corrected code
Sub macro()
Dim Res() As Variant
Dim regExp As Object
Dim dict As Object
Dim dictEx As Object
Set dict = CreateObject("Scripting.Dictionary")
Set dictEx = CreateObject("Scripting.Dictionary")
Set regExp = CreateObject("vbscript.regexp")

strExceptionList = "live.com,email.com,yahoo.com,gmail.com,paypal.com"

Application.ScreenUpdating = False

For Each itm In Split(strExceptionList, ",")
    If Not dictEx.exists(itm) Then dictEx.Add itm, itm
Next
With regExp
    .Pattern = "(\@)([^;]+)"
    .Global = True
    For Each c In Range(Range("A1"), Range("A" & Rows.Count).End(xlUp))
        Set matches = .Execute(c)
        For Each Match In matches
            If Not dict.exists(Match.Value) And Not dictEx.exists(Match.submatches(1)) Then
                dict.Add Match.Value, Match.Value
            End If
        Next
    Next
End With

Res = dict.Items()
Workbooks.Add
For Idx = 0 To UBound(Res)
    Range("A1").Offset(Idx) = "Allow, *" & Res(Idx) & ", *"
Next
Application.ScreenUpdating = True

End Sub

Open in new window

0
 
IT_Field_TechnicianAuthor Commented:
It worked thanks!
0

Featured Post

VIDEO: THE CONCERTO CLOUD FOR HEALTHCARE

Modern healthcare requires a modern cloud. View this brief video to understand how the Concerto Cloud for Healthcare can help your organization.

  • 7
  • 3
  • 2
Tackle projects and never again get stuck behind a technical roadblock.
Join Now