asked on

flag or remove emails, or obscured emails / telephone numers using PHP

hi

I'm writing a system that's free to use but requires users to subscribe before they can send messages containing contact info such as email or phone numbers.

Looking for a @ or (at) is obviously quite simple and should flag accounts for us to check that try that method to circumnavigate subscribing but does anyone have any ideas how I could also check for phone numbers this way, say if there is 4 or more numbers in a row for example

I'm using PHP by the way.

Thanks
Neil

Sam Wallis

You should use regular expressions or in PHP preg_match()

Julian Hansen

What if I send this

john(dot)smith(at)somewhere(dot)com
Or
john.smith @ somewhere . com

There are many ways of obscuring something - how meticulous do you want to get?

Neil Thompson

ASKER

Hi Julian,

The is obviously a n'th degree people can use to try to get round but I just want to get a fair few "chancers" so obviously anyone using @ I can grab as most people wont use this in an "about me" text.

I can also check for the biggies gmail, Hotmail, yahoo, .uk .com (dot) (.) but would also be handy to look for groups of numbers say 4 or more in a row, such as 07941 111222

Julian Hansen

Then you should be looking at using regular expressions.
preg_match()

The next question is - do you want to block the message from being sent OR do you want to remove the offending content and send the modified message on?

Neil Thompson

ASKER

Ideally just remove and send as it then negates user intervention.

gr8gonzo

My two cents - it's not worth it to try and analyze every message for these patterns. Like other people have said, if someone WANTS to send this information, then there's too many ways around it. The simplest approach that can be used is to separate the content into different messages:

ME: My email is
ME: johnsmith
ME: at gmail
ME: dot com

Since the messages could be spread out and earlier portions of the message could already be in the hands of the recipient by the time enough data is accumulated to recognize it as an attempt to send an email address, it's virtually impossible to stop without human moderation (the concept of the "radio broadcast delay" where an "editor" has 2-3 seconds to review message content and flag or block it before the data makes it to the recipient.

If you wanted this kind of human-based, delay moderation, you'd need to implement something that looks for keywords in conversations so that the human moderator only has to review content that is potentially in violation of your terms and conditions. For example, look for @ and # and ( ) characters, and keywords like "phone", "phon", "fone", "ph#", etc... and then if you find those, simply flag the conversation to begin moderation so that the messages are routed through a human moderator first (until the moderator senses that the conversation is okay and turns off the flag).

If you don't have the human resources, then your better bet might be to limit the # of messages or overall number of words exchanged between two people. In theory, most people need to have some minimum amount of trust in a person before they exchange contact information, and that trust is normally gained by conversation. So after a certain amount of conversation, cut it off unless one of them is a subscriber (don't force both of them to be a subscriber - if this is a matchmaking chat type of service, your site's success depends on enabling connections, so providing the lowest amount of resistance to continuing a successful connection is in your best interest).

Neil Thompson

ASKER

Thanks all,

Taking o board your great comments I'm going to try and just sniff the obvious and flag that for moderation before delivery.

This obviously matches @ but how can I add more things to check or do I need to preg_match for every one I want.

Ideally I would like something like this but it obviously doesn't work:
preg_match('/(@)(hotmail)(gmail)(079)/', $userText , $matches, PREG_OFFSET_CAPTURE);

<?php
$userText = "hi, my name is bob send me an email bob @ hotmail .(dot) co DOT UK or call 07912 123456 or bob@testemail.co.uk";
preg_match('/(@)/', $userText , $matches, PREG_OFFSET_CAPTURE);
print_r($matches);
?> 

Array
(
    [0] => Array
        (
            [0] => @
            [1] => 55
        )

    [1] => Array
        (
            [0] => @
            [1] => 55
        )

)

Open in new window

ASKER CERTIFIED SOLUTION

gr8gonzo

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

Neil Thompson

ASKER

Excellent, many thanks for your code and thoughts