Verifying email value syntax

Can someone tell me how a column could be checked to verify that it's contents which "should" be email addresses contain the @ character and at least one period?  Thanks!!
BasssqueAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

slightwv (䄆 Netminder) Commented:
Quick and dirty but won't work on everything:
case when regexp_like(email_address,'^.+@.+\..+$') then 'Good' else 'Bad' end
slightwv (䄆 Netminder) Commented:
There are a LOT of email address regular expression validators out there.  Just pick one and use it.  You might need to tweak it a little to make the syntax Oracle compliant but most regex's can be used.
slightwv (䄆 Netminder) Commented:
This one is a lot better than my first one but still won't get all bad data:
case when regexp_like(email_address,'^[^@.]+@[^@.]+\.[^@.]+$') then 'Good' else 'Bad' end
Your Guide to Achieving IT Business Success

The IT Service Excellence Tool Kit has best practices to keep your clients happy and business booming. Inside, you’ll find everything you need to increase client satisfaction and retention, become more competitive, and increase your overall success.

BasssqueAuthor Commented:
how about this one?
case when REGEXP_LIKE (CNT1_EMAIL,'^[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,4}$')
slightwv (䄆 Netminder) Commented:
>>how about this one?

Does it work for you?

I saw that one as well but it is OLD.  There are so many new top level domains, that regex won't work for them.

http://data.iana.org/TLD/tlds-alpha-by-domain.txt

If I created:
slightwv@i_drink.beer

Would it show 'Good' or 'Bad'?
BasssqueAuthor Commented:
Can't get either of them to work, here is an example

SELECT
STUDENTS.STUDENT_NUMBER,
STUDENTS.LAST_NAME,
STUDENTS.FIRST_NAME,
STUDENTS.SCHOOLID,
STUDENTS.GRADE_LEVEL,

case when REGEXP_LIKE (CNT1_EMAIL,'^[^@.]+@[^@.]+\.[^@.]+$')
                        then ''
                        else CNT1_EMAIL || '<-- invalid_format' end CNT1_EMAIL

FROM
"PS"."STUDENTS" "STUDENTS"
LEFT JOIN "PS"."CONTACTS" "CONTACTS" ON "STUDENTS"."DCID" = "CONTACTS"."STUDENTSDCID"

WHERE
(
REGEXP_LIKE (CNT1_EMAIL,'^[^@.]+@[^@.]+\.[^@.]+$') AND
(CONTACTS.CNT1_EMAIL IS NOT NULL) and (CONTACTS.CNT1_EMAIL != ' ')
)
slightwv (䄆 Netminder) Commented:
What isn't working?  do you get an error or just no data?

If you use the regexp_like in the where clause, then everything returned must match the pattern so, you should not get any 'invalid' ones?


>>(CONTACTS.CNT1_EMAIL IS NOT NULL) and (CONTACTS.CNT1_EMAIL != ' ')

Make it one using trim.  No need to check for a single space.  What if there are two or ten spaces in the field?
(trim(CONTACTS.CNT1_EMAIL) IS NOT NULL)
BasssqueAuthor Commented:
I wasn't getting any data before, that works great.  Good point!
The regex statement outputs any addresses with periods before the @ sign though which is unwanted.  Can any periods before the @ be ignored?
slightwv (䄆 Netminder) Commented:
>>Can any periods before the @ be ignored?

You can include/ignore anything you want.

As far as periods before the '@', is this not a valid email address:
jane.doe@some.domain.com
BasssqueAuthor Commented:
it's valid
I'm only looking to output invalid addresses though
slightwv (䄆 Netminder) Commented:
OH, I was confused.  I didn't test 'jane.doe@some.domain.com' with my regex.

I'll see what I can do.
BasssqueAuthor Commented:
this is what I have so far btw
just need to ignore any periods before the @ character :-)

case when REGEXP_LIKE (CNT1_EMAIL,'^[^@.]+@[^@.]+\.[^@.]+$') then ''
                        when instr(CONTACTS.CNT1_EMAIL,'@')=0 then 'missing_@'
                        when instr(CONTACTS.CNT1_EMAIL,'.')=0 then 'missing_period'
                        else CNT1_EMAIL || '<-- invalid_format' end CNT1_EMAIL
Alex [***Alex140181***]Software DeveloperCommented:
Taken from various sources:

http://emailregex.com/

General Email Regex (RFC 5322 Official Standard)
(?:[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*|"(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21\x23-\x5b\x5d-\x7f]|\\[\x01-\x09\x0b\x0c\x0e-\x7f])*")@(?:(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?|\[(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?|[a-z0-9-]*[a-z0-9]:(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21-\x5a\x53-\x7f]|\\[\x01-\x09\x0b\x0c\x0e-\x7f])+)\])

Open in new window


Oracle PL/SQL Email Regex
SELECT email 
FROM table_name
WHERE REGEXP_LIKE (email, '[A-Z0-9._%-]+@[A-Z0-9._%-]+\.[A-Z]{2,4}');

Open in new window


Other:

http://nerd.net.au/oracle/49-apex-application-express/validation-regex/108-email-regex-validation
^[a-zA-Z0-9][a-zA-Z0-9\.\-]{1,}@[a-zA-Z0-9]{1}[a-zA-Z0-9\.\-]{1,}\.{1}[a-zA-Z]{2,4}$

Open in new window


Basics:
https://galobalda.wordpress.com/2013/07/10/an-attempt-to-validate-email-addresses-with-a-regular-expression/

General (awesome site for testing regular expressions):
https://regex101.com/
slightwv (䄆 Netminder) Commented:
Try this one.  Again, far from perfect but I've not really seen a really good one.

case when regexp_like(email_address, '^[^@]+\.?[^@]*@[^@.]+\.[^@.]+$' ) then 'Good' else 'Bad' end
slightwv (䄆 Netminder) Commented:
Alex,
Most of those aren't valid.  They are all old.  I didn't test the RFC one but the others are out of date.
BasssqueAuthor Commented:
works well, still looking through all of the data
only found one issue so far
if the address contains more than two periods after the @ then it's included in the results which should not be the case
for example first.last@school1.miami.fl.us is valid
slightwv (䄆 Netminder) Commented:
Try this tweak:
 regexp_like(email_address, '^[^@]+\.?[^@]*@[^@.]+(\.[^@.]+)+$' )

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
BasssqueAuthor Commented:
Fantastic!! :-)

This is what I have right now, I can't thank you enough for your help!

case when REGEXP_LIKE (CNT1_EMAIL, '^[^@]+\.?[^@]*@[^@.]+(\.[^@.]+)+$' ) then ''
                        when instr(CONTACTS.CNT1_EMAIL,'@')=0 then CNT1_EMAIL || ' <-- missing_@'
                        when instr(CONTACTS.CNT1_EMAIL,'@')>1 then CNT1_EMAIL || ' <-- multiple_addresses'
                        when instr(CONTACTS.CNT1_EMAIL,'.')=0 then CNT1_EMAIL || ' <-- missing_period'
                        else CNT1_EMAIL || ' <-- invalid_format' end CNT1_EMAIL
BasssqueAuthor Commented:
found one last issue
if the address contains multiple periods before the @ then it's output as invalid
slightwv (䄆 Netminder) Commented:
There will be more, remember, what I posted won't catch everything.

I'm not going to be able to post something that will catch everything.

This appears to fix the q..@q.q issue:
regexp_like(email_address, '^([^@.]+\.?)+@[^@.]+(\.[^@.]+)+$')
BasssqueAuthor Commented:
Doesn't like that one
never completes and doesn't output any field values
slightwv (䄆 Netminder) Commented:
I ran it against 11.2.0.2.  I can't remember your Oracle version.

I'll see if I can come up without the new group.
BasssqueAuthor Commented:
12
slightwv (䄆 Netminder) Commented:
Try my stand-alone test case.  I just tried it with 12.1.0.2 and it ran fine:
with mydata as (
select null email_address from dual
union all
select 'q@q@q.q' email_address from dual
union all
select 'q..@q.q' email_address from dual
union all
select 'q.@q.q' email_address from dual
union all
select 'q@q.q' email_address from dual
union all
select 'q.q.q@q.q.q' email_address from dual
union all
select 'q.q@q.q' email_address from dual
union all
select 'q.q.q.q.q@q.q.q.q.q' email_address from dual
union all
select 'q@q..q' email_address from dual
union all
select 'qqq' email_address from dual
)
select email_address,
case when
regexp_like(email_address, '^([^@.]+\.?)+@[^@.]+(\.[^@.]+)+$')
then 'Good' else 'Bad' end
from mydata
/

Open in new window

BasssqueAuthor Commented:
Can it be modified to only output the invalid values though?
slightwv (䄆 Netminder) Commented:
Sure.

Add it to the where clause:

select ...
where ...
and NOT regexp_like ...
BasssqueAuthor Commented:
Throws an invalid operational error message
where
(CONTACTS.CNT1_EMAIL NOT REGEXP_LIKE '^[^@]+\.?[^@]*@[^@.]+(\.[^@.]+)+$' )
slightwv (䄆 Netminder) Commented:
You need the complete regexp_like function call...

select ...
from ...
where ...
and NOT REGEXP_LIKE (CNT1_EMAIL, '^[^@]+\.?[^@]*@[^@.]+(\.[^@.]+)+$' )
BasssqueAuthor Commented:
If I do that then it gives me all the null fields from the <-- missing_@ statement
even with an is not null in the where statement
slightwv (䄆 Netminder) Commented:
>>If I do that then it gives me all the null fields from

I didn't say it was the ONLY thing in the where clause, if you don't want nulls, add to the where clause.

select ...
from ...
where ...
and NOT REGEXP_LIKE (CNT1_EMAIL, '^[^@]+\.?[^@]*@[^@.]+(\.[^@.]+)+$' )
and trim(CNT1_EMAIL) is not null
...
BasssqueAuthor Commented:
NICE!
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Query Syntax

From novice to tech pro — start learning today.