Solved

C# REgExp Extract domain only

Posted on 2013-01-13
2
304 Views
Last Modified: 2013-01-13
Hi people, I am trying to get an regexp working.

I want to extract the domain by searching in an text using regexp.

Example:

---1---
http://derekslager.com/blog/posts/2007/09/a-better-dotnet-regular-expression-tester.ashx
>>> Extracting = http (protocol)
>>> Extracting = derekslager.com


---2---
http://pew.no-ip.biz/layouts/15/start.aspx#/anyingstrangelink
>>> Extracting = http (protocol)
>>> Extracting = pew.no-ip.biz

---3---
http://cdn.http.pri.streamotor.com/1/2/video-sd.mp4?Expires=1350310145&Key-Pair-Id=APKAJUAT6SMTUDSHTR3Q&Signature=fQ2bVVTZDdtoaFsc41cnR0GgoA2Y
>>> Extracting = http (protocol)
>>> Extracting = cdn.http.pri.streamotor.com

---4---
http://www.google.com (www prefix oO)
>>> Extracting = http (protocol)
>>> Extracting = google.com
---4---

I tried to achive using this but failing on example 2 :(, havent tried 3 and 4

            var regex = new Regex(@"(((file|gopher|news|nntp|telnet|http|ftp|https|ftps|sftp)://)|(www\.))+(([a-zA-Z0-9\._-]+\.[a-zA-Z]{2,6})|([0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}))(/[a-zA-Z0-9\&%_\./-~-]*)?");

Open in new window


Thaks
0
Comment
Question by:chugarah
2 Comments
 
LVL 75

Accepted Solution

by:
käµfm³d   👽 earned 500 total points
ID: 38772381
I'd say start with this:

var regex = new Regex("(?<=(file|gopher|news|nntp|telnet|http|ftp|https|ftps|sftp)://)[^/]+");

Open in new window


...and then write rules in C# code to determine what you want to filter out of the host value that is returned. It will be much easier to write, read, and understand.
0
 
LVL 1

Author Closing Comment

by:chugarah
ID: 38773504
Thanks, a good start
0

Featured Post

Migrating Your Company's PCs

To keep pace with competitors, businesses must keep employees productive, and that means providing them with the latest technology. This document provides the tips and tricks you need to help you migrate an outdated PC fleet to new desktops, laptops, and tablets.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

In order to hide the "ugly" records selectors (triangles) in the rowheaders, here are some suggestions. Microsoft doesn't have a direct method/property to do it. You can only hide the rowheader column. First solution, the easy way The first sol…
The article shows the basic steps of integrating an HTML theme template into an ASP.NET MVC project
Microsoft Active Directory, the widely used IT infrastructure, is known for its high risk of credential theft. The best way to test your Active Directory’s vulnerabilities to pass-the-ticket, pass-the-hash, privilege escalation, and malware attacks …
The Email Laundry PDF encryption service allows companies to send confidential encrypted  emails to anybody. The PDF document can also contain attachments that are embedded in the encrypted PDF. The password is randomly generated by The Email Laundr…

803 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question