I've got hundreds of domains, all over the world, with various TLD's ccTLD's, plus a list of the subdomains (thousands). I want to use a script to match the list of TLD's, and cut any sudomains off the front of the input...
TLD's: (for this example, see attached for full txt)
and so on. Sometimes I've got subdomains, other times not, so regex *seems* out of the question because there can be many variants in the "dot" counts (anywhere from 2-6 dot's in an entry) in the domain list.
So I was thinking read each line of the domain list (above), match a TLD to the end, put that aside, match anything left of the TLD up to one dot or beginning of line (if no dot found) and then combine that into one whole TLD.
Again I've got thousands of subdomains and hundreds of TLD's, our registrar is a mess because it's not letting us export them as just the TLD's :( We can only export the DNS records for some reason (and we want to leave this registrar).
I am attaching list of valid domain's (ccTLD and gTLD), I'd like the script to read from that, and the other file, strip off any subdoomains and leave me with just TLD's like