Avatar of john lambert
john lambert
 asked on

regex code,How to do this?include and exclude chars?

I use this tool,Word List Updater 2.7:
the tool i use
All i want is to filter(exclude) all email domains and this type of chars: ®©�ØÇÖÄüèöµÃ‖|¦ and include this: :^/\\,.+ .;
The code bellow exclude this too: ^/\\,.+:;
^[^/\\{«»„““”‘’|\n\t….,;`^"<>'}+:?®©�ØÇÖÄüèöµÃ‖|¦]*$

This is my list i want to filter:

john>123
john:123
john;123
john/123
john@123
john!123
john#123
john%123
john^123
john=123
john*1
john&1
john+1
john\123
èáàúùóò
acan
itoh
ö
ü


Ã
Ä
Ö
Ç
Ø
RE
�^�O�OsG���
w���n���
john-123
john_123
marcy
µ
john
marcy
michael
test
&amp;lt;
&amp;lt
&amp;gt;
&amp;gt
&amp;lt;&amp;gt;
&amp;
&amp

^
��
¦
johnny$1234
john~123
john)123
john(123

Open in new window


so the final list must look like this:

john>123
john:123
john;123
john/123
john@123
john!123
john#123
john%123
john^123
john=123
john*1
john&1
john+1
john\123
acan
itoh
RE
john-123
john_123
marcy
john
marcy
michael
test
^
johnny$1234
john~123
john)123
john(123

Open in new window

Regular Expressions

Avatar of undefined
Last Comment
john lambert

8/22/2022 - Mon
Rgonzo1971

Hi,

pls try


^[^{«»„““”‘’|\n\t…`"<>'}?®©�ØÇÖÄüèöµÃ‖|¦]*$ 

Open in new window

Regards
john lambert

ASKER
no no doesn't work this is the rez. plz try to use same tool as mine is easy to find and clean

john/123
test:123
test;123
john@123
john!123
john#123
john%123
john^123
john=123
john*1
john&1
john+1
john\123
èáàúùóò
acan
itoh
ö
ü


Ã
Ä
Ö
Ç
Ø
RE
�^�O�OsG���
w���n���
john-123
john_123
marcy
µ
john
marcy
michael
test
&amp;lt;
&amp;lt
&amp;gt;
&amp;gt
&amp;lt;&amp;gt;
&amp;
&amp

^
��
¦
johnny$1234
john~123
john)123
john(123

Open in new window

john lambert

ASKER
or i can give u mine
All of life is about relationships, and EE has made a viirtual community a real community. It lifts everyone's boat
William Peck
Rgonzo1971

Sorry can't help further
john lambert

ASKER
ok no problem thanks anyway
SubSun

Are you reading from text file? If yes I can try using PowerShell script to achieve the result.
⚡ FREE TRIAL OFFER
Try out a week of full access for free.
Find out why thousands trust the EE community with their toughest problems.
john lambert

ASKER
yes yes i use that tool and a text file( 8 gb) huge file , yes
SubSun

Try the following code.. it works based on your input and output data posted in the question..
GC C:\temp\input.txt | ?{$_ -match "^(?!(&|\?)).*([$>:;/@!#%^=*&+\\\-_$~)(]).*(?<!;)$|^[a-zA-Z].*[a-zA-Z]$"} | Out-File C:\temp\output.txt

Open in new window

john lambert

ASKER
ur result, this must desseapear too: �^�O�OsG���
and this 2 lines too:

john@yahoo.com
john@live.com

can handle huge txt files right? 8-10 gb?
I added 2 new lines, 2 emails...so no emails

john/123
test:123
test;123
john@123
john@yahoo.com
john@live.com
john!123
john#123
john%123
john^123
john=123
john*1
john&1
john+1
john\123
�^�O�OsG���
john-123
john_123
^
johnny$1234
john~123
john)123
john(123

Open in new window

I started with Experts Exchange in 2004 and it's been a mainstay of my professional computing life since. It helped me launch a career as a programmer / Oracle data analyst
William Peck
SubSun

Try..
GC C:\temp\input.txt | ?{$_ -match "^(?!(&|\?|�)).*([$>:;/@!#%^=*&+\\\-_$~)(]).*(?<!;)$|^[a-zA-Z].*[a-zA-Z]$"} | Out-File C:\temp\output.txt

Open in new window

john lambert

ASKER
yes much better one last favour make emails dessapear plz

john@yahoo.com
john@live.com
john lambert

ASKER
si final final list must look like this

Input list:

john/123
test:123
test;123
john@123
john!123
john@yahoo.com
john@live.com
john#123
john%123
john^123
john=123
john*1
john&1
john+1
john\123
èáàúùóò
acan
itoh
ö
ü



Ã
Ä
Ö
Ç
Ø
RE

�^�O�OsG���
w���n���
john-123
john_123
marcy
µ
john
marcy
michael
test
&amp;lt;
&amp;lt
&amp;gt;
&amp;gt
&amp;lt;&amp;gt;
&amp;
&amp

^
��
¦
johnny$1234
john~123
john)123
john(123

Open in new window




output list:
---------------
john/123
test:123
test;123
john@123
john!123
john#123
john%123
john^123
john=123
john*1
john&1
john+1
john\123
acan
itoh
RE
john-123
john_123
marcy
john
marcy
michael
test
johnny$1234
john~123
john)123
john(123

Open in new window

⚡ FREE TRIAL OFFER
Try out a week of full access for free.
Find out why thousands trust the EE community with their toughest problems.
SubSun

Try..
GC C:\temp\input.txt | ?{$_ -match "^(?!(&|\?|�)).*([$>:;/@!#%^=*&+\\\-_$~)(]).*(?<!;)$|^[a-zA-Z].*[a-zA-Z]$" -and $_ -notmatch '\w+@\w+\.\w+'} | Out-File C:\temp\output.txt

Open in new window

john lambert

ASKER
yes yes perfecttt just wondering to be everything perfect can u do this too modify a bit ur script? :

1.handle huge files? (10-20-50 gb) ?
2.can u remove extra spaces?
3.can u remove duplicates from huge files using this regex?


for example,input:
john
john
john^123
  money
carlos 123
  marcos 123
john&123
john&123

Open in new window


output:
------------
john
john^123
money
carlos123 
marcos123
john&123

Open in new window

SubSun

1.handle huge files? (10-20-50 gb) ?
Not sure, I have not tested it..
2.can u remove duplicates from huge files using this regex?

3.can u remove extra spaces?
GC C:\temp\input.txt | ?{$_ -match "^(?!(&|\?|�)).*([$>:;/@!#%^=*&+\\\-_$~)(]).*(?<!;)$|^[a-zA-Z].*[a-zA-Z]$" -and $_ -notmatch '\w+@\w+\.\w+'} | %{$_.Trim()} | Select -Unique | Out-File C:\temp\output.txt

Open in new window

This is the best money I have ever spent. I cannot not tell you how many times these folks have saved my bacon. I learn so much from the contributors.
rwheeler23
john lambert

ASKER
doesn't work,can't see :
input
-------
  money
carlos 123
  marcos 123

Open in new window


output(this 3 liens doesn't appear ):
----------
money
carlos 123
 marcos 123

Open in new window

SubSun

Try..
GC C:\temp\input.txt | ?{$_ -match "^(?!(&|\?|�)).*([$>:;/@!#%^=*&+\\\-_$~)(\s]).*(?<!;)$|^[a-zA-Z].*[a-zA-Z]$" -and $_ -notmatch '\w+@\w+\.\w+'} | %{$_.Trim()} | Select -Unique | Out-File C:\temp\output.txt

Open in new window

john lambert

ASKER
i can see in ur snapshot :

carlos 123
marcos 123

i want:
 
carlos123
marcos123

no extra spaces at all

My output is this,i used ur last string:

john/123
test:123
test;123
john@123
john^123
john!123
john#123
john%123
john=123
john*1
john&1
john+1
john\123
acan
itoh
RE
john-123
john_123
marcy
john
michael
test
^
johnny$1234
john~123
john)123
john(123

Open in new window

⚡ FREE TRIAL OFFER
Try out a week of full access for free.
Find out why thousands trust the EE community with their toughest problems.
ASKER CERTIFIED SOLUTION
SubSun

THIS SOLUTION ONLY AVAILABLE TO MEMBERS.
View this solution by signing up for a free trial.
Members can start a 7-Day free trial and enjoy unlimited access to the platform.
See Pricing Options
Start Free Trial
GET A PERSONALIZED SOLUTION
Ask your own question & get feedback from real experts
Find out why thousands trust the EE community with their toughest problems.
john lambert

ASKER
Oh my God!!!!!!!!!!! hard work uhh u have steel nervs Subsun mmm it seems to remove also duplicates,amazing i just hope to handle HUGE txt files like 8 gb or 20 gb i will test tomorrow with huge txt files i hope to work fine
john lambert

ASKER
working perfect thank you....just tell me if handle huge txt files?
SubSun

I just tested the code which I posted in my last comment, I am not getting �^�O�OsG��� in output.
Experts Exchange is like having an extremely knowledgeable team sitting and waiting for your call. Couldn't do my job half as well as I do without it!
James Murphy
john lambert

ASKER
no no my mistake no no the code is PERFECT, CONGRATULATION!!!!!!!!!!!!! 1000 THANKS!!
U worked a lot God bless you!!!!!!!!!!!!!!!!!!!!!!!!
john lambert

ASKER
working perfect,thanks for ur hard work, God bless u