john lambert
asked on
regex code to filter this thip of combo email
regex code to filter this thip of combo email,is possible to do this?
1'st output:
awhitcomb@gmail.com:0x77E6 1D83DD3D29 61059CC734 E58E644852 D172CC
2nd output:
aberw:0x77E61D83DD3D296105 9CC734E58E 644852D172 CC
3:awhitcomb@gmail.com:aberw:0x77E61D83DD3D2961059CC734E58E644852D172CC:''
1'st output:
awhitcomb@gmail.com:0x77E6
2nd output:
aberw:0x77E61D83DD3D296105
ASKER
dones't work for me..
.*?:(.*?):(.*?):(.*?):
Use group 1 and 3 for first output
Use group 2 and 3 for second output
HTH,
Dan
Use group 1 and 3 for first output
Use group 2 and 3 for second output
HTH,
Dan
ASKER
well for my tool word list updater2.7 doesn't work any of this 2 codes..ufff
ASKER
i don't know why dones't working...we can tyr this too,please??
179819085:best-boy23@sezna m.cz:01798 19085:548a f13bdd5fc9 2c120491ae f92f9a22:J ohn:John:P epa:1975-0 8-08:37:M: 65:969:900 563
output:
179819085:best-boy23@sezna
output:
best-boy23@seznam.cz:548af13bdd5fc92c120491aef92f9a22
ASKER
.*?:(.*?:).*?:(.*?):.*
ASKER
test content:
here my clean tool, is very small u can download and try ,doesn't work:
https://www.sendspace.com/file/5v7gw2
john@yahoo.com:lazio:::lazio99:::juventus@1
john@yahoo.com:lazio:::lazio99
3:awhitcomb@gmail.com:aberw:0x77E61D83DD3D2961059CC734E58E644852D172CC:''
john@yahoo.com:lazio:lazio99:juventus@@
dgfd
here my clean tool, is very small u can download and try ,doesn't work:
https://www.sendspace.com/file/5v7gw2
Your tool can't use multiple groups.
Why don't you use Notepad++ or EditPad or any other editor with a proper Regex implementation?
Anyway, you can obtain all 3 groups in your tool by using this retain pattern:
Why don't you use Notepad++ or EditPad or any other editor with a proper Regex implementation?
Anyway, you can obtain all 3 groups in your tool by using this retain pattern:
.*?:(.*?:.*?:.*?):.*
ASKER
why? becouse this tool supports 10000000000 Gb txt file,for example my file have 7 GB and also becouse is very very fast and never give me errors
OK. Then after the first step (when you obtain awhitcomb@gmail.com:aberw: 0x77E61D83 DD3D296105 9CC734E58E 644852D172 CC) use the following retain pattern to get the second option (aberw:0x77E61D83DD3D29610 59CC734E58 E644852D17 2CC):
.*?:(.*)
For the first option I don't know how you can get it. Using this as remove pattern almost works, but deletes the : also:
:.*?:
.*?:(.*)
For the first option I don't know how you can get it. Using this as remove pattern almost works, but deletes the : also:
:.*?:
ASKER
with first code i got this:
but the second code doesn't work,i tried all 4 options
awhitcomb@gmail.com:aberw:0x77E61D83DD3D2961059CC734E58E644852D172CC:''
but the second code doesn't work,i tried all 4 options
ASKER
anyone? ufff
I'll have to meditate a bit abot that, but two tips I could offer by now:
- The x64 version of Notepadd++ supports very big files, too.
- The free tool Expresso (see
- here
- ) is wonderful for developing and testing regular expressions (even while it doesn't support testing on gigabyte sized data piles ...)
ASKER
dones't support my huge list 16 gb , says too big ,supports at least 6 gb ?
ASKER
first i want this:
Input:
Output:
Input:
139903:gcoquio@surfeu.ch:guillaume_tell:8c5b7bb6042110ac96c9ae351dbd7fbc:Comandante Ch?:::1962-04-22:50:M:66:729:67793
Output:
gcoquio@surfeu.ch:8c5b7bb6042110ac96c9ae351dbd7fbc
ASKER
or split first this:
then this:
then this:
139903:gcoquio@surfeu.ch:guillaume_tell:8c5b7bb6042110ac96c9ae351dbd7fbc
then this:
gcoquio@surfeu.ch:guillaume_tell:8c5b7bb6042110ac96c9ae351dbd7fbc
then this:
guillaume_tell:8c5b7bb6042110ac96c9ae351dbd7fbc
then this:
gcoquio@surfeu.ch::8c5b7bb6042110ac96c9ae351dbd7fbc
ASKER
ASKER
this (^(.*?):(.*?)) working to cut this,input:
output:
139903:gcoquio@surfeu.ch:guillaume_tell:8c5b7bb6042110ac96c9ae351dbd7fbc:Comandante Ch?:::1962-04-22:50:M:66:729:67793
output:
gcoquio@surfeu.ch:guillaume_tell:8c5b7bb6042110ac96c9ae351dbd7fbc:Comandante Ch?:::1962-04-22:50:M:66:729:67793
Can you please send a sample file of a few dozen lines?
If all the lines have the same number of fields, with : as delimiter, this looks like a job for awk.
If all the lines have the same number of fields, with : as delimiter, this looks like a job for awk.
ASKER
ok until now i got this,i cut first : and last :::, here what i have now:
By the way i using WINDOWS not linux os
By the way i using WINDOWS not linux os
gcoquio@surfeu.ch:guillaume_tell:8c5b7bb6042110ac96c9ae351dbd7fbc:Comandante Ch?
helil38@yahoo.de:paar48of:4708471dadc188ddc28ca02ad3203c00:Ilse Stelz
fabpatsmash2000@yahoo.com:fabpatsmash:8155adbe513300ff7b12a99ee717d12e:Fabpatsmash
nosekeponer_666@hotmail.com:devil_cara:e6b28db7802e90c9372b2069ed9b3e47:Javi
maisesap@hotmail.com:diablilla1:06c0681c95e6d499eb653073e1ed4bb5:Carmen
eloy_malaguita@hotmail.com:cojone:50fccc7c8b7417438df5e34d019c9036:Yo
chikito200@hotmail.com:ckikito200:45c8d284d70198a61b45524a9ce29795:Preguntamelo
kicks_r_us@hotmail.com:oulanem:81b74ebe1e8baae94d4f6c3d1e82673a:Oulanem
zerdesht03@hotmail.com:xolefize:e10adc3949ba59abbe56e057f20f883e:Ahmed
ropa60@hotmail.com:249886.scheda.batty35030:b51418d07c873d79c1ac21e5d9ea0dd0:Anto76
Hmmm - how about this regex:
Snippet from Expresso attached ...
Seems to work fine with the last sample text, too.
Expresso-Sample.png
(\d*?:){0,1}(.*?@.*?):(.*?):(.*?):
First result would be$2:$4
Second result would be$3:$4
Snippet from Expresso attached ...
Seems to work fine with the last sample text, too.
Expresso-Sample.png
Who cares what you're using? Windows 10 has bash shell, you can download gawk for windows from here:
http://gnuwin32.sourceforge.net/packages/gawk.htm
The beauty of awk is that it works with fields, like Excel. For ex, if sterge.txt has the content that you posted above
awk 'BEGIN { FS=":" } { print $1 }' sterge.txt
will print
gcoquio@surfeu.ch
helil38@yahoo.de
fabpatsmash2000@yahoo.com
nosekeponer_666@hotmail.co m
maisesap@hotmail.com
eloy_malaguita@hotmail.com
chikito200@hotmail.com
kicks_r_us@hotmail.com
zerdesht03@hotmail.com
ropa60@hotmail.com
awk 'BEGIN { FS=":" } { print $1":"$3 }' sterge.txt
will print
gcoquio@surfeu.ch:8c5b7bb6 042110ac96 c9ae351dbd 7fbc
helil38@yahoo.de:4708471da dc188ddc28 ca02ad3203 c00
fabpatsmash2000@yahoo.com: 8155adbe51 3300ff7b12 a99ee717d1 2e
nosekeponer_666@hotmail.co m:e6b28db7 802e90c937 2b2069ed9b 3e47
maisesap@hotmail.com:06c06 81c95e6d49 9eb653073e 1ed4bb5
eloy_malaguita@hotmail.com :50fccc7c8 b7417438df 5e34d019c9 036
chikito200@hotmail.com:45c 8d284d7019 8a61b45524 a9ce29795
kicks_r_us@hotmail.com:81b 74ebe1e8ba ae94d4f6c3 d1e82673a
zerdesht03@hotmail.com:e10 adc3949ba5 9abbe56e05 7f20f883e
ropa60@hotmail.com:b51418d 07c873d79c 1ac21e5d9e a0dd0
http://gnuwin32.sourceforge.net/packages/gawk.htm
The beauty of awk is that it works with fields, like Excel. For ex, if sterge.txt has the content that you posted above
awk 'BEGIN { FS=":" } { print $1 }' sterge.txt
will print
gcoquio@surfeu.ch
helil38@yahoo.de
fabpatsmash2000@yahoo.com
nosekeponer_666@hotmail.co
maisesap@hotmail.com
eloy_malaguita@hotmail.com
chikito200@hotmail.com
kicks_r_us@hotmail.com
zerdesht03@hotmail.com
ropa60@hotmail.com
awk 'BEGIN { FS=":" } { print $1":"$3 }' sterge.txt
will print
gcoquio@surfeu.ch:8c5b7bb6
helil38@yahoo.de:4708471da
fabpatsmash2000@yahoo.com:
nosekeponer_666@hotmail.co
maisesap@hotmail.com:06c06
eloy_malaguita@hotmail.com
chikito200@hotmail.com:45c
kicks_r_us@hotmail.com:81b
zerdesht03@hotmail.com:e10
ropa60@hotmail.com:b51418d
ASKER
C:\Program Files (x86)\GnuWin32\bin>awk 'BEGIN { FS=":" } { print $1 }' sterge.txt
awk: 'BEGIN
awk: ^ invalid char ''' in expression
In Windows awk you're stuck with double quotes:
D:\portables\Gnu utilities\bin>awk "BEGIN { FS=""":""" } { print $1 }" sterge.txt
D:\portables\Gnu utilities\bin>awk "BEGIN { FS=""":""" } { print $1""":"""$2 }" sterge.txt
gcoquio@surfeu.ch:guillaume_tell
helil38@yahoo.de:paar48of
fabpatsmash2000@yahoo.com:fabpatsmash
nosekeponer_666@hotmail.com:devil_cara
maisesap@hotmail.com:diablilla1
eloy_malaguita@hotmail.com:cojone
chikito200@hotmail.com:ckikito200
kicks_r_us@hotmail.com:oulanem
zerdesht03@hotmail.com:xolefize
ropa60@hotmail.com:249886.scheda.batty35030
ASKER
output is this:
notgood i need
email:hash
then.......
username:hash
john@yahoo.com
john@yahoo.com
3
139903
john@yahoo.com
dgfd
notgood i need
email:hash
then.......
username:hash
ASKER
and this dones't work to filter email:pass and also output.txt to save the res. the file have 16 gb
That's why I asked for a sample file, to see the input.
If what you posted after I asked is the input, then:
If what you posted after I asked is the input, then:
email:hash
D:\portables\Gnu utilities\bin>awk "BEGIN { FS=""":""" } { print $1""":"""$3 }" sterge.txt
gcoquio@surfeu.ch:8c5b7bb6042110ac96c9ae351dbd7fbc
helil38@yahoo.de:4708471dadc188ddc28ca02ad3203c00
fabpatsmash2000@yahoo.com:8155adbe513300ff7b12a99ee717d12e
nosekeponer_666@hotmail.com:e6b28db7802e90c9372b2069ed9b3e47
maisesap@hotmail.com:06c0681c95e6d499eb653073e1ed4bb5
eloy_malaguita@hotmail.com:50fccc7c8b7417438df5e34d019c9036
chikito200@hotmail.com:45c8d284d70198a61b45524a9ce29795
kicks_r_us@hotmail.com:81b74ebe1e8baae94d4f6c3d1e82673a
zerdesht03@hotmail.com:e10adc3949ba59abbe56e057f20f883e
ropa60@hotmail.com:b51418d07c873d79c1ac21e5d9ea0dd0
username:hash
D:\portables\Gnu utilities\bin>awk "BEGIN { FS=""":""" } { print $2""":"""$3 }" sterge.txt
guillaume_tell:8c5b7bb6042110ac96c9ae351dbd7fbc
paar48of:4708471dadc188ddc28ca02ad3203c00
fabpatsmash:8155adbe513300ff7b12a99ee717d12e
devil_cara:e6b28db7802e90c9372b2069ed9b3e47
diablilla1:06c0681c95e6d499eb653073e1ed4bb5
cojone:50fccc7c8b7417438df5e34d019c9036
ckikito200:45c8d284d70198a61b45524a9ce29795
oulanem:81b74ebe1e8baae94d4f6c3d1e82673a
xolefize:e10adc3949ba59abbe56e057f20f883e
249886.scheda.batty35030:b51418d07c873d79c1ac21e5d9ea0dd0
ASKER
oh my god is working,but can u set Output.txt please? to ''output.txt'' hugeee result can u do that please?
Ok so with this sample working perfect:
thank you 90% of the battle u won,ur great i love you man!!!
Ok so with this sample working perfect:
helil38@yahoo.de:paar48of:4708471dadc188ddc28ca02ad3203c00:Ilse Stelz
first i cut using a code for Word List Updater2.7 blah blah can u make a try ?make it work with this original sample?thank you 90% of the battle u won,ur great i love you man!!!
139903:gcoquio@surfeu.ch:guillaume_tell:8c5b7bb6042110ac96c9ae351dbd7fbc:Comandante Ch?:::1962-04-22:50:M:66:729:67793
Since you only have an additional field, add 1 to the fields in the awk commands:
Want to save the output?
D:\portables\Gnu utilities\bin>awk "BEGIN { FS=""":""" } { print $2""":"""$4 }" sterge.txt > output.txt
sterge.txt
139903:gcoquio@surfeu.ch:guillaume_tell:8c5b7bb6042110ac96c9ae351dbd7fbc:Comandante Ch?:::1962-04-22:50:M:66:729:67793
D:\portables\Gnu utilities\bin>awk "BEGIN { FS=""":""" } { print $3""":"""$4 }" sterge.txt
guillaume_tell:8c5b7bb6042110ac96c9ae351dbd7fbc
D:\portables\Gnu utilities\bin>awk "BEGIN { FS=""":""" } { print $2""":"""$4 }" sterge.txt
gcoquio@surfeu.ch:8c5b7bb6042110ac96c9ae351dbd7fbc
Want to save the output?
D:\portables\Gnu utilities\bin>awk "BEGIN { FS=""":""" } { print $2""":"""$4 }" sterge.txt > output.txt
ASKER
output working perfect,but output results not ok,need 2 codes to save email:has and username:hash with the above original sample
??
awk "BEGIN { FS=""":""" } { print $2""":"""$4 }" sterge.txt > emailhash.txt
awk "BEGIN { FS=""":""" } { print $3""":"""$4 }" sterge.txt > usernamehash.txt
awk "BEGIN { FS=""":""" } { print $2""":"""$4 }" sterge.txt > emailhash.txt
awk "BEGIN { FS=""":""" } { print $3""":"""$4 }" sterge.txt > usernamehash.txt
ASKER
no no look, this is the original sample,make a try see if u can split using the original sample,original sample from beging of beginings:
first extract,email:hash:
then extract username:hash
139903:gcoquio@surfeu.ch:guillaume_tell:8c5b7bb6042110ac96c9ae351dbd7fbc:Comandante Ch?:::1962-04-22:50:M:66:729:67793
60099:gcoquio@mia.uk:gianii:8c5b7bb6042110ac96c9ae351dbd7fbc:rihagd:::1990-01-22:80:M:66:729:66666
first extract,email:hash:
gcoquio@surfeu.ch:8c5b7bb6042110ac96c9ae351dbd7fbc
gcoquio@mia.uk:8c5b7bb6042110ac96c9ae351dbd7fbc
then extract username:hash
surfeu.ch:guillaume_tell:8c5b7bb6042110ac96c9ae351dbd7fbc
gianii:8c5b7bb6042110ac96c9ae351dbd7fbc:rihagd
So what is the problem??
sterge.txt:
139903:gcoquio@surfeu.ch:guillaume_tell:8c5b7bb6042110ac96c9ae351dbd7fbc:Comandante Ch?:::1962-04-22:50:M:66:729:67793
60099:gcoquio@mia.uk:gianii:8c5b7bb6042110ac96c9ae351dbd7fbc:rihagd:::1990-01-22:80:M:66:729:66666
Commands:
D:\portables\Gnu utilities\bin>awk "BEGIN { FS=""":""" } { print $2""":"""$4 }" sterge.txt > emailhash.txt
D:\portables\Gnu utilities\bin>awk "BEGIN { FS=""":""" } { print $3""":"""$4 }" sterge.txt > usernamehash.txt
Results:
emailhash.txt:
gcoquio@surfeu.ch:8c5b7bb6042110ac96c9ae351dbd7fbc
gcoquio@mia.uk:8c5b7bb6042110ac96c9ae351dbd7fbc
usernamehash.txt:
guillaume_tell:8c5b7bb6042110ac96c9ae351dbd7fbc
gianii:8c5b7bb6042110ac96c9ae351dbd7fbc
ASKER
Works perfect, ur i good shape my god!!!
what about this is possible?
Original sample:
revert username:hash
look the output:
or haha this is crazy is possible to save the revert in this format? or save in above format and then use otehr comand to revert?
what about this is possible?
Original sample:
139903:gcoquio@surfeu.ch:guillaume_tell:8c5b7bb6042110ac96c9ae351dbd7fbc:Comandante Ch?:::1962-04-22:50:M:66:729:67793
60099:gcoquio@mia.uk:gianii:8c5b7bb6042110ac96c9ae351dbd7fbc:rihagd:::1990-01-22:80:M:66:729:66666
revert username:hash
look the output:
8c5b7bb6042110ac96c9ae351dbd7fbc:Comandante Ch?
8c5b7bb6042110ac96c9ae351dbd7fbc:rihagd
or haha this is crazy is possible to save the revert in this format? or save in above format and then use otehr comand to revert?
Comandante Ch?:8c5b7bb6042110ac96c9ae351dbd7fbc
rihagd:8c5b7bb6042110ac96c9ae351dbd7fbc
You did not look at awk, did you? It simply uses : as a field separator (that's what FS=':' means) and then it splits every line at every : and counts the parts as fields.
On your sample, the id is the first field, the email the 2nd, the username 3rd field, the hash is the 4th field, in the 5th you have some other username and so on.
Want to print the 5th field and then the 4th?
On your sample, the id is the first field, the email the 2nd, the username 3rd field, the hash is the 4th field, in the 5th you have some other username and so on.
Want to print the 5th field and then the 4th?
D:\portables\Gnu utilities\bin>awk "BEGIN { FS=""":""" } { print $5""":"""$4 }" sterge.txt
Comandante Ch?:8c5b7bb6042110ac96c9ae351dbd7fbc
rihagd:8c5b7bb6042110ac96c9ae351dbd7fbc
ASKER
this is more normal and is last favour i ask:
Input:
Output:
Toorrow when my head is clear hehe i will try to learn costumize and follow ur examples
Input:
54:jmshaw3567@hotmail.com:54:0x7BABC233DE26AB19EAD1B9C278128D5C434910EE:''
56:mark13886570@sina.com:56:0x66036CCD51CD5EB31978E803784D79CC5DADFBEC:''
Output:
mshaw3567@hotmail.com:0x7BABC233DE26AB19EAD1B9C278128D5C434910EE
mark13886570@sina.com:0x66036CCD51CD5EB31978E803784D79CC5DADFBEC
Toorrow when my head is clear hehe i will try to learn costumize and follow ur examples
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
yes
well......amazing!!no words!I honestly thought I would not solve these problems never
well......amazing!!no words!I honestly thought I would not solve these problems never
ASKER
thank you very much!!!!!!!!!!!!!!!!!!!!!! !!!!!!!!!! !!!!!!!!!! !!!!!!!!!! !!!!!!!!!! !!!!!!!!!! !!!!!!!!!! !!!!!!!!!! !!!!!!!!!! !ur the best!!!thank you thank you thank you, God bless u!!!
ASKER
thank you....
you could use subgroup 1 and 3 first and thensubgroup 2 and 3
Open in new window
Regards