pvsbandi
asked on
List the words in Notepad++
Hi,
I have a huge text file which i opened in Notepad++.
I want to find the words starting with SP_ and ending in a space.
Then i would like these words to be all listed into a column.
how can i achieve it?
Please help.
I have a huge text file which i opened in Notepad++.
I want to find the words starting with SP_ and ending in a space.
Then i would like these words to be all listed into a column.
how can i achieve it?
Please help.
Use RegEx "SP_.* " in find for Notepad++
ASKER
Thanks, Shaun! I want all those words starting SP_ and ending in a blank space, as a list.
Can i achieve that?
Can i achieve that?
Yes, use Find All in Current Document
ASKER
I did. But i'm getting the entire line that has the match.
I just need to words matching the pattern.
I just need to words matching the pattern.
Copy all those matches into Excel and use Text to Columns with : as delimited
ASKER
I tried copying it into excel and parse, but it turned out be very cumbersome.
I there a way in Notepad++ to list list those words starting with SP_ and ending in a space?
I there a way in Notepad++ to list list those words starting with SP_ and ending in a space?
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
Wonderful!
ASKER
Can you explain the logic, if you don't mind?
>> Can you explain the logic, if you don't mind?
You mean the regex?
You mean the regex?
ASKER
Yes, please
>> (\bSP_\S+)
the \b is the word boundary metacharacter. You use it to make sure you don't end up matching in the middle of a word. For instance if you were interested in finding the word "character" but isolated - meaning, not in the middle of some word, and your search text is "Regular expressions have awesome metacharacters!" . Then you need to search for \bcharacter\b in which case it wouldn't match.
The \S+ means one or more non-space characters. Since your text contains "SP_" plus some other characters other than space (aka non-spaces), then \S seems appropriate. If you know it can only have alnum characters, you could have also used (\bSP_[a-zA-Z0-9]+)
>>^(SP_.+)$
At this point you have all the words on their own lines. The "^" means "starts with". The ".+" means "one or more characters other than new line.
>> ^[^S]+
At this point, your search text looks like:
Line 1: SP_foo
Line 23: SP_bar
Line 1000: SP_baz
If you pay attention, you still have every word on a line by itself, but they all contain the useless prefix "Line #: ". Specifically, what you are interested in begins from the "S". So just discard everything before the "S".
The "[^S]" meant "not an S". So, the whole expression looks for lines that start with one or more characters that are not "S"
the \b is the word boundary metacharacter. You use it to make sure you don't end up matching in the middle of a word. For instance if you were interested in finding the word "character" but isolated - meaning, not in the middle of some word, and your search text is "Regular expressions have awesome metacharacters!" . Then you need to search for \bcharacter\b in which case it wouldn't match.
The \S+ means one or more non-space characters. Since your text contains "SP_" plus some other characters other than space (aka non-spaces), then \S seems appropriate. If you know it can only have alnum characters, you could have also used (\bSP_[a-zA-Z0-9]+)
>>^(SP_.+)$
At this point you have all the words on their own lines. The "^" means "starts with". The ".+" means "one or more characters other than new line.
>> ^[^S]+
At this point, your search text looks like:
Line 1: SP_foo
Line 23: SP_bar
Line 1000: SP_baz
If you pay attention, you still have every word on a line by itself, but they all contain the useless prefix "Line #: ". Specifically, what you are interested in begins from the "S". So just discard everything before the "S".
The "[^S]" meant "not an S". So, the whole expression looks for lines that start with one or more characters that are not "S"
ASKER
Thanks a lot for that explanation!
Is there a book or a reference manual for the Notepad++ Regexp?
Is there a book or a reference manual for the Notepad++ Regexp?
Not that I am aware off. I learned regular expressions when I was learning Perl. Look for perl-compatible regular expression tutorials.
ASKER
Thanks again!