Link to home
Start Free TrialLog in
Avatar of pvsbandi
pvsbandiFlag for United States of America

asked on

List the words in Notepad++

Hi,

  I have a huge text file which i opened in Notepad++.
I want to find the words starting with SP_ and ending in a space.
Then i would like these words to be all listed into a column.

how can i achieve it?
Please help.
Avatar of Shaun Vermaak
Shaun Vermaak
Flag of Australia image

Use RegEx "SP_.* " in find for Notepad++
User generated image
Avatar of pvsbandi

ASKER

Thanks, Shaun! I want all those words starting SP_ and ending in a blank space, as a list.
Can i achieve that?
Yes, use Find All in Current Document
I did. But i'm getting the entire line that has the match.
  I just need to words matching the pattern.
Copy all those matches into Excel and use Text to Columns with : as delimited
I tried copying it into excel and parse, but it turned out be very cumbersome.
  I there a way in Notepad++ to list list those words starting with SP_ and ending in a space?
ASKER CERTIFIED SOLUTION
Avatar of hielo
hielo
Flag of Wallis and Futuna image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Wonderful!
Can you explain the logic, if you don't mind?
>> Can you explain the logic, if you don't mind?
You mean the regex?
Yes, please
>> (\bSP_\S+)
the \b is the word boundary metacharacter.  You use it to make sure you don't end up matching in the middle of a word. For instance if you were interested in finding the word "character" but isolated - meaning, not in the middle of some word, and your search text is "Regular expressions have awesome metacharacters!" . Then you need to search for \bcharacter\b in which case it wouldn't match.

The \S+ means one or more non-space characters.  Since your text contains "SP_" plus some other characters other than space (aka non-spaces), then \S seems appropriate.  If you know it can only have alnum characters, you could have also used (\bSP_[a-zA-Z0-9]+)

>>^(SP_.+)$
At this point you have all the words on their own lines.  The "^" means "starts with".  The ".+" means "one or more characters other than new line.

>> ^[^S]+
At this point, your search text looks like:
Line 1: SP_foo
Line 23: SP_bar
Line 1000: SP_baz

If you pay attention, you still have every word on a line by itself, but they all contain the useless prefix "Line #: ".  Specifically, what you are interested in begins from the "S".  So just discard everything before the "S".
The "[^S]" meant "not an S".  So, the whole expression looks for lines that start with one or more characters that are not "S"
Thanks a lot for that explanation!
   Is there a book or a reference manual for the Notepad++ Regexp?
Not that I am aware off.  I learned regular expressions when I was learning Perl.  Look for perl-compatible regular expression tutorials.
Thanks again!
http://www.regular-expressions.info

From the creators of RegexBuddy.

HTH,
Dan