Solved

select following word only

Posted on 2010-11-22
14
324 Views
Last Modified: 2012-05-10
Hello
Want to use regexpression to be able to select the word that follows the word "join" - the following word may contain / or - etc but with no gaps - example

testing for join table_2/3 is my ........
select ... inner join reference/1 is the table

so, what I would like returned is
table_2/3
reference/1

my code below works but selects the whole line that has the word "join" in it


freq = Hash.new(0)
myarray=[]
mystring=File.open("U:/A_CRYSTAL_REPORTS_MASTER/Report Definitions/tables_inline_views_used.txt") do |r|
  r.each do |record|
        if record =~/^(?=.*?\bjoin\b).*$/i  then
    record = record.gsub(/\n/,'')#\n removes n so as not to count \n - new line
    myarray<<record
  end
  end
  end

puts myarray.inspect

  myarray.inspect.scan(/\w+/) {|word| freq[word] = freq[word]+1}
  #THIS SORTS BY MA X COUNTS OF WORDS
  freq=freq.sort {|k,v| k[1]<=>v[1]}
  freq.each {|k,v| puts "#{k}..#{v}"}
   
 #freq.keys.sort.each {|k| print k, " - ", freq[k], "\n"}  
 
  exit
0
Comment
Question by:philsivyer
  • 7
  • 7
14 Comments
 
LVL 60

Expert Comment

by:Geert Bormans
Comment Utility
try this,

note that there is a difference in matching (that is in the if)
and what you finally select from the regex '\1' refers to the stuff matched inside the first ()
myarray=[]

  r.each do |record|

    if record =~/\bjoin\b/i  then

    record = record.gsub(/^.*\bjoin\b\s*([a-z0-9\/\-_]+).*$/i,'\1')

    myarray<<record

    end

  end



puts myarray.inspect

Open in new window

0
 

Author Comment

by:philsivyer
Comment Utility
One query

this is the code..
freq = Hash.new(0)
myarray=[]
mystring=File.open("U:/A_CRYSTAL_REPORTS_MASTER/Report Definitions/tables_inline_views_used.txt") do |r|

myarray=[]
  r.each do |record|
    if record =~/\bjoin\b/i  then
    record = record.gsub(/^.*\bjoin\b\s*([a-z0-9\/\.-_]+).*$/i,'\1')
    myarray<<record
    end
  end
end

myarray.inspect.scan(/\w+/) {|word| freq[word] = freq[word]+1}
  #THIS SORTS BY MA X COUNTS OF WORDS
  freq=freq.sort {|k,v| k[1]<=>v[1]}
  freq.each {|k,v| puts "#{k}..#{v}"}
   
 #freq.keys.sort.each {|k| print k, " - ", freq[k], "\n"}  
    exit

THIS IS THE TEXT ITS LOOKING AT.....
 Table or Inline View? ....  FROM FUND_VISIT
 LEFT JOIN T_SECURITY_USER TSU1 ON F.DD_ANALYST_ID = TSU1.USER_ID
 LEFT JOIN T_SECURITY_USER TSU2 ON F.PRIMARY_ANALYST_ID = TSU2.USER_ID
 LEFT OUTER JOIN FUND_FP_STS T3 ON T1.FP_STATUS_ID = T3.FP_STS_ID
 INNER JOIN CRYSTAL.V_CITY_S T7 ON T1.CITY_ID = T7.CITY_ID
 INNER JOIN STRAT T8 ON T1.STRATEGY_ID = T8.STRAT_ID
  Table or Inline View? ....  FROM (SELECT FUND_ID,
  Table or Inline View? ....  FROM (SELECT t1.Fund_id AS FUND_ID,
 
NAME OF REPORT: DD02
  Table or Inline View? ....  FROM .dd_scorecard ddsc
 INNER JOIN INSIGHT.DD_CATEGORY_COMMENT DDCC ON DDSC.SCORECARD_ID
 INNER JOIN INSIGHT.DD_CATEGORY_GRADE DDCG ON DDCC.AREA_ID
   Table or Inline View? ....  FROM FUND_VISIT
 
 
THIS IS THE RESULT - OK EXCEPT FOR LAST LINE "n..7" - ANY IDEAS?
DD_CATEGORY_GRADE..1
DD_CATEGORY_COMMENT..1
FUND_FP_STS..1
CRYSTAL..1
STRAT..1
V_CITY_S..1
INSIGHT..2
T_SECURITY_USER..2
n..7
0
 
LVL 60

Expert Comment

by:Geert Bormans
Comment Utility
put a chomp ending this line
    record = record.gsub(/^.*\bjoin\b\s*([a-z0-9\/\-_]+).*$/i,'\1').chomp
   
0
 
LVL 60

Expert Comment

by:Geert Bormans
Comment Utility
or match the newline with the match
   record = record.gsub(/^.*\bjoin\b\s*([a-z0-9\/\-_]+).*$\n?/i,'\1')
 
0
 

Author Comment

by:philsivyer
Comment Utility
I'm getting some additional lines that I would not expect - I have attached the txt file in question just in case I have missed something.

req = Hash.new(0)
myarray=[]
mystring=File.open("U:/tables_inline_views_used.txt") do |r|

myarray=[]
  r.each do |record|
    if record =~/\bjoin\b/i  then
    #record = record.gsub(/^.*\bjoin\b\s*([a-z0-9\/\.-_]+).*$/i,'\1').chomp
    record = record.gsub(/^.*\bjoin\b\s*([a-z0-9\/\-_]+).*$\n?/i,'\1')
    myarray<<record
    end
  end
end

myarray.inspect.scan(/\w+/) {|word| freq[word] = freq[word]+1}
   freq=freq.sort {|k,v| k[1]<=>v[1]}
  freq.each {|k,v| puts "#{k}..#{v}"}
   #freq.keys.sort.each {|k| print k, " - ", freq[k], "\n"}  
    exit
tables-inline-views-used.txt
0
 
LVL 60

Expert Comment

by:Geert Bormans
Comment Utility
it would help if you said which lines you did not expect.
Looking for them in the input would be the next thing to do,
maybe you can get an idea what is wrong doing that
Anyway. please let me know which lines you don't expect

I see something going weird because of this construct
JOIN ._SESSION
I assume you want to allow '.' in the follow-up word too

change the regex then
 record = record.gsub(/^.*\bjoin\b\s*([a-z0-9\/\-_\.]+).*$/i,'\1').chomp
 
0
 

Author Comment

by:philsivyer
Comment Utility
I would not expect to see for example
table
inline


... any given line that has the word "join" in it then I want to see the very next complete string eg
from this string below........
 INNER JOIN FUND_LIST_VALUE FLV ON DDSC.SCORECARD_TYPE = FLV.VAL_ID
  Table or Inline View? ....  INNER JOIN (SELECT DISTINCT A.fp_sts_name,
  Table or Inline View? ....  FROM (SELECT t7.city_id AS city_id,

I would now expect to see....
FUND_LIST_VALUE
(SELECT
0
Threat Intelligence Starter Resources

Integrating threat intelligence can be challenging, and not all companies are ready. These resources can help you build awareness and prepare for defense.

 
LVL 60

Expert Comment

by:Geert Bormans
Comment Utility
yes, but that is caused by the non matching lines
I have not created the same regex in the matching if statment as in the next line,
the problem with gsub is that it returns the string unchanged if it doesn't match
I am trying to figure out what you want in the string, so it doesn't give you (SELECT, so that entire line is split

add the ( to the pattern and it will be OK

   record = record.gsub(/^.*\bjoin\b\s*([a-z0-9\/\-_\.\(]+).*$/i,'\1').chomp
 
you might need to add more to the character class
0
 

Author Comment

by:philsivyer
Comment Utility
Below gives me what I'm looking for with one exception.

freq = Hash.new(0)
myarray=[]
mystring=File.open("U:/A_CRYSTAL_REPORTS_MASTER/Report Definitions/tables_inline_views_used.txt") do |r|

myarray=[]
  r.each do |record|
    if record =~/\bjoin\b/i  then
    #record = record.gsub(/^.*\bjoin\b\s*([a-z0-9\/\.-_]+).*$/i,'\1').chomp
    #record = record.gsub(/^.*\bjoin\b\s*([a-z0-9\/\-_\.]+).*$/i,'\1').chomp
    #record = record.gsub(/^.*\bjoin\b\s*([a-z0-9\/\-_]+).*$\n?/i,'\1')
    record = record.gsub(/^.*\bjoin\b\s*([a-z0-9\/\-_\.\(]+).*$/i,'\1').chomp
    unless record =~ /select|fund|inner|insight/i then
    myarray<<record
   
    end
  end
end
end

myarray.inspect.scan(/\w+/) {|word| freq[word] = freq[word]+1}
   freq=freq.sort {|k,v| k[1]<=>v[1]}
  freq.each {|k,v| puts "#{k}..#{v}"}
   #freq.keys.sort.each {|k| print k, " - ", freq[k], "\n"}  
    exit


There are some lines in the text file with the following..
INNER JOIN CRYSTAL.V_CITY_S T7 ON T1.CITY_ID = T7.CITY_ID

for some reason it retuns 2 lines i.e
CRYSTAL
V_CITY_S

it should return
CRYSTAL.V_CITY_S

must be something to do with the period dot .

0
 

Author Comment

by:philsivyer
Comment Utility
As a test I went to the text file and typed in a new line JOIN CRYSTAL.V_CITY_S
and the code returned
CRYSTALV_CITY_S # without the period dot - how strange
0
 
LVL 60

Accepted Solution

by:
Geert Bormans earned 500 total points
Comment Utility
Not strange at all,
it goes well whilst creating myarray
it goes wrong a bit later whilst splitting in words for creating freq
a dot is not part of \w
you could change this
myarray.inspect.scan(/[\w\.]+/) {|word| freq[word] = freq[word]+1}
to include a dot
0
 

Author Comment

by:philsivyer
Comment Utility
Thanks Gertone  - that did it.
0
 

Author Closing Comment

by:philsivyer
Comment Utility
Thanks
0
 
LVL 60

Expert Comment

by:Geert Bormans
Comment Utility
welcome
0

Featured Post

Threat Intelligence Starter Resources

Integrating threat intelligence can be challenging, and not all companies are ready. These resources can help you build awareness and prepare for defense.

Join & Write a Comment

Suggested Solutions

I recently rediscovered rails when I needed a holiday project and decided to build a management dashboard for the company where I work.  With it being a project done in my free time, I could focus my time on learning the basics rather than trying to…
In Ruby, Call or invoke a API DLL library is easily via Win32API class, win32-api gem or other gems. For general DLL API call, there are quite a few references, some good tips list below: http://www.rubytips.org/2008/05/13/accessing-windows-api-fro…
In this tutorial you'll learn about bandwidth monitoring with flows and packet sniffing with our network monitoring solution PRTG Network Monitor (https://www.paessler.com/prtg). If you're interested in additional methods for monitoring bandwidt…
You have products, that come in variants and want to set different prices for them? Watch this micro tutorial that describes how to configure prices for Magento super attributes. Assigning simple products to configurable: We assigned simple products…

772 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

12 Experts available now in Live!

Get 1:1 Help Now