Solved

select following word only

Posted on 2010-11-22
14
334 Views
Last Modified: 2012-05-10
Hello
Want to use regexpression to be able to select the word that follows the word "join" - the following word may contain / or - etc but with no gaps - example

testing for join table_2/3 is my ........
select ... inner join reference/1 is the table

so, what I would like returned is
table_2/3
reference/1

my code below works but selects the whole line that has the word "join" in it


freq = Hash.new(0)
myarray=[]
mystring=File.open("U:/A_CRYSTAL_REPORTS_MASTER/Report Definitions/tables_inline_views_used.txt") do |r|
  r.each do |record|
        if record =~/^(?=.*?\bjoin\b).*$/i  then
    record = record.gsub(/\n/,'')#\n removes n so as not to count \n - new line
    myarray<<record
  end
  end
  end

puts myarray.inspect

  myarray.inspect.scan(/\w+/) {|word| freq[word] = freq[word]+1}
  #THIS SORTS BY MA X COUNTS OF WORDS
  freq=freq.sort {|k,v| k[1]<=>v[1]}
  freq.each {|k,v| puts "#{k}..#{v}"}
   
 #freq.keys.sort.each {|k| print k, " - ", freq[k], "\n"}  
 
  exit
0
Comment
Question by:philsivyer
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 7
  • 7
14 Comments
 
LVL 60

Expert Comment

by:Geert Bormans
ID: 34186950
try this,

note that there is a difference in matching (that is in the if)
and what you finally select from the regex '\1' refers to the stuff matched inside the first ()
myarray=[]
  r.each do |record|
    if record =~/\bjoin\b/i  then
    record = record.gsub(/^.*\bjoin\b\s*([a-z0-9\/\-_]+).*$/i,'\1')
    myarray<<record
    end
  end

puts myarray.inspect

Open in new window

0
 

Author Comment

by:philsivyer
ID: 34187150
One query

this is the code..
freq = Hash.new(0)
myarray=[]
mystring=File.open("U:/A_CRYSTAL_REPORTS_MASTER/Report Definitions/tables_inline_views_used.txt") do |r|

myarray=[]
  r.each do |record|
    if record =~/\bjoin\b/i  then
    record = record.gsub(/^.*\bjoin\b\s*([a-z0-9\/\.-_]+).*$/i,'\1')
    myarray<<record
    end
  end
end

myarray.inspect.scan(/\w+/) {|word| freq[word] = freq[word]+1}
  #THIS SORTS BY MA X COUNTS OF WORDS
  freq=freq.sort {|k,v| k[1]<=>v[1]}
  freq.each {|k,v| puts "#{k}..#{v}"}
   
 #freq.keys.sort.each {|k| print k, " - ", freq[k], "\n"}  
    exit

THIS IS THE TEXT ITS LOOKING AT.....
 Table or Inline View? ....  FROM FUND_VISIT
 LEFT JOIN T_SECURITY_USER TSU1 ON F.DD_ANALYST_ID = TSU1.USER_ID
 LEFT JOIN T_SECURITY_USER TSU2 ON F.PRIMARY_ANALYST_ID = TSU2.USER_ID
 LEFT OUTER JOIN FUND_FP_STS T3 ON T1.FP_STATUS_ID = T3.FP_STS_ID
 INNER JOIN CRYSTAL.V_CITY_S T7 ON T1.CITY_ID = T7.CITY_ID
 INNER JOIN STRAT T8 ON T1.STRATEGY_ID = T8.STRAT_ID
  Table or Inline View? ....  FROM (SELECT FUND_ID,
  Table or Inline View? ....  FROM (SELECT t1.Fund_id AS FUND_ID,
 
NAME OF REPORT: DD02
  Table or Inline View? ....  FROM .dd_scorecard ddsc
 INNER JOIN INSIGHT.DD_CATEGORY_COMMENT DDCC ON DDSC.SCORECARD_ID
 INNER JOIN INSIGHT.DD_CATEGORY_GRADE DDCG ON DDCC.AREA_ID
   Table or Inline View? ....  FROM FUND_VISIT
 
 
THIS IS THE RESULT - OK EXCEPT FOR LAST LINE "n..7" - ANY IDEAS?
DD_CATEGORY_GRADE..1
DD_CATEGORY_COMMENT..1
FUND_FP_STS..1
CRYSTAL..1
STRAT..1
V_CITY_S..1
INSIGHT..2
T_SECURITY_USER..2
n..7
0
 
LVL 60

Expert Comment

by:Geert Bormans
ID: 34187360
put a chomp ending this line
    record = record.gsub(/^.*\bjoin\b\s*([a-z0-9\/\-_]+).*$/i,'\1').chomp
   
0
Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
LVL 60

Expert Comment

by:Geert Bormans
ID: 34187365
or match the newline with the match
   record = record.gsub(/^.*\bjoin\b\s*([a-z0-9\/\-_]+).*$\n?/i,'\1')
 
0
 

Author Comment

by:philsivyer
ID: 34187793
I'm getting some additional lines that I would not expect - I have attached the txt file in question just in case I have missed something.

req = Hash.new(0)
myarray=[]
mystring=File.open("U:/tables_inline_views_used.txt") do |r|

myarray=[]
  r.each do |record|
    if record =~/\bjoin\b/i  then
    #record = record.gsub(/^.*\bjoin\b\s*([a-z0-9\/\.-_]+).*$/i,'\1').chomp
    record = record.gsub(/^.*\bjoin\b\s*([a-z0-9\/\-_]+).*$\n?/i,'\1')
    myarray<<record
    end
  end
end

myarray.inspect.scan(/\w+/) {|word| freq[word] = freq[word]+1}
   freq=freq.sort {|k,v| k[1]<=>v[1]}
  freq.each {|k,v| puts "#{k}..#{v}"}
   #freq.keys.sort.each {|k| print k, " - ", freq[k], "\n"}  
    exit
tables-inline-views-used.txt
0
 
LVL 60

Expert Comment

by:Geert Bormans
ID: 34188002
it would help if you said which lines you did not expect.
Looking for them in the input would be the next thing to do,
maybe you can get an idea what is wrong doing that
Anyway. please let me know which lines you don't expect

I see something going weird because of this construct
JOIN ._SESSION
I assume you want to allow '.' in the follow-up word too

change the regex then
 record = record.gsub(/^.*\bjoin\b\s*([a-z0-9\/\-_\.]+).*$/i,'\1').chomp
 
0
 

Author Comment

by:philsivyer
ID: 34188102
I would not expect to see for example
table
inline


... any given line that has the word "join" in it then I want to see the very next complete string eg
from this string below........
 INNER JOIN FUND_LIST_VALUE FLV ON DDSC.SCORECARD_TYPE = FLV.VAL_ID
  Table or Inline View? ....  INNER JOIN (SELECT DISTINCT A.fp_sts_name,
  Table or Inline View? ....  FROM (SELECT t7.city_id AS city_id,

I would now expect to see....
FUND_LIST_VALUE
(SELECT
0
 
LVL 60

Expert Comment

by:Geert Bormans
ID: 34188239
yes, but that is caused by the non matching lines
I have not created the same regex in the matching if statment as in the next line,
the problem with gsub is that it returns the string unchanged if it doesn't match
I am trying to figure out what you want in the string, so it doesn't give you (SELECT, so that entire line is split

add the ( to the pattern and it will be OK

   record = record.gsub(/^.*\bjoin\b\s*([a-z0-9\/\-_\.\(]+).*$/i,'\1').chomp
 
you might need to add more to the character class
0
 

Author Comment

by:philsivyer
ID: 34188895
Below gives me what I'm looking for with one exception.

freq = Hash.new(0)
myarray=[]
mystring=File.open("U:/A_CRYSTAL_REPORTS_MASTER/Report Definitions/tables_inline_views_used.txt") do |r|

myarray=[]
  r.each do |record|
    if record =~/\bjoin\b/i  then
    #record = record.gsub(/^.*\bjoin\b\s*([a-z0-9\/\.-_]+).*$/i,'\1').chomp
    #record = record.gsub(/^.*\bjoin\b\s*([a-z0-9\/\-_\.]+).*$/i,'\1').chomp
    #record = record.gsub(/^.*\bjoin\b\s*([a-z0-9\/\-_]+).*$\n?/i,'\1')
    record = record.gsub(/^.*\bjoin\b\s*([a-z0-9\/\-_\.\(]+).*$/i,'\1').chomp
    unless record =~ /select|fund|inner|insight/i then
    myarray<<record
   
    end
  end
end
end

myarray.inspect.scan(/\w+/) {|word| freq[word] = freq[word]+1}
   freq=freq.sort {|k,v| k[1]<=>v[1]}
  freq.each {|k,v| puts "#{k}..#{v}"}
   #freq.keys.sort.each {|k| print k, " - ", freq[k], "\n"}  
    exit


There are some lines in the text file with the following..
INNER JOIN CRYSTAL.V_CITY_S T7 ON T1.CITY_ID = T7.CITY_ID

for some reason it retuns 2 lines i.e
CRYSTAL
V_CITY_S

it should return
CRYSTAL.V_CITY_S

must be something to do with the period dot .

0
 

Author Comment

by:philsivyer
ID: 34189100
As a test I went to the text file and typed in a new line JOIN CRYSTAL.V_CITY_S
and the code returned
CRYSTALV_CITY_S # without the period dot - how strange
0
 
LVL 60

Accepted Solution

by:
Geert Bormans earned 500 total points
ID: 34189762
Not strange at all,
it goes well whilst creating myarray
it goes wrong a bit later whilst splitting in words for creating freq
a dot is not part of \w
you could change this
myarray.inspect.scan(/[\w\.]+/) {|word| freq[word] = freq[word]+1}
to include a dot
0
 

Author Comment

by:philsivyer
ID: 34194770
Thanks Gertone  - that did it.
0
 

Author Closing Comment

by:philsivyer
ID: 34194773
Thanks
0
 
LVL 60

Expert Comment

by:Geert Bormans
ID: 34195410
welcome
0

Featured Post

Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Article by: narshlob
If you've ever programmed in Ruby and have come across either a proc or a lambda, you might have been wondering what the difference is between the two and when you would use one over the other. This article will try to explain the difference between…
Recently I spent hours debugging an issue in a Rails project where ActiveRecord was causing MySQL errors trying to create a User object of a class at the top level of a Single Table Inheritance model structure.  It turns out `.create` behaves differ…
Michael from AdRem Software explains how to view the most utilized and worst performing nodes in your network, by accessing the Top Charts view in NetCrunch network monitor (https://www.adremsoft.com/). Top Charts is a view in which you can set seve…
Monitoring a network: how to monitor network services and why? Michael Kulchisky, MCSE, MCSA, MCP, VTSP, VSP, CCSP outlines the philosophy behind service monitoring and why a handshake validation is critical in network monitoring. Software utilized …
Suggested Courses

632 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question