Want to protect your cyber security and still get fast solutions? Ask a secure question today.Go Premium

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 338
  • Last Modified:

select following word only

Hello
Want to use regexpression to be able to select the word that follows the word "join" - the following word may contain / or - etc but with no gaps - example

testing for join table_2/3 is my ........
select ... inner join reference/1 is the table

so, what I would like returned is
table_2/3
reference/1

my code below works but selects the whole line that has the word "join" in it


freq = Hash.new(0)
myarray=[]
mystring=File.open("U:/A_CRYSTAL_REPORTS_MASTER/Report Definitions/tables_inline_views_used.txt") do |r|
  r.each do |record|
        if record =~/^(?=.*?\bjoin\b).*$/i  then
    record = record.gsub(/\n/,'')#\n removes n so as not to count \n - new line
    myarray<<record
  end
  end
  end

puts myarray.inspect

  myarray.inspect.scan(/\w+/) {|word| freq[word] = freq[word]+1}
  #THIS SORTS BY MA X COUNTS OF WORDS
  freq=freq.sort {|k,v| k[1]<=>v[1]}
  freq.each {|k,v| puts "#{k}..#{v}"}
   
 #freq.keys.sort.each {|k| print k, " - ", freq[k], "\n"}  
 
  exit
0
philsivyer
Asked:
philsivyer
  • 7
  • 7
1 Solution
 
Geert BormansCommented:
try this,

note that there is a difference in matching (that is in the if)
and what you finally select from the regex '\1' refers to the stuff matched inside the first ()
myarray=[]
  r.each do |record|
    if record =~/\bjoin\b/i  then
    record = record.gsub(/^.*\bjoin\b\s*([a-z0-9\/\-_]+).*$/i,'\1')
    myarray<<record
    end
  end

puts myarray.inspect

Open in new window

0
 
philsivyerAuthor Commented:
One query

this is the code..
freq = Hash.new(0)
myarray=[]
mystring=File.open("U:/A_CRYSTAL_REPORTS_MASTER/Report Definitions/tables_inline_views_used.txt") do |r|

myarray=[]
  r.each do |record|
    if record =~/\bjoin\b/i  then
    record = record.gsub(/^.*\bjoin\b\s*([a-z0-9\/\.-_]+).*$/i,'\1')
    myarray<<record
    end
  end
end

myarray.inspect.scan(/\w+/) {|word| freq[word] = freq[word]+1}
  #THIS SORTS BY MA X COUNTS OF WORDS
  freq=freq.sort {|k,v| k[1]<=>v[1]}
  freq.each {|k,v| puts "#{k}..#{v}"}
   
 #freq.keys.sort.each {|k| print k, " - ", freq[k], "\n"}  
    exit

THIS IS THE TEXT ITS LOOKING AT.....
 Table or Inline View? ....  FROM FUND_VISIT
 LEFT JOIN T_SECURITY_USER TSU1 ON F.DD_ANALYST_ID = TSU1.USER_ID
 LEFT JOIN T_SECURITY_USER TSU2 ON F.PRIMARY_ANALYST_ID = TSU2.USER_ID
 LEFT OUTER JOIN FUND_FP_STS T3 ON T1.FP_STATUS_ID = T3.FP_STS_ID
 INNER JOIN CRYSTAL.V_CITY_S T7 ON T1.CITY_ID = T7.CITY_ID
 INNER JOIN STRAT T8 ON T1.STRATEGY_ID = T8.STRAT_ID
  Table or Inline View? ....  FROM (SELECT FUND_ID,
  Table or Inline View? ....  FROM (SELECT t1.Fund_id AS FUND_ID,
 
NAME OF REPORT: DD02
  Table or Inline View? ....  FROM .dd_scorecard ddsc
 INNER JOIN INSIGHT.DD_CATEGORY_COMMENT DDCC ON DDSC.SCORECARD_ID
 INNER JOIN INSIGHT.DD_CATEGORY_GRADE DDCG ON DDCC.AREA_ID
   Table or Inline View? ....  FROM FUND_VISIT
 
 
THIS IS THE RESULT - OK EXCEPT FOR LAST LINE "n..7" - ANY IDEAS?
DD_CATEGORY_GRADE..1
DD_CATEGORY_COMMENT..1
FUND_FP_STS..1
CRYSTAL..1
STRAT..1
V_CITY_S..1
INSIGHT..2
T_SECURITY_USER..2
n..7
0
 
Geert BormansCommented:
put a chomp ending this line
    record = record.gsub(/^.*\bjoin\b\s*([a-z0-9\/\-_]+).*$/i,'\1').chomp
   
0
Technology Partners: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
Geert BormansCommented:
or match the newline with the match
   record = record.gsub(/^.*\bjoin\b\s*([a-z0-9\/\-_]+).*$\n?/i,'\1')
 
0
 
philsivyerAuthor Commented:
I'm getting some additional lines that I would not expect - I have attached the txt file in question just in case I have missed something.

req = Hash.new(0)
myarray=[]
mystring=File.open("U:/tables_inline_views_used.txt") do |r|

myarray=[]
  r.each do |record|
    if record =~/\bjoin\b/i  then
    #record = record.gsub(/^.*\bjoin\b\s*([a-z0-9\/\.-_]+).*$/i,'\1').chomp
    record = record.gsub(/^.*\bjoin\b\s*([a-z0-9\/\-_]+).*$\n?/i,'\1')
    myarray<<record
    end
  end
end

myarray.inspect.scan(/\w+/) {|word| freq[word] = freq[word]+1}
   freq=freq.sort {|k,v| k[1]<=>v[1]}
  freq.each {|k,v| puts "#{k}..#{v}"}
   #freq.keys.sort.each {|k| print k, " - ", freq[k], "\n"}  
    exit
tables-inline-views-used.txt
0
 
Geert BormansCommented:
it would help if you said which lines you did not expect.
Looking for them in the input would be the next thing to do,
maybe you can get an idea what is wrong doing that
Anyway. please let me know which lines you don't expect

I see something going weird because of this construct
JOIN ._SESSION
I assume you want to allow '.' in the follow-up word too

change the regex then
 record = record.gsub(/^.*\bjoin\b\s*([a-z0-9\/\-_\.]+).*$/i,'\1').chomp
 
0
 
philsivyerAuthor Commented:
I would not expect to see for example
table
inline


... any given line that has the word "join" in it then I want to see the very next complete string eg
from this string below........
 INNER JOIN FUND_LIST_VALUE FLV ON DDSC.SCORECARD_TYPE = FLV.VAL_ID
  Table or Inline View? ....  INNER JOIN (SELECT DISTINCT A.fp_sts_name,
  Table or Inline View? ....  FROM (SELECT t7.city_id AS city_id,

I would now expect to see....
FUND_LIST_VALUE
(SELECT
0
 
Geert BormansCommented:
yes, but that is caused by the non matching lines
I have not created the same regex in the matching if statment as in the next line,
the problem with gsub is that it returns the string unchanged if it doesn't match
I am trying to figure out what you want in the string, so it doesn't give you (SELECT, so that entire line is split

add the ( to the pattern and it will be OK

   record = record.gsub(/^.*\bjoin\b\s*([a-z0-9\/\-_\.\(]+).*$/i,'\1').chomp
 
you might need to add more to the character class
0
 
philsivyerAuthor Commented:
Below gives me what I'm looking for with one exception.

freq = Hash.new(0)
myarray=[]
mystring=File.open("U:/A_CRYSTAL_REPORTS_MASTER/Report Definitions/tables_inline_views_used.txt") do |r|

myarray=[]
  r.each do |record|
    if record =~/\bjoin\b/i  then
    #record = record.gsub(/^.*\bjoin\b\s*([a-z0-9\/\.-_]+).*$/i,'\1').chomp
    #record = record.gsub(/^.*\bjoin\b\s*([a-z0-9\/\-_\.]+).*$/i,'\1').chomp
    #record = record.gsub(/^.*\bjoin\b\s*([a-z0-9\/\-_]+).*$\n?/i,'\1')
    record = record.gsub(/^.*\bjoin\b\s*([a-z0-9\/\-_\.\(]+).*$/i,'\1').chomp
    unless record =~ /select|fund|inner|insight/i then
    myarray<<record
   
    end
  end
end
end

myarray.inspect.scan(/\w+/) {|word| freq[word] = freq[word]+1}
   freq=freq.sort {|k,v| k[1]<=>v[1]}
  freq.each {|k,v| puts "#{k}..#{v}"}
   #freq.keys.sort.each {|k| print k, " - ", freq[k], "\n"}  
    exit


There are some lines in the text file with the following..
INNER JOIN CRYSTAL.V_CITY_S T7 ON T1.CITY_ID = T7.CITY_ID

for some reason it retuns 2 lines i.e
CRYSTAL
V_CITY_S

it should return
CRYSTAL.V_CITY_S

must be something to do with the period dot .

0
 
philsivyerAuthor Commented:
As a test I went to the text file and typed in a new line JOIN CRYSTAL.V_CITY_S
and the code returned
CRYSTALV_CITY_S # without the period dot - how strange
0
 
Geert BormansCommented:
Not strange at all,
it goes well whilst creating myarray
it goes wrong a bit later whilst splitting in words for creating freq
a dot is not part of \w
you could change this
myarray.inspect.scan(/[\w\.]+/) {|word| freq[word] = freq[word]+1}
to include a dot
0
 
philsivyerAuthor Commented:
Thanks Gertone  - that did it.
0
 
philsivyerAuthor Commented:
Thanks
0
 
Geert BormansCommented:
welcome
0

Featured Post

Concerto Cloud for Software Providers & ISVs

Can Concerto Cloud Services help you focus on evolving your application offerings, while delivering the best cloud experience to your customers? From DevOps to revenue models and customer support, the answer is yes!

Learn how Concerto can help you.

  • 7
  • 7
Tackle projects and never again get stuck behind a technical roadblock.
Join Now