Want to protect your cyber security and still get fast solutions? Ask a secure question today.Go Premium

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 251
  • Last Modified:

LIKE Search Logic: SQL Server 2008

Hi Experts,
Trying to create search logic, and my write / execute, look for errors, try again method is taking far too long with over 8000 records to play with! Something new always pops up that needs fixing!

So, with COMMUNICATIONS as COM and ORGANISATIONS as O, any thoughts on how best to refine the search below to find PLACES OF LEARNING?

Note:
Exclude all cubs and brownies (NEGATE).
Email domains of .ac.uk and .sch.uk is only used by education & schools in the United Kingdom – however may also include cubs and brownies.
Not all schools have a .ac.uk or .sch.uk domain name
Some towns and schools have the word BROWN or CUB (rare) in it.
WHERE 
 COM.number LIKE '%.ac.uk' 
 OR COM.number LIKE '%.sch.uk%'
 OR O.name LIKE '%scho%'
 OR O.name LIKE '%pre-scho%' 
 OR O.name LIKE '%prescho%'
 OR O.name LIKE '%infant%' 
 OR O.name LIKE '%nursery%' 
 OR O.name LIKE '%primary%' 
 OR O.name LIKE '%secondary%' 
 OR O.name LIKE '%college%'
 OR O.name LIKE '%hysgol%' --Welsh for "schooling"
 OR O.name LIKE '%ysgol%' --Welsh for school.
 OR O.name LIKE '%kinderg%'
 OR O.name LIKE '%kinter%'
 OR O.name LIKE '%brown[^ies]%' --negate brownies

Open in new window

Thanks in advance!

John Diddy
0
JohnDiddy77
Asked:
JohnDiddy77
  • 3
  • 2
1 Solution
 
LowfatspreadCommented:
post your full query...

#you haven't really expalined what you're trying to achieve and what its based upon....

however at least 3 of the conditions are unnecessary as you already covered them

a fulltext search may be more appropriate .... again more background information
on the data you are searching....


WHERE 
 COM.number LIKE '%.ac.uk' 
 OR COM.number LIKE '%.sch.uk%'
 OR O.name LIKE '%scho%'
-- OR O.name LIKE '%pre-scho%' -- included in scho
-- OR O.name LIKE '%prescho%'  -- included in scho
 OR O.name LIKE '%infant%' 
 OR O.name LIKE '%nursery%' 
 OR O.name LIKE '%primary%' 
 OR O.name LIKE '%secondary%' 
 OR O.name LIKE '%college%'
-- OR O.name LIKE '%hysgol%' --Welsh for "schooling" -- included in ysgol
 OR O.name LIKE '%ysgol%' --Welsh for school. 
 OR O.name LIKE '%kinderg%'
 OR O.name LIKE '%kinter%'
 OR O.name LIKE '%brown[^ies]%' --negate brownies

Open in new window

0
 
JohnDiddy77Author Commented:
SELECT DISTINCT
	isnull(C.forenames + ' ', '') + isnull(C.surname, '') AS 'Label Name',
	C.forenames AS 'Forename',
	C.surname AS 'Surname',
	O.name AS 'Organisation',	
	MAX(CASE COM.device WHEN 'EM' THEN number END) AS Email,
	MAX (CASE AC.campaign WHEN 'MPC' THEN 'Y' ELSE 'N' END) AS '2008/09',
	MAX (CASE AC.campaign WHEN 'P09' THEN 'Y' ELSE 'N' END) AS '2009/10',
	MAX (CASE AC.campaign WHEN 'P10' THEN 'Y' ELSE 'N' END) AS '2010/11'

FROM dbo.appeal_collections AS AC
	
	INNER JOIN unmanned_collections AS UMC
		ON AC.collection_number = UMC.collection_number
		
	INNER JOIN dbo.organisations AS O
		ON UMC.organisation_number = O.organisation_number

	INNER JOIN contacts AS C
		ON UMC.contact_number = C.contact_number
	
	INNER JOIN dbo.communications AS COM
		ON C.contact_number = COM.contact_number

WHERE
	(AC.campaign ='MPC' OR AC.campaign = 'P09' OR AC.campaign = 'P10')
	
	AND COM.device = 'EM' --also check for inverted comma.
	
	AND (COM.number LIKE '%.ac.uk' 
		OR COM.number LIKE '%.sch.uk%'
		OR O.name LIKE '%scho%'
		OR O.name LIKE '%pre-scho%' 
		OR O.name LIKE '%prescho%'
		OR O.name LIKE '%infant%' 
		OR O.name LIKE '%nursery%' 
		OR O.name LIKE '%primary%' 
		OR O.name LIKE '%secondary%' 
		OR O.name LIKE '%college%'
		OR O.name LIKE '%hysgol%'	
		OR O.name LIKE '%ysgol%’
OR O.name LIKE '%kinderg%'
		OR O.name LIKE '%kinter%'
		OR O.name LIKE '%brown[^ies]%'	
		)

Open in new window

SELECT DISTINCT
	isnull(C.forenames + ' ', '') + isnull(C.surname, '') AS 'Label Name',
	C.forenames AS 'Forename',
	C.surname AS 'Surname',
	O.name AS 'Organisation',	
	MAX(CASE COM.device WHEN 'EM' THEN number END) AS Email,
	MAX (CASE AC.campaign WHEN 'MPC' THEN 'Y' ELSE 'N' END) AS '2008/09',
	MAX (CASE AC.campaign WHEN 'P09' THEN 'Y' ELSE 'N' END) AS '2009/10',
	MAX (CASE AC.campaign WHEN 'P10' THEN 'Y' ELSE 'N' END) AS '2010/11'

FROM dbo.appeal_collections AS AC
	
	INNER JOIN unmanned_collections AS UMC
		ON AC.collection_number = UMC.collection_number
		
	INNER JOIN dbo.organisations AS O
		ON UMC.organisation_number = O.organisation_number

	INNER JOIN contacts AS C
		ON UMC.contact_number = C.contact_number
	
	INNER JOIN dbo.communications AS COM
		ON C.contact_number = COM.contact_number

WHERE
	(AC.campaign ='MPC' OR AC.campaign = 'P09' OR AC.campaign = 'P10')
	
	AND COM.device = 'EM'
	
	AND (COM.number LIKE '%.ac.uk' 
		OR COM.number LIKE '%.sch.uk%'
		OR O.name LIKE '%scho%'
		OR O.name LIKE '%pre-scho%' 
		OR O.name LIKE '%prescho%'
		OR O.name LIKE '%infant%' 
		OR O.name LIKE '%nursery%' 
		OR O.name LIKE '%primary%' 
		OR O.name LIKE '%secondary%' 
		OR O.name LIKE '%college%'
		OR O.name LIKE '%hysgol%'	
		OR O.name LIKE '%ysgol%’
OR O.name LIKE '%kinderg%'
		OR O.name LIKE '%kinter%'
		OR O.name LIKE '%brown[^ies]%'	
		)

Open in new window

0
 
JohnDiddy77Author Commented:
@ LowFat:

As alluded to in my original post, I'm searching organisation names and email addresses to find schools / and any form of educational establishment. However, I wish to exclude after-school clubs and similar activites with "Cubs" and "Brownies" being the main culprits.

I hope this is clearly.

PS Don't know why it's posting the code twice!
0
 
LowfatspreadCommented:
ok ....

i think you need to analyse what words/phrases are in your data... use the first...

and then make a decision based on the frequency...

and add them into the exclusion list in the main query
select word,count(distinct o.organisation_number) as InOrgs,count(*) as Occurs
  from (
select substring(o.name+' ',v.number,charindex(' ',o.name+' ',v.number)-v.number) as word
     ,o.organisation_number
  from master.dbo.spt_,values as v
  cross join (select replace(replace(replace(replace(replace(name,'.',' '),',',' '),'(',' '),')',' '),';',' ')
                   as  name
                     ,organisation_number
                from dbo.organisations) as o
 where v.type='p'
 and v.number between 1 and len(o.name)+1
 and substring(' '+o.name,v.number,1)=' '
 ) as x
group by word
order by 3 desc,2 desc,1
 



SELECT 
	isnull(forename + ' ', '') + isnull(surname, '') AS 'Label Name'
        ,x.*
  from (
Select 	C.forenames AS 'Forename',
	C.surname AS 'Surname',
	O.name AS 'Organisation',	
	MAX(com.number ) AS Email,
	MAX (CASE AC.campaign WHEN 'MPC' THEN 'Y' ELSE 'N' END) AS '2008/09',
	MAX (CASE AC.campaign WHEN 'P09' THEN 'Y' ELSE 'N' END) AS '2009/10',
	MAX (CASE AC.campaign WHEN 'P10' THEN 'Y' ELSE 'N' END) AS '2010/11'

FROM dbo.appeal_collections AS AC
INNER JOIN unmanned_collections AS UMC
   oN AC.collection_number = UMC.collection_number
INNER JOIN dbo.organisations AS O
   ON UMC.organisation_number = O.organisation_number
INNER JOIN contacts AS C
   ON UMC.contact_number = C.contact_number
INNER JOIN dbo.communications AS COM
   ON C.contact_number = COM.contact_number

WHERE 	AC.campaign in ('MPC', 'P09', 'P10')
  AND COM.device = 'EM'
  AND (COM.number LIKE '%.ac.uk' 
		OR COM.number LIKE '%.sch.uk%'
		OR O.name LIKE '%scho%'
--		OR O.name LIKE '%pre-scho%' 
--		OR O.name LIKE '%prescho%'
		OR O.name LIKE '%infant%' 
		OR O.name LIKE '%nursery%' 
		OR O.name LIKE '%primary%' 
		OR O.name LIKE '%secondary%' 
		OR O.name LIKE '%college%'
--		OR O.name LIKE '%hysgol%'	
		OR O.name LIKE '%ysgol%’
                OR O.name LIKE '%kinderg%'
		OR O.name LIKE '%kinter%'
	)
group by forenames,surname,o.name
) as x
Where x.Organisation not like '%brownies%'
  and x.organisation not like '%club%'
  and x.organisation not like '% cub%'
  and x.organisation not like '%beaver%'
  and x.organisation not like '%rainbow%'
  and x.organisation not like '%scout%'
  and x.organisation not like '%Guide%'
  and x.organisation not like '%TWGA%'
order by x.organisation

Open in new window

0
 
JohnDiddy77Author Commented:
@ LowFat:
 master.dbo.spt_values?!

Really!?!

That, right there, is a monster! Your original code above didn't work so had to step through it and figure out the analysis bit, and learned a lot!

Question

Re: Analysis Section, how can I adapt this to work for a specific subset of of the Organisation Table, namely the organisations involved in my query above?
SELECT 
  WORD,
  COUNT(DISTINCT x.organisation_number) AS InOrgs,
  COUNT(*) AS Occurs

FROM 
(
SELECT 
		SUBSTRING(o.name + ' ',v.number,
		CHARINDEX(' ',o.name+ ' ',v.number)-v.number) AS word, o.organisation_number
		
	FROM
		master.dbo.spt_values AS v
             CROSS JOIN
		 (SELECT 
		   REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(name,'.',' '),',',' '),'(',' '),')',' '),';',' ') AS  name, organisation_number
  FROM dbo.organisations) AS O
		
		  WHERE v.type='p'
		   AND v.number BETWEEN 1 AND len(o.name)+1
		  AND substring(' '+o.name,v.number,1)=' '
) AS x

GROUP BY word

ORDER BY 3 DESC,2 DESC,1

Open in new window

0

Featured Post

Technology Partners: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

  • 3
  • 2
Tackle projects and never again get stuck behind a technical roadblock.
Join Now