Link to home
Create AccountLog in
Avatar of howyue
howyueFlag for Malaysia

asked on

Compare Character By Character Between Columns

i have records like the following:
ColA
-------
ABC
ABC(001)
ABC(002)

i wan to write a SQL query that can return me a string of 'ABC'. Comparing all 3 columns character by character from left to right until it find an character that cannot match all 3 rows. Without using cursor or loop if possible. Thank you.
Avatar of Guy Hengel [angelIII / a3]
Guy Hengel [angelIII / a3]
Flag of Luxembourg image

so, you want the longest matching combination?
will there be other values to be returned for other rows?
please clarify.
Maybe it's possible to do without a loop, but hardly without using a table with character indexes so that you "fake" a loop in a query.

You only have to compare the first and the last of the selected strings, as those will differ most:
declare @first varchar(100)
declare @last varchar(100)
 
select @first = min(ColA), @Last = max(ColA) from SomeTable
 
declare @len int
set @len = 0
 
while (substring(@first,1,@len) = substring(@last,1,@len)) begin
   set @len = @len + 1
end

Open in new window

Correction. It should of course be:

while (substring(@first,1,@len+1) = substring(@last,1,@len+1)) begin
Avatar of howyue

ASKER

angelIII:
yes. actually there's another column that determine the group of the records. please refer to the code belows. with the sample data, the result should be ABC, DEF-, GHI.

The idea is i'm migrating data from anonymous source, and i need to get the distinct of ColA into another new table.
Create Table Test (
ColA VARCHAR(100),
ColB Int
)
Go
Insert Into Test Values ('ABC', 1)
Insert Into Test Values ('ABC-A', 1)
Insert Into Test Values ('ABC-AA', 1)
Insert Into Test Values ('ABC-B', 1)
Insert Into Test Values ('DEF-C', 2)
Insert Into Test Values ('DEF-D', 2)
Insert Into Test Values ('GHI A', 3)
Insert Into Test Values ('GHI-B', 3)
Insert Into Test Values ('GHI-C', 3)
Insert Into Test Values ('GHI-D', 3)
Go

Open in new window

Avatar of howyue

ASKER

GreenGhost:
i don't understand....
What is it that you don't understand?
Avatar of howyue

ASKER

GreenGhost:
what is the result of your code? @len is the position of the chracter that starts to differ?
and i don't think it's feasible when there is other values to be returned for other rows
SOLUTION
Avatar of Guy Hengel [angelIII / a3]
Guy Hengel [angelIII / a3]
Flag of Luxembourg image

Link to home
membership
Create an account to see this answer
Signing up is free. No credit card required.
Create Account
function dbo.GetMaxCommon
create function dbo.GetMaxCommon(@colB int )
returns varchar(100)
as
begin
  declare @x int
  declare @res varchar(100)
 
  select @x =max(len(cola)) 
  from test
  where colB = @colB
  
 
  select top 1 @res = max(s.colAX)
  from (
    select colB, colA, left(t.colA,x.n) colAX, x.n
    from dbo.CreateRows(@x) x
    cross join test t
    where x.n <= ( select min(len(i.colA)) from test i where i.colB = t.colB )
    and t.colB = @colB
   ) s
  group by s.colB, s.n
  having min(s.colAX) = max(s.ColAX)
  order by s.n desc
 
  return @res
end

Open in new window

function dbo.CreateRows
create function dbo.CreateRows(@n int)
returns @t table ( n int )
as
begin
  declare @r int
  set @r = @n + 1
 
  while (@n>0)
  begin 
    insert into @t (n ) values( @r - @n)
    set @n = @n -1
  end
  return
end

Open in new window

ASKER CERTIFIED SOLUTION
Link to home
membership
Create an account to see this answer
Signing up is free. No credit card required.
Create Account
Avatar of howyue

ASKER

GreenGhost: at 1st glance i don understand ur solution, with ur sample code it helped a lot. and ur solution look more neat.

Good job for both of you.  i attached my original code, which now i'm thinkin to enhanced or swap my solution with ur solution. i'm wondering, with all these 3 solution, which will be the most efficient?
Declare @i Int
Declare @m Int
Declare @Name VARCHAR(100)
Declare @ColB VARCHAR(500)
DECLARE @Test Table (ColA varchar(100), ColB Int )
 
Set @i = 1
Declare c Cursor For
	Select Distinct ColB From Test
OPEN c
FETCH NEXT FROM c INTO @ColB
 
WHILE @@FETCH_STATUS = 0
BEGIN
 
	Select @m = Max(Len(ColA)) + 1 From Test Where ColB = @ColB
	While @i < @m Begin
		If (Select Count(Name) From (Select Substring(ColA, 1, @i) Name From Test Where ColB = @ColB Group By Substring(ColA, 1, @i)) a) = 1 Begin
			Set @Name = (Select Substring(ColA, 1, @i) From Test Where ColB = @ColB Group By Substring(ColA, 1, @i))
			Set @i = @i + 1
		End Else 
			Set @i = @m
	End
	Insert Into @Test Values (@Name, @ColB)
	Set @i = 0
 
FETCH NEXT FROM c INTO @ColB
END 
CLOSE c
DEALLOCATE c
 
Select * From @Test

Open in new window

SOLUTION
Link to home
membership
Create an account to see this answer
Signing up is free. No credit card required.
Create Account
Avatar of howyue

ASKER

Thank guys~
I shall test all the solutions myself. Good job!