PeterBaileyUk
asked on
sql server query
I have this query
I would like to know how I can get the number of words so in this case Field NoOfWords would =3 in each row as the original string contained 3 words before it was split.
here ive contained the where clause to one client code obviously there are many.
I dont know if its related but I would have preferred the word position just to be sequential ie 1 2 3 etc as opposed to the starting position however its not a deal breaker as I can still determine the original word sequence.
The data looks like
use Dictionary
GO
SELECT SMMT.[MVRIS CODE], SMMT.MARQUE, s.value AS [Word],CONCAT(ISNULL(RTRIM(LTRIM(SMMT.[Model Range])),''), CHAR(32), ISNULL(RTRIM(LTRIM(SMMT.[RANGE SERIES])),''), CHAR(32), ISNULL(RTRIM(LTRIM(SMMT.[VARIANT])),'')) AS ConcatDesc, LEN(s.VALUE) AS WordLength, CHARINDEX(s.value,CONCAT(ISNULL(RTRIM(LTRIM(SMMT.[Model Range])),''), CHAR(32), ISNULL(RTRIM(LTRIM(SMMT.[RANGE SERIES])),''), CHAR(32), ISNULL(RTRIM(LTRIM(SMMT.[VARIANT])),''))) AS wordPos, smmt.[MVRIS CODE] + '_' + s.value + '_' + CAST(CHARINDEX(s.value,CONCAT(ISNULL(RTRIM(LTRIM(SMMT.[Model Range])),''), CHAR(32), ISNULL(RTRIM(LTRIM(SMMT.[RANGE SERIES])),''), CHAR(32), ISNULL(RTRIM(LTRIM(SMMT.[VARIANT])),'')))AS nvarchar) AS ClientCodeWordPosition, 'SMMT' AS ClientName
FROM ClientData.[dbo].SMMT SMMT
CROSS APPLY STRING_SPLIT(CONCAT(ISNULL(RTRIM(LTRIM(SMMT.[Model Range])),''), CHAR(32), ISNULL(RTRIM(LTRIM(SMMT.[RANGE SERIES])),''), CHAR(32), ISNULL(RTRIM(LTRIM(SMMT.[VARIANT])),'')), CHAR(32)) s
WHERE smmt.MARQUE='PEUGEOT' AND SMMT.[MODEL RANGE]='ElDDIS' AND SMMT.[MVRIS CODE]='D39GA' AND s.VALUE <>' '
I would like to know how I can get the number of words so in this case Field NoOfWords would =3 in each row as the original string contained 3 words before it was split.
here ive contained the where clause to one client code obviously there are many.
I dont know if its related but I would have preferred the word position just to be sequential ie 1 2 3 etc as opposed to the starting position however its not a deal breaker as I can still determine the original word sequence.
The data looks like
ClientCodeWordPosition MVRIS CODE ClientName Word WordLength ConcatDesc wordPos MARQUE
D39GA_ELDDIS_1 D39GA SMMT ELDDIS 6 ELDDIS AUTOQUEST D'ARTAGNAN 1 PEUGEOT
D39GA_AUTOQUEST_9 D39GA SMMT AUTOQUEST 9 ELDDIS AUTOQUEST D'ARTAGNAN 9 PEUGEOT
D39GA_D'ARTAGNAN_19 D39GA SMMT D'ARTAGNAN 10 ELDDIS AUTOQUEST D'ARTAGNAN 19 PEUGEOT
ASKER
yes.
ASKER
no apologies mvris code is one field [mvris code] d39ga
ASKER
wait i put into excel
ee.xlsx
ee.xlsx
a "generic" approach is to
replace spaces with nothing,
then compare the length of the original string to the replaced string = number of replaced spaces + 1 = number of words
BUT you need to be careful of multiple spaces, so this should be a close approximation.
select
(len(replace(ConcatDesc,' ',' ')) - len(replace(replace(Concat Desc,' ',' '),' ',''))) + 1 as num_words
, *
from smmt
However, that concatenation is of these columns:
SMMT.[Model Range]
SMMT.[RANGE SERIES]
SMMT.[VARIANT]
which do not appear as separated columns in the sample data, you could (should?) count the number of words in each of those parts before the conatenation, and you could do that within a cross apply if you prefer e.g.
cross apply (
select
len(SMMT.[Model Range]) - len(replace(SMMT.[Model Range],' ','')) + 1
+ len(SMMT.[RANGE SERIES]) - len(replace(SMMT.[RANGE SERIES],' ','')) + 1
+ len(SMMT.[VARIANT]) - len(replace(SMMT.[VARIANT] ,' ',''))
+ 1
) ca
replace spaces with nothing,
then compare the length of the original string to the replaced string = number of replaced spaces + 1 = number of words
BUT you need to be careful of multiple spaces, so this should be a close approximation.
select
(len(replace(ConcatDesc,' ',' ')) - len(replace(replace(Concat
, *
from smmt
However, that concatenation is of these columns:
SMMT.[Model Range]
SMMT.[RANGE SERIES]
SMMT.[VARIANT]
which do not appear as separated columns in the sample data, you could (should?) count the number of words in each of those parts before the conatenation, and you could do that within a cross apply if you prefer e.g.
cross apply (
select
len(SMMT.[Model Range]) - len(replace(SMMT.[Model Range],' ','')) + 1
+ len(SMMT.[RANGE SERIES]) - len(replace(SMMT.[RANGE SERIES],' ','')) + 1
+ len(SMMT.[VARIANT]) - len(replace(SMMT.[VARIANT]
+ 1
) ca
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
thank you it was a great approach and I learned a new way of the language..thank you
welcome... buy me a coffee [_]?
Open in new window