• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 689
  • Last Modified:

Need help with Invalid length parameter passed to the substring function

I need to parse out several lines of email headers contained in a text type column

Here's an example

Date: Thu, 27 Apr 2000 09:05:22 -0400
From: John Doe <bill@website.com>
Subject: test
To: <BillSmith@website2.com>
Mime-Version: 1.0
Organization: Company, Inc.

I need to pull the Date:, From: Subject: and To: lines to their own columns. I know how to use SUBSTRING(EmailText,PATINDEX('%From:%',EmailText),40) to get 40 chars of the From: line but would like to pull everything up to the linefeed for each of the above.

I accepted the following as a solution previously but discovered it includes the trailing char(10) which shows up in the column as a trailing pair of boxes when I SELECT INTO a new table:

SELECt SUBSTRING(EmailText,PATINDEX('%From:%',EmailText),PATINDEX('%'+CHAR(10)+'%',SUBSTRING(EmailText,PATINDEX('%From:%',EmailText),Len(EmailText))))
 
I tried the following but I get "Invalid length parameter passed to the substring function" due to missing data in certain rows:

SELECt SUBSTRING(EmailText,PATINDEX('%From:%',EmailText),PATINDEX('%'+CHAR(10)+'%',SUBSTRING(EmailText,PATINDEX('%From:%',EmailText),Len(EmailText)))-2)

I tried eliminating incomplete headers through the WHERE clause by making sure that the strings 'From:','To:','Subject:' and 'Date:' are in the header but sometimes they appear with no following text which results in the invalid length parameter passed error.
0
dsteinschneider
Asked:
dsteinschneider
  • 3
  • 2
1 Solution
 
Anthony PerkinsCommented:
You may want to resort to using a function like this:

CREATE Function dbo.udf_GetEmailText(
            @EmailText text,
            @Field varchar(20))

Returns varchar(1000)

Begin
Declare @FromPos integer,
      @ToPos integer,
      @LF char(1),
      @Value varchar(1000)

Set @LF = CHAR(10)

Set @FromPos = CHARINDEX(@Field, @EmailText)
If @FromPos > 0
   Begin
      Set @FromPos = @FromPos + LEN(@Field)
      Set @ToPos = CHARINDEX(@LF, @EmailText, @FromPos)
      If @ToPos = 0
         Begin
            Set @ToPos = DATALENGTH(@EmailText)
         End
      Set @Value = SUBSTRING(@EmailText, @FromPos, @ToPos - @FromPos + 1)
   End
Else
   Begin
      Set @Value = Null       -- Or '' whichever you prefer
   End

Return @Value
End

You can then call it as follows:

Select      ID,
      EmailText,
      dbo.udf_GetEmailText(EmailText, 'Date:'),
      dbo.udf_GetEmailText(EmailText, 'From:'),
      dbo.udf_GetEmailText(EmailText, 'Subject:'),
      dbo.udf_GetEmailText(EmailText, 'To:'),
      dbo.udf_GetEmailText(EmailText, 'Mime-Version:'),
      dbo.udf_GetEmailText(EmailText, 'Organization:')
From      YourTableName
0
 
Anthony PerkinsCommented:
The reason I suggest you do it this way, is that it will give you greater flexibility in modifying it.  For example, you may find you are getting false positives:  The "From:" may be embedded in the text somewhere.  What you would have to do then is search for CHAR(10) + 'From:'

Nome other notes:
1.  Since the column is a Text datatype use DATALENGTH instead of LEN
2.  I am making the assumption all the "lines" end in CHAR(10), except for optionally the last.
3.  If you return Null instead of '' it will alert you to the fact that the "field" did not exist as opposed to being really empty.
0
 
dsteinschneiderAuthor Commented:
Thanks,
I agree with you three notes. The way you wrote the function makes it very clear. It works well except that each column has a leading space. I tried changing this:

Set @Value = SUBSTRING(@EmailText, @FromPos+1, @ToPos - @FromPos-2)

Looked good for the top 2000 rows but somewhere beyond that it resulted in the invalid parameter for substring error.

0
 
Anthony PerkinsCommented:
>>It works well except that each column has a leading space. <<
Right that is because there is a space after every "Field".  If you do not need the space, just change the query to (assuming that all "Fields" have a space at the end):
Select     ID,
     EmailText,
     dbo.udf_GetEmailText(EmailText, 'Date: '),
     dbo.udf_GetEmailText(EmailText, 'From: '),
     dbo.udf_GetEmailText(EmailText, 'Subject: '),
     dbo.udf_GetEmailText(EmailText, 'To: '),
     dbo.udf_GetEmailText(EmailText, 'Mime-Version: '),
     dbo.udf_GetEmailText(EmailText, 'Organization: ')
From     YourTableName
0
 
dsteinschneiderAuthor Commented:
I had tried already to modify the query by altering the second parameter but I still got the leading space.

I realized my last attempt to shift the @FromPos and @ToPos was faulty - here's the change that works:

Set @Value = SUBSTRING(@EmailText, @FromPos+1, @ToPos - (@FromPos+1))

Thanks again for your help - this is for a compliance report.
0

Featured Post

What does it mean to be "Always On"?

Is your cloud always on? With an Always On cloud you won't have to worry about downtime for maintenance or software application code updates, ensuring that your bottom line isn't affected.

  • 3
  • 2
Tackle projects and never again get stuck behind a technical roadblock.
Join Now