We help IT Professionals succeed at work.

Reading from textfile where vbcrlf is a line delimiter as well as embedded within the data

Hi,

I need urgent help reading from a textfile with vbcrlf for the row delimiter (end of line), however, the vbcrlf  can sometimes be in the data within the row. This causes the reader to think it has encountered a new row even though it is still has now finished reading/parsing the full row.

 I am using a pipe sysmbol "|"  as a column delimiter. There are 51 columns in each row.

Any ideas?

The simplest would be is to use a combo delimiter of vbcrlf and "|" to see if it has really reached the end of the row (this does not seem possible to do).

Another thing is to make sure that the 51 columns are parsed for each row/line even if the vbcrlf is encountered before the row is read/parsed to the end. However, I do not know how to do this. Concrete code example needed...

I need urgent help on this. Please see my code for details.

Many thanks.
Dim ZipToUnpack As String = fullzippath
        Dim TargetDir As String = Server.MapPath("uploads/")
        Dim ez As ZipEntry


        Using zip1 As ZipFile = ZipFile.Read(ZipToUnpack)
            
            ' based on entry name, size, date, etc.   
            For Each ez In zip1
                ez.Extract(TargetDir, ExtractExistingFileAction.OverwriteSilently)

                ' Once extracted, let us know
                dvResults.InnerHtml &= "<br />File <b>" & ez.FileName & "</b> has been extracted and uploaded." ' & TargetDir

                ' Open the file, using a streamreader
                Dim objSR As StreamReader = File.OpenText(TargetDir & ez.FileName)

                ' Read the file's contents into a variable
                Dim contents As String = objSR.ReadToEnd()

                ' Start: write Biodata CROSSREF to table
                If ez.FileName.Contains("Cross") Then

                    ' the array of rows
                    Dim arrRows() As String = contents.Split(vbCrLf) 'contents.Split("|" + vbCrLf)
                    Dim arrCells() As String


                    ' Loop through its contents
                    For Each a As Object In arrRows

                        Dim tbl As New Object
                        tbl = New tblBiodata_Cross_Reference_TEMP
                        dcMeheret.ExecuteCommand("Truncate table tblBiodata_Cross_Reference_TEMP")

                        ' If a line doesn't have |, then skip it
                        If a.Contains("GUID") Then Continue For

                        ' Split each row into the 3 columns
                        arrCells = a.Split("|")

                        ' Add each cell into the linq table row
                        With tbl
                            .Row = arrCells(0)
                            .Case_GUIDFrom = New Guid(arrCells(1))
                            .FullCaseNoFrom = arrCells(2)
                        End With

                        dcMeheret.tblBiodata_Cross_Reference_TEMPs.InsertOnSubmit(tbl)
                    Next
                End If
                ' End: write Biodata CROSSREF to table
                ' *** End: write biotable to respective tables. ***


                ' Close the streamreader
                objSR.Close()

            Next

            ' Save changes
            dcMeheret.SubmitChanges()

    End Sub

Open in new window

Comment
Watch Question

Commented:
Do the 'extra' vbCrLFs appear anywhere in the data delimited by the | symbols?
Éric MoreauSenior .Net Consultant
Top Expert 2016

Commented:
Test your restores, not your backups...
Expert of the Year 2019
Top Expert 2016
Commented:
==> The simplest would be is to use a combo delimiter of vbcrlf and "|" to see if it has really reached
==> the end of the row (this does not seem possible to do).

With a couple of extra statements this can actually be done.  Here's a small example.  The basic idea is to convert the | CRLF combination to a single character that can then be split on with the Split() command.  Naturally you want to pick a new delimiter character that won't appear in the data.  Here's a sample bit of code to give you the idea:

Const ForReading = 1
Const Delim = "^"
Set objFSO = CreateObject("Scripting.FileSystemObject")
strData = objFSO.OpenTextFile("c:\temp\EE26252114.txt", ForReading).ReadAll
strData = Replace(strData, "|" & vbCRLF, Delim)
arrLines = Split(strData, Delim)
For Each strLine in arrLines
  wscript.echo "[" & strLine & "]"
Next

I ran this with the following input file:

1|2|3|4a
4b
4c|5|6|
1|2|3|4a
4b
4c|5|6|
1|2|3|4a
4b
4c|5|6|

And got the following output:

[1|2|3|4a
4b
4c|5|6]
[1|2|3|4a
4b
4c|5|6]
[1|2|3|4a
4b
4c|5|6]

While this may not look a lot different it demonstrates that the input was correctly grouped and shows up in the output inside each [ ] pair.  Naturally the extra CRLF I put in the fourth delimited field still cause line breaks when the output is displayed, but in your where I do the ECHO of the input record, you would likely do a SPLIT() on the "|" and get each field to process.

Hope this helps, and makes sense, let me know any questions.  I should note this approach assumes a trailing delimiter after the last field in your input file, if that is not the case then we have a problem.

~bp