Solved

Most efficient way to loop  through a txt file and find specific "Block of data" using vb

Posted on 2014-10-06
17
136 Views
Last Modified: 2014-10-13
Hi experts,
I have one question and it is related to design and performance!

Need most efficient way to loop through a txt file and find specific "Block of data”.
That will be inserted into an SQL table. (With SQL part I'm fine)
 Size of file is max 50 MB.
The Data is well-defined and the file looks like:

BEGIN DATA FOR ID XXXXXX
**ERRORS M(20) < 10 INDICATES THAT THE CELL IS SUSPECT**ERRORS**
***ERRORS  jgkhdfghdkjfhghg;dsfh;ghs;ghsd;ghsd;ghds;ghsdfhg;dfhgsfd
***ERRORS**dlhagdlajgdlakjalkfjalflajgfljakgflagdljfgalkjdgf
END DATA OUTPUT FOR ID XXXXXX
Qotiqyrtpoqyptqptqw
etypqytqwetyqwyitqor
BEGIN DATA FOR ID YYYYYY
**ERRORS**jgkhdfghdkjfhghg;dsfh;ghs;ghsd;ghsd;ghds;ghsdfhg;dfhgsfd
**ERRORS jgkhdfghdkjfhghg;dsfh;ghs;ghsd;ghsd;ghds;ghsdfhg;dfhgsfd
**ERRORS**12424335353545454646464664
**ERRORS**vnvcbvxcm,vxcbxc,nvx,cb,xnvx,nb,xmnvx,cm
***ERRORS**
***ERRORS**nvnmnbmnnbnnmmn
END DATA OUTPUT FOR ID YYYYYY
Etc, etc…

I do want to
   1. Parse id
   2. Find the line with "ERRORS"
   3. Build the new record as (id, Comment) like:

Id          Comment  
xxxxxx **ERRORS M(20) < 10 INDICATES THAT THE CELL IS SUSPECT**ERRORS**
xxxxxx ***ERRORS  jgkhdfghdkjfhghg;dsfh;ghs;ghsd;ghsd;ghds;ghsdfhg;dfhgsfd
xxxxxx ***ERRORS**dlhagdlajgdlakjalkfjalflajgfljakgflagdljfgalkjdgf
YYYYYY **ERRORS**jgkhdfghdkjfhghg;dsfh;ghs;ghsd;ghsd;ghds;ghsdfhg;dfhgsfd
YYYYYY **ERRORS jgkhdfghdkjfhghg;dsfh;ghs;ghsd;ghsd;ghds;ghsdfhg;dfhgsfd
YYYYYY **ERRORS**12424335353545454646464664
YYYYYY **ERRORS**vnvcbvxcm,vxcbxc,nvx,cb,xnvx,nb,xmnvx,cm
YYYYYY ***ERRORS**
YYYYYY ***ERRORS**nvnmnbmnnbnnmmn


Code where I  need help (with another loop) :  
Do Until EOF(1)
    Line Input #1, MyTextLine
    LineNo = LineNo + 1
    If Mid(MyTextLine, 1, 17) = "BEGIN DATA FOR ID" Then
    ID = Mid(MyTextLine, 19, 6)
    Debug.Print ID

    '''need another loop to find line with "ERRORS"



    End If
Loop


I hope this is enough information..
Looking forward to a very elegant solution....
0
Comment
Question by:PDF
  • 9
  • 8
17 Comments
 
LVL 45

Expert Comment

by:Martin Liss
Comment Utility
Unless I misunderstand the question you don't need a second loop

Dim FF As Integer
Dim MyTextLine As String
Dim ID As String

FF = FreeFile

Open "C:\temp\errors.txt" For Input As #FF
Do Until EOF(FF)
    Line Input #FF, MyTextLine
    'LineNo = LineNo + 1
    Select Case True
        Case Mid(MyTextLine, 1, 17) = "BEGIN DATA FOR ID"
        ID = Mid(MyTextLine, 19, 6)
        Debug.Print ID
        Case InStr(1, "*ERRORS", MyTextLine) > 0
            ' write error to SQL
    End Select
Loop
Close

Open in new window

0
 
LVL 45

Accepted Solution

by:
Martin Liss earned 500 total points
Comment Utility
Line 15 in my code above should be Case InStr(1, MyTextLine, "*ERRORS") > 0

And for consistency sake, line 17 could also be written as follows. InStr is also faster than Mid. (Here's an article on the subject.)

Case InStr(1, MyTextLine, "BEGIN DATA FOR ID") > 0
0
 

Author Closing Comment

by:PDF
Comment Utility
MartinLiss,

MartinLiss,

Thank you for your prompt reply.
Works Perfectly!

I replaced line    "Case InStr(1, MyTextLine, "*ERRORS") > 0
           with           "Case InStr(MyTextLine, "ERRORS") > 0
   
Thanks again!
Sincerely
PDF
0
 
LVL 45

Expert Comment

by:Martin Liss
Comment Utility
OK, with or without the "1", the search starts at position one because that's the default and I think it's better to specifically include the value, but in any case you're welcome and I'm glad I was able to help.

In my profile you'll find links to some articles I've written that may interest you.
Marty - MVP 2009 to 2014
0
 

Author Comment

by:PDF
Comment Utility
Hi MartinLiss,

Just to be 100% sure that I have all lines in that block of data
That starts with "****errors",I would like  to do Loop.
If are  outputs identical, than I can use accepted solution.
 (In case that line doesn’t start with ****errors but it is still part of block.)
Thank you very much & thank you for link.


I appreciate You a lot!
0
 
LVL 45

Expert Comment

by:Martin Liss
Comment Utility
Just to be 100% sure that I have all lines in that block of data
That starts with "****errors",I would like  to do Loop.
If are  outputs identical, than I can use accepted solution.
 (In case that line doesn’t start with ****errors but it is still part of block.)
Thank you very much & thank you for link.

I'm sorry but I don't know if you are asking me a question.

In any case you're welcome and I'm glad I was able to help.

In my profile you'll find links to some articles I've written that may interest you.
Marty - MVP 2009 to 2014
0
 

Author Comment

by:PDF
Comment Utility
Well,
My question was "to loop through a txt file and find specific "Block of data" .
I accepted your comment BUT,
I want to be absolutely certain that I have all lines and I would like  to use  an loop.
0
 
LVL 45

Expert Comment

by:Martin Liss
Comment Utility
Trust me, there's no need to use a second loop. The code as is looks at every line in the file and...

If the line contains "BEGIN DATA FOR ID" it stores the ID
If the line contains "ERRORS" it writes the line to SQL (after you added the proper code to do that)
All other lines are ignored.
0
How to run any project with ease

Manage projects of all sizes how you want. Great for personal to-do lists, project milestones, team priorities and launch plans.
- Combine task lists, docs, spreadsheets, and chat in one
- View and edit from mobile/offline
- Cut down on emails

 

Author Comment

by:PDF
Comment Utility
Right ,if line not start with "ERRORS " is ignored,but can be part of block.



Unfortunately ...there can be some line that not start with "ERRORS" but they are part of block.
 

BEGIN DATA FOR ID XXXXXX
ryyyyywerty
ssgfshghgd
afdgfdsgfh
aaaaaaaaaaaaaaaaadg
**ERRORS-----   Start of block
ggaewgleealf   ---- need to be part of sql
 ***ERRORS  jgkhdfghdkjfhghg;dsfh;ghs;ghsd;ghsd
ahgjajgfdakjkas   ----need to be part of sql
***ERRORS**dlhagdlajgdlakjalkfjalflajgfljakgflagdljfgalkjdgf
END DATA OUTPUT FOR ID XXXXXX
0
 
LVL 45

Expert Comment

by:Martin Liss
Comment Utility
So are you saying that everything between the "BEGIN DATA FOR ID XXXXXX" line and the "END DATA OUTPUT FOR ID XXXXXX" line are all a part of the same block? If not then using the same lines, please show what each block should contain.
0
 

Author Comment

by:PDF
Comment Utility
Block of data for each entry start as:
"BEGIN DATA FOR ID XXXXXX"
data
data
data
100 more lines of data
**ERRORS-----   Start of Errors block
 ***ERRORS**dlha
 ***ERRORS**dlha
Data
Data
 ***ERRORS**dlha
Data
End:"END DATA OUTPUT FOR ID XXXXXX"

I need all lines between **ERRORS and "END DATA OUTPUT FOR ID XXXXXX"
0
 
LVL 45

Expert Comment

by:Martin Liss
Comment Utility
Just to be 100% clear, do you mean these lines?

**ERRORS-----   Start of Errors block
 ***ERRORS**dlha
 ***ERRORS**dlha
Data
Data
 ***ERRORS**dlha
Data

or do you mean these lines?

Data
Data
Data
0
 

Author Comment

by:PDF
Comment Utility
all lines  :

**ERRORS-----   Start of Errors block
  ***ERRORS**dlha
  ***ERRORS**dlha
 Data
 Data
  ***ERRORS**dlha
 Data
End:"END DATA OUTPUT FOR ID XXXXXX"
0
 
LVL 45

Expert Comment

by:Martin Liss
Comment Utility
Try this.

Dim FF As Integer
Dim MyTextLine As String
Dim ID As String
Dim bErrorFound As Boolean

FF = FreeFile

Open "C:\temp\errors.txt" For Input As #FF
Do Until EOF(FF)
    Line Input #FF, MyTextLine
    Select Case True
        Case InStr(1, UCase(MyTextLine), "BEGIN DATA FOR ID") > 0
            ID = Mid(MyTextLine, 19, 6)
            bErrorFound = False
            Debug.Print ""
            Debug.Print ID
        Case InStr(1, UCase(MyTextLine), "END DATA") > 0
            bErrorFound = False
            ' Write this line to SQL
            Debug.Print "written to SQL: " & MyTextLine
        Case Else
            If InStr(1, UCase(MyTextLine), "ERRORS") > 0 Or bErrorFound Then
                ' write error to SQL
                Debug.Print "written to SQL: " & MyTextLine
                bErrorFound = True
            End If
    End Select
Loop
Close

Open in new window

0
 

Author Comment

by:PDF
Comment Utility
Thank you for posting code.
 I'll get back to you tomorrow.
0
 

Author Comment

by:PDF
Comment Utility
EXCELLENT !!!

Thank you, MartinLiss for your help!
0
 
LVL 45

Expert Comment

by:Martin Liss
Comment Utility
You're welcome.
0

Featured Post

Maximize Your Threat Intelligence Reporting

Reporting is one of the most important and least talked about aspects of a world-class threat intelligence program. Here’s how to do it right.

Join & Write a Comment

Introduction While answering a recent question (http://www.experts-exchange.com/Q_27402310.html) in the VB classic zone, I wrote some VB code in the (Office) VBA environment, rather than fire up my older PC.  I didn't post completely correct code o…
Most everyone who has done any programming in VB6 knows that you can do something in code like Debug.Print MyVar and that when the program runs from the IDE, the value of MyVar will be displayed in the Immediate Window. Less well known is Debug.Asse…
Get people started with the process of using Access VBA to control Outlook using automation, Microsoft Access can control other applications. An example is the ability to programmatically talk to Microsoft Outlook. Using automation, an Access applic…
Get people started with the process of using Access VBA to control Excel using automation, Microsoft Access can control other applications. An example is the ability to programmatically talk to Excel. Using automation, an Access application can laun…

762 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

10 Experts available now in Live!

Get 1:1 Help Now