Solved

Most efficient way to loop  through a txt file and find specific "Block of data" using vb

Posted on 2014-10-06
17
145 Views
Last Modified: 2014-10-13
Hi experts,
I have one question and it is related to design and performance!

Need most efficient way to loop through a txt file and find specific "Block of data”.
That will be inserted into an SQL table. (With SQL part I'm fine)
 Size of file is max 50 MB.
The Data is well-defined and the file looks like:

BEGIN DATA FOR ID XXXXXX
**ERRORS M(20) < 10 INDICATES THAT THE CELL IS SUSPECT**ERRORS**
***ERRORS  jgkhdfghdkjfhghg;dsfh;ghs;ghsd;ghsd;ghds;ghsdfhg;dfhgsfd
***ERRORS**dlhagdlajgdlakjalkfjalflajgfljakgflagdljfgalkjdgf
END DATA OUTPUT FOR ID XXXXXX
Qotiqyrtpoqyptqptqw
etypqytqwetyqwyitqor
BEGIN DATA FOR ID YYYYYY
**ERRORS**jgkhdfghdkjfhghg;dsfh;ghs;ghsd;ghsd;ghds;ghsdfhg;dfhgsfd
**ERRORS jgkhdfghdkjfhghg;dsfh;ghs;ghsd;ghsd;ghds;ghsdfhg;dfhgsfd
**ERRORS**12424335353545454646464664
**ERRORS**vnvcbvxcm,vxcbxc,nvx,cb,xnvx,nb,xmnvx,cm
***ERRORS**
***ERRORS**nvnmnbmnnbnnmmn
END DATA OUTPUT FOR ID YYYYYY
Etc, etc…

I do want to
   1. Parse id
   2. Find the line with "ERRORS"
   3. Build the new record as (id, Comment) like:

Id          Comment  
xxxxxx **ERRORS M(20) < 10 INDICATES THAT THE CELL IS SUSPECT**ERRORS**
xxxxxx ***ERRORS  jgkhdfghdkjfhghg;dsfh;ghs;ghsd;ghsd;ghds;ghsdfhg;dfhgsfd
xxxxxx ***ERRORS**dlhagdlajgdlakjalkfjalflajgfljakgflagdljfgalkjdgf
YYYYYY **ERRORS**jgkhdfghdkjfhghg;dsfh;ghs;ghsd;ghsd;ghds;ghsdfhg;dfhgsfd
YYYYYY **ERRORS jgkhdfghdkjfhghg;dsfh;ghs;ghsd;ghsd;ghds;ghsdfhg;dfhgsfd
YYYYYY **ERRORS**12424335353545454646464664
YYYYYY **ERRORS**vnvcbvxcm,vxcbxc,nvx,cb,xnvx,nb,xmnvx,cm
YYYYYY ***ERRORS**
YYYYYY ***ERRORS**nvnmnbmnnbnnmmn


Code where I  need help (with another loop) :  
Do Until EOF(1)
    Line Input #1, MyTextLine
    LineNo = LineNo + 1
    If Mid(MyTextLine, 1, 17) = "BEGIN DATA FOR ID" Then
    ID = Mid(MyTextLine, 19, 6)
    Debug.Print ID

    '''need another loop to find line with "ERRORS"



    End If
Loop


I hope this is enough information..
Looking forward to a very elegant solution....
0
Comment
Question by:PDF
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 9
  • 8
17 Comments
 
LVL 47

Expert Comment

by:Martin Liss
ID: 40364834
Unless I misunderstand the question you don't need a second loop

Dim FF As Integer
Dim MyTextLine As String
Dim ID As String

FF = FreeFile

Open "C:\temp\errors.txt" For Input As #FF
Do Until EOF(FF)
    Line Input #FF, MyTextLine
    'LineNo = LineNo + 1
    Select Case True
        Case Mid(MyTextLine, 1, 17) = "BEGIN DATA FOR ID"
        ID = Mid(MyTextLine, 19, 6)
        Debug.Print ID
        Case InStr(1, "*ERRORS", MyTextLine) > 0
            ' write error to SQL
    End Select
Loop
Close

Open in new window

0
 
LVL 47

Accepted Solution

by:
Martin Liss earned 500 total points
ID: 40364838
Line 15 in my code above should be Case InStr(1, MyTextLine, "*ERRORS") > 0

And for consistency sake, line 17 could also be written as follows. InStr is also faster than Mid. (Here's an article on the subject.)

Case InStr(1, MyTextLine, "BEGIN DATA FOR ID") > 0
0
 

Author Closing Comment

by:PDF
ID: 40365028
MartinLiss,

MartinLiss,

Thank you for your prompt reply.
Works Perfectly!

I replaced line    "Case InStr(1, MyTextLine, "*ERRORS") > 0
           with           "Case InStr(MyTextLine, "ERRORS") > 0
   
Thanks again!
Sincerely
PDF
0
Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
LVL 47

Expert Comment

by:Martin Liss
ID: 40365038
OK, with or without the "1", the search starts at position one because that's the default and I think it's better to specifically include the value, but in any case you're welcome and I'm glad I was able to help.

In my profile you'll find links to some articles I've written that may interest you.
Marty - MVP 2009 to 2014
0
 

Author Comment

by:PDF
ID: 40375739
Hi MartinLiss,

Just to be 100% sure that I have all lines in that block of data
That starts with "****errors",I would like  to do Loop.
If are  outputs identical, than I can use accepted solution.
 (In case that line doesn’t start with ****errors but it is still part of block.)
Thank you very much & thank you for link.


I appreciate You a lot!
0
 
LVL 47

Expert Comment

by:Martin Liss
ID: 40375744
Just to be 100% sure that I have all lines in that block of data
That starts with "****errors",I would like  to do Loop.
If are  outputs identical, than I can use accepted solution.
 (In case that line doesn’t start with ****errors but it is still part of block.)
Thank you very much & thank you for link.

I'm sorry but I don't know if you are asking me a question.

In any case you're welcome and I'm glad I was able to help.

In my profile you'll find links to some articles I've written that may interest you.
Marty - MVP 2009 to 2014
0
 

Author Comment

by:PDF
ID: 40375779
Well,
My question was "to loop through a txt file and find specific "Block of data" .
I accepted your comment BUT,
I want to be absolutely certain that I have all lines and I would like  to use  an loop.
0
 
LVL 47

Expert Comment

by:Martin Liss
ID: 40375809
Trust me, there's no need to use a second loop. The code as is looks at every line in the file and...

If the line contains "BEGIN DATA FOR ID" it stores the ID
If the line contains "ERRORS" it writes the line to SQL (after you added the proper code to do that)
All other lines are ignored.
0
 

Author Comment

by:PDF
ID: 40375822
Right ,if line not start with "ERRORS " is ignored,but can be part of block.



Unfortunately ...there can be some line that not start with "ERRORS" but they are part of block.
 

BEGIN DATA FOR ID XXXXXX
ryyyyywerty
ssgfshghgd
afdgfdsgfh
aaaaaaaaaaaaaaaaadg
**ERRORS-----   Start of block
ggaewgleealf   ---- need to be part of sql
 ***ERRORS  jgkhdfghdkjfhghg;dsfh;ghs;ghsd;ghsd
ahgjajgfdakjkas   ----need to be part of sql
***ERRORS**dlhagdlajgdlakjalkfjalflajgfljakgflagdljfgalkjdgf
END DATA OUTPUT FOR ID XXXXXX
0
 
LVL 47

Expert Comment

by:Martin Liss
ID: 40375848
So are you saying that everything between the "BEGIN DATA FOR ID XXXXXX" line and the "END DATA OUTPUT FOR ID XXXXXX" line are all a part of the same block? If not then using the same lines, please show what each block should contain.
0
 

Author Comment

by:PDF
ID: 40375866
Block of data for each entry start as:
"BEGIN DATA FOR ID XXXXXX"
data
data
data
100 more lines of data
**ERRORS-----   Start of Errors block
 ***ERRORS**dlha
 ***ERRORS**dlha
Data
Data
 ***ERRORS**dlha
Data
End:"END DATA OUTPUT FOR ID XXXXXX"

I need all lines between **ERRORS and "END DATA OUTPUT FOR ID XXXXXX"
0
 
LVL 47

Expert Comment

by:Martin Liss
ID: 40375915
Just to be 100% clear, do you mean these lines?

**ERRORS-----   Start of Errors block
 ***ERRORS**dlha
 ***ERRORS**dlha
Data
Data
 ***ERRORS**dlha
Data

or do you mean these lines?

Data
Data
Data
0
 

Author Comment

by:PDF
ID: 40375919
all lines  :

**ERRORS-----   Start of Errors block
  ***ERRORS**dlha
  ***ERRORS**dlha
 Data
 Data
  ***ERRORS**dlha
 Data
End:"END DATA OUTPUT FOR ID XXXXXX"
0
 
LVL 47

Expert Comment

by:Martin Liss
ID: 40375992
Try this.

Dim FF As Integer
Dim MyTextLine As String
Dim ID As String
Dim bErrorFound As Boolean

FF = FreeFile

Open "C:\temp\errors.txt" For Input As #FF
Do Until EOF(FF)
    Line Input #FF, MyTextLine
    Select Case True
        Case InStr(1, UCase(MyTextLine), "BEGIN DATA FOR ID") > 0
            ID = Mid(MyTextLine, 19, 6)
            bErrorFound = False
            Debug.Print ""
            Debug.Print ID
        Case InStr(1, UCase(MyTextLine), "END DATA") > 0
            bErrorFound = False
            ' Write this line to SQL
            Debug.Print "written to SQL: " & MyTextLine
        Case Else
            If InStr(1, UCase(MyTextLine), "ERRORS") > 0 Or bErrorFound Then
                ' write error to SQL
                Debug.Print "written to SQL: " & MyTextLine
                bErrorFound = True
            End If
    End Select
Loop
Close

Open in new window

0
 

Author Comment

by:PDF
ID: 40376089
Thank you for posting code.
 I'll get back to you tomorrow.
0
 

Author Comment

by:PDF
ID: 40377629
EXCELLENT !!!

Thank you, MartinLiss for your help!
0
 
LVL 47

Expert Comment

by:Martin Liss
ID: 40377640
You're welcome.
0

Featured Post

Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Have you ever wanted to restrict the users input in a textbox to numbers, and while doing that make sure that they can't 'cheat' by pasting in non-numeric text? Of course you can do that with code you write yourself but it's tedious and error-prone …
If you need to start windows update installation remotely or as a scheduled task you will find this very helpful.
As developers, we are not limited to the functions provided by the VBA language. In addition, we can call the functions that are part of the Windows operating system. These functions are part of the Windows API (Application Programming Interface). U…
Get people started with the process of using Access VBA to control Outlook using automation, Microsoft Access can control other applications. An example is the ability to programmatically talk to Microsoft Outlook. Using automation, an Access applic…

740 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question