mondintator
asked on
VB.net - What is the fastest way to read a text file?
I need to read a text file line by line and pull different pieces of text from each line.
At the moment I'm using objreader line by line. Is this the fastest way to be able to read specific information from each line from the entire file?
Dim FileName As String = [myfilename]
Dim objreader As New System.IO.StreamReader(Fil eName)
Do While objreader.Peek() <> -1
TextLine = objreader.ReadLine() & vbNewLine
Line_Number = Line_Number + 1
Loop
objreader.Close()
At the moment I'm using objreader line by line. Is this the fastest way to be able to read specific information from each line from the entire file?
Dim FileName As String = [myfilename]
Dim objreader As New System.IO.StreamReader(Fil
Do While objreader.Peek() <> -1
TextLine = objreader.ReadLine() & vbNewLine
Line_Number = Line_Number + 1
Loop
objreader.Close()
Here is an example of file.readalllines
' Open the file to read from.
Dim path as string = "C:\sample.txt"
Dim readText() As String = File.ReadAllLines(path)
Dim s As String
For Each s In readText
Console.WriteLine(s)
Next
How big is your text file, how much data will it contain in worst case?
' Open the file to read from.
Dim path as string = "C:\sample.txt"
Dim readText() As String = File.ReadAllLines(path)
Dim s As String
For Each s In readText
Console.WriteLine(s)
Next
How big is your text file, how much data will it contain in worst case?
ASKER
I use readalllines first so I can see the progress of the file being read. But when I read each line of the file, I need to pick up specific strings within each line and put them into table
How do you pick strings from each line to put into a table, is there a separator in each line based on which you will read the string?
ASKER
No I pick up the text from it's position using mid(textline, starting number, length)
How big is your text file, how much data will it contain in worst case? Do you think it is currently taking lot of time?
ASKER
Text files can be huge - usually under 250,000KB, but could be bigger
Ok, that's around 250 mb.
So coming back to your previous comment, can you please explain what was your observation if you used File.ReadAllLines?
You can use background worker thread to read all lines and read parts of a string using mid or by position in a table.
And till that operation is finished you can show the progress in your UI.
Do you see any issues with this approach?
So coming back to your previous comment, can you please explain what was your observation if you used File.ReadAllLines?
You can use background worker thread to read all lines and read parts of a string using mid or by position in a table.
And till that operation is finished you can show the progress in your UI.
Do you see any issues with this approach?
ASKER
I only use readalllines to get the total number of lines in the file, purely so I can track progress as I read line by line.
Can I use readalllines to read parts of a string using mid? Is this faster than readline?
Can I use readalllines to read parts of a string using mid? Is this faster than readline?
Readalllines returns a array of string. And size of the array would represent the number of lines. Yes, I would tend to think that reading all the lines using read all lines and then performing the reading (using for each of this string array) of each part using mid or character would be faster then reading it line by line. But how much faster is relative.
Would it possible for you to provide a sample text file of your and also the logic of reading parts of each line to separate them to put it to a table?
I normally do this exercise myself (log start time and end time) to determine which is faster.
I normally do this exercise myself (log start time and end time) to determine which is faster.
ASKER
Public Function Read_File()
Read_Ahead()
Line_Number = 0
Dim FileName As String = Myfilename
Dim objreader As New System.IO.StreamReader(Fil eName)
Do While objreader.Peek() <> -1
TextLine = objreader.ReadLine() & vbNewLine
Line_Number = Line_Number + 1
MyCroppedField1 = Mid(TextLine, 46, 11)
MyCroppedField2 = Mid(TextLine, 62, 6)
MyCroppedField3 = Mid(TextLine, 73, 2)
Loop
objreader.Close()
Return True
End Function
Private Sub Read_Ahead()
Dim FileName As String = Myfilename
Dim Total_Recs_ReadAhead As Integer = System.IO.File.ReadAllLine s(FileName ).Length
Total_recs = Total_Recs_ReadAhead
End Sub
Read_Ahead()
Line_Number = 0
Dim FileName As String = Myfilename
Dim objreader As New System.IO.StreamReader(Fil
Do While objreader.Peek() <> -1
TextLine = objreader.ReadLine() & vbNewLine
Line_Number = Line_Number + 1
MyCroppedField1 = Mid(TextLine, 46, 11)
MyCroppedField2 = Mid(TextLine, 62, 6)
MyCroppedField3 = Mid(TextLine, 73, 2)
Loop
objreader.Close()
Return True
End Function
Private Sub Read_Ahead()
Dim FileName As String = Myfilename
Dim Total_Recs_ReadAhead As Integer = System.IO.File.ReadAllLine
Total_recs = Total_Recs_ReadAhead
End Sub
Thanks for sharing, give me sometime, I shall get back to you with my findings.
Can you also please clarify below?
1) Which .net framework are you targeting? Will your process (exe) be 64 bit or 32 bit?
2) what is your target deployment environment? Which processor, how many cores, how much RAM, which OS is some of the information that I'm looking for?
1) Which .net framework are you targeting? Will your process (exe) be 64 bit or 32 bit?
2) what is your target deployment environment? Which processor, how many cores, how much RAM, which OS is some of the information that I'm looking for?
ASKER
vb.net. 64 bit. Not sure about target deployment environment / processors, ram etc. Windows 7 or 10.
Ok, thanks, would it be .net framework 3.5, 4.0 or 4.5?
Fyi : For the current project which .Net framework is being used can be seen by right click on project, properties, compiler options, advance options.
Fyi : For the current project which .Net framework is being used can be seen by right click on project, properties, compiler options, advance options.
ASKER
3.5
I would use something like:
Proof of concept:
Sub Read_File()
Dim lines = File.ReadAllLines(MyfileName)
Total_recs = lines.Count
Dim parsed = (From line In lines Select New With {.Field1 = Mid(line, 46, 11), .Field2 = Mid(line, 62, 6), .Field3 = Mid(line, 73, 2)})
End Sub
The second read of the file is not needed as you know the line count from the first ReadAllLines. You simply need to combine both methods. The linq method allows for you to parse your lines into a readily readable Enumerable.Proof of concept:
Imports System.IO
Module Module1
Sub Main()
Dim lines = File.ReadAllLines("EE_Q28908699.txt")
Console.WriteLine("File has {0} lines.", lines.Count)
Dim parsed = (From line In lines Select New With {.Field1 = Mid(line, 46, 11), .Field2 = Mid(line, 62, 6), .Field3 = Mid(line, 73, 2)})
For Each line In parsed
Console.WriteLine(line)
Next
Console.ReadLine()
End Sub
End Module
-saige-
Simply read the whole file and use the Split method on the result to split it into an array of string, taking a line change as the separator. Each element of the array will be a line in the file.
Dim lines() As String
Dim reader As New StreamReader("C:\Users\Jacques\Desktop\Test.txt")
lines = reader.ReadToEnd.Split(Environment.NewLine)
For Each line As String In lines
'Do what you want with the line
Debug.WriteLine(line)
Next
Hi mondintator;
As stated above reading through the file more then once will cause longer processing time so read once and cash the data. The below sample code will help you do that as well as allow you to track the process of which line it is up to.
As stated above reading through the file more then once will cause longer processing time so read once and cash the data. The below sample code will help you do that as well as allow you to track the process of which line it is up to.
Imports System.IO
Imports System.Linq
Public Class Form1
'' update file location on the next line
Dim Myfilename As String = "C:\PathToFile\FilenameHere.txt"
Private Sub Button1_Click(sender As Object, e As EventArgs) Handles Button1.Click
Dim Line_Number = 0
Dim dataFields As New List(Of MyData)
'' Lambda Expression to process the input file, this line of code is not executed
'' until it is called below.
Dim processFile As Action(Of String) =
Sub(inputFile)
'' Get the file name from the input parameter
Dim FileName As String = inputFile
'' Used to hold all the input lines from the file as a list of strings
Dim Lines As List(Of String)
'' Using statement Opens and Close and Dispose the file when finished
Using objReader As New StreamReader(FileName)
'' Removes carrage returns and empty lines from the file
Lines = objReader.ReadToEnd().Split(New Char() {vbLf}, StringSplitOptions.RemoveEmptyEntries).ToList()
End Using
'' Parses the line as needed
For Each line In Lines
'' you can track progress and update progress in the For Each loop.
'' I used a custome class but you can use your table object in its place
Dim lineData As New MyData
lineData.Field1 = line.Substring(45, 11)
lineData.Field2 = line.Substring(61, 6)
lineData.Field3 = line.Substring(72, 2)
'' Add the parsed data to the list
dataFields.Add(lineData)
Next
End Sub
'' Runs the above lambda expression
processFile(Myfilename)
End Sub
End Class
'' Helps in parseing the file.
Public Class MyData
Public Property Field1 As String
Public Property Field2 As String
Public Property Field3 As String
End Class
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
Top class once again
Also there is textfieldparser class which helps to read if your file is comma separated or semi colon separated or any other separator for that matter.
Even there's a way to perform the read operation asynchronously so that it can happen in a separate thread while main thread can do its normal UI work in meantime. But if this file reading is something required for any other operation to happen in your app then you can do this at start up.
So the question is what do you want to do after reading all lines? Do you need to further parse each string using some separator then you can use text filed parser. If this work can happen in parallel then it can be done in a separate thread.
Else you can do File.readalllines.