The Effective Way to Find the Last Line Content of a Large Text File

Ryan Chong
CERTIFIED EXPERT
(NIV) Hebrews 10:35 So do not throw away your confidence; it will be richly rewarded.
Published:
Edited by: Andrew Leniart
Data can be kept in different media, Sometimes, the data need to be extracted, transformed or loaded in different ways.

For this article, I'm going to demonstrate an effective way to find the last line content of a large text file using VBA codes.
I recently helped to answer a question regarding Find Last *.txt Line in VB6.

What is the fastest way to find the value of the last line in a plain text file with records... There are usually thousands of records so a quick and efficient way is necessary.

Firstly, I was thinking I could loop the file content, which would eventually grab the content of the last line. But due to the fact that there could have been thousands of records, doing a looping is usually an inefficient solution.

We can measure the performance by comparing different approaches.

  • Method 1 - Input As
  • Method 2 - File System Object
  • Method 3 - ActiveX Data Objects Library (ADO)

In order to make such a measurement, let's try to prepare a large file first.

I'm creating a table with 10000 records, and then output it as a simple comma delimited file.
 

Let's start the test!



Method 1 - Input As

Public Declare Function GetTickCount Lib "kernel32.dll" () As Long

Sub Test_InputAs()
    Dim Value As String
    For i = 1 To 20
        StartTime = GetTickCount()
        Value = GetLastLineFromFile_InputAs("C:\yourpath\source.txt")
        Debug.Print Value
        Elapsed = (GetTickCount() - StartTime)
        Debug.Print Elapsed / 1000# & " seconds"
    Next
End Sub

Public Function GetLastLineFromFile_InputAs(FilePath As String) As String
    Dim FileNum As Integer
    Dim TextLine As String

    FileNum = FreeFile()
    Open FilePath For Input As #FileNum

    Do While Not EOF(FileNum)
        Line Input #FileNum, TextLine
    Loop
    
    Close FileNum
    GetLastLineFromFile_InputAs = TextLine
End Function

Method 2 - File System Object

Public Declare Function GetTickCount Lib "kernel32.dll" () As Long

Sub Test_FSO()
    Dim Value As String
    For i = 1 To 20
        StartTime = GetTickCount()
        Value = GetLastLineFromFile_FSO("C:\yourpath\source.txt")
        Debug.Print Value
        Elapsed = (GetTickCount() - StartTime)
        Debug.Print Elapsed / 1000# & " seconds"
    Next
End Sub

Public Function GetLastLineFromFile_FSO(FilePath As String) As String
    Dim TextLine As String
   
    Set objFSO = CreateObject("Scripting.FileSystemObject")
    Set objFile = objFSO.OpenTextFile(FilePath, 1)
   
    Do Until objFile.AtEndOfStream
        TextLine = objFile.ReadLine
    Loop

    objFile.Close
    GetLastLineFromFile_FSO = TextLine
End Function

Method 3 - ActiveX Data Objects Library (ADO)

Test 1

Public Declare Function GetTickCount Lib "kernel32.dll" () As Long

Sub Test_ADO()
    Dim Value As String
    For i = 1 To 20
        StartTime = GetTickCount()
        Value = GetLastLineFromFile_ADO("C:\yourpath\", "source.txt")
        Debug.Print Value
        Elapsed = (GetTickCount() - StartTime)
        Debug.Print Elapsed / 1000# & " seconds"
    Next
End Sub

Public Function GetLastLineFromFile_ADO(FilePath As String, FileName As String) As String
    Dim TextLine As String
   
    Dim conn As ADODB.Connection
    Dim rs As ADODB.Recordset
    Dim connString As String, SQL As String
   
    connString = "Driver={Microsoft Text Driver (*.txt; *.csv)};Dbq=" & FilePath & ";Extensions=asc,csv,tab,txt;"
    Set conn = New ADODB.Connection
    conn.ConnectionString = connString
    conn.Open
   
    SQL = "Select * from [" & FileName & "]"
    Set rs = New ADODB.Recordset
    rs.Open SQL, conn, 1, 1
    If rs.EOF = False Then
        rs.MoveLast
        GetLastLineFromFile_ADO = rs(0) & "," & rs(1) & "," & rs(2)
    End If
   
    rs.Close
    Set rs = Nothing
    conn.Close
    Set conn = Nothing
End Function

Test 2

And what if we change the line from:

GetLastLineFromFile_ADO = rs(0) & "," & rs(1) & "," & rs(2)

To:

GetLastLineFromFile_ADO = rs(0)

Does it affect the results?



The Result

So, here we go!

Here is the result after 20 times of loading.

The figures shown are in seconds.

From the result, you will see the winner is using Method 1 - Input As.

This is because of the fact that the 'Input As' command is lite enough to read the text file. While other methods could be as powerful, they're spending more time on transforming the text content to use data that is wrapped with objects.

Having said that, this is just one corner of the test and I believe there are other factors that may be affecting the results.

1
440 Views
Ryan Chong
CERTIFIED EXPERT
(NIV) Hebrews 10:35 So do not throw away your confidence; it will be richly rewarded.

Comments (0)

Have a question about something in this article? You can receive help directly from the article author. Sign up for a free trial to get started.

Get access with a 7-day free trial.
You Belong in the World's Smartest IT Community