Find distinct value from values in a text file....

I have a text file of size >5,000,000 KB. It has a header and a trailer record, and all records between these two are body records. The records are pipe-delimited. Each body record has 236 columns. I want to find out the distinct values present in columns 168 and 169. The file is too large to import into Excel, otherwise I could have set a filter on each column and viewed the distinct values. It is also too large to open in MS-Word. How can I go about finding out the distinct values in the two columns I am interested in?

Thank you.
Who is Participating?
dsackerContract ERP Admin/ConsultantCommented:
VBScript or Powershell would work.

You'd read each record in, use the Split function to reference YourRecord.Split("|")(167) and YourRecord.Split("|")(168), concatenate them into another array, testing first whether you already placed it in that other array, sort it if you wish, then echo it out.
dsackerContract ERP Admin/ConsultantCommented:
This VBScript template might work:
Option Explicit

Call Main

Sub Main
    Dim blnFound
    Dim cnt
    Dim ndx
    Dim objFSO
    Dim objFileIn
    Dim strArray()
    Dim strFileName
    Dim strKey
    Dim strRecord

    cnt = -1
    strFileName = "C:\Temp\YourInputFile.txt"
    Set objFSO     = CreateObject("Scripting.FileSystemObject")
    Set objFileIn  = objFSO.OpenTextFile(strFileName, 1, 2)

    Do While objFileIn.AtEndOfStream = False
        strRecord = objFileIn.ReadLine
        strKey = Split(strRecord, "|")(167) & "|" & Split(strRecord, "|")(168)
        If cnt = -1 Then
            cnt = cnt + 1
            Redim strArray(cnt)
            strArray(cnt) = strKey
            blnFound = False
            For ndx = 0 To UBound(strArray)
                If strArray(ndx) = strKey Then
                    blnFound = True
                    Exit For
                End If
            If blnFound = False Then
                cnt = cnt + 1
                Redim Preserve strArray(cnt)
                strArray(cnt) = strKey
            End If
        End If

    Set objFileIn  = Nothing
    Set objFSO     = Nothing

    For ndx = 0 To Ubound(strArray)
        WScript.Echo strArray(ndx)
End Sub

Open in new window

dsackerContract ERP Admin/ConsultantCommented:
Even sweeter is the small amount of Powershell needed:
$file = Get-Content C:\Temp\YourInputFile.txt
$array = @()
ForEach ($line in $file) {
    $fields = $line.Split("|");
    $key = $fields[167]+ "|" + $fields[168]
    If ($array.Contains($key) -eq $False) { $array += $key }

Open in new window

Cloud Class® Course: SQL Server Core 2016

This course will introduce you to SQL Server Core 2016, as well as teach you about SSMS, data tools, installation, server configuration, using Management Studio, and writing and executing queries.

aej1973Author Commented:
Thanks dsacker, how will the output from this poweshell look? Thank you.

dsackerContract ERP Admin/ConsultantCommented:
The lone "$array" will spit out the two values you're interested in. If you need to change that and grab other values, just tack them on.

Play with it some. Powershell can be addicting. :)
aej1973Author Commented:
Will do, thank you.
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.