Load different files and arrange them to one file


Hi Experts,

I have a question which is a bit complex for me as it load different text files and arrange them to one.

The idea is that, I have a different text file like this:
test1.txt
test2.txt
test3.txt
The number of the files is unknown, but I know that the file name will start by "test" and then a number starts from 1 and then the ext ".txt". Each file will include four numbers like this:
test1.txt:.
1
2
3
4

Open in new window


test2.txt:
2
6
9
8

Open in new window


test3.txt
9
10
12
13

Open in new window


The idea is that, I want to load all the files and then output one file (output.txt) with specific format. The furcation will first load the first file which is test1.txt
1
2
3
4

Open in new window

Save it to output.txt file and then load the second file and try to match the numbers in test2 to test1 so as you can see there is match which is number 2. It then loads test2.text to the output file starting from number 2 in the file. So, the result is that:
Output.txt
1
2	/2
3	/6
4	/9
	/8

Open in new window


And then load the other files with the same idea, so the final result will be like this:
1
2	/2
3	/6
4	/9	/9
	/8	/10
		/12
		/13

Open in new window

And so one. To summarise the idea, load the first text file and output the content to an output file (output.txt) and then load second file and try to match the first number with the previous data on (output.txt) when it finds matches (and it has to find on it might be the first, second, third or fourth number) then it will add the second file just next to the matched number.

Hope I have explained the idea clearly and please let me know if there is any question! Thanks in advance
Note: I am using vb.net 2008
Regards
Sat80Asked:
Who is Participating?

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

x
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

nepaluzCommented:
I'd suggest you load all files once and output the unique contents to one file.

1. Get all files begining test to array
Dim FileList = Directory.GetFiles("C:\", "test?.txt")

Open in new window


2. Iterate through the array
        Dim OutPut As String() = Nothing
        For Each x As String In FileList
            OutPut = OutPut.Union(File.ReadAllLines(x))
        Next

Open in new window


3. Then write the unique contents to your output
File.WriteAllLines("C:\output.txt", OutPut)

Open in new window


NOTE: I am not sure about the search pattern for the files, but you get the gist...
Sat80Author Commented:
nepaluz, thanks for ur reply,
Yes you are right, your method is good, but as you know still the search and match is one the complex issue in the question.
nepaluzCommented:
Actually there is just a small issue.

This will list all files in the directory C:\  from test1.txt to test9.txt
Dim FileList = Directory.GetFiles("C:\", "test?.txt") 

Open in new window


This will list ALL files begining test and ending .txt
Dim FileList = Directory.GetFiles("C:\", "test*.txt") 

Open in new window


If you want to narrow that down, I believe it can be done! (Master Gates MUST have thought of it)
Big Business Goals? Which KPIs Will Help You

The most successful MSPs rely on metrics – known as key performance indicators (KPIs) – for making informed decisions that help their businesses thrive, rather than just survive. This eBook provides an overview of the most important KPIs used by top MSPs.

Mike TomlinsonHigh School Computer Science, Computer Applications, and Mathematics TeachersCommented:
How big do the numbers in test?.txt go?

If they go above 9, and you want them in numeric order, then you'll need to SORT the list of files by parsing out the numbers and converting them to ints so you can have them in the right order.  (Otherwise you'd get an alphanumeric sort...1 followed by 10, will come before 2 etc...)

Will there ALWAYS be a match in the previous file?  What should happen if there isn't?


Sat80Author Commented:
nepaluz, thanks.

Idle_Mind, I appreciate your reply :)
The numbers in each line is less than 9 and might be between 4 and 5 numbers only. And about the match, most of them yes there is a match, but yes in some cases there is not match and in that case we can start from the end of file such as:

test1.txt:
1
2
3
4

Open in new window



test2.txt
5
6
7
8

Open in new window



The the output.txt file should be like this:
1
2
3
4
	/5
	/6
	/7
	/8

Open in new window


Thanks
Mike TomlinsonHigh School Computer Science, Computer Applications, and Mathematics TeachersCommented:
Cool...what about the FILENAMES though?  Will the number in the FILENAME go above 9?

The reason I ask is that "test10.txt" would come before "test2.txt" using a normal sort routine.

Do you need them sorted based on the NUMBER in the FileName?
test1.txt
test2.txt
...
test10.txt
Mike TomlinsonHigh School Computer Science, Computer Applications, and Mathematics TeachersCommented:
...oh, and will you match ONLY in the PREVIOUS file?
(we should only look for a match in the previous {rightmost} "column" and not any others)
Sat80Author Commented:
Hi, the file number is unknown so it might be above 10.
Yes we only match each file with the previous one so:
match test2.txt with test1.txt
match test5.txt with test4.txt

Each file will try to match to the previous one which is also the first column in the left in output.txt file
Thanks
Sat80Author Commented:
Hi Idle_Mind, sry to bother you, but I hope you can help me with this as always :) thanks.
Mike TomlinsonHigh School Computer Science, Computer Applications, and Mathematics TeachersCommented:
Went out to a movie with the family.  I'll see if I can whip something up later tonight after the kids go to bed...  =)
Mike TomlinsonHigh School Computer Science, Computer Applications, and Mathematics TeachersCommented:
I ~think~ this does what you want...
Imports System.IO
Imports System.Text.RegularExpressions
Public Class Form1

    Private Sub Button1_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Button1.Click
        CombineTextFiles("C:\Users\Mike\Documents", "test*.txt", "Output.txt")
    End Sub

    Private Sub CombineTextFiles(ByVal FolderPath As String, ByVal FileNameFilter As String, ByVal OutputFile As String)
        Dim FileNames As New List(Of String)
        FileNames.AddRange(Directory.GetFiles(FolderPath, FileNameFilter))
        FileNames.Sort(AddressOf NumericFileSorter)

        Dim Width As Integer = 8
        Dim Column As Integer = 0
        Dim Match As Integer
        Dim OutputIndex As Integer
        Dim OutputLines As New List(Of String)
        Dim CurLines As List(Of String) = Nothing
        Dim PrevLines As List(Of String) = Nothing
        For i As Integer = 0 To FileNames.Count - 1
            CurLines = New List(Of String)
            CurLines.AddRange(File.ReadAllLines(FileNames(i)))

            If IsNothing(PrevLines) Then
                For Each Line As String In CurLines
                    OutputLines.Add(Line.PadRight(Width))
                Next
            Else
                Match = PrevLines.IndexOf(CurLines(0))
                If Match = -1 Then
                    For Each Line As String In CurLines
                        OutputLines.Add(New String(" ", Column * Width) & ("/" & Line).PadRight(Width))
                    Next
                Else
                    For x As Integer = PrevLines.Count To 1 Step -1
                        OutputIndex = OutputLines.Count - x
                        If OutputLines(OutputIndex).Trim.EndsWith(IIf(Column = 1, "", "/") & CurLines(0)) Then
                            Exit For
                        End If
                    Next
                    For Each Line As String In CurLines
                        If OutputIndex < OutputLines.Count Then
                            OutputLines(OutputIndex) = OutputLines(OutputIndex) & ("/" & Line).PadRight(Width)
                        Else
                            OutputLines.Add(New String(" ", Column * Width) & ("/" & Line).PadRight(Width))
                        End If
                        OutputIndex = OutputIndex + 1
                    Next
                End If
            End If

            Column = Column + 1
            PrevLines = CurLines
        Next

        Dim FullPathOutputFile As String = Path.Combine(FolderPath, OutputFile)
        File.WriteAllLines(FullPathOutputFile, OutputLines.ToArray)
        Process.Start(FullPathOutputFile) ' <-- Just so you can see the output file easily
    End Sub

    Private Function NumericFileSorter(ByVal File1 As String, ByVal File2 As String) As Integer
        Dim Match1 As Match = Regex.Match(File1, "\d+")
        Dim Match2 As Match = Regex.Match(File2, "\d+")
        If Match1.Success AndAlso Match2.Success Then
            Dim Num1 As Integer, Num2 As Integer
            If Integer.TryParse(Match1.ToString(), Num1) AndAlso Integer.TryParse(Match2.ToString(), Num2) Then
                Dim result As Integer = Num1.CompareTo(Num2)
                If result <> 0 Then
                    Return result
                End If
            End If
        ElseIf Match1.Success AndAlso Not Match2.Success Then
            Return -1
        ElseIf Not Match1.Success AndAlso Match2.Success Then
            Return 1
        End If
        Return File1.CompareTo(File2)
    End Function

End Class

Open in new window

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
Sat80Author Commented:
Hi Idle_Mind, hope you have enjoyed the movie with your family :)
Thank you for the code as it's working perfectly. You are really a professional programmer :) thanks again.
Regards
Sat80Author Commented:
Thanks
Mike TomlinsonHigh School Computer Science, Computer Applications, and Mathematics TeachersCommented:
I was sleepy when I wrote that so please test with a wide variety of inputs and let me know if you find any bugs!  =O
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Visual Basic.NET

From novice to tech pro — start learning today.