Solved

Load different files and arrange them to one file

Posted on 2011-02-21
14
193 Views
Last Modified: 2012-05-11

Hi Experts,

I have a question which is a bit complex for me as it load different text files and arrange them to one.

The idea is that, I have a different text file like this:
test1.txt
test2.txt
test3.txt
The number of the files is unknown, but I know that the file name will start by "test" and then a number starts from 1 and then the ext ".txt". Each file will include four numbers like this:
test1.txt:.
1
2
3
4

Open in new window


test2.txt:
2
6
9
8

Open in new window


test3.txt
9
10
12
13

Open in new window


The idea is that, I want to load all the files and then output one file (output.txt) with specific format. The furcation will first load the first file which is test1.txt
1
2
3
4

Open in new window

Save it to output.txt file and then load the second file and try to match the numbers in test2 to test1 so as you can see there is match which is number 2. It then loads test2.text to the output file starting from number 2 in the file. So, the result is that:
Output.txt
1
2	/2
3	/6
4	/9
	/8

Open in new window


And then load the other files with the same idea, so the final result will be like this:
1
2	/2
3	/6
4	/9	/9
	/8	/10
		/12
		/13

Open in new window

And so one. To summarise the idea, load the first text file and output the content to an output file (output.txt) and then load second file and try to match the first number with the previous data on (output.txt) when it finds matches (and it has to find on it might be the first, second, third or fourth number) then it will add the second file just next to the matched number.

Hope I have explained the idea clearly and please let me know if there is any question! Thanks in advance
Note: I am using vb.net 2008
Regards
0
Comment
Question by:Sat80
  • 6
  • 6
  • 2
14 Comments
 
LVL 17

Assisted Solution

by:nepaluz
nepaluz earned 100 total points
ID: 34943688
I'd suggest you load all files once and output the unique contents to one file.

1. Get all files begining test to array
Dim FileList = Directory.GetFiles("C:\", "test?.txt")

Open in new window


2. Iterate through the array
        Dim OutPut As String() = Nothing
        For Each x As String In FileList
            OutPut = OutPut.Union(File.ReadAllLines(x))
        Next

Open in new window


3. Then write the unique contents to your output
File.WriteAllLines("C:\output.txt", OutPut)

Open in new window


NOTE: I am not sure about the search pattern for the files, but you get the gist...
0
 

Author Comment

by:Sat80
ID: 34943818
nepaluz, thanks for ur reply,
Yes you are right, your method is good, but as you know still the search and match is one the complex issue in the question.
0
 
LVL 17

Expert Comment

by:nepaluz
ID: 34943893
Actually there is just a small issue.

This will list all files in the directory C:\  from test1.txt to test9.txt
Dim FileList = Directory.GetFiles("C:\", "test?.txt") 

Open in new window


This will list ALL files begining test and ending .txt
Dim FileList = Directory.GetFiles("C:\", "test*.txt") 

Open in new window


If you want to narrow that down, I believe it can be done! (Master Gates MUST have thought of it)
0
 
LVL 85

Expert Comment

by:Mike Tomlinson
ID: 34944540
How big do the numbers in test?.txt go?

If they go above 9, and you want them in numeric order, then you'll need to SORT the list of files by parsing out the numbers and converting them to ints so you can have them in the right order.  (Otherwise you'd get an alphanumeric sort...1 followed by 10, will come before 2 etc...)

Will there ALWAYS be a match in the previous file?  What should happen if there isn't?


0
 

Author Comment

by:Sat80
ID: 34944722
nepaluz, thanks.

Idle_Mind, I appreciate your reply :)
The numbers in each line is less than 9 and might be between 4 and 5 numbers only. And about the match, most of them yes there is a match, but yes in some cases there is not match and in that case we can start from the end of file such as:

test1.txt:
1
2
3
4

Open in new window



test2.txt
5
6
7
8

Open in new window



The the output.txt file should be like this:
1
2
3
4
	/5
	/6
	/7
	/8

Open in new window


Thanks
0
 
LVL 85

Expert Comment

by:Mike Tomlinson
ID: 34944791
Cool...what about the FILENAMES though?  Will the number in the FILENAME go above 9?

The reason I ask is that "test10.txt" would come before "test2.txt" using a normal sort routine.

Do you need them sorted based on the NUMBER in the FileName?
test1.txt
test2.txt
...
test10.txt
0
 
LVL 85

Expert Comment

by:Mike Tomlinson
ID: 34944807
...oh, and will you match ONLY in the PREVIOUS file?
(we should only look for a match in the previous {rightmost} "column" and not any others)
0
Free Trending Threat Insights Every Day

Enhance your security with threat intelligence from the web. Get trending threat insights on hackers, exploits, and suspicious IP addresses delivered to your inbox with our free Cyber Daily.

 

Author Comment

by:Sat80
ID: 34944866
Hi, the file number is unknown so it might be above 10.
Yes we only match each file with the previous one so:
match test2.txt with test1.txt
match test5.txt with test4.txt

Each file will try to match to the previous one which is also the first column in the left in output.txt file
Thanks
0
 

Author Comment

by:Sat80
ID: 34948067
Hi Idle_Mind, sry to bother you, but I hope you can help me with this as always :) thanks.
0
 
LVL 85

Expert Comment

by:Mike Tomlinson
ID: 34948208
Went out to a movie with the family.  I'll see if I can whip something up later tonight after the kids go to bed...  =)
0
 
LVL 85

Accepted Solution

by:
Mike Tomlinson earned 400 total points
ID: 34949183
I ~think~ this does what you want...
Imports System.IO
Imports System.Text.RegularExpressions
Public Class Form1

    Private Sub Button1_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Button1.Click
        CombineTextFiles("C:\Users\Mike\Documents", "test*.txt", "Output.txt")
    End Sub

    Private Sub CombineTextFiles(ByVal FolderPath As String, ByVal FileNameFilter As String, ByVal OutputFile As String)
        Dim FileNames As New List(Of String)
        FileNames.AddRange(Directory.GetFiles(FolderPath, FileNameFilter))
        FileNames.Sort(AddressOf NumericFileSorter)

        Dim Width As Integer = 8
        Dim Column As Integer = 0
        Dim Match As Integer
        Dim OutputIndex As Integer
        Dim OutputLines As New List(Of String)
        Dim CurLines As List(Of String) = Nothing
        Dim PrevLines As List(Of String) = Nothing
        For i As Integer = 0 To FileNames.Count - 1
            CurLines = New List(Of String)
            CurLines.AddRange(File.ReadAllLines(FileNames(i)))

            If IsNothing(PrevLines) Then
                For Each Line As String In CurLines
                    OutputLines.Add(Line.PadRight(Width))
                Next
            Else
                Match = PrevLines.IndexOf(CurLines(0))
                If Match = -1 Then
                    For Each Line As String In CurLines
                        OutputLines.Add(New String(" ", Column * Width) & ("/" & Line).PadRight(Width))
                    Next
                Else
                    For x As Integer = PrevLines.Count To 1 Step -1
                        OutputIndex = OutputLines.Count - x
                        If OutputLines(OutputIndex).Trim.EndsWith(IIf(Column = 1, "", "/") & CurLines(0)) Then
                            Exit For
                        End If
                    Next
                    For Each Line As String In CurLines
                        If OutputIndex < OutputLines.Count Then
                            OutputLines(OutputIndex) = OutputLines(OutputIndex) & ("/" & Line).PadRight(Width)
                        Else
                            OutputLines.Add(New String(" ", Column * Width) & ("/" & Line).PadRight(Width))
                        End If
                        OutputIndex = OutputIndex + 1
                    Next
                End If
            End If

            Column = Column + 1
            PrevLines = CurLines
        Next

        Dim FullPathOutputFile As String = Path.Combine(FolderPath, OutputFile)
        File.WriteAllLines(FullPathOutputFile, OutputLines.ToArray)
        Process.Start(FullPathOutputFile) ' <-- Just so you can see the output file easily
    End Sub

    Private Function NumericFileSorter(ByVal File1 As String, ByVal File2 As String) As Integer
        Dim Match1 As Match = Regex.Match(File1, "\d+")
        Dim Match2 As Match = Regex.Match(File2, "\d+")
        If Match1.Success AndAlso Match2.Success Then
            Dim Num1 As Integer, Num2 As Integer
            If Integer.TryParse(Match1.ToString(), Num1) AndAlso Integer.TryParse(Match2.ToString(), Num2) Then
                Dim result As Integer = Num1.CompareTo(Num2)
                If result <> 0 Then
                    Return result
                End If
            End If
        ElseIf Match1.Success AndAlso Not Match2.Success Then
            Return -1
        ElseIf Not Match1.Success AndAlso Match2.Success Then
            Return 1
        End If
        Return File1.CompareTo(File2)
    End Function

End Class

Open in new window

0
 

Author Comment

by:Sat80
ID: 34950027
Hi Idle_Mind, hope you have enjoyed the movie with your family :)
Thank you for the code as it's working perfectly. You are really a professional programmer :) thanks again.
Regards
0
 

Author Closing Comment

by:Sat80
ID: 34950044
Thanks
0
 
LVL 85

Expert Comment

by:Mike Tomlinson
ID: 34951588
I was sleepy when I wrote that so please test with a wide variety of inputs and let me know if you find any bugs!  =O
0

Featured Post

What Is Threat Intelligence?

Threat intelligence is often discussed, but rarely understood. Starting with a precise definition, along with clear business goals, is essential.

Join & Write a Comment

If you have ever used Microsoft Word then you know that it has a good spell checker and it may have occurred to you that the ability to check spelling might be a nice piece of functionality to add to certain applications of yours. Well the code that…
Article by: Martin
Here are a few simple, working, games that you can use as-is or as the basis for your own games. Tic-Tac-Toe This is one of the simplest of all games.   The game allows for a choice of who goes first and keeps track of the number of wins for…
As developers, we are not limited to the functions provided by the VBA language. In addition, we can call the functions that are part of the Windows operating system. These functions are part of the Windows API (Application Programming Interface). U…
Get people started with the process of using Access VBA to control Outlook using automation, Microsoft Access can control other applications. An example is the ability to programmatically talk to Microsoft Outlook. Using automation, an Access applic…

707 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

15 Experts available now in Live!

Get 1:1 Help Now