• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 445
  • Last Modified:

Merge files with VBS

The original question is here
The files that have the same header(the first line). I want to have the content of all files, but with only one header.
ReneGe's code is okay.
But I just worry the speed.
@ECHO OFF

SET Output=Output.txt

FOR /F %%A IN ('dir /b chr*.txt') DO Call :GetHeader "%%~fA"
FOR /F %%A IN ('dir /b chr*.txt') DO FOR /F "usebackq Skip=1 delims=" %%B IN ("%%A") DO ECHO %%B>>"%Output%"
EXIT

:GetHeader
FOR /F "usebackq delims=" %%A IN ("%~1") DO (
	ECHO %%A>"%Output%"
	EXIT /b
)

Open in new window

huacat's code
copy chr1.assoc QT.assoc
for /L %%A in (2,1,23) do (
  type chr%%A.assoc | find /v "header" >> qt.assoc
)

Open in new window

I tried it, but it only merged two files. Not sure why?
Thanks.
0
zhshqzyc
Asked:
zhshqzyc
  • 12
  • 10
  • 4
  • +1
3 Solutions
 
ReneGeCommented:
How much time it takes to run my script?
0
 
zhshqzycAuthor Commented:
Almost 30 minutes passed, it processed 8210 kb roughly. About 5%.
0
 
RobSampsonCommented:
Hi, here's a VBS that should do the job, and also show you which file it's currently processing, so at least you know it's doing something.

Regards,

Rob.
' Specify the folder that contains the files
strDir = "C:\Files"
' Specify the extension of the files to be read
strExt = ".txt"
' Specify the output file, which should be in a separate folder
strOutput = "C:\Output.txt"

If LCase(Right(Wscript.FullName, 11)) = "wscript.exe" Then
    strPath = Wscript.ScriptFullName
    strCommand = "%comspec% /c cscript  """ & strPath & """"
    Set objShell = CreateObject("Wscript.Shell")
    objShell.Run(strCommand), 1, True
    Wscript.Quit
End If

Set objFSO = CreateObject("Scripting.FileSystemObject")
Set objOutput = objFSO.CreateTextFile(strOutput, True)
blnHeaderWritten = False
For Each objFile In objFSO.GetFolder(strDir).Files
	If Right(LCase(objFile.Name), Len(strExt)) = LCase(strExt) Then
		WScript.Echo "Processing " & objFile.Name & "..."
		Set objInput = objFSO.OpenTextFile(objFile.Path, 1, False)
		If blnHeaderWritten = False Then
			objOutput.WriteLine objInput.ReadAll
			blnHeaderWritten = True
		Else
			objInput.SkipLine
			objOutput.WriteLine objInput.ReadAll
		End If
		objInput.Close
	End If
Next
objOutput.Close
MsgBox "Done. Please see " & objOutput

Open in new window

0
Free Tool: Path Explorer

An intuitive utility to help find the CSS path to UI elements on a webpage. These paths are used frequently in a variety of front-end development and QA automation tasks.

One of a set of tools we're offering as a way of saying thank you for being a part of the community.

 
ReneGeCommented:
Now, how much time it takes to run Rob's script?
0
 
RobSampsonCommented:
Yes, that would certainly be interesting.  I can't really see that it would be much different....it all depends on the amount of the files, and the size of each.

Rob.
0
 
huacatCommented:
copy chr1.assoc QT.assoc  
for /L %%A in (2,1,23) do type (%%A).assoc | find /v "CHR      SNP      N_MISS" >> qt.assoc

I put above code to a .bat file, create 23 files to test it, and it run it correctly.
Please remember, the char after CHR must be a TAB char if you header used TAB to seperator columns.
0
 
zhshqzycAuthor Commented:
I am new to VBS. Do I need install something to run VBS?
0
 
ReneGeCommented:
Copy ROB's script in a test file with a VBS extension.
Then, just run it by double-clicking on it.
0
 
ReneGeCommented:
Copy ROB's script in a text file with a VBS extension.
Then, just run it by double-clicking on it.
0
 
zhshqzycAuthor Commented:
One more thing is that my copy order is  
chr1,chr2,chr3,..chr9,chr10,...chr20,chr21,chr22,chr23

Open in new window

However except huacat's code, the other codes's order is
chr10,...chr2,chr3,...chr9

Open in new window

I hope that somebody can modify it.
0
 
ReneGeCommented:
Based on huacat's script, are your file always going to be numbered from 2 to 23, or it will change?
0
 
zhshqzycAuthor Commented:
Yes, it is always from 1 to 23,
0
 
ReneGeCommented:
@zhshqzyc
Regardless of the files order, have you had the chance to compare execution speed between my batch file and Rob's VBS?

@RobSampson
Could you please change your script so it reads files from 1 to 23 like:[for /L %%A in (1,1,23) DO ...]

0
 
zhshqzycAuthor Commented:
Yes.
@RobSampson
Your code is fast, only took 5 minutes. But the copying order is wild.
10,4,8,5,11,9,12,6,7,13,23,18,14,...

Open in new window


@ReneGe,
Thanks for your help.

@huacat,
Maybe your code is the fastest, but I need manual to input the header.
0
 
RobSampsonCommented:
Sorry.....about the file order....do the files have the same name, like
chr1.txt
chr2.txt
....
chr23.txt

If so, that's easy....try this.

Regards,

Rob.
' Specify the folder that contains the files
strDir = "C:\Files"
' Specify the beginning of the file name
strNameStart = "chr"
' Specify the lower bound of the numbering of the file to start at
intStartNum = 1
' Specify the upper bound of the numbering of the file to end at
intEndNum = 23
' Specify the extension of the files to be read
strExt = ".txt"
' Specify the output file, which should be in a separate folder
strOutput = "C:\Output.txt"

If LCase(Right(Wscript.FullName, 11)) = "wscript.exe" Then
    strPath = Wscript.ScriptFullName
    strCommand = "%comspec% /c cscript  """ & strPath & """"
    Set objShell = CreateObject("Wscript.Shell")
    objShell.Run(strCommand), 1, True
    Wscript.Quit
End If

If Right(strDir, 1) <> "\" Then strDir = strDir & "\"
If Left(strExt , 1) <> "." Then strExt = "." & strExt
Set objFSO = CreateObject("Scripting.FileSystemObject")
Set objOutput = objFSO.CreateTextFile(strOutput, True)
blnHeaderWritten = False
For intNum = intStartNum To intEndNum
	strFilePath = strDir & strNameStart & intNum & strExt
	If objFSO.FileExists(strFilePath) = True Then
		WScript.Echo "Processing " & strFilePath & "..."
		Set objInput = objFSO.OpenTextFile(strFilePath, 1, False)
		If blnHeaderWritten = False Then
			objOutput.WriteLine objInput.ReadAll
			blnHeaderWritten = True
		Else
			objInput.SkipLine
			objOutput.WriteLine objInput.ReadAll
		End If
		objInput.Close
	End If
Next
objOutput.Close
MsgBox "Done. Please see " & objOutput

Open in new window

0
 
zhshqzycAuthor Commented:
Okay. Many thanks.
@ReneGe:
I am not sure that can you combine your code with huacat's one?
First getting the header with your code, then using find /v etc...
0
 
ReneGeCommented:
I would doubt that combining my code with huacat's code would increase performance, compared with Rob's script.

That is because huacat's script requires more processing to be achieved then my script. And since you experienced a major emprovment with Rob's script, my conclusion is obvious.

So, tell us, how did Rob's script performed?

Cheers,
Rene
0
 
huacatCommented:
Rob's code is great!

Hi Zhshqzyc,

If you wan't input the header, so easy:

for /f "delims=" %%i in (chr1.assoc) do (set header=%%i)&(goto :next)
:next
copy chr1.assoc QT.assoc  
for /L %%A in (2,1,23) do type chr%%A.assoc | find /v "%header%" >> qt.assoc

Open in new window

0
 
ReneGeCommented:
Speed of light

(Please make sure you split points with all, and according to their contribution/effort.)

@ECHO OFF

SET Output=Output.txt

FOR /F %%A IN ('dir /b chr1.txt') DO Call :GetHeader "%%~fA"
FOR /L %%A IN (1,1,23) DO IF EXIST chr%%A.txt (
	ECHO EXTRACTING: "chr%%A.txt"
	FINDSTR -v "%Header%" "chr%%A.txt">>"%Output%"
)
pause
EXIT

:GetHeader
FOR /F "usebackq delims=" %%A IN ("%~1") DO (
	SET Header=%%A
	ECHO %%A>"%Output%"
	EXIT /b
)

Open in new window

0
 
ReneGeCommented:
Made a slight modif. on huacat's script. Which inspired me for my previous one.

 
@echo off
SET Output=Output.txt
for /f "delims=" %%i in (chr1.txt) do (set header=%%i)&(goto :next)
:next
ECHO %Header%>"%Output%"
for /L %%A in (1,1,23) do IF EXIST "chr%%A.txt" (
	ECHO EXTRACTING: "chr%%A.txt"
	FINDSTR -v "%Header%" "chr%%A.txt">>"%Output%"
)
PAUSE

Open in new window

0
 
zhshqzycAuthor Commented:
@huacat

I uploaded three large files at SkyDrive.
Testing it with your code but no lucky. Only two files merged.
Thanks for help.
0
 
huacatCommented:
Hi Zhshqzyc,

I test with the batch codes, it works fine, combine your 3 files together.

BUT, I found:
The header of the chr2.lmiss is diffrent with others.
 CHR          SNP   N_MISS   N_GENO   F_MISS               // Chr2's header, looks more space after CHR
 CHR         SNP   N_MISS   N_GENO   F_MISS                // Other's Header

So the result file qt.assoc include two header line, because CMD find match whole words/statements.

My code only work's for the fixed header of each file. If the header changed for diffrent files, please using Robo's code.
Robo's code should be fast and can process diffrent header line for each file.

Good luck.
0
 
zhshqzycAuthor Commented:
Thanks.
0
 
ReneGeCommented:
Found that to.

I also found that we do not find anywhere in the file the caracter sets found in the header.

Also, since the header is always the same, I just changed the Findstr string with a header element.

 
@echo off
SET Output=Output.txt
for /f "delims=" %%i in (chr1.lmiss) do (set header=%%i)&(goto :next)
:next
ECHO %Header%>"%Output%"
for /L %%A in (1,1,23) do IF EXIST "chr%%A.lmiss" (
	ECHO EXTRACTING: "chr%%A.lmiss"
	FINDSTR -v "F_MISS" "chr%%A.lmiss">>"%Output%"
)
PAUSE

Open in new window

0
 
zhshqzycAuthor Commented:
Yes. By the way, Rob's code has a little error? He replaced the header with a blank row?
There is a blank row between file 1 and file 2.
0
 
ReneGeCommented:
Have you tried my last script version?
0
 
zhshqzycAuthor Commented:
@ReneGe:
I tried it, it is great!
Regards.
0
 
ReneGeCommented:
Glad I could help.
0
 
RobSampsonCommented:
Thanks for the grade.  The blank row between file1 and file2 is mostly likely that there might be a blank line at the end of file1?

Otherwise, to be able to remove that, we'd need to do some more string processing, which would slow it down over such large files....

Rob.
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

Join & Write a Comment

Featured Post

Get your problem seen by more experts

Be seen. Boost your question’s priority for more expert views and faster solutions

  • 12
  • 10
  • 4
  • +1
Tackle projects and never again get stuck behind a technical roadblock.
Join Now