zhshqzyc
asked on
Merge files with VBS
The original question is here
The files that have the same header(the first line). I want to have the content of all files, but with only one header.
ReneGe's code is okay.
But I just worry the speed.
Thanks.
The files that have the same header(the first line). I want to have the content of all files, but with only one header.
ReneGe's code is okay.
But I just worry the speed.
@ECHO OFF
SET Output=Output.txt
FOR /F %%A IN ('dir /b chr*.txt') DO Call :GetHeader "%%~fA"
FOR /F %%A IN ('dir /b chr*.txt') DO FOR /F "usebackq Skip=1 delims=" %%B IN ("%%A") DO ECHO %%B>>"%Output%"
EXIT
:GetHeader
FOR /F "usebackq delims=" %%A IN ("%~1") DO (
ECHO %%A>"%Output%"
EXIT /b
)
huacat's codecopy chr1.assoc QT.assoc
for /L %%A in (2,1,23) do (
type chr%%A.assoc | find /v "header" >> qt.assoc
)
I tried it, but it only merged two files. Not sure why?Thanks.
How much time it takes to run my script?
ASKER
Almost 30 minutes passed, it processed 8210 kb roughly. About 5%.
Hi, here's a VBS that should do the job, and also show you which file it's currently processing, so at least you know it's doing something.
Regards,
Rob.
Regards,
Rob.
' Specify the folder that contains the files
strDir = "C:\Files"
' Specify the extension of the files to be read
strExt = ".txt"
' Specify the output file, which should be in a separate folder
strOutput = "C:\Output.txt"
If LCase(Right(Wscript.FullName, 11)) = "wscript.exe" Then
strPath = Wscript.ScriptFullName
strCommand = "%comspec% /c cscript """ & strPath & """"
Set objShell = CreateObject("Wscript.Shell")
objShell.Run(strCommand), 1, True
Wscript.Quit
End If
Set objFSO = CreateObject("Scripting.FileSystemObject")
Set objOutput = objFSO.CreateTextFile(strOutput, True)
blnHeaderWritten = False
For Each objFile In objFSO.GetFolder(strDir).Files
If Right(LCase(objFile.Name), Len(strExt)) = LCase(strExt) Then
WScript.Echo "Processing " & objFile.Name & "..."
Set objInput = objFSO.OpenTextFile(objFile.Path, 1, False)
If blnHeaderWritten = False Then
objOutput.WriteLine objInput.ReadAll
blnHeaderWritten = True
Else
objInput.SkipLine
objOutput.WriteLine objInput.ReadAll
End If
objInput.Close
End If
Next
objOutput.Close
MsgBox "Done. Please see " & objOutput
Now, how much time it takes to run Rob's script?
Yes, that would certainly be interesting. I can't really see that it would be much different....it all depends on the amount of the files, and the size of each.
Rob.
Rob.
copy chr1.assoc QT.assoc
for /L %%A in (2,1,23) do type (%%A).assoc | find /v "CHR SNP N_MISS" >> qt.assoc
I put above code to a .bat file, create 23 files to test it, and it run it correctly.
Please remember, the char after CHR must be a TAB char if you header used TAB to seperator columns.
for /L %%A in (2,1,23) do type (%%A).assoc | find /v "CHR SNP N_MISS" >> qt.assoc
I put above code to a .bat file, create 23 files to test it, and it run it correctly.
Please remember, the char after CHR must be a TAB char if you header used TAB to seperator columns.
ASKER
I am new to VBS. Do I need install something to run VBS?
Copy ROB's script in a test file with a VBS extension.
Then, just run it by double-clicking on it.
Then, just run it by double-clicking on it.
Copy ROB's script in a text file with a VBS extension.
Then, just run it by double-clicking on it.
Then, just run it by double-clicking on it.
ASKER
One more thing is that my copy order is
chr1,chr2,chr3,..chr9,chr10,...chr20,chr21,chr22,chr23
However except huacat's code, the other codes's order is chr10,...chr2,chr3,...chr9
I hope that somebody can modify it.
Based on huacat's script, are your file always going to be numbered from 2 to 23, or it will change?
ASKER
Yes, it is always from 1 to 23,
@zhshqzyc
Regardless of the files order, have you had the chance to compare execution speed between my batch file and Rob's VBS?
@RobSampson
Could you please change your script so it reads files from 1 to 23 like:[for /L %%A in (1,1,23) DO ...]
Regardless of the files order, have you had the chance to compare execution speed between my batch file and Rob's VBS?
@RobSampson
Could you please change your script so it reads files from 1 to 23 like:[for /L %%A in (1,1,23) DO ...]
ASKER
Yes.
@RobSampson
Your code is fast, only took 5 minutes. But the copying order is wild.
@ReneGe,
Thanks for your help.
@huacat,
Maybe your code is the fastest, but I need manual to input the header.
@RobSampson
Your code is fast, only took 5 minutes. But the copying order is wild.
10,4,8,5,11,9,12,6,7,13,23,18,14,...
@ReneGe,
Thanks for your help.
@huacat,
Maybe your code is the fastest, but I need manual to input the header.
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
Okay. Many thanks.
@ReneGe:
I am not sure that can you combine your code with huacat's one?
First getting the header with your code, then using find /v etc...
@ReneGe:
I am not sure that can you combine your code with huacat's one?
First getting the header with your code, then using find /v etc...
I would doubt that combining my code with huacat's code would increase performance, compared with Rob's script.
That is because huacat's script requires more processing to be achieved then my script. And since you experienced a major emprovment with Rob's script, my conclusion is obvious.
So, tell us, how did Rob's script performed?
Cheers,
Rene
That is because huacat's script requires more processing to be achieved then my script. And since you experienced a major emprovment with Rob's script, my conclusion is obvious.
So, tell us, how did Rob's script performed?
Cheers,
Rene
Rob's code is great!
Hi Zhshqzyc,
If you wan't input the header, so easy:
Hi Zhshqzyc,
If you wan't input the header, so easy:
for /f "delims=" %%i in (chr1.assoc) do (set header=%%i)&(goto :next)
:next
copy chr1.assoc QT.assoc
for /L %%A in (2,1,23) do type chr%%A.assoc | find /v "%header%" >> qt.assoc
Speed of light
(Please make sure you split points with all, and according to their contribution/effort.)
(Please make sure you split points with all, and according to their contribution/effort.)
@ECHO OFF
SET Output=Output.txt
FOR /F %%A IN ('dir /b chr1.txt') DO Call :GetHeader "%%~fA"
FOR /L %%A IN (1,1,23) DO IF EXIST chr%%A.txt (
ECHO EXTRACTING: "chr%%A.txt"
FINDSTR -v "%Header%" "chr%%A.txt">>"%Output%"
)
pause
EXIT
:GetHeader
FOR /F "usebackq delims=" %%A IN ("%~1") DO (
SET Header=%%A
ECHO %%A>"%Output%"
EXIT /b
)
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
@huacat
I uploaded three large files at SkyDrive.
Testing it with your code but no lucky. Only two files merged.
Thanks for help.
I uploaded three large files at SkyDrive.
Testing it with your code but no lucky. Only two files merged.
Thanks for help.
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
Thanks.
Found that to.
I also found that we do not find anywhere in the file the caracter sets found in the header.
Also, since the header is always the same, I just changed the Findstr string with a header element.
I also found that we do not find anywhere in the file the caracter sets found in the header.
Also, since the header is always the same, I just changed the Findstr string with a header element.
@echo off
SET Output=Output.txt
for /f "delims=" %%i in (chr1.lmiss) do (set header=%%i)&(goto :next)
:next
ECHO %Header%>"%Output%"
for /L %%A in (1,1,23) do IF EXIST "chr%%A.lmiss" (
ECHO EXTRACTING: "chr%%A.lmiss"
FINDSTR -v "F_MISS" "chr%%A.lmiss">>"%Output%"
)
PAUSE
ASKER
Yes. By the way, Rob's code has a little error? He replaced the header with a blank row?
There is a blank row between file 1 and file 2.
There is a blank row between file 1 and file 2.
Have you tried my last script version?
ASKER
@ReneGe:
I tried it, it is great!
Regards.
I tried it, it is great!
Regards.
Glad I could help.
Thanks for the grade. The blank row between file1 and file2 is mostly likely that there might be a blank line at the end of file1?
Otherwise, to be able to remove that, we'd need to do some more string processing, which would slow it down over such large files....
Rob.
Otherwise, to be able to remove that, we'd need to do some more string processing, which would slow it down over such large files....
Rob.