?
Solved

Merge files with VBS

Posted on 2011-05-08
29
Medium Priority
?
434 Views
Last Modified: 2012-06-27
The original question is here
The files that have the same header(the first line). I want to have the content of all files, but with only one header.
ReneGe's code is okay.
But I just worry the speed.
@ECHO OFF

SET Output=Output.txt

FOR /F %%A IN ('dir /b chr*.txt') DO Call :GetHeader "%%~fA"
FOR /F %%A IN ('dir /b chr*.txt') DO FOR /F "usebackq Skip=1 delims=" %%B IN ("%%A") DO ECHO %%B>>"%Output%"
EXIT

:GetHeader
FOR /F "usebackq delims=" %%A IN ("%~1") DO (
	ECHO %%A>"%Output%"
	EXIT /b
)

Open in new window

huacat's code
copy chr1.assoc QT.assoc
for /L %%A in (2,1,23) do (
  type chr%%A.assoc | find /v "header" >> qt.assoc
)

Open in new window

I tried it, but it only merged two files. Not sure why?
Thanks.
0
Comment
Question by:zhshqzyc
  • 12
  • 10
  • 4
  • +1
29 Comments
 
LVL 10

Expert Comment

by:ReneGe
ID: 35716777
How much time it takes to run my script?
0
 

Author Comment

by:zhshqzyc
ID: 35716803
Almost 30 minutes passed, it processed 8210 kb roughly. About 5%.
0
 
LVL 65

Expert Comment

by:RobSampson
ID: 35717111
Hi, here's a VBS that should do the job, and also show you which file it's currently processing, so at least you know it's doing something.

Regards,

Rob.
' Specify the folder that contains the files
strDir = "C:\Files"
' Specify the extension of the files to be read
strExt = ".txt"
' Specify the output file, which should be in a separate folder
strOutput = "C:\Output.txt"

If LCase(Right(Wscript.FullName, 11)) = "wscript.exe" Then
    strPath = Wscript.ScriptFullName
    strCommand = "%comspec% /c cscript  """ & strPath & """"
    Set objShell = CreateObject("Wscript.Shell")
    objShell.Run(strCommand), 1, True
    Wscript.Quit
End If

Set objFSO = CreateObject("Scripting.FileSystemObject")
Set objOutput = objFSO.CreateTextFile(strOutput, True)
blnHeaderWritten = False
For Each objFile In objFSO.GetFolder(strDir).Files
	If Right(LCase(objFile.Name), Len(strExt)) = LCase(strExt) Then
		WScript.Echo "Processing " & objFile.Name & "..."
		Set objInput = objFSO.OpenTextFile(objFile.Path, 1, False)
		If blnHeaderWritten = False Then
			objOutput.WriteLine objInput.ReadAll
			blnHeaderWritten = True
		Else
			objInput.SkipLine
			objOutput.WriteLine objInput.ReadAll
		End If
		objInput.Close
	End If
Next
objOutput.Close
MsgBox "Done. Please see " & objOutput

Open in new window

0
VIDEO: THE CONCERTO CLOUD FOR HEALTHCARE

Modern healthcare requires a modern cloud. View this brief video to understand how the Concerto Cloud for Healthcare can help your organization.

 
LVL 10

Expert Comment

by:ReneGe
ID: 35717137
Now, how much time it takes to run Rob's script?
0
 
LVL 65

Expert Comment

by:RobSampson
ID: 35717148
Yes, that would certainly be interesting.  I can't really see that it would be much different....it all depends on the amount of the files, and the size of each.

Rob.
0
 
LVL 7

Expert Comment

by:huacat
ID: 35717231
copy chr1.assoc QT.assoc  
for /L %%A in (2,1,23) do type (%%A).assoc | find /v "CHR      SNP      N_MISS" >> qt.assoc

I put above code to a .bat file, create 23 files to test it, and it run it correctly.
Please remember, the char after CHR must be a TAB char if you header used TAB to seperator columns.
0
 

Author Comment

by:zhshqzyc
ID: 35721773
I am new to VBS. Do I need install something to run VBS?
0
 
LVL 10

Expert Comment

by:ReneGe
ID: 35722013
Copy ROB's script in a test file with a VBS extension.
Then, just run it by double-clicking on it.
0
 
LVL 10

Expert Comment

by:ReneGe
ID: 35722015
Copy ROB's script in a text file with a VBS extension.
Then, just run it by double-clicking on it.
0
 

Author Comment

by:zhshqzyc
ID: 35722155
One more thing is that my copy order is  
chr1,chr2,chr3,..chr9,chr10,...chr20,chr21,chr22,chr23

Open in new window

However except huacat's code, the other codes's order is
chr10,...chr2,chr3,...chr9

Open in new window

I hope that somebody can modify it.
0
 
LVL 10

Expert Comment

by:ReneGe
ID: 35722185
Based on huacat's script, are your file always going to be numbered from 2 to 23, or it will change?
0
 

Author Comment

by:zhshqzyc
ID: 35722245
Yes, it is always from 1 to 23,
0
 
LVL 10

Expert Comment

by:ReneGe
ID: 35723292
@zhshqzyc
Regardless of the files order, have you had the chance to compare execution speed between my batch file and Rob's VBS?

@RobSampson
Could you please change your script so it reads files from 1 to 23 like:[for /L %%A in (1,1,23) DO ...]

0
 

Author Comment

by:zhshqzyc
ID: 35724132
Yes.
@RobSampson
Your code is fast, only took 5 minutes. But the copying order is wild.
10,4,8,5,11,9,12,6,7,13,23,18,14,...

Open in new window


@ReneGe,
Thanks for your help.

@huacat,
Maybe your code is the fastest, but I need manual to input the header.
0
 
LVL 65

Accepted Solution

by:
RobSampson earned 1200 total points
ID: 35725152
Sorry.....about the file order....do the files have the same name, like
chr1.txt
chr2.txt
....
chr23.txt

If so, that's easy....try this.

Regards,

Rob.
' Specify the folder that contains the files
strDir = "C:\Files"
' Specify the beginning of the file name
strNameStart = "chr"
' Specify the lower bound of the numbering of the file to start at
intStartNum = 1
' Specify the upper bound of the numbering of the file to end at
intEndNum = 23
' Specify the extension of the files to be read
strExt = ".txt"
' Specify the output file, which should be in a separate folder
strOutput = "C:\Output.txt"

If LCase(Right(Wscript.FullName, 11)) = "wscript.exe" Then
    strPath = Wscript.ScriptFullName
    strCommand = "%comspec% /c cscript  """ & strPath & """"
    Set objShell = CreateObject("Wscript.Shell")
    objShell.Run(strCommand), 1, True
    Wscript.Quit
End If

If Right(strDir, 1) <> "\" Then strDir = strDir & "\"
If Left(strExt , 1) <> "." Then strExt = "." & strExt
Set objFSO = CreateObject("Scripting.FileSystemObject")
Set objOutput = objFSO.CreateTextFile(strOutput, True)
blnHeaderWritten = False
For intNum = intStartNum To intEndNum
	strFilePath = strDir & strNameStart & intNum & strExt
	If objFSO.FileExists(strFilePath) = True Then
		WScript.Echo "Processing " & strFilePath & "..."
		Set objInput = objFSO.OpenTextFile(strFilePath, 1, False)
		If blnHeaderWritten = False Then
			objOutput.WriteLine objInput.ReadAll
			blnHeaderWritten = True
		Else
			objInput.SkipLine
			objOutput.WriteLine objInput.ReadAll
		End If
		objInput.Close
	End If
Next
objOutput.Close
MsgBox "Done. Please see " & objOutput

Open in new window

0
 

Author Comment

by:zhshqzyc
ID: 35725313
Okay. Many thanks.
@ReneGe:
I am not sure that can you combine your code with huacat's one?
First getting the header with your code, then using find /v etc...
0
 
LVL 10

Expert Comment

by:ReneGe
ID: 35725410
I would doubt that combining my code with huacat's code would increase performance, compared with Rob's script.

That is because huacat's script requires more processing to be achieved then my script. And since you experienced a major emprovment with Rob's script, my conclusion is obvious.

So, tell us, how did Rob's script performed?

Cheers,
Rene
0
 
LVL 7

Expert Comment

by:huacat
ID: 35725700
Rob's code is great!

Hi Zhshqzyc,

If you wan't input the header, so easy:

for /f "delims=" %%i in (chr1.assoc) do (set header=%%i)&(goto :next)
:next
copy chr1.assoc QT.assoc  
for /L %%A in (2,1,23) do type chr%%A.assoc | find /v "%header%" >> qt.assoc

Open in new window

0
 
LVL 10

Expert Comment

by:ReneGe
ID: 35726101
Speed of light

(Please make sure you split points with all, and according to their contribution/effort.)

@ECHO OFF

SET Output=Output.txt

FOR /F %%A IN ('dir /b chr1.txt') DO Call :GetHeader "%%~fA"
FOR /L %%A IN (1,1,23) DO IF EXIST chr%%A.txt (
	ECHO EXTRACTING: "chr%%A.txt"
	FINDSTR -v "%Header%" "chr%%A.txt">>"%Output%"
)
pause
EXIT

:GetHeader
FOR /F "usebackq delims=" %%A IN ("%~1") DO (
	SET Header=%%A
	ECHO %%A>"%Output%"
	EXIT /b
)

Open in new window

0
 
LVL 10

Assisted Solution

by:ReneGe
ReneGe earned 200 total points
ID: 35726129
Made a slight modif. on huacat's script. Which inspired me for my previous one.

 
@echo off
SET Output=Output.txt
for /f "delims=" %%i in (chr1.txt) do (set header=%%i)&(goto :next)
:next
ECHO %Header%>"%Output%"
for /L %%A in (1,1,23) do IF EXIST "chr%%A.txt" (
	ECHO EXTRACTING: "chr%%A.txt"
	FINDSTR -v "%Header%" "chr%%A.txt">>"%Output%"
)
PAUSE

Open in new window

0
 

Author Comment

by:zhshqzyc
ID: 35728594
@huacat

I uploaded three large files at SkyDrive.
Testing it with your code but no lucky. Only two files merged.
Thanks for help.
0
 
LVL 7

Assisted Solution

by:huacat
huacat earned 600 total points
ID: 35729190
Hi Zhshqzyc,

I test with the batch codes, it works fine, combine your 3 files together.

BUT, I found:
The header of the chr2.lmiss is diffrent with others.
 CHR          SNP   N_MISS   N_GENO   F_MISS               // Chr2's header, looks more space after CHR
 CHR         SNP   N_MISS   N_GENO   F_MISS                // Other's Header

So the result file qt.assoc include two header line, because CMD find match whole words/statements.

My code only work's for the fixed header of each file. If the header changed for diffrent files, please using Robo's code.
Robo's code should be fast and can process diffrent header line for each file.

Good luck.
0
 

Author Closing Comment

by:zhshqzyc
ID: 35729523
Thanks.
0
 
LVL 10

Expert Comment

by:ReneGe
ID: 35729710
Found that to.

I also found that we do not find anywhere in the file the caracter sets found in the header.

Also, since the header is always the same, I just changed the Findstr string with a header element.

 
@echo off
SET Output=Output.txt
for /f "delims=" %%i in (chr1.lmiss) do (set header=%%i)&(goto :next)
:next
ECHO %Header%>"%Output%"
for /L %%A in (1,1,23) do IF EXIST "chr%%A.lmiss" (
	ECHO EXTRACTING: "chr%%A.lmiss"
	FINDSTR -v "F_MISS" "chr%%A.lmiss">>"%Output%"
)
PAUSE

Open in new window

0
 

Author Comment

by:zhshqzyc
ID: 35729922
Yes. By the way, Rob's code has a little error? He replaced the header with a blank row?
There is a blank row between file 1 and file 2.
0
 
LVL 10

Expert Comment

by:ReneGe
ID: 35730427
Have you tried my last script version?
0
 

Author Comment

by:zhshqzyc
ID: 35730513
@ReneGe:
I tried it, it is great!
Regards.
0
 
LVL 10

Expert Comment

by:ReneGe
ID: 35730765
Glad I could help.
0
 
LVL 65

Expert Comment

by:RobSampson
ID: 35733952
Thanks for the grade.  The blank row between file1 and file2 is mostly likely that there might be a blank line at the end of file1?

Otherwise, to be able to remove that, we'd need to do some more string processing, which would slow it down over such large files....

Rob.
0

Featured Post

Hire Technology Freelancers with Gigs

Work with freelancers specializing in everything from database administration to programming, who have proven themselves as experts in their field. Hire the best, collaborate easily, pay securely, and get projects done right.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Deploying a Microsoft Access application in a Citrix environment is not difficult but takes a few steps. However, Citrix system people are often of little help, as they typically know next to nothing about Access. The script provided here will take …
If you need to start windows update installation remotely or as a scheduled task you will find this very helpful.
Please read the paragraph below before following the instructions in the video — there are important caveats in the paragraph that I did not mention in the video. If your PaperPort 12 or PaperPort 14 is failing to start, or crashing, or hanging, …
Is your OST file inaccessible, Need to transfer OST file from one computer to another? Want to convert OST file to PST? If the answer to any of the above question is yes, then look no further. With the help of Stellar OST to PST Converter, you can e…
Suggested Courses

839 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question