Brent Guttmann
asked on
Splitting a txt file into multiple by file size
We have a program that generates a text file that can range from 3mb to 30 mb. I am trying to figure out how to split the file by size - the max size of the text files is 4mb - so a text file that was originally 14 mb would need to be split into 4 text files.
The file name would need "_1", "_2", "_3" etc. appended to it.. so, if we had a 14mb file named "filename.txt"
the resulting text files would be
filename_1.txt
filename_2.txt
filename_3.txt
filename_4.txt
Is there a batch file that can accomplish this?
The file name would need "_1", "_2", "_3" etc. appended to it.. so, if we had a 14mb file named "filename.txt"
the resulting text files would be
filename_1.txt
filename_2.txt
filename_3.txt
filename_4.txt
Is there a batch file that can accomplish this?
ASKER
Hi, id like to have the file split at the end of the line when the file size is, lets say 3.9mb -- so, I guess it would be split based on both file size and end of line.
The EOL has carriage return, line feed - so, <CR><LF> is correct.
The EOL has carriage return, line feed - so, <CR><LF> is correct.
Here is a simple PowerShell function which you can use to split file. Check and see if it works for you..
Function Split-File ($DestPath,$Inputfile,$Size){
Begin{
$count = 1
$FileData = GI $Inputfile
$Outputname = "$DestPath\$($FileData.BaseName)"
$NewFile = "$Outputname`_$count$($FileData.Extension)"
New-Item $NewFile -ItemType File -Force | Out-Null
Write-host "Writing file $NewFile"
}
Process{
Get-Content $Inputfile | % {
If ((GI $NewFile).Length -ge $Size) {
$count++
$NewFile = "$Outputname`_$count$($FileData.Extension)"
Write-host "Writing file $NewFile"
Add-Content $_ -Path $NewFile
}Else{
Add-Content $_ -Path $NewFile
}
}
}
}
#Run Function..
Split-File -DestPath C:\temp\testing -Inputfile C:\Temp\Test.txt -Size 4MB
ASKER
Okay - so first question, how can we have it just run any text file in the folder? I wont know the name of the file until its there and am trying to automate the entire process. Also, I tried running the bat file with the below but it just opened and closed - tried adding a couple different commands at the end to pause the script to view the errors but none of them worked...
Function Split-File ($DestPath,$Inputfile,$Siz e){
Begin{
$count = 1
$FileData = GI $Inputfile
$Outputname = "$DestPath\$($FileData.Bas eName)"
$NewFile = "$Outputname`_$count$($Fil eData.Exte nsion)"
New-Item $NewFile -ItemType File -Force | Out-Null
Write-host "Writing file $NewFile"
}
Process{
Get-Content $Inputfile | % {
If ((GI $NewFile).Length -ge $Size) {
$count++
$NewFile = "$Outputname`_$count$($Fil eData.Exte nsion)"
Write-host "Writing file $NewFile"
Add-Content $_ -Path $NewFile
}Else{
Add-Content $_ -Path $NewFile
}
}
}
}
#Run Function..
Split-File -DestPath "\\server-win-sv05\data\di visions\CO L\Col\MVP\ MVP_Websit e_Uploads\ bat_test" -Inputfile "\\server-win-sv05\data\di visions\CO L\Col\MVP\ MVPWebsite _Uploads\b at_test\NC 13_2016101 2.txt" -Size 4MB
Function Split-File ($DestPath,$Inputfile,$Siz
Begin{
$count = 1
$FileData = GI $Inputfile
$Outputname = "$DestPath\$($FileData.Bas
$NewFile = "$Outputname`_$count$($Fil
New-Item $NewFile -ItemType File -Force | Out-Null
Write-host "Writing file $NewFile"
}
Process{
Get-Content $Inputfile | % {
If ((GI $NewFile).Length -ge $Size) {
$count++
$NewFile = "$Outputname`_$count$($Fil
Write-host "Writing file $NewFile"
Add-Content $_ -Path $NewFile
}Else{
Add-Content $_ -Path $NewFile
}
}
}
}
#Run Function..
Split-File -DestPath "\\server-win-sv05\data\di
It's PowerShell script so you need to save it as .ps1 file and run it from PowerShell console.. Following articles will help you..
How to Run a PowerShell script
http://ss64.com/ps/syntax-run.html
Run PowerShell Scripts from Task Scheduler
https://community.spiceworks.com/how_to/17736-run-powershell-scripts-from-task-scheduler
To split all files in a directory you can change last line to..
How to Run a PowerShell script
http://ss64.com/ps/syntax-run.html
Run PowerShell Scripts from Task Scheduler
https://community.spiceworks.com/how_to/17736-run-powershell-scripts-from-task-scheduler
To split all files in a directory you can change last line to..
GCI "\\server-win-sv05\data\divisions\COL\Col\MVP\MVPWebsite_Uploads\bat_test\*.Txt" | %{Split-File -DestPath "\\server-win-sv05\data\divisions\COL\Col\MVP\MVP_Website_Uploads\bat_test" -Inputfile $_.FullName -Size 4MB}
ASKER
Okay - so this cannot be run from a bat file?
ASKER
I tried running this and it froze at the line where its writing the file..
PS Microsoft.PowerShell.Core\ FileSystem ::\\server -win-fs05\ data\divis ions\COL\C ol\MVP\MVP _Website_U ploads\bat _test> . .\split.ps1
Writing file \\server-win-fs05\data\div isions\COL \Col\MVP\M VP_Website _Uploads\b at_test\NC 13_2016101 2_1.txt
PS Microsoft.PowerShell.Core\
Writing file \\server-win-fs05\data\div
Yes you can run the PowerShell script from bat file..
For example.. You can save the PowerShell code in to a file named Splitfile.ps1 in C:\Script folder..
and use the following code in bat file to execute it..
For example.. You can save the PowerShell code in to a file named Splitfile.ps1 in C:\Script folder..
and use the following code in bat file to execute it..
@ECHO OFF
Powershell.exe -ExecutionPolicy Bypass -Command C:\Script\Splitfile.ps1
PAUSE
ASKER
Nevermind.. typo in my path... so that worked, although a lot slower than I thought it would... i guess I could just make a bat file to execute the ps1 file ,right?
ASKER
okay, great - thanks
It's slow, because it's directly writing to file share, if it's too much of a burden we can try to save all files in local folder and copy it to share once it's complete splitting. It might work bit faster..
ASKER
Yeah - can we edit it to first copy to local temp and then move back after? Its been running 5min on a 5,000 kb text file and has only written 650 kb
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
so what variables do I need to edit here? Just the PSFILE to the ps1 file location?
I see in the bat file its setting the file - but I wont know the file name... should I change to .\*.txt?
I see in the bat file its setting the file - but I wont know the file name... should I change to .\*.txt?
ASKER
nevermind - i get it... the c:\temp is the temporary folder for splitting
ASKER
appreciate your help!
You're rather sparse with the details required to help you.
Does this file have a completely random name, or are some parts static?
Do you just want to process all files of a specific extension, and/or is the only file of its type in the folder?
Do you want the processed file(s) to end up in the same location as the source, or do you need them in a different directory?
Do you want the file processed/moved (and so renamed with the index) even if it is smaller than the size limit, or can this never happen anyway?
Does this file have a completely random name, or are some parts static?
Do you just want to process all files of a specific extension, and/or is the only file of its type in the folder?
Do you want the processed file(s) to end up in the same location as the source, or do you need them in a different directory?
Do you want the file processed/moved (and so renamed with the index) even if it is smaller than the size limit, or can this never happen anyway?
Exactly at the given block size, or rather at the end of a line?
If line based:
- Will the lines always be shorter than your chosen block size?
- What is the program using as EOL - the usual <CR><LF> or something else?