Batch to remove commas, qoutes

W.E.B
W.E.B used Ask the Experts™
on
Hello,
can you please help,
I have a folder with couple csv files.
I'm trying to automate this by running a batch to remove the commas & quotes from the files.
The file names are not always the same, but they will always be in the same folder (C:\Users\Wassim\Desktop\POD Updates)

Any help is appreciated,
Comment
Watch Question

Do more with

Expert Office
EXPERT OFFICE® is a registered trademark of EXPERTS EXCHANGE®

Commented:
You mean from the file names? or from the content of the file?

Author

Commented:
From the content of the files.
thanks
Most Valuable Expert 2018
Distinguished Expert 2018

Commented:
Please provide a (sanitized) sample, and what's supposed to happen with it.
What's supposed to happen with the commas? A csv with the commas removed isn't a csv anymore, so do you really want all the fields in a line concatenated?
Are there any special characters in the csv file that might mess up batch processing, like &, ^, |, <, >?
Are there any commas inside a field (enclosed in double quotes), and if so, what's supposed to happen with these?
What's the OS you're running this on, and is Powershell an option as well?
OWASP Proactive Controls

Learn the most important control and control categories that every architect and developer should include in their projects.

Author

Commented:
ok,
let me explain what to do now,
I receive the csv files from my clients,
I have to open the csv file (Using Excel), find replace all commas, find replace quotes,
then save again as CSV.

then, I use this file for importing data into My database (SQL).

So basically I'm removing all extra characters found inside the file.

I'm using windows 10.
thanks
Most Valuable Expert 2018
Distinguished Expert 2018

Commented:
OK; so you don't want all commas and quotes removed from the files, you only want them removed from inside the fields?
Though since you seem to be able to import them into Excel, it seems to be a properly formatted csv, so why do you have problems importing them into SQL?
Still, please provide a small (a few lines are enough) sanitized representative sample of the exact format you're getting.
And just in case: what size range are we talking about? KB, MB, GB?

Author

Commented:
Hello,
Please see attached sample,

Note,
Even if the csv file doesn't have any obvious commas, for some reason,  I still need to find replace the commas , save as csv again. (This is the only way I'm able to use the csv file to update my datbase).

Appreciate your help,
Sample.csv
Bill PrewIT / Software Engineering Consultant
Top Expert 2016

Commented:
When you use the process you are currently using, and take the example file you just provided, bring it into excel, do the replaces, and save as a CSV again, can you post what that looks like so we can see the difference?

~bp

Author

Commented:
It's a bit hard to send you the sample file without me having to change all data inside.
and when I change the data, save as csv, it's not going to show properly.

To be clear on my question,
I just need to open the csv file, find replace , save as csv file again.

thank you,
Most Valuable Expert 2018
Distinguished Expert 2018
Commented:
The issue here is that you shouldn't have to edit a well-formed csv before you import it to SQL.
That aside, your sample doesn't contain a line with a comma, other than the ones used as delim, so we can't tell if there's anything special going on with your import format. It doesn't contain a line with double quotes either, but enough single quotes, so I assume these are meant to be replaced.
Now, if you save an Excel sheet as csv, it will by default enclose (only) fields that contain the delim in double quotes, and it will escape a double quote inside a field with a second double quote. The default delim is ";". Is that the case for your export as well?
Can your SQL import handle fields enclosed in double quotes?

For starters, here's a Powershell script; you can pass it a path of a file to process (wildcards accepted), default is all csv files in the current directory. It will add a prefix to the edited files, and will skip any files starting with that prefix.
You can change the $ReplacementList variable to your needs.
Save it as Whatever.ps1 and run it like that:
C:\Temp\Whatever.ps1 -Path "C:\Users\Wassim\Desktop\POD Updates\*.csv"

Open in new window

Or, for example, process only a single file, and use "<TAB>" as the export delimiter:
C:\Temp\Whatever.ps1 -Path "C:\Users\Wassim\Desktop\POD Updates\Sample.csv" -ExportDelim "`t"

Open in new window

[CmdletBinding()]
Param(
	[string]$Path = "$(Get-Location -PSProvider FileSystem).Path)\*.csv",
	[string]$ImportDelim = ",",
	[string]$ExportDelim = ";",
	[string]$ExportPrefix = "_Edited_",
	[string]$Encoding = "UTF8"
)
$ReplacementList = @{
	"'" = "_"
	"," = "_"
}
If (-not (Test-Path -Path $Path)) {
	Throw "No csv files found in '$($Path)'"
}
Get-Item -Path $Path |
	Where-Object {$_.Name -notlike "$($ExportPrefix)*"} |
	ForEach-Object {
		$OutFile = $_.DirectoryName + "\" + $ExportPrefix + $_.Name
		"Processing '$($_.FullName)' --> '$($OutFile)'" | Write-Host
		$FirstLine = Get-Content -Path $_.FullName -TotalCount 1
		$FieldCount = ([regex]::Matches($FirstLine, $ImportDelim + '(?=(?:[^"]*"[^"]*")*[^"]*\Z)')).Count + 1
		$Header = 1..$FieldCount | % {"Col_$($_)"}
		Import-Csv -Path $_.FullName -Delimiter $Importdelim -Header $Header -Encoding $Encoding |
			ForEach-Object {
				ForEach ($Column In $Header) {
					ForEach ($Replacement In $ReplacementList.GetEnumerator()) {
						$_.$Column = $_.$Column.Replace($Replacement.Key, $Replacement.Value)
					}
				}
				$_
			} |
			ConvertTo-Csv -NoTypeInformation -Delimiter $ExportDelim |
			Select-Object -Skip 1 |
			Set-Content -Path $OutFile -Encoding $Encoding
	}

Open in new window

Author

Commented:
Thank you very much.

Do more with

Expert Office
Submit tech questions to Ask the Experts™ at any time to receive solutions, advice, and new ideas from leading industry professionals.

Start 7-Day Free Trial