Avatar of motioneye
motioneye
Flag for Singapore asked on

Powershell script to remove any line with extra double quote "

I have a text file that need to import into database, sometimes the import failed due to extra double quote found in the line, so before performing the import how do I use PS script to scan every line and remove unwanted extra doule quote.

as we seen below, between delimiter "$" , we have extra double quote in "Maxis Segar ", this is only example but we have many like this in the text lines.

"$"Limited  Company "Maxis Segar ""$"
Powershell

Avatar of undefined
Last Comment
motioneye

8/22/2022 - Mon
Aard Vark

There's probably a bunch of ways to do this. But here's a quickie.

foreach ($line in $file)
{
	$line -replace '("\$"[\w\s]*)(")?([\w\s]+)?(")?("\$")','$1$3$5'
}

Open in new window

It is matching your specific example patterns though. EG:

"$"Limited  Company "Maxis Segar ""$"
"$"Limited  Company Maxis Segar "$"
"$"Blah Blah "$"
"$"sdfsdfsdf "Blah Blah" "$"
Benjamin Voglar

$test = Get-Content C:\IT\querry.txt

$newtest = $test -replace '""' , '"'

$newtest > c:\it\newquerry.txt
Qlemo

Benjamin, it is not about replacing two successive double quotes only - each double quote inside of the delimiters should get removed.

Another way to do it is to remove the delimiters, remove double quotes, then add back delimiters:
Get-Content file.txt -replace '"\$"', '' -replace '"', '' -replace '(.+)', '"$"$1"$"' | Out-File file2.txt

Open in new window

There are no real pros or cons compared to -replace '("\$"[\w\s]*)(")?([\w\s]+)?(")?("\$")','$1$3$5' or similar regular expressions.
All of life is about relationships, and EE has made a viirtual community a real community. It lifts everyone's boat
William Peck
motioneye

ASKER
Hi,

I'm trying with one of the example script here but getting error as below.

Windows PowerShell
Copyright (C) 2014 Microsoft Corporation. All rights reserved.

PS C:\Users\BDXTY> Get-Content D:\Programs\actual\Folder3\Output\SolidFile.csv -replace '"\$"', '' -replace '"', '' -replace '(.+)', '"$"$1"$"' | Out-File D:\Programs\actual\Folder3\Output\SolidFile.csv
Get-Content : A parameter cannot be found that matches parameter name 'replace'.
At line:1 char:47
+ Get-Content D:\Programs\actual\Folder3\Output\SolidFile.csv -replace '"\$"', '' -replace '"',  ...
+                                               ~~~~~~~~
    + CategoryInfo          : InvalidArgument: (:) [Get-Content], ParameterBindingException
    + FullyQualifiedErrorId : NamedParameterNotFound,Microsoft.PowerShell.Commands.GetContentCommand

PS C:\Users\BDXTY>
Qlemo

Sorry, forgot to copy the parens over to EE:
(Get-Content D:\Programs\actual\Folder3\Output\SolidFile.csv) -replace '"\$"', '' -replace '"', '' -replace '(.+)', '"$"$1"$"' | Out-File D:\Programs\actual\Folder3\Output\SolidFile.csv

Open in new window

Also,  be careful, you are overwriting the original file. Dangerous if in a pipe, which reads the file content while it processes (and replaces) it - leading to an empty output file ... The parens take care of that, as file content is first read in completely before getting processed.
motioneye

ASKER
Hi Qlemo.
The script works but somehow it removed all double quote for each column delimeter, from the example below.

"$"Limited  Company "Maxis Segar ""$"

It should be like this

"$"Limited  Company Maxis Segar "$"
⚡ FREE TRIAL OFFER
Try out a week of full access for free.
Find out why thousands trust the EE community with their toughest problems.
Qlemo

With above input it generates the above output. If it does not for you, can you please post an input and output file, as you get it?
motioneye

ASKER
Hi Qlemo
You can try with this input file attached here, What I would like this scripts does for me is to preserve all "$"  but only remove those extra double quote within a delimiter.
input.csv
SOLUTION
Qlemo

THIS SOLUTION ONLY AVAILABLE TO MEMBERS.
View this solution by signing up for a free trial.
Members can start a 7-Day free trial and enjoy unlimited access to the platform.
See Pricing Options
Start Free Trial
GET A PERSONALIZED SOLUTION
Ask your own question & get feedback from real experts
Find out why thousands trust the EE community with their toughest problems.
motioneye

ASKER
Hi,
One more issues that  puzzled me now, I tried with creating many entries in the text file by simply duplicating the 1st line in text file to multiple entry to be exact 1000 duplicate line. I ran the command again, at this time the size of the text file was double, as shown in the screenshot, any reasons why this happened?
Capture.PNG
Experts Exchange has (a) saved my job multiple times, (b) saved me hours, days, and even weeks of work, and often (c) makes me look like a superhero! This place is MAGIC!
Walt Forbes
motioneye

ASKER
Hi Qlemo,
I managed to resolved this, by adding -encoding ASCII, than the file has the same size as the original file.
Btw, don't you mind to explain how this works at every -replace ? what actually it does here ...

-replace '"\$"', '|' -replace '"', '' -replace '\|', '"$"' -replace '(.+)\$', '"$1"$'
ASKER CERTIFIED SOLUTION
Qlemo

THIS SOLUTION ONLY AVAILABLE TO MEMBERS.
View this solution by signing up for a free trial.
Members can start a 7-Day free trial and enjoy unlimited access to the platform.
See Pricing Options
Start Free Trial
⚡ FREE TRIAL OFFER
Try out a week of full access for free.
Find out why thousands trust the EE community with their toughest problems.
motioneye

ASKER
Thanks Qlemo, you always be  nice and helping us here :)