RichardPWolf
asked on
Remove duplicate value in header of csv
I'm trying to create a monthly report and I've run into a stumbling block.
Setup; each month I get an exported .csv from a security appliance and it has a lengthy csv file.
On the header line I've got 2 cells that are very similar "DestinationLogonID" and "DestinationLogonid". As you can see the "only" difference is the case of the last 2 characters.
import-csv won't work as it sees these 2 header items as identical.
Now the good part is I don't need either of these columns as they are blank so either renaming one of them or deleting one or both columns are an option.
Using the -header option isn't really an option as there are about 147 columns.
My preferred method would be to rename one of the offending column headers.
I can read the file using the Get-Content and that "seems" to be the way to get the data into an array but that's where I'm stuck.
Here's where I'd like to go with this script;
-> Read input file and in this process rename offending header
-> Process specific columns into my report.
-> Write output file for report.
At this point I only need help with the first step as the others are more or less canned scripts that I've already created.
Thanks.
Richard
Setup; each month I get an exported .csv from a security appliance and it has a lengthy csv file.
On the header line I've got 2 cells that are very similar "DestinationLogonID" and "DestinationLogonid". As you can see the "only" difference is the case of the last 2 characters.
import-csv won't work as it sees these 2 header items as identical.
Now the good part is I don't need either of these columns as they are blank so either renaming one of them or deleting one or both columns are an option.
Using the -header option isn't really an option as there are about 147 columns.
My preferred method would be to rename one of the offending column headers.
I can read the file using the Get-Content and that "seems" to be the way to get the data into an array but that's where I'm stuck.
Here's where I'd like to go with this script;
-> Read input file and in this process rename offending header
-> Process specific columns into my report.
-> Write output file for report.
At this point I only need help with the first step as the others are more or less canned scripts that I've already created.
Thanks.
Richard
ASKER
I suppose but I was hoping for a search and replace as there are at least 147 headers and I didn't want to make the code "too" unwieldy. But maybe I'm making a mountain out of a molehill. One difficulty is some of these headers have a rather long name.
Brute-force approach: get the content, replace one of the offending names in the first element, convert from csv to PS objects, and filter out the offending columns:
$Content = Get-Content C:\Temp\test.csv
$Content[0] = $Content[0].Replace("DestinationLogonid", "DestLogonId")
$Csv = $Content | ConvertFrom-Csv | Select-Object -Property * -ExcludeProperty "DestinationLogonID", "DestLogonId"
ASKER
OK, looks good but one question since powershell is generally case insensitive how do I handle the two headers that are identical except for the case?
The .replace method is case sensitive.
PS C:\> $array
Test
Three
PS C:\> $array.replace("Three","One")
Test
One
PS C:\> $array.replace("three","One")
Test
Three
PS C:\>
Unlike the "-replace" operator, the "Replace()" method of a string object is case sensitive, so line 2 in the script above will only replace the "DestinationLogonid" with the lower case "id" with "DestLogonId"; "DestinationLogonID" will remain untouched.
From then on, the header is okay and the content can be converted to an object array.
From then on, the header is okay and the content can be converted to an object array.
ASKER
Cool.
did just run it and this is what I got. I did change one variable to go with my script $filecontent.
ConvertFrom-Csv : The member "DestinationLogonId" is already present.
At E:\Scripts\VPN Report\Get-VPNReport.ps1:6 1 char:24
+ $Csv = $FileContent | ConvertFrom-Csv | Select-Object -Property * -ExcludePrope ...
+ ~~~~~~~~~~~~~~~
+ CategoryInfo : NotSpecified: (:) [ConvertFrom-Csv], ExtendedTypeSystemExceptio n
+ FullyQualifiedErrorId : AlreadyPresentPSMemberInfo InternalCo llectionAd d,Microsof t.PowerShe ll.Command s.ConvertF ro
mCsvCommand
did just run it and this is what I got. I did change one variable to go with my script $filecontent.
ConvertFrom-Csv : The member "DestinationLogonId" is already present.
At E:\Scripts\VPN Report\Get-VPNReport.ps1:6
+ $Csv = $FileContent | ConvertFrom-Csv | Select-Object -Property * -ExcludePrope ...
+ ~~~~~~~~~~~~~~~
+ CategoryInfo : NotSpecified: (:) [ConvertFrom-Csv], ExtendedTypeSystemExceptio
+ FullyQualifiedErrorId : AlreadyPresentPSMemberInfo
mCsvCommand
Try manually running the 1st 2 lines oDbA provided, then "$content.[0]" and see if the target header was in fact replaced as intended.
The column in the error message is called "DestinationLogonId" with an uppercase "I" and a lowercase "d".
My script has the "DestinationLogonid" with a lowercase "i" and a lowercase "d" as per your first comment.
My script has the "DestinationLogonid" with a lowercase "i" and a lowercase "d" as per your first comment.
$Content = Get-Content C:\Temp\test.csv
$Content[0] = $Content[0].Replace("DestinationLogonId", "DestLogonId")
$Csv = $Content | ConvertFrom-Csv | Select-Object -Property * -ExcludeProperty "DestinationLogonID", "DestLogonId"
ASKER
One thing I just tried was to use the "first" iteration of the duplicate header which is DestinationLogonID and that allowed the script to run.
For testing purposes I added this -> Export-CSV .\vpn-outfile2.csv -notype but didn't get any data in the output file.
For testing purposes I added this -> Export-CSV .\vpn-outfile2.csv -notype but didn't get any data in the output file.
ASKER
Ahhh. Let me try that.
ASKER
OK, that fixed the original script.
Output file only has two elements A1=Length and A2 = 15.
Output file only has two elements A1=Length and A2 = 15.
Everything's running here as expected.
Here's the sophisticated csv I used:
Here's the sophisticated csv I used:
"Header1","Header2","DestinationLogonID","DestinationLogonId","Header5"
"Cola1","Cola2","Cola3","COLa4","Cola5"
"Colb1","Colb2","Colb3","COLb4","Colb5"
"Colc1","Colc2","Colc3","COLc4","Colc5"
Here's the script:$Content = Get-Content C:\Temp\test.csv
$Content[0] = $Content[0].Replace("DestinationLogonId", "DestLogonId")
$Csv = $Content | ConvertFrom-Csv | Select-Object -Property * -ExcludeProperty "DestinationLogonID", "DestLogonId"
$Csv | Export-Csv C:\Temp\Outfile.csv -NoTypeInformation
And here's the content of Outfile.csv:"Header1","Header2","Header5"
"Cola1","Cola2","Cola5"
"Colb1","Colb2","Colb5"
"Colc1","Colc2","Colc5"
I'll be offline for a few hours, so I'm not ignoring you if you don't hear from me ...
ASKER
Hmmmm. Here's my information. As you can see there are a LOT of headers.
Header line from csv -> Event Name EventInfo InsertionIP Manager DetectionIP InsertionTime DetectionTime Severity ToolAlias InferenceRule ProviderSID ExtraneousInfo UniqueID AccessGranted AccessProperties AccessRequested AccessTarget ActionTaken AgentAddress AgentUid AgentUniqueId AgentVersion AlertActivityType ApplicationName ApplicationVersion AuthPackage BIOSVersion BootCleaned BootSectorInfected BootSectorsScanned Category ChangeDetails ChangeType Component ConnectionName ConnectionStatus DN Description DestinationAccount DestinationAccountType DestinationDomain DestinationDomainID DestinationDomainType DestinationHandleID DestinationLogonID DestinationLogonId DestinationMACAddress DestinationMachine DestinationPort DestinationProcessID Detail DisplayName DomainMember DomainName DumpFile ElementName ElementProperty EventMessage FailureCount FailurePoint FailureReason FastPack FeaturePackage FileHandleID FileName FilesCleaned FilesDeleted FilesInfected FilesLeftAlone FilesQuarantined FilesScanned Filter GroupMember GroupName GroupType ImageFile InfectedFile InfoMessage InformationName InstallOperation Interface Interfaces InternalUserName LicenseMaximum LicensedTo LinkName ListName Locale LogonProcess LogonType Machine MachineType ManagerVersion Manufacturer MechanismName MechanismType MemberID MemberName MemoryCleaned MemoryInfected Message ModType Mode Model Modification ModificationType NewPropertyValue NewVersion ObjectHandleID ObjectName ObjectServer ObjectType OldPropertyValue OldVersion OperationID OperationType OriginalAccount OriginalAccountID PackVersion Parameters ParentPID PatchId PeerIdentity PolicyName Port PrimaryActionAttempt PrivilegesExercised PrivilegesUsed ProcessID ProductId ProfileConfiguration ProfileName Protocol QueryCommand References RegistryKeyHandleID RegistryKeyName RiskFactor ScanMessage ScanType SecondaryActionAttempt Service ServiceName ServingProcess SetCommand SignatureName SoftwareDate SoftwarePackage SoftwareSource SourceAccount SourceDomain SourceHandleID SourceLogonID SourceMACAddress SourceMachine SourcePort SourceProcessID StackTrace StartMessage StatusLevel StatusMessage StopCondition StopMessage SuggestedSolution TargetMachineList ToolID TrojanName URL Version VirusDetected VirusName WarningMessage
Code at this point ->
$FileContent = Get-Content .\vpn-infile.csv
$FileContent[0] = $FileContent[0].Replace("D estination Logonid", "DestLogonid")
$Csv = $FileContent | ConvertFrom-Csv | Select-Object -Property * -ExcludeProperty "DestinationLogonID", "DestLogonId"
Export-CSV .\vpn-outfile2.csv -notype
Results as before A1 = Length, A2 = 15
Header line from csv -> Event Name EventInfo InsertionIP Manager DetectionIP InsertionTime DetectionTime Severity ToolAlias InferenceRule ProviderSID ExtraneousInfo UniqueID AccessGranted AccessProperties AccessRequested AccessTarget ActionTaken AgentAddress AgentUid AgentUniqueId AgentVersion AlertActivityType ApplicationName ApplicationVersion AuthPackage BIOSVersion BootCleaned BootSectorInfected BootSectorsScanned Category ChangeDetails ChangeType Component ConnectionName ConnectionStatus DN Description DestinationAccount DestinationAccountType DestinationDomain DestinationDomainID DestinationDomainType DestinationHandleID DestinationLogonID DestinationLogonId DestinationMACAddress DestinationMachine DestinationPort DestinationProcessID Detail DisplayName DomainMember DomainName DumpFile ElementName ElementProperty EventMessage FailureCount FailurePoint FailureReason FastPack FeaturePackage FileHandleID FileName FilesCleaned FilesDeleted FilesInfected FilesLeftAlone FilesQuarantined FilesScanned Filter GroupMember GroupName GroupType ImageFile InfectedFile InfoMessage InformationName InstallOperation Interface Interfaces InternalUserName LicenseMaximum LicensedTo LinkName ListName Locale LogonProcess LogonType Machine MachineType ManagerVersion Manufacturer MechanismName MechanismType MemberID MemberName MemoryCleaned MemoryInfected Message ModType Mode Model Modification ModificationType NewPropertyValue NewVersion ObjectHandleID ObjectName ObjectServer ObjectType OldPropertyValue OldVersion OperationID OperationType OriginalAccount OriginalAccountID PackVersion Parameters ParentPID PatchId PeerIdentity PolicyName Port PrimaryActionAttempt PrivilegesExercised PrivilegesUsed ProcessID ProductId ProfileConfiguration ProfileName Protocol QueryCommand References RegistryKeyHandleID RegistryKeyName RiskFactor ScanMessage ScanType SecondaryActionAttempt Service ServiceName ServingProcess SetCommand SignatureName SoftwareDate SoftwarePackage SoftwareSource SourceAccount SourceDomain SourceHandleID SourceLogonID SourceMACAddress SourceMachine SourcePort SourceProcessID StackTrace StartMessage StatusLevel StatusMessage StopCondition StopMessage SuggestedSolution TargetMachineList ToolID TrojanName URL Version VirusDetected VirusName WarningMessage
Code at this point ->
$FileContent = Get-Content .\vpn-infile.csv
$FileContent[0] = $FileContent[0].Replace("D
$Csv = $FileContent | ConvertFrom-Csv | Select-Object -Property * -ExcludeProperty "DestinationLogonID", "DestLogonId"
Export-CSV .\vpn-outfile2.csv -notype
Results as before A1 = Length, A2 = 15
ASKER
Got it. missing a pipe before export-csv.
ASKER
It actually looks like it deleted both columns. Not a problem but I will need to look more closely at the output. Will save that for tomorrow.
So far Thanks.
So far Thanks.
You excluded both columns with your Select statement.
It may be unnecessary to post this, since oDbA's solution might do everything you need, but I had started it already, so...
Here's something that doesn't require you to know the name of the duplicate headers. It just appends a "0" to each duplicate header it finds. Because it works recursively, it doesn't care how many duplicates there are, it'll just keep appending more "0"s. Not real pretty, but should handle whether your fields in the .CSV are surrounded by quotes or not.
It may be unnecessary to post this, since oDbA's solution might do everything you need, but I had started it already, so...
Here's something that doesn't require you to know the name of the duplicate headers. It just appends a "0" to each duplicate header it finds. Because it works recursively, it doesn't care how many duplicates there are, it'll just keep appending more "0"s. Not real pretty, but should handle whether your fields in the .CSV are surrounded by quotes or not.
function unique-header ($header)
{
if ($headers -contains $header)
{
$header = """$($header -replace '"')$i"""
unique-header $header
}
Else
{
$Global:headers += $header
$header
}
}
$data = Get-Content somefile.csv
$data[0] = ($data[0] -split "," | % -begin {$headers = @(); $i=0 } -process { unique-header $_ }) -join ","
$data | ConvertFrom-Csv
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
Thanks to both oBdA and footech. I'll work with both of these and see which will work better.
-oBda - I see now. What I didn't understand was the -ExcludeProperty statement. I've been working with Powershell for about a year now but still have much to learn.
-footech- I like the method you have presented as I don't know if any of these columns will ever have anything that I need.
I'll work with both these ideas and let everyone know later today.
Thanks.
-oBda - I see now. What I didn't understand was the -ExcludeProperty statement. I've been working with Powershell for about a year now but still have much to learn.
-footech- I like the method you have presented as I don't know if any of these columns will ever have anything that I need.
I'll work with both these ideas and let everyone know later today.
Thanks.
ASKER
-footech- Sorry but still same problem;
ConvertFrom-Csv : The member "DestinationLogonId" is already present.
At E:\Scripts\VPN Report\Get-VPNReport.ps1:9 0 char:9
+ $data | ConvertFrom-Csv | Export-CSV .\vpn-outfile2.csv -notype
+ ~~~~~~~~~~~~~~~
+ CategoryInfo : NotSpecified: (:) [ConvertFrom-Csv], ExtendedTypeSystemExceptio n
+ FullyQualifiedErrorId : AlreadyPresentPSMemberInfo InternalCo llectionAd d,Microsof t.PowerShe ll.Command s.ConvertF ro
mCsvCommand
ConvertFrom-Csv : The member "DestinationLogonId" is already present.
At E:\Scripts\VPN Report\Get-VPNReport.ps1:9
+ $data | ConvertFrom-Csv | Export-CSV .\vpn-outfile2.csv -notype
+ ~~~~~~~~~~~~~~~
+ CategoryInfo : NotSpecified: (:) [ConvertFrom-Csv], ExtendedTypeSystemExceptio
+ FullyQualifiedErrorId : AlreadyPresentPSMemberInfo
mCsvCommand
Hmmm... I can't replicate your error with my code. I even tested with the big list that you have above.
I did make one change to handle quotes better.
One thing it wouldn't handle would be if your .CSV wasn't using a comma as a delimiter, or if there were commas inside a field.
I did make one change to handle quotes better.
function unique-header ($header)
{
if ($headers -contains $header)
{
$noquote = $header -replace '"'
$header = $header -replace "$noquote","$noquote$i"
unique-header $header
}
Else
{
$Global:headers += $header
$header
}
}
One thing it wouldn't handle would be if your .CSV wasn't using a comma as a delimiter, or if there were commas inside a field.
ASKER
Thanks, I'll try it out.
ASKER
Thank you to both of you but in the end I've used oDbA solution as it worked out better for my situation. Definitely opened up possibilities with the .Replace object as I've never seen that before.
Again thank you both.
Richard
Again thank you both.
Richard
$array = "One","Two","Three"
$array2 = $array.replace("Two", "Four")
or to address the specific element,
$array2 = $array[2].replace("Three",