?
Solved

Process one CSV vs another with reverse wildcard search and numeric calculation and export results to new file

Posted on 2016-07-20
4
Medium Priority
?
146 Views
Last Modified: 2016-08-03
I have 2 csv files: SampleFile.csv and priority.csv, and skills.csv. The first 2 columns in SampleFile.csv are blank and need to be calculated before the results are added in exported into an new csv/xls file. That calculated field needs to return a corresponding SortPriority and SkillPriority value if any part of the word listed in the Title field of SampleFile.csv is in priority.csv (basically a wildcard search on each row in the priority.csv's Keyword column). If there are multiple hits then the numbers need to be added together and returned, if no results returned then value equals 0.

This calculation will have to be done for both the Title Sort Priority and Skill Sort Priority columns


Example 1
Title = Vice President Security
The keywords “Vice President” and “Security” are both hits (would return a positive number with an InString function) in the priority.csv and valued at 2 so the result that should be returned should be 4.

Example 2
Title = IT Ops Manager
There are no matches so the value that will be returned will be 0. If it were listed as “IT Operations” then it would be a hit and the Title Sort Priority would be 1.

So far I have tried using Excel functions, vbScript, and PowerShell to do this but I am either overthinking it or my logic is bad since I can’t figure out how to do it. Now I am reaching out for assistance. I though this would be easy with vlookup but its essentially a reverse vlookup or reverse wildcard search.

I would prefer the solution be in PowerShell but I would be fine with a vbScript or Excel macro. Here are 2 screenshots of a subset of each of the file contents. The bottom image is the desired output from the script.

Screenshot of SampleFile.csv
Screenshot of priority.csv
Desired output that will be written to processed fileSampleFile.csv
priority.csv
0
Comment
Question by:Stormageddon
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 2
4 Comments
 
LVL 71

Accepted Solution

by:
Chris Dent earned 2000 total points
ID: 41722973
Here you go, it works against the sample :) All it lacks is an output pipeline, pipeling into Export-Csv after the very last } will deal with that.
# Build a nested hashtable from priorities. If this is the extent of priorities it will be sufficient.
# If the file is a lot bigger something more efficient may be needed.
$priorities = @{}
Import-Csv priority.csv | ForEach-Object {
    if (-not $priorities.Contains($_.KeywordType)) {
        $priorities.Add($_.KeywordType, @{$_.Keyword = $_.SortPriority})
    } else {
        if (-not $priorities[$_.KeywordType].Contains($_.Keyword)) {
            $priorities[$_.KeywordType].Add($_.Keyword, $_.SortPriority)
        }
    }
}

Import-Csv SampleFile.csv | ForEach-Object {
    $entry = $_
    $name = '{0} {1}' -f $entry.'First Name', $entry.'Last Name'

    $priorities.Keys | ForEach-Object {
        $keywordType = $_

        # This gets the value from either the Title or Skills column to test
        $valueToMatch = $entry.$keywordType

        # This returns the priority to set
        $priorityToSet = 0
        $priorities[$keywordType].Keys | ForEach-Object {
            # Match against the defined keyword (regex match)
            if ($valueToMatch -match $_) {
                # This gets the priority value associated with that keyword and the matched keyword
                $priorityToSet += [Int]$priorities[$keywordType][$_]
            }
        }

        # Set the value
        # Note: This trims "s" off the end of the KeywordType value if it is present.
        $entry."$($keywordType.TrimEnd('s')) Sort Priority" = $priorityToSet
    }

    # Leave the modified entry in the output pipeline
    $entry
}

Open in new window

Chris
2
 

Author Comment

by:Stormageddon
ID: 41741228
Sorry for the delay in response, I was on vacation. Thank you so much for figuring out the solution. This works great for my needs. I made a few edits and am using the below:
$DateStr = (Get-Date).ToString('MM-dd-yyyy')
$InputFile = Get-OpenFile -initialDirectory "C:\Temp"
$OutputFile = $InputFile.Replace(".csv", "_Processed($DateStr).csv")
$FilePath = Split-Path -Path $InputFile


# Build a nested hashtable from priorities. If this is the extent of priorities it will be sufficient.
# If the file is a lot bigger something more efficient may be needed.
$priorities = @{}
Import-Csv $FilePath\priority.csv | ForEach-Object {
    if (-not $priorities.Contains($_.KeywordType)) {
        $priorities.Add($_.KeywordType, @{$_.Keyword = $_.SortPriority})
    } else {
        if (-not $priorities[$_.KeywordType].Contains($_.Keyword)) {
            $priorities[$_.KeywordType].Add($_.Keyword, $_.SortPriority)
        }
    }
}

Import-Csv $InputFile | ForEach-Object {
    $entry = $_
    $name = '{0} {1}' -f $entry.'First Name', $entry.'Last Name'

    $priorities.Keys | ForEach-Object {
        $keywordType = $_

        # This gets the value from either the Title or Skills column to test
        $valueToMatch = $entry.$keywordType

        # This returns the priority to set
        $priorityToSet = 0
        $priorities[$keywordType].Keys | ForEach-Object {
            # Match against the defined keyword (regex match)
            if ($valueToMatch -match $_) {
                # This gets the priority value associated with that keyword and the matched keyword
                $priorityToSet += [Int]$priorities[$keywordType][$_]
            }
        }

        # Set the value
        # Note: This trims "s" off the end of the KeywordType value if it is present.
        $entry."$($keywordType.TrimEnd('s')) Sort Priority" = $priorityToSet
    }

    # Leave the modified entry in the output pipeline
    $entry | Export-CSV $OutputFile -Append -NoTypeInformation
}



Function Get-FileName($initialDirectory)
{
    [System.Reflection.Assembly]::LoadWithPartialName("System.windows.forms") | Out-Null
    
    $OpenFileDialog = New-Object System.Windows.Forms.OpenFileDialog
    $OpenFileDialog.initialDirectory = $initialDirectory
    $OpenFileDialog.filter = "CSV (*.csv)| *.csv"
    $OpenFileDialog.ShowDialog() | Out-Null
    $OpenFileDialog.filename
}

Open in new window

0
 
LVL 17

Expert Comment

by:Kyle Santos
ID: 41741273
Hi Stormageddon,

Hope vacation was great!  Select Chris Dent's solution as the Best Solution to close the question and award him for his efforts.  =)
0
 

Author Closing Comment

by:Stormageddon
ID: 41741338
Excellent work!
0

Featured Post

WatchGuard's M Series Appliances - Miecom Approved

WatchGuard's newest M series appliances were put to the test by Miercom.  We had great results and outperformed all of our competitors in both stateless and stateful traffic throghput scenarios! Ready to see how your UTM appliance stacked up? Download the Miercom Report!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Auditing domain password hashes is a commonly overlooked but critical requirement to ensuring secure passwords practices are followed. Methods exist to extract hashes directly for a live domain however this article describes a process to extract u…
There are times when we need to generate a report on the inbox rules, where users have set up forwarding externally in their mailbox. In this article, I will be sharing a script I wrote to generate the report in CSV format.
This Micro Tutorial will demonstrate on a Mac how to change the sort order for chart legend values and decrpyt the intimidating chart menu.
This Micro Tutorial will demonstrate how to use longer labels with horizontal bar charts instead of the vertical column chart.

770 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question