Solved

Process one CSV vs another with reverse wildcard search and numeric calculation and export results to new file

Posted on 2016-07-20
4
96 Views
Last Modified: 2016-08-03
I have 2 csv files: SampleFile.csv and priority.csv, and skills.csv. The first 2 columns in SampleFile.csv are blank and need to be calculated before the results are added in exported into an new csv/xls file. That calculated field needs to return a corresponding SortPriority and SkillPriority value if any part of the word listed in the Title field of SampleFile.csv is in priority.csv (basically a wildcard search on each row in the priority.csv's Keyword column). If there are multiple hits then the numbers need to be added together and returned, if no results returned then value equals 0.

This calculation will have to be done for both the Title Sort Priority and Skill Sort Priority columns


Example 1
Title = Vice President Security
The keywords “Vice President” and “Security” are both hits (would return a positive number with an InString function) in the priority.csv and valued at 2 so the result that should be returned should be 4.

Example 2
Title = IT Ops Manager
There are no matches so the value that will be returned will be 0. If it were listed as “IT Operations” then it would be a hit and the Title Sort Priority would be 1.

So far I have tried using Excel functions, vbScript, and PowerShell to do this but I am either overthinking it or my logic is bad since I can’t figure out how to do it. Now I am reaching out for assistance. I though this would be easy with vlookup but its essentially a reverse vlookup or reverse wildcard search.

I would prefer the solution be in PowerShell but I would be fine with a vbScript or Excel macro. Here are 2 screenshots of a subset of each of the file contents. The bottom image is the desired output from the script.

Screenshot of SampleFile.csv
Screenshot of priority.csv
Desired output that will be written to processed fileSampleFile.csv
priority.csv
0
Comment
Question by:Stormageddon
  • 2
4 Comments
 
LVL 70

Accepted Solution

by:
Chris Dent earned 500 total points
ID: 41722973
Here you go, it works against the sample :) All it lacks is an output pipeline, pipeling into Export-Csv after the very last } will deal with that.
# Build a nested hashtable from priorities. If this is the extent of priorities it will be sufficient.
# If the file is a lot bigger something more efficient may be needed.
$priorities = @{}
Import-Csv priority.csv | ForEach-Object {
    if (-not $priorities.Contains($_.KeywordType)) {
        $priorities.Add($_.KeywordType, @{$_.Keyword = $_.SortPriority})
    } else {
        if (-not $priorities[$_.KeywordType].Contains($_.Keyword)) {
            $priorities[$_.KeywordType].Add($_.Keyword, $_.SortPriority)
        }
    }
}

Import-Csv SampleFile.csv | ForEach-Object {
    $entry = $_
    $name = '{0} {1}' -f $entry.'First Name', $entry.'Last Name'

    $priorities.Keys | ForEach-Object {
        $keywordType = $_

        # This gets the value from either the Title or Skills column to test
        $valueToMatch = $entry.$keywordType

        # This returns the priority to set
        $priorityToSet = 0
        $priorities[$keywordType].Keys | ForEach-Object {
            # Match against the defined keyword (regex match)
            if ($valueToMatch -match $_) {
                # This gets the priority value associated with that keyword and the matched keyword
                $priorityToSet += [Int]$priorities[$keywordType][$_]
            }
        }

        # Set the value
        # Note: This trims "s" off the end of the KeywordType value if it is present.
        $entry."$($keywordType.TrimEnd('s')) Sort Priority" = $priorityToSet
    }

    # Leave the modified entry in the output pipeline
    $entry
}

Open in new window

Chris
2
 

Author Comment

by:Stormageddon
ID: 41741228
Sorry for the delay in response, I was on vacation. Thank you so much for figuring out the solution. This works great for my needs. I made a few edits and am using the below:
$DateStr = (Get-Date).ToString('MM-dd-yyyy')
$InputFile = Get-OpenFile -initialDirectory "C:\Temp"
$OutputFile = $InputFile.Replace(".csv", "_Processed($DateStr).csv")
$FilePath = Split-Path -Path $InputFile


# Build a nested hashtable from priorities. If this is the extent of priorities it will be sufficient.
# If the file is a lot bigger something more efficient may be needed.
$priorities = @{}
Import-Csv $FilePath\priority.csv | ForEach-Object {
    if (-not $priorities.Contains($_.KeywordType)) {
        $priorities.Add($_.KeywordType, @{$_.Keyword = $_.SortPriority})
    } else {
        if (-not $priorities[$_.KeywordType].Contains($_.Keyword)) {
            $priorities[$_.KeywordType].Add($_.Keyword, $_.SortPriority)
        }
    }
}

Import-Csv $InputFile | ForEach-Object {
    $entry = $_
    $name = '{0} {1}' -f $entry.'First Name', $entry.'Last Name'

    $priorities.Keys | ForEach-Object {
        $keywordType = $_

        # This gets the value from either the Title or Skills column to test
        $valueToMatch = $entry.$keywordType

        # This returns the priority to set
        $priorityToSet = 0
        $priorities[$keywordType].Keys | ForEach-Object {
            # Match against the defined keyword (regex match)
            if ($valueToMatch -match $_) {
                # This gets the priority value associated with that keyword and the matched keyword
                $priorityToSet += [Int]$priorities[$keywordType][$_]
            }
        }

        # Set the value
        # Note: This trims "s" off the end of the KeywordType value if it is present.
        $entry."$($keywordType.TrimEnd('s')) Sort Priority" = $priorityToSet
    }

    # Leave the modified entry in the output pipeline
    $entry | Export-CSV $OutputFile -Append -NoTypeInformation
}



Function Get-FileName($initialDirectory)
{
    [System.Reflection.Assembly]::LoadWithPartialName("System.windows.forms") | Out-Null
    
    $OpenFileDialog = New-Object System.Windows.Forms.OpenFileDialog
    $OpenFileDialog.initialDirectory = $initialDirectory
    $OpenFileDialog.filter = "CSV (*.csv)| *.csv"
    $OpenFileDialog.ShowDialog() | Out-Null
    $OpenFileDialog.filename
}

Open in new window

0
 
LVL 14

Expert Comment

by:Kyle Santos
ID: 41741273
Hi Stormageddon,

Hope vacation was great!  Select Chris Dent's solution as the Best Solution to close the question and award him for his efforts.  =)
0
 

Author Closing Comment

by:Stormageddon
ID: 41741338
Excellent work!
0

Featured Post

Is Your Active Directory as Secure as You Think?

More than 75% of all records are compromised because of the loss or theft of a privileged credential. Experts have been exploring Active Directory infrastructure to identify key threats and establish best practices for keeping data safe. Attend this month’s webinar to learn more.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

"Migrate" an SMTP relay receive connector to a new server using info from an old server.
A brief introduction to what I consider to be the best editor for PowerShell.
The viewer will learn how to simulate a series of sales calls dependent on a single skill level and learn how to simulate a series of sales calls dependent on two skill levels. Simulating Independent Sales Calls: Enter .75 into cell C2 – “skill leve…
The viewer will learn how to create two correlated normally distributed random variables in Excel, use a normal distribution to simulate the return on different levels of investment in each of the two funds over a period of ten years, and, create a …

863 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

20 Experts available now in Live!

Get 1:1 Help Now