Solved

Process one CSV vs another with reverse wildcard search and numeric calculation and export results to new file

Posted on 2016-07-20
4
88 Views
Last Modified: 2016-08-03
I have 2 csv files: SampleFile.csv and priority.csv, and skills.csv. The first 2 columns in SampleFile.csv are blank and need to be calculated before the results are added in exported into an new csv/xls file. That calculated field needs to return a corresponding SortPriority and SkillPriority value if any part of the word listed in the Title field of SampleFile.csv is in priority.csv (basically a wildcard search on each row in the priority.csv's Keyword column). If there are multiple hits then the numbers need to be added together and returned, if no results returned then value equals 0.

This calculation will have to be done for both the Title Sort Priority and Skill Sort Priority columns


Example 1
Title = Vice President Security
The keywords “Vice President” and “Security” are both hits (would return a positive number with an InString function) in the priority.csv and valued at 2 so the result that should be returned should be 4.

Example 2
Title = IT Ops Manager
There are no matches so the value that will be returned will be 0. If it were listed as “IT Operations” then it would be a hit and the Title Sort Priority would be 1.

So far I have tried using Excel functions, vbScript, and PowerShell to do this but I am either overthinking it or my logic is bad since I can’t figure out how to do it. Now I am reaching out for assistance. I though this would be easy with vlookup but its essentially a reverse vlookup or reverse wildcard search.

I would prefer the solution be in PowerShell but I would be fine with a vbScript or Excel macro. Here are 2 screenshots of a subset of each of the file contents. The bottom image is the desired output from the script.

Screenshot of SampleFile.csv
Screenshot of priority.csv
Desired output that will be written to processed fileSampleFile.csv
priority.csv
0
Comment
Question by:Stormageddon
  • 2
4 Comments
 
LVL 70

Accepted Solution

by:
Chris Dent earned 500 total points
Comment Utility
Here you go, it works against the sample :) All it lacks is an output pipeline, pipeling into Export-Csv after the very last } will deal with that.
# Build a nested hashtable from priorities. If this is the extent of priorities it will be sufficient.
# If the file is a lot bigger something more efficient may be needed.
$priorities = @{}
Import-Csv priority.csv | ForEach-Object {
    if (-not $priorities.Contains($_.KeywordType)) {
        $priorities.Add($_.KeywordType, @{$_.Keyword = $_.SortPriority})
    } else {
        if (-not $priorities[$_.KeywordType].Contains($_.Keyword)) {
            $priorities[$_.KeywordType].Add($_.Keyword, $_.SortPriority)
        }
    }
}

Import-Csv SampleFile.csv | ForEach-Object {
    $entry = $_
    $name = '{0} {1}' -f $entry.'First Name', $entry.'Last Name'

    $priorities.Keys | ForEach-Object {
        $keywordType = $_

        # This gets the value from either the Title or Skills column to test
        $valueToMatch = $entry.$keywordType

        # This returns the priority to set
        $priorityToSet = 0
        $priorities[$keywordType].Keys | ForEach-Object {
            # Match against the defined keyword (regex match)
            if ($valueToMatch -match $_) {
                # This gets the priority value associated with that keyword and the matched keyword
                $priorityToSet += [Int]$priorities[$keywordType][$_]
            }
        }

        # Set the value
        # Note: This trims "s" off the end of the KeywordType value if it is present.
        $entry."$($keywordType.TrimEnd('s')) Sort Priority" = $priorityToSet
    }

    # Leave the modified entry in the output pipeline
    $entry
}

Open in new window

Chris
2
 

Author Comment

by:Stormageddon
Comment Utility
Sorry for the delay in response, I was on vacation. Thank you so much for figuring out the solution. This works great for my needs. I made a few edits and am using the below:
$DateStr = (Get-Date).ToString('MM-dd-yyyy')
$InputFile = Get-OpenFile -initialDirectory "C:\Temp"
$OutputFile = $InputFile.Replace(".csv", "_Processed($DateStr).csv")
$FilePath = Split-Path -Path $InputFile


# Build a nested hashtable from priorities. If this is the extent of priorities it will be sufficient.
# If the file is a lot bigger something more efficient may be needed.
$priorities = @{}
Import-Csv $FilePath\priority.csv | ForEach-Object {
    if (-not $priorities.Contains($_.KeywordType)) {
        $priorities.Add($_.KeywordType, @{$_.Keyword = $_.SortPriority})
    } else {
        if (-not $priorities[$_.KeywordType].Contains($_.Keyword)) {
            $priorities[$_.KeywordType].Add($_.Keyword, $_.SortPriority)
        }
    }
}

Import-Csv $InputFile | ForEach-Object {
    $entry = $_
    $name = '{0} {1}' -f $entry.'First Name', $entry.'Last Name'

    $priorities.Keys | ForEach-Object {
        $keywordType = $_

        # This gets the value from either the Title or Skills column to test
        $valueToMatch = $entry.$keywordType

        # This returns the priority to set
        $priorityToSet = 0
        $priorities[$keywordType].Keys | ForEach-Object {
            # Match against the defined keyword (regex match)
            if ($valueToMatch -match $_) {
                # This gets the priority value associated with that keyword and the matched keyword
                $priorityToSet += [Int]$priorities[$keywordType][$_]
            }
        }

        # Set the value
        # Note: This trims "s" off the end of the KeywordType value if it is present.
        $entry."$($keywordType.TrimEnd('s')) Sort Priority" = $priorityToSet
    }

    # Leave the modified entry in the output pipeline
    $entry | Export-CSV $OutputFile -Append -NoTypeInformation
}



Function Get-FileName($initialDirectory)
{
    [System.Reflection.Assembly]::LoadWithPartialName("System.windows.forms") | Out-Null
    
    $OpenFileDialog = New-Object System.Windows.Forms.OpenFileDialog
    $OpenFileDialog.initialDirectory = $initialDirectory
    $OpenFileDialog.filter = "CSV (*.csv)| *.csv"
    $OpenFileDialog.ShowDialog() | Out-Null
    $OpenFileDialog.filename
}

Open in new window

0
 
LVL 13

Expert Comment

by:Kyle Santos
Comment Utility
Hi Stormageddon,

Hope vacation was great!  Select Chris Dent's solution as the Best Solution to close the question and award him for his efforts.  =)
0
 

Author Closing Comment

by:Stormageddon
Comment Utility
Excellent work!
0

Featured Post

What Should I Do With This Threat Intelligence?

Are you wondering if you actually need threat intelligence? The answer is yes. We explain the basics for creating useful threat intelligence.

Join & Write a Comment

This article descibes how to create a connection between Excel and SAP and how to move data from Excel to SAP or the other way around.
This article explains how to prepare an HTML email signature template file containing dynamic placeholders for users' Azure AD data. Furthermore, it explains how to use this file to remotely set up a department-wide email signature policy in Office …
The view will learn how to download and install SIMTOOLS and FORMLIST into Excel, how to use SIMTOOLS to generate a Monte Carlo simulation of 30 sales calls, and how to calculate the conditional probability based on the results of the Monte Carlo …
Excel styles will make formatting consistent and let you apply and change formatting faster. In this tutorial, you'll learn how to use Excel's built-in styles, how to modify styles, and how to create your own. You'll also learn how to use your custo…

763 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

12 Experts available now in Live!

Get 1:1 Help Now