Solved

Process one CSV vs another with reverse wildcard search and numeric calculation and export results to new file

Posted on 2016-07-20
4
129 Views
Last Modified: 2016-08-03
I have 2 csv files: SampleFile.csv and priority.csv, and skills.csv. The first 2 columns in SampleFile.csv are blank and need to be calculated before the results are added in exported into an new csv/xls file. That calculated field needs to return a corresponding SortPriority and SkillPriority value if any part of the word listed in the Title field of SampleFile.csv is in priority.csv (basically a wildcard search on each row in the priority.csv's Keyword column). If there are multiple hits then the numbers need to be added together and returned, if no results returned then value equals 0.

This calculation will have to be done for both the Title Sort Priority and Skill Sort Priority columns


Example 1
Title = Vice President Security
The keywords “Vice President” and “Security” are both hits (would return a positive number with an InString function) in the priority.csv and valued at 2 so the result that should be returned should be 4.

Example 2
Title = IT Ops Manager
There are no matches so the value that will be returned will be 0. If it were listed as “IT Operations” then it would be a hit and the Title Sort Priority would be 1.

So far I have tried using Excel functions, vbScript, and PowerShell to do this but I am either overthinking it or my logic is bad since I can’t figure out how to do it. Now I am reaching out for assistance. I though this would be easy with vlookup but its essentially a reverse vlookup or reverse wildcard search.

I would prefer the solution be in PowerShell but I would be fine with a vbScript or Excel macro. Here are 2 screenshots of a subset of each of the file contents. The bottom image is the desired output from the script.

Screenshot of SampleFile.csv
Screenshot of priority.csv
Desired output that will be written to processed fileSampleFile.csv
priority.csv
0
Comment
Question by:Stormageddon
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 2
4 Comments
 
LVL 71

Accepted Solution

by:
Chris Dent earned 500 total points
ID: 41722973
Here you go, it works against the sample :) All it lacks is an output pipeline, pipeling into Export-Csv after the very last } will deal with that.
# Build a nested hashtable from priorities. If this is the extent of priorities it will be sufficient.
# If the file is a lot bigger something more efficient may be needed.
$priorities = @{}
Import-Csv priority.csv | ForEach-Object {
    if (-not $priorities.Contains($_.KeywordType)) {
        $priorities.Add($_.KeywordType, @{$_.Keyword = $_.SortPriority})
    } else {
        if (-not $priorities[$_.KeywordType].Contains($_.Keyword)) {
            $priorities[$_.KeywordType].Add($_.Keyword, $_.SortPriority)
        }
    }
}

Import-Csv SampleFile.csv | ForEach-Object {
    $entry = $_
    $name = '{0} {1}' -f $entry.'First Name', $entry.'Last Name'

    $priorities.Keys | ForEach-Object {
        $keywordType = $_

        # This gets the value from either the Title or Skills column to test
        $valueToMatch = $entry.$keywordType

        # This returns the priority to set
        $priorityToSet = 0
        $priorities[$keywordType].Keys | ForEach-Object {
            # Match against the defined keyword (regex match)
            if ($valueToMatch -match $_) {
                # This gets the priority value associated with that keyword and the matched keyword
                $priorityToSet += [Int]$priorities[$keywordType][$_]
            }
        }

        # Set the value
        # Note: This trims "s" off the end of the KeywordType value if it is present.
        $entry."$($keywordType.TrimEnd('s')) Sort Priority" = $priorityToSet
    }

    # Leave the modified entry in the output pipeline
    $entry
}

Open in new window

Chris
2
 

Author Comment

by:Stormageddon
ID: 41741228
Sorry for the delay in response, I was on vacation. Thank you so much for figuring out the solution. This works great for my needs. I made a few edits and am using the below:
$DateStr = (Get-Date).ToString('MM-dd-yyyy')
$InputFile = Get-OpenFile -initialDirectory "C:\Temp"
$OutputFile = $InputFile.Replace(".csv", "_Processed($DateStr).csv")
$FilePath = Split-Path -Path $InputFile


# Build a nested hashtable from priorities. If this is the extent of priorities it will be sufficient.
# If the file is a lot bigger something more efficient may be needed.
$priorities = @{}
Import-Csv $FilePath\priority.csv | ForEach-Object {
    if (-not $priorities.Contains($_.KeywordType)) {
        $priorities.Add($_.KeywordType, @{$_.Keyword = $_.SortPriority})
    } else {
        if (-not $priorities[$_.KeywordType].Contains($_.Keyword)) {
            $priorities[$_.KeywordType].Add($_.Keyword, $_.SortPriority)
        }
    }
}

Import-Csv $InputFile | ForEach-Object {
    $entry = $_
    $name = '{0} {1}' -f $entry.'First Name', $entry.'Last Name'

    $priorities.Keys | ForEach-Object {
        $keywordType = $_

        # This gets the value from either the Title or Skills column to test
        $valueToMatch = $entry.$keywordType

        # This returns the priority to set
        $priorityToSet = 0
        $priorities[$keywordType].Keys | ForEach-Object {
            # Match against the defined keyword (regex match)
            if ($valueToMatch -match $_) {
                # This gets the priority value associated with that keyword and the matched keyword
                $priorityToSet += [Int]$priorities[$keywordType][$_]
            }
        }

        # Set the value
        # Note: This trims "s" off the end of the KeywordType value if it is present.
        $entry."$($keywordType.TrimEnd('s')) Sort Priority" = $priorityToSet
    }

    # Leave the modified entry in the output pipeline
    $entry | Export-CSV $OutputFile -Append -NoTypeInformation
}



Function Get-FileName($initialDirectory)
{
    [System.Reflection.Assembly]::LoadWithPartialName("System.windows.forms") | Out-Null
    
    $OpenFileDialog = New-Object System.Windows.Forms.OpenFileDialog
    $OpenFileDialog.initialDirectory = $initialDirectory
    $OpenFileDialog.filter = "CSV (*.csv)| *.csv"
    $OpenFileDialog.ShowDialog() | Out-Null
    $OpenFileDialog.filename
}

Open in new window

0
 
LVL 16

Expert Comment

by:Kyle Santos
ID: 41741273
Hi Stormageddon,

Hope vacation was great!  Select Chris Dent's solution as the Best Solution to close the question and award him for his efforts.  =)
0
 

Author Closing Comment

by:Stormageddon
ID: 41741338
Excellent work!
0

Featured Post

Edgartown IT Case Study

Learn about Edgartown's quest to ensure the safety and security of the entire town's employee and citizen data. Read the case study!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

With User Account Control (UAC) enabled in Windows 7, one needs to open an elevated Command Prompt in order to run scripts under administrative privileges. Although the elevated Command Prompt accomplishes the task, the question How to run as script…
My attempt to use PowerShell and other great resources found online to simplify the deployment of Office 365 ProPlus client components to any workstation that needs it, regardless of existing Office components that may be needing attention.
This Micro Tutorial demonstrates in Microsoft Excel how to consolidate your marketing data by creating an interactive charts using form controls. This creates cool drop-downs for viewers of your chart to choose from.
This Micro Tutorial will demonstrate how to create pivot charts out of a data set. I also added a drop-down menu which allows to choose from different categories in the data set and the chart will automatically update.

688 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question