Solved

Word List Pivot table not counting correctly.

Posted on 2013-11-12
11
419 Views
Last Modified: 2013-11-14
Using the following macro to generate word counts of a list of words


Sub MakeWordList(wkbk As Excel.Workbook)
    Dim InputSheet As Worksheet
    Dim WordListSheet As Worksheet
    Dim PuncChars As Variant, x As Variant
    Dim i As Long, r As Long
    Dim txt As String
    Dim wordCnt As Long
    Dim AllWords As Range
    Dim PC As PivotCache
    Dim PT As PivotTable
   
    Application.ScreenUpdating = False
    Set InputSheet = Workbooks("Automated Interview Analysis.xlsm").Sheets("owssvr")
 
    Set WordListSheet = wkbk.Worksheets.Add
    WordListSheet.name = "FreqAll"
    WordListSheet.Range("A1") = "All Words"
    WordListSheet.Range("A1").Font.Bold = True
    InputSheet.Activate
    wordCnt = 2
    PuncChars = Array(".", ",", ";", ":", "'", "!", "#", _
        "$", "%", "&", "(", ")", " - ", "_", "--", "+", _
        "=", "~", "/", "\", "{", "}", "[", "]", """", "?", "*")
    r = 1

'   Loop until blank cell is encountered
    Do While Cells(r, 1) <> ""
'       covert to UPPERCASE
        txt = UCase(Cells(r, 1))
'       Remove punctuation
        For i = 0 To UBound(PuncChars)
            txt = Replace(txt, PuncChars(i), "")
        Next i
'       Remove excess spaces
        txt = WorksheetFunction.Trim(txt)
'       Extract the words
        x = Split(txt)
        For i = 0 To UBound(x)
            WordListSheet.Cells(wordCnt, 1) = x(i)
            wordCnt = wordCnt + 1
        Next i
    r = r + 1
    Loop
   
'   Create pivot table
    WordListSheet.Activate
    Set AllWords = Range("A1").CurrentRegion
    Set PC = ActiveWorkbook.PivotCaches.Add _
        (SourceType:=xlDatabase, _
        SourceData:=AllWords)
    Set PT = PC.CreatePivotTable _
        (TableDestination:=Range("C1"), _
        tableName:="PivotTable1")
    With PT
        .AddDataField .PivotFields("All Words")
        .PivotFields("All Words").Orientation = xlRowField
    End With
End Sub


However, when I create th epivot table.  It puts some words twice with the count and so they need to be further summed.  Any reason this is happening?  Can't figure it out.
0
Comment
Question by:HyperBPP
  • 6
  • 5
11 Comments
 
LVL 85

Accepted Solution

by:
Rory Archibald earned 500 total points
ID: 39644184
Hard to be sure without a sample workbook but I suspect you have some non-printing characters that are causing the 'duplicates'.
0
 
LVL 6

Author Comment

by:HyperBPP
ID: 39644987
The document is a governemnt FOUO document so I cannot post it.  How would I identify these non-printing characters?
0
 
LVL 85

Expert Comment

by:Rory Archibald
ID: 39645008
Start by testing the length of the cells in the pivot table row fields (using the LEN function) and see if they are different.
0
Use Case: Protecting a Hybrid Cloud Infrastructure

Microsoft Azure is rapidly becoming the norm in dynamic IT environments. This document describes the challenges that organizations face when protecting data in a hybrid cloud IT environment and presents a use case to demonstrate how Acronis Backup protects all data.

 
LVL 6

Author Comment

by:HyperBPP
ID: 39645066
Tested.  They are coming up different.  If I click in a cell they don't appear to be different.  So how do I get rid of the non-printing characters?
0
 
LVL 85

Expert Comment

by:Rory Archibald
ID: 39645113
If you edit the cell do there appear to be any trailing or leading spaces? Did the data perhaps come from a web page?
0
 
LVL 6

Author Comment

by:HyperBPP
ID: 39645136
If edit the cell there are no leading or trailing spaces.  The data came from a MS word document.  I tried the following function that I found online:

Public Function TrimComplete(ByVal sValue As String) As _
        String

        Dim sAns As String
        Dim sWkg As String
        Dim sChar As String
        Dim lLen As Long
        Dim lCtr As Long

        sAns = sValue
        lLen = Len(sValue)

        If lLen > 0 Then
            'Ltrim
            For lCtr = 1 To lLen
                sChar = Mid(sAns, lCtr, 1)
                If (Asc(sChar) > 32) and (Asc(sChar) < 127) Then Exit For
            Next

            sAns = Mid(sAns, lCtr)
            lLen = Len(sAns)

            'Rtrim
            If lLen > 0 Then
                For lCtr = lLen To 1 Step -1
                    sChar = Mid(sAns, lCtr, 1)
                    If (Asc(sChar) > 32) and (Asc(sChar) < 127) Then Exit For
                Next
            End If
            sAns = Left$(sAns, lCtr)
        End If

        TrimComplete = sAns

    End Function

It collapsed a couple of the words.   However, there appear to still be a few non-printable characters after this function runs.  It fixed some of though.
0
 
LVL 85

Expert Comment

by:Rory Archibald
ID: 39645336
Can you test this version on a copy of your workbook?

Sub MakeWordList(wkbk As Excel.Workbook)
   Dim strPattern                  As String
   Dim RegExp                      As Object
   Dim InputSheet                  As Worksheet
   Dim WordListSheet               As Worksheet
   Dim x                           As Variant
   Dim i                           As Long
   Dim r                           As Long
   Dim txt                         As String
   Dim wordCnt                     As Long
   Dim AllWords                    As Range
   Dim PC                          As PivotCache
   Dim PT                          As PivotTable

   Application.ScreenUpdating = False
   '    Set InputSheet = Workbooks("Automated Interview Analysis.xlsm").Sheets("owssvr")
   Set InputSheet = ActiveWorkbook.Sheets(1)
   Set WordListSheet = wkbk.Worksheets.Add
   Set RegExp = CreateObject("vbscript.regexp")
   strPattern = "([^A-Za-z0-9 ]+)"
   With RegExp
      .Pattern = strPattern
      .IgnoreCase = False
   End With
   WordListSheet.Name = "FreqAll"
   WordListSheet.Range("A1") = "All Words"
   WordListSheet.Range("A1").Font.Bold = True
   InputSheet.Activate
   r = 1

   '   Loop until blank cell is encountered
   Do While Cells(r, 1) <> ""
      '       covert to UPPERCASE
      txt = UCase(Cells(r, 1).Value)
      '       clean text
      txt = RegExp.Replace(txt, "")
      '       Remove excess spaces
      txt = WorksheetFunction.Trim(txt)
      '       Extract the words
      x = Split(txt)
      For i = 0 To UBound(x)
         WordListSheet.Cells(wordCnt, 1) = x(i)
         wordCnt = wordCnt + 1
      Next i
      r = r + 1
   Loop

   '   Create pivot table
   WordListSheet.Activate
   Set AllWords = Range("A1").CurrentRegion
   Set PC = ActiveWorkbook.PivotCaches.Add _
            (SourceType:=xlDatabase, _
             SourceData:=AllWords)
   Set PT = PC.CreatePivotTable _
            (TableDestination:=Range("C1"), _
             TableName:="PivotTable1")
   With PT
      .AddDataField .PivotFields("All Words")
      .PivotFields("All Words").Orientation = xlRowField
   End With
End Sub

Open in new window

0
 
LVL 6

Author Comment

by:HyperBPP
ID: 39645365
Throws and error at line 42.... Application defined or object defined error
0
 
LVL 85

Expert Comment

by:Rory Archibald
ID: 39645416
Oops - too much hack and slash. Add back:
wordCnt = 2

Open in new window

before line 28
0
 
LVL 6

Author Comment

by:HyperBPP
ID: 39645425
Does only a few words, spits out 1's for those words and then stops.  No errors.  Just very incomplete.
0
 
LVL 6

Author Closing Comment

by:HyperBPP
ID: 39648589
This was the case.  Ended up using the Clean() function to recitfy.
0

Featured Post

Free Tool: IP Lookup

Get more info about an IP address or domain name, such as organization, abuse contacts and geolocation.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
SharePoint 2013 URL Duplication Issue 5 24
Excel if statement 3 15
Clear a Text Box 7 23
Add Attendee Macro need not halt for Acknowledgement 12 19
Deploying a Microsoft Access application in a Citrix environment is not difficult but takes a few steps. However, Citrix system people are often of little help, as they typically know next to nothing about Access. The script provided here will take …
Do you use a spreadsheet like Microsoft's Excel?  Have you ever wanted to link out to a non excel file on your computer or network drive?  This is the way I found to do it!
The viewer will learn how to create a normally distributed random variable in Excel, use a normal distribution to simulate the return on an investment over a period of years, Create a Monte Carlo simulation using a normal random variable, and calcul…
The viewer will learn how to create two correlated normally distributed random variables in Excel, use a normal distribution to simulate the return on different levels of investment in each of the two funds over a period of ten years, and, create a …

792 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question