Convert array to a dictionary of counts?

Posted on 2006-07-10
Last Modified: 2012-08-14
Is there a built-in function that converts an array of strings into a dictionary, where each key is a unique element in the array, and each value is the number of times that the corresponding key appeared in the array? I wrote such a function (below), but I would prefer to use either a built-in function, or at least a function from a common library.

def setToMap(array):
    dict = {}
    for element in array:
        if element in dict:
            dict[element] += 1
            dict[element] = 1
    return dict

Question by:BerkeleyJeff
LVL 14

Expert Comment

ID: 17071618
No, I don't believe there's a built-in function to do this.

Most people build up their own module of such functions, and import that module into each of their projects.
LVL 28

Expert Comment

ID: 17072340
In fact, you are implementing the multi-set (or bag). Using the dictionary type, which is very efficient in Python, is fine for that. Even the Sets module--the predecessor of built-in set type--implemented the sets  using the dictionary type. The comment says "This module implements sets using dictionaries whose values are ignored."

I personally would call the function bagFromSeq(seq), because you can iterate through any sequence of whatever the same way. I would also not use the identifier dict for the dictionary, because you are masking the same name of a built-in type.

You can even think about building your class Bag(dict):  if it makes sense in your case. Otherwise, you code looks fine.

Author Comment

ID: 17076100
Thanks for the suggestions. In response, I've rewritten the code and moved it to a seperate module:

def accumulate(hash, key, value):
    if key in hash:
        hash[key] += value
        hash[key] = value

def multisetCounts( multiset ):
    counts = {}
    for element in multiset:
        accumulate(counts, element, 1)
    return counts

Is there anything further that could be improved? I'm pretty new to Python, coming from Perl. Btw/in Perl this type of routine could be written in one short line:

map {counts{$_}++} multiset;

I'm surprised Python (a successor to Perl, known for its elegance) would require 5x as much code.
VMware Disaster Recovery and Data Protection

In this expert guide, you’ll learn about the components of a Modern Data Center. You will use cases for the value-added capabilities of Veeam®, including combining backup and replication for VMware disaster recovery and using replication for data center migration.

LVL 14

Accepted Solution

RichieHindle earned 300 total points
ID: 17076658
For what it's worth, here's how I'd spell it:

def count_items(seq):
    counts = {}
    for s in seq:
        counts[s] = counts.get(s, 0) + 1
    return counts

if __name__ == '__main__':
    TEST = ['red', 'green', 'blue', 'green', 'pink', 'red', 'green']
    print count_items(TEST)

And also for what it's worth, one of the reasons Python is more elegant than Perl is that you can't express four line's worth of code in one line.  8-)
LVL 17

Assisted Solution

ramrom earned 200 total points
ID: 17078565
map(lambda x:counts.update({x:counts.get(x,0)+1}),seq)

Author Comment

ID: 17078591
Thanks! It was the 'get' function that I was looking for.

Featured Post

Optimizing Cloud Backup for Low Bandwidth

With cloud storage prices going down a growing number of SMBs start to use it for backup storage. Unfortunately, business data volume rarely fits the average Internet speed. This article provides an overview of main Internet speed challenges and reveals backup best practices.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Variable is a place holder or reserved memory locations to store any value. Which means whenever we create a variable, indirectly we are reserving some space in the memory. The interpreter assigns or allocates some space in the memory based on the d…
When we want to run, execute or repeat a statement multiple times, a loop is necessary. This article covers the two types of loops in Python: the while loop and the for loop.
Learn the basics of if, else, and elif statements in Python 2.7. Use "if" statements to test a specified condition.: The structure of an if statement is as follows: (CODE) Use "else" statements to allow the execution of an alternative, if the …
Learn the basics of modules and packages in Python. Every Python file is a module, ending in the suffix: .py: Modules are a collection of functions and variables.: Packages are a collection of modules.: Module functions and variables are accessed us…

803 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question