Solved

How do I count sequences in Python with itertools groupby?

Posted on 2016-11-16
3
78 Views
Last Modified: 2016-11-17
I'm trying to use the itertools groupby function here:
https://docs.python.org/2/library/itertools.html#itertools.groupby

but there are no examples of how to do this with count.  Doing the following:
import itertools as itr
In [204]: list(itr.groupby('1222311'))
Out[204]:
[('1', <itertools._grouper at 0x3cae048>),
 ('2', <itertools._grouper at 0x3cae630>),
 ('3', <itertools._grouper at 0x3caedd8>),
 ('1', <itertools._grouper at 0x3cae1d0>)]

Open in new window

yields some indecipherable second element of each tuple.  What I want to get back from that is:
In [204]: list(itr.groupby('1222311'))
Out[204]:
[('1', 1),
 ('2', 3),
 ('3', 1),
 ('1', 2)]

Open in new window

though the order of elements within the tuples is irrelevant.  
I tried adding a second argument to the groupby (e.g. key=itr.count), but they all give errors.  How do I accomplish this and how do I use the keyfunc parameter of groupby?
0
Comment
Question by:ugeb
  • 2
3 Comments
 
LVL 16

Accepted Solution

by:
Walter Ritzel earned 500 total points
ID: 41891826
The undecipherable object grouper is a lazy iterator, so in order to get the count for each of those elements, you need to get their len.

import itertools as itr
for a, b  in itr.groupby('1222311'):
    print("(%s,%s)" % (a, sum(1 for i in b)))

Open in new window


you could also write like this:
import itertools as itr
for a, b  in itr.groupby('1222311'):
    print("(%s,%s)" % (a, len(list(b))))

Open in new window

0
 
LVL 11

Author Comment

by:ugeb
ID: 41892272
Thank you, that's helpful.  What exactly is the iter group iterating? We already have the token, and then you're iterating over something that gives you a count.  It looks like, after some testing, that it's just the elements of the group.  I guess that's more useful in different scenarios with a more complicated grouping function.

How do I use the keyfunc parameter?  What functions would work there?  I've tried many different things, and they all give errors.
0
 
LVL 11

Author Closing Comment

by:ugeb
ID: 41892380
Okay, thanks to your help I figured out how to do it as a single list.  This is what I wanted to do:

[ (i[0],len(list(i[1]))) for i in itr.groupby('1113334')]
0

Featured Post

Best Practices: Disaster Recovery Testing

Besides backup, any IT division should have a disaster recovery plan. You will find a few tips below relating to the development of such a plan and to what issues one should pay special attention in the course of backup planning.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Introduction On September 29, 2012, the Python 3.3.0 was released; nothing extremely unexpected,  yet another, better version of Python. But, if you work in Microsoft Windows, you should notice that the Python Launcher for Windows was introduced wi…
Flask is a microframework for Python based on Werkzeug and Jinja 2. This requires you to have a good understanding of Python 2.7. Lets install Flask! To install Flask you can use a python repository for libraries tool called pip. Download this f…
Learn the basics of strings in Python: declaration, operations, indices, and slicing. Strings are declared with quotations; for example: s = "string": Strings are immutable.: Strings may be concatenated or multiplied using the addition and multiplic…
Learn the basics of lists in Python. Lists, as their name suggests, are a means for ordering and storing values. : Lists are declared using brackets; for example: t = [1, 2, 3]: Lists may contain a mix of data types; for example: t = ['string', 1, T…

810 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question