[Webinar] Streamline your web hosting managementRegister Today

x
?
Solved

Subclassing a Python Dictionary

Posted on 2010-03-31
8
Medium Priority
?
720 Views
Last Modified: 2012-05-09
I am trying to subclass a python dict and I am not sure how to get it to work exactly.

What I am trying to create is basically a cache mechanism tied to the key value in the dict so I can do a kind of lazy loading of the data.  When I create the object I pass it a data source.  By default it will have no keys.  I want to have the Class return the data if it's already loaded or load the data if it's missing.  Hope that makes sense.  I included a rough example of what I am trying to do.
SomeClass(dict):

    __init__(self, data):
        dict.__init__(self)
        self.data = data

    __getitem__(self, key):
        if not self.has_key(key):
            self[key] = self.data.getDataForKey( key )
        return self[key]

Open in new window

0
Comment
Question by:renderbox
  • 3
  • 2
  • 2
  • +1
8 Comments
 
LVL 13

Expert Comment

by:Superdave
ID: 29294789
class SomeClass(dict):

    def __init__(self, data):
        dict.__init__(self)
        self.data = data

    def __getitem__(self, key):
        if not self.has_key(key):
            self[key] = self.data.getDataForKey( key )
        return dict(self)[key]
0
 
LVL 15

Expert Comment

by:mish33
ID: 29308163
def __getitem__(self, key):
  return self.setdefault(key, self.data.getDataForKey(key))

def __contains__(self, key):
  if dict(self, '__contains__')(key): return True
  #Check if key in self.data as well.
0
 
LVL 29

Expert Comment

by:pepr
ID: 29319855
As mish33 correctly pointed out, you should also consider to implement the __contains__ if you ever want to test whether the key is in the data.  It may depend on your cache behaviour.  Try the following snippet to see what would happen if you didn not implement it.  Notice the False in the output.  Without implementing it, the "in" operator says whether the value is in the cache -- it may or may not be the wanted behaviour.

Play with the Data to simulate your source behaviour and modify the LazyDic (cache) as needed.  The output looks like:

...\Q_25634427>a.py
{0: 'data for 0', 1: 'data for 1', 2: 'data for 2', 3: 'data for 3', 4: 'data for 4', 5: 'data for 5', 6: 'data for 6', 7: 'data for 7', 8: 'data for 8', 9: 'data for 9'}
data for 0
data for 9
default value for 100
{}

Asking for 0: False, data for 0
The content after:
    0: data for 0

Asking for 1: False, data for 1
The content after:
    0: data for 0
    1: data for 1

Asking for 100: False, default value for 100
The content after:
    0: data for 0
    1: data for 1
    100: default value for 100

Asking for 1: True, data for 1
The content after:
    0: data for 0
    1: data for 1
    100: default value for 100

Asking for 1: True, data for 1
The content after:
    0: data for 0
    1: data for 1
    100: default value for 100

Asking for 2: False, data for 2
The content after:
    0: data for 0
    1: data for 1
    2: data for 2
    100: default value for 100

[...snip...]
class Data:
    '''Simulate the source of the data.'''
    def __init__(self, n):
        self.d = dict( (k, 'data for %d' % k) for k in range(n) )
   
    def getDataForKey(self, k):
        return self.d.get(k, 'default value for %d' % k)     
 
 
class LazyDict(dict):

    def __init__(self, datasource):
        dict.__init__(self)
        self.datasource = datasource
        
    def __getitem__(self, k):
        return self.setdefault(k, self.datasource.getDataForKey(k))
        
 
if __name__ == '__main__':
    # Simulate the data source.
    data = Data(10)
    print data.d
    print data.getDataForKey(0)
    print data.getDataForKey(9)
    print data.getDataForKey(100)
    
    # Create the lazy dictionary (cache) with the data source.
    ld = LazyDict(data)
    print ld
    
    # Simulate usage of the cache
    for k in (0, 1, 100, 1, 1, 2, 1000, 9):
        print '\nAsking for %s: %s, %s' % (k, k in ld, ld[k])
        print 'The content after:'
        for item in ld.iteritems():
            print '    %s: %s' % item

Open in new window

0
The new generation of project management tools

With monday.com’s project management tool, you can see what everyone on your team is working in a single glance. Its intuitive dashboards are customizable, so you can create systems that work for you.

 
LVL 29

Expert Comment

by:pepr
ID: 29320115
A side note. You should stop to use the .has_key(), use the "in" operator instead.
0
 
LVL 1

Author Comment

by:renderbox
ID: 29354700
Thanks for the comments.  I am going to try them out.

On the .has_key()/in comment.  I was under the impression that 'in' did it's check on the values and not on the keys.  I'll give that a try.
0
 
LVL 1

Author Comment

by:renderbox
ID: 29359854
I ran a test on this and noticed that it seem to always callself.self.datasource.getDataForKey(k) in:

    def __getitem__(self, k):
        return self.setdefault(k, self.datasource.getDataForKey(k))

Even if it already has the data, it calls "self.datasource.getDataForKey(k)".  How do I get it to make the call only if the data is missing and return the stored results if not?
0
 
LVL 15

Accepted Solution

by:
mish33 earned 500 total points
ID: 29415360
def __getitem__(self, k):
   try:
        return dict.__getitem__(self, k)
   except KeyError:
        return self.setdefault(k, self.datasource.getDataForKey(k))
0
 
LVL 29

Assisted Solution

by:pepr
pepr earned 500 total points
ID: 29423286
For the has_key(), it was deprecated in favor of the "in" operator (see http://docs.python.org/library/stdtypes.html#dict.has_key).  It is likely that it is related to the more unified approach to all containers in newer Pythons.  

The "in" operator (as also other operators) can be viewed the syntactic form of a built-in function that internally calls the object method .__contains__() (or it is keyword in the for loop; see http://docs.python.org/reference/datamodel.html#object.__contains__).
0

Featured Post

Free Tool: Site Down Detector

Helpful to verify reports of your own downtime, or to double check a downed website you are trying to access.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Dictionaries contain key:value pairs. Which means a collection of tuples with an attribute name and an assigned value to it. The semicolon present in between each key and values and attribute with values are delimited with a comma.  In python we can…
This article is meant to give a basic understanding of how to use R Sweave as a way to merge LaTeX and R code seamlessly into one presentable document.
This video teaches viewers about errors in exception handling.
The viewer will learn how to user default arguments when defining functions. This method of defining functions will be contrasted with the non-default-argument of defining functions.
Suggested Courses

612 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question