Subclassing a Python Dictionary

I am trying to subclass a python dict and I am not sure how to get it to work exactly.

What I am trying to create is basically a cache mechanism tied to the key value in the dict so I can do a kind of lazy loading of the data.  When I create the object I pass it a data source.  By default it will have no keys.  I want to have the Class return the data if it's already loaded or load the data if it's missing.  Hope that makes sense.  I included a rough example of what I am trying to do.
SomeClass(dict):

    __init__(self, data):
        dict.__init__(self)
        self.data = data

    __getitem__(self, key):
        if not self.has_key(key):
            self[key] = self.data.getDataForKey( key )
        return self[key]

Open in new window

LVL 1
renderboxAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

SuperdaveCommented:
class SomeClass(dict):

    def __init__(self, data):
        dict.__init__(self)
        self.data = data

    def __getitem__(self, key):
        if not self.has_key(key):
            self[key] = self.data.getDataForKey( key )
        return dict(self)[key]
0
mish33Commented:
def __getitem__(self, key):
  return self.setdefault(key, self.data.getDataForKey(key))

def __contains__(self, key):
  if dict(self, '__contains__')(key): return True
  #Check if key in self.data as well.
0
peprCommented:
As mish33 correctly pointed out, you should also consider to implement the __contains__ if you ever want to test whether the key is in the data.  It may depend on your cache behaviour.  Try the following snippet to see what would happen if you didn not implement it.  Notice the False in the output.  Without implementing it, the "in" operator says whether the value is in the cache -- it may or may not be the wanted behaviour.

Play with the Data to simulate your source behaviour and modify the LazyDic (cache) as needed.  The output looks like:

...\Q_25634427>a.py
{0: 'data for 0', 1: 'data for 1', 2: 'data for 2', 3: 'data for 3', 4: 'data for 4', 5: 'data for 5', 6: 'data for 6', 7: 'data for 7', 8: 'data for 8', 9: 'data for 9'}
data for 0
data for 9
default value for 100
{}

Asking for 0: False, data for 0
The content after:
    0: data for 0

Asking for 1: False, data for 1
The content after:
    0: data for 0
    1: data for 1

Asking for 100: False, default value for 100
The content after:
    0: data for 0
    1: data for 1
    100: default value for 100

Asking for 1: True, data for 1
The content after:
    0: data for 0
    1: data for 1
    100: default value for 100

Asking for 1: True, data for 1
The content after:
    0: data for 0
    1: data for 1
    100: default value for 100

Asking for 2: False, data for 2
The content after:
    0: data for 0
    1: data for 1
    2: data for 2
    100: default value for 100

[...snip...]
class Data:
    '''Simulate the source of the data.'''
    def __init__(self, n):
        self.d = dict( (k, 'data for %d' % k) for k in range(n) )
   
    def getDataForKey(self, k):
        return self.d.get(k, 'default value for %d' % k)     
 
 
class LazyDict(dict):

    def __init__(self, datasource):
        dict.__init__(self)
        self.datasource = datasource
        
    def __getitem__(self, k):
        return self.setdefault(k, self.datasource.getDataForKey(k))
        
 
if __name__ == '__main__':
    # Simulate the data source.
    data = Data(10)
    print data.d
    print data.getDataForKey(0)
    print data.getDataForKey(9)
    print data.getDataForKey(100)
    
    # Create the lazy dictionary (cache) with the data source.
    ld = LazyDict(data)
    print ld
    
    # Simulate usage of the cache
    for k in (0, 1, 100, 1, 1, 2, 1000, 9):
        print '\nAsking for %s: %s, %s' % (k, k in ld, ld[k])
        print 'The content after:'
        for item in ld.iteritems():
            print '    %s: %s' % item

Open in new window

0
Cloud Class® Course: MCSA MCSE Windows Server 2012

This course teaches how to install and configure Windows Server 2012 R2.  It is the first step on your path to becoming a Microsoft Certified Solutions Expert (MCSE).

peprCommented:
A side note. You should stop to use the .has_key(), use the "in" operator instead.
0
renderboxAuthor Commented:
Thanks for the comments.  I am going to try them out.

On the .has_key()/in comment.  I was under the impression that 'in' did it's check on the values and not on the keys.  I'll give that a try.
0
renderboxAuthor Commented:
I ran a test on this and noticed that it seem to always callself.self.datasource.getDataForKey(k) in:

    def __getitem__(self, k):
        return self.setdefault(k, self.datasource.getDataForKey(k))

Even if it already has the data, it calls "self.datasource.getDataForKey(k)".  How do I get it to make the call only if the data is missing and return the stored results if not?
0
mish33Commented:
def __getitem__(self, k):
   try:
        return dict.__getitem__(self, k)
   except KeyError:
        return self.setdefault(k, self.datasource.getDataForKey(k))
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
peprCommented:
For the has_key(), it was deprecated in favor of the "in" operator (see http://docs.python.org/library/stdtypes.html#dict.has_key).  It is likely that it is related to the more unified approach to all containers in newer Pythons.  

The "in" operator (as also other operators) can be viewed the syntactic form of a built-in function that internally calls the object method .__contains__() (or it is keyword in the for loop; see http://docs.python.org/reference/datamodel.html#object.__contains__).
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Python

From novice to tech pro — start learning today.