List vs. Category: Part 1: Introduction to the problems

Computer Scientists

CERTIFIED EXPERT

Published: 2010-12-05

Browse All Articles > List vs. Category: Part 1: Introduction to the problems

This article explores the difference between two entities: List and Category. In part one, we'll look at the basic concepts, and set some groundwork. In part two, we'll get to some conclusions.

I know that these two things are different, but I need to find out a method using which I make my software program identify the distinction. I am trying to be methodical. After a search/research and brainstorming for some hours, I am able to come up with this much:

A List is a collection of items.
Properties of a list:

name of the list

list of items

list of sublists

size

A Category is a class or division in a scheme of classification.
Properties of a category:

name

list of items

list of subcategories

size

While I am aware that List and Category are two different things or entities, I am not able to find one attribute in any of these entities which is not in the other one. So, I am unable to see the difference either due to:

My inability to describe the entity in terms of its attributes, or...

It is impossible to describe an entity in terms of only its attributes and something else is required, or...

The difference in entities is not because of the difference in attributes.

I have made two assumptions so far:

It is possible to compare entities.

One of the way to do so is to compare the properties and behavior first.

There are some questions to be asked which challenges the assumptions first.

How do we compare two entities?

Should we compare the attributes of those entities to describe the difference between them?

If the attributes of the two entities are same, could they still be differentiated based on the values of those attributes and be grouped into a common class?

Or Does it mean that two entities with identical attributes are same?

Given two entities, I will first explore the possibility of comparing the entities by comparing their attributes and behavior. Comparing the attributes could mean comparing the number of attributes as well as type of those attributes. If there is one property/attribute which is in one entity and not in the other, that would mean these two entities are different. If the properties themselves of the entities are different, that would also mean that entities are different.

So, if the attributes are identical, then does it mean that the entities are same? Let's say that I have two instances, which have the same properties/attributes, just the values of those attributes are different. Would that imply that these instance belong to the same entity? Could the behavior of those entities also makes a difference whether they belong to the same entity or different? Let's look at a few examples:

1) Blue vs. Green

Now what could be the properties of an instance 'Blue'? Name and RGB Value is all I could think of. It would be the same for 'Green' as well.

It's not hard to make out that both these instance belongs to class 'Color', though the behavior of both colors are different. Blue color and Green color signifies separate things also, and have different usage. So, would this mean that two instances could belong to same class even if their behavior is not same? One answer would be that there could be two subclasses of 'Color' class, one of which will have 'Blue' as its instance and the other ones will have 'Green' as its instance. Both of these subclasses could add to the default behavior of a color.

So, Blue and Green, while both of them being the part of Color class, can have different behavior. One thing to note here is that, we have created the hierarchy as per the context, keeping in mind the difference in behavior. For example, since Green color can be used in a traffic signal, and Blue color cannot be, so subclassing could be done in way that there would be two subclasses of Color, TrafficSignalColor (whose instance would be Green), and other would be NonTrafficSignalColor (whose instance would be Blue). But if the difference is something else, then subclassing would be done differently. We can identify all such usages and argue that we can make subclassing non-contextual, but the success of this exercise would be entirely dependent on how capable we are in finding out all the behavioral aspects and usages of the instances.

Also, another point to be noted is, I am assuming the class's instances to exhibit properties, behavior, and usages rather than the usual notion of just properties and behavior.

2) A 'Harry Potter' Novel vs. A 'Rotomac' ball-point pen

Now this looks pretty straight-forward in terms of difference between them as entities. Attributes of that Book would be book name, author name, publication source, price, genre, type, release date, and released media. Attributes of that pen would be pen brand name, pen size, ink color, price, type, release date, and released media. Both these instance belong to a different entity each. It will be interesting though to find out the classes to which these instances belong.

a) An instance of that book could belong to multiple classes (if we allow the addition/removal of few attributes) such as Book, or Novel (if we remove the attribute 'type'). If we keep just the author name, publication source, type, price, genre, release date, and release media, then the instance could be belong to class called 'Publication.'

If we keep on adding or removing the attributes, this instance could belong to any number of classes (apart from) like SaleableItems (if we keep price, type, release date, release media), or RentalItems, or SoldItems (if we add sold date, buyer name), FavoriteList (if we add buyer name, etc.) and many more such classes. So, there are a few things to notice here:

>> Which class would an instance belong to, would be determine by what all attributes, one could think of for an instance. In other worlds, the answer to 'where this instance fits in?' is limited-by/subjected-to your knowledge-of/ability-to-describe that instance.

>> An instance could belong to a set of classes which can be mutually exclusive to each other. For example, in the above example RentalItems, RentedItems, and SoldItems.

b) An instance of that pen could also belong to multiple classes, such as: Pen, ball-point pen, etc. It could also belong to a class of WritingDevice, WritingTool, etc. if we look from the point of view of its usage. It could also belong to a class of StationaryItem, GeometryBoxItems, SchoolBagItems, etc. if we look from the point of view of where it fits in. As above, this instance could belong to many number of classes (apart from) like SaleableItems (if we keep price, type, release date, release media), or RentalItems, or SoldItems (if we add sold date, buyer name), FavoriteList (if we add buyer name, etc) and many more such classes. So, another few set of things to note here is:

>> By looking from different points of view, we can figure out many sets of classes an instance can belong to. So, which class an instance belongs to is limited by our imagination and also the current context in which you are looking at the instance.

c) Both the Pen and Book as described above in terms of attributes have some common attributes like type, price, genre, release date, and release media. So, another few set of things to note here is:

>> If we ignore rest of the attributes, then we can make both of these instances belong to the same class. This is a familiar conclusion since it is the case generally that common attributes can be re-factored to the super class, and then rest of the attributes will go to the subclasses.

>> As we can see (in both a) and b) and also above mentioned point), that both the instance belongs to SaleableItems, RentalItems, SoldItems, FavoriteList etc.

Now this is where the real trouble begins for me. As per Object Oriented Programming (OOP) principles, an object must exhibit Inheritance hierarchy. In other words, if an object (instance in our case) belongs to multiple classes, these classes should be in an hierarchical order. But these classes SaleableItems, RentalItems, SoldItems, and FavoriteList class are not in a hierarchical order, instead they are either mutually exclusive to each other, or they do not fall in the same hierarchy. This could mean two things:

i) Either, I have made a mistake in reaching to a conslusion in c) above. (which I have tried my best to avoid),...

ii) Or, the OOP principal of inheritance is erroneous, (which is a big conclusion to make since OOP has been around for long time, widely used, appreciated, and I haven't read about this problem before, probably no one has).

As much as I want to think, revise and adjust the first point 'i)' above, it appears to me 'ii)' is true. It is easy to verify that using any language that claims to support OOPs fully, like Java. Say that there is class called Pen and Book. As already explained both of them can either belong to SaleableItems, RentalItems, SoldItems, FavoriteList etc class. Now, since all of them are not in a hierarchical order, you won't be able to up-cast a Pen or a Book object to all of them (unless, of course you can figure out a way to force a hierarchy here, which will incorrect semantically).

Some more questions that can be asked after the conclusion made above are:

How do we represent this in terms of OOPs, is there a work-around?

How do we represent this in terms of RDBMS and HDBMS at least, since they may not have all the restrictions of OOP?

Going back to the original problem of 'List vs. Category', we have probably found no answers to the questions asked earlier. Having few attributes extra does really tells me the difference. It actually confirms that (as in case of List vs Category) that either: List is a super class of Category or vice versa. In other words, it means either a list is a special kind of category or category is a special kind of list.

I feel this is not a satisfactory conclusion, since semantics of both List and Category are entirely different. Even if attributes are same and the behaviour is varying, then also objects could belong to the same class (check the Blue vs. Green example above). Even if I consider Usage of an entity apart from properties and behaviour, then also no real progress is made.

More question than answers have emerged from this article and I am more confused than ever. While I am trying to research and explore about classes, objects, entity and attributes, it is becoming increasing obvious to me that there is more to OOP than I know (or what I am told). I guess more limitations will surface as I am beginning to explore OOP and entity/attribute relationships.

3,897 Views

Comment

Gurvinder Pal SinghComputer Scientists

CERTIFIED EXPERT

Programming Theory

Comments (11)

dpearsonCTO

CERTIFIED EXPERT

Commented: 2011-09-19

You said:

ii) Or, the OOP principal of inheritance is erroneous, (which is a big conclusion to make since OOP has been around for long time, widely used, appreciated, and I haven't read about this problem before, probably no one has).

This is a big conclusion but I think it's the right one. Inheritance it turns out is not a very strong part of OOP. Encapsulation is the really big idea in OOP that remains strong today. The limitations of inheritance are just as you describe - many relationships don't exhibit a total ordering - where A is "higher up" than B and B is "higher up" than C. There seem to be 2 common approaches to addressing this:

i) You use composition instead of inheritance. The properties you wish to expose (e.g. "SellableItem") are implemented as an interface (public void sell()) and a class that implements this (SellableItem) but the class that provides the implementation is a member of the main class (Book).

So we get something like:

Book implements Sellable {
    private SellableItem m_SellableItem ;

    public void sell() { m_SellableItem.sell() ; }
}

Open in new window

You can see that this pattern extends to multiple properties that can be added onto Book (e.g. Indexable with an IndexableItem to implement it).

ii) However this is a pretty clumsy solution. We still need to know ahead of time that we'll want to index books when we put them in the library. Also you need to write a lot of methods like
public void sell() { m_SellableItem.sell() ; }
to pass the call down to the class that's being composed into the main class to provide the implementation.

That's why we now see newer languages exploring other features. E.g. Scala introduced traits - which look a lot like this pattern I described, but also look a lot like multiple inheritance. You can add a "trait" to a class and give it additional properties. It's more flexible than classic inheritance and better defined than C++'s multiple inheritance triangles. What's more you can define behaviors for objects without knowing the specifics of the class ahead of time - so you can implement a behavior for "IndexableThings" and later decide that books should be indexable.

Anyway a great discussion article and I think you've come to the right conclusion - that inheritance really isn't that great of a feature and certainly not the key to good OO design.

Doug

Gurvinder Pal Singh

Computer Scientists

CERTIFIED EXPERT

Author

Commented: 2011-09-26

Thanks for your comment and a good suggestion on exploring 'traits' in Scala. I will try to look into the same.

However, i also want to know your thoughts about the main topic of 'List vs Categories'. How do you think, or on what basis you will decide that a certain entity belongs to which language?

Thanks & Regards,
gurvinder372

dpearsonCTO

CERTIFIED EXPERT

Commented: 2011-09-26

For the main question of List vs Categories I would suggest that what you're attempting to do is basically to define a set of rules (properties) that define a particular entity (e.g. List or Category).

While that's certainly possible for specific aspects of a entity (e.g. "Has a name" - can be applied to lists or categories) I think if you're looking at the philosophical question of "can you define the list of rules that completely define an entity" the answer is no.

I believe this because of the way we as humans categorize information. Several years ago the prevailing view was that we categorized information through rules. E.g. A dog is a dog because it has 4 legs, a tail and fur. However, this has been shown to not really be true - we don't really build mental categories like that. If we follow that example of a dog for instance, what about breeds of dogs that have no tails? Aren't they still dogs? What if the dog lost a leg in an accident. Again - it clearly doesn't lose it's dog-ness. So what really defines a dog if it's not "4 legs + tail + fur". Well, it turns out the better definition of a category is all of the instances of the category which we've experienced. In other words a given dog is a dog because it's similar to the thousands of other dogs we've seen in our lives. The more similar it is to those other dogs, the more we say it's a dog.

Of course that's slightly annoying to us as computer scientists since this question of "similar to a group of other instances" is a bit fuzzy. We obviously need to define similarity - which probably is something like "number of properties in common" or "closeness of properties to the example instances".

So if we believe this is how we categorize all information and how language really works, then what does that say for "List vs Category". Well, it would suggest that the terms themselves are somewhat loose. To decide if an entity is-a list or if an entity is-a category we should decide how close this particular entity is to the group of other lists (or categories) that we've seen before.

For example - a collection of things I wish to buy looks more like other lists (it has an order, it's typically about 20-30 things long which a lot of lists I've seen before are, I would expect to add an remove things from it etc.).

While another example - the collection of items priced under $10 at amazon.com looks more like other categories I've seen (it can be represented by a conditional expression, the condition can be changed to create sub and super-categories, enumerating it would be impractical etc.).

So while Lists and Categories may share a lot of properties in common, I'd suggest we should use a different way to compare the core concepts than listing those properties.

Of course that's not much use for software. In software we must define properties, but then that brings us back to my original point - that inheritance and class hierarchies aren't great at describing these sorts of things and explicit "collections of properties" via composition may be the more useful concept.

Doug

Gurvinder Pal Singh

Computer Scientists

CERTIFIED EXPERT

Author

Commented: 2011-09-27

Thanks for your interesting analysis.
Yes, it definitely is tricky to compare the real world entities in terms of properties, but that was the one of the main points I wanted to convey in this article.
We were told that looking at the real world in terms of data (something that OOPs claims to do), is the best way forward. It is suppose to give us clearest picture of the real world entities. But, it clearly is not, if we have to compare multiple entities with each other, which is often the confusion while designing a OOP-based design.

However, if you ask me where it is useful, i would say if a breakthrough can be achieved in this, it can revolutionize the soft-computing altogether, since we don't have to embed the logic of comparing with explicit set of entities. Otherwise, if you really look at it, if right now if a Robot is asked to tell if a particular object is Dog or not, he cannot since he is most likely to judge the same based on the properties of that object. Isn't it?

dpearsonCTO

CERTIFIED EXPERT

Commented: 2011-09-27

I think that's just it Gurvinder - if you asked a Robot in 1990 "is this a dog?" it would check the list of properties for a dog and compare it to the properties of the object in front of it.

But if you asked a Robot in 2011 "is this a dog?" it would compare it to the thousands of other instances it had seen before with the label of "dog" and decide if this current object was close enough to those others.

This is (in very approximate terms) is how Watson played Jeopardy and did so well.

I'm not sure what that really means for OOP since we are living in the land of rules, but I do think it goes to the core problem you're considering and why OOP doesn't always do as well as we might hope.

Doug

Have a question about something in this article? You can receive help directly from the article author. Sign up for a free trial to get started.

Get Access