Link to home
Start Free TrialLog in
Avatar of Ra
Ra

asked on

dynamic array

Hey all, does anyone know how I can make an array in C++ where I can add and remove elements dynamically and keep it sorted at the same time?  This must be able to run in a 16-bit DOS enviroment on a 486 processor.  I'm using MS Visual C++ 1.52c for 16-bit apps.  And before you say it, no, CArray doesn't exist here... :(

Currently I'm just defining a huge array (1500 elements) and parsing through it, moving and sorting when I need to add or removing someting.  This is not very efficent and I would like a better way to do it.

Thanks.
Avatar of Axter
Axter
Flag of United States of America image

I highly recommend that you upgrade to Visual C++ 6.0

You have a lot of limitations with VC++ 1.52
The best way to do this in VC 1.52 is to use a link list.
You could take the C approach:

typedef struct {
   int length;
   int *elements;
}  Array;

void foo( Array & myArray ) {
  // ...
  // Extend myArray
  myArray.length += 10;  // or whatever increment...
  myArray.elements = (int *) realloc(myArray.elements,
                       sizeof(int) * myArray.length );
}

You'd likely want to wrap this operation in a helper
function, assuming that you'd want to do it more than
once:

void assureArraySize( Array & theArray, int length ) {
  if (theArray.length < length) {
    theArray.length = length;
    theArray.elements = (int *) realloc(theArray.elements,
                         sizeof(int) * theArray.length );
  }
}

And, if you are using C++, you could hide all of that
inside a class:

class MyDynArray {
    int  length_;
    int *elements_;
public:
    MyDynArray() {
        length_ = 10;
        elements_ = (int *) malloc(length_ * sizeof(int) );
    }

    ~MyDynArray() {
       free( elements_ );
    }

    // And the usual canonical form for classes that
    // allocate memory:

    MyDynArray( const MyDynArray & other );
    MyDynArray &
    operator = ( const MyDynArray & other );

    //
    // Hide dynamic array management in index operators:
    //
    int operator[]( int index ) const;
    int operator[]( int index );
};

But that can be a lot of typing.  Your choice!
Hi (rtenhove), welcome to EE.

All of the experts here, for the most part have learn from other experts as to the proper etiquette for posting answer.

An answer should not be posted as an answer, if other experts have previously posted possible answers as comments, and/or have already made contributions to the question.

There are many experts who never post answers as answer.  Instead, they post their answers as comments.

If you read the following link, you'll see why this is the preferred method for many of our valued experts, including myself.

https://www.experts-exchange.com/jsp/cmtyQuestAnswer.jsp

Hi (Ra):

Feel free to click the [Reject Answer] button near (Answer-poster's) response, even if it seems like a good answer.
Doing so will increase your chance of obtaining additional input from other experts.  Later, you can click the [Select Comment as Answer] button on any response.
Avatar of Ra
Ra

ASKER

rtenhove, Axter is right about posting stuff as comments.  Your answer looks like it will work, but I don't have time to test it today.  I will try it Monday morning and if it works and there are no additional sugesstions that are any better, I will accept it then.

Axter, please explain what you mean by "link list".  Also, upgrading to 6.0 is not an option.  This program was designed in VC 1.52c and runs in DOS, not in Windows.  VC 6.0 cannot, to my knowledge, compile a program for 16-bit DOS (if it can, tell me how!).  I tried VC 6.0 first and got over 2000 errors and warnings, most because of near and far pointers and other stuff that 6.0 doesn't use any more.

Thanks.
>>Axter, please explain what you mean by "link list".  
I'll post some example code later.
Do you need to be able to travers through the list backwards and forward.
If you only need to move through the list forward, then I'll post code similar to rtenhove, but more C++ like.

If you need to go backwards, then you need to have a double link list, and that's a little more work.

You should note that rtenhove, method will not help you in sorting your array.
If you just needed a dynamic array, and you did not have any sorting requirements, then there's a much better method then the previously proposed.

Example:
foo *IncreaseFoo(foo *pOld, int CurrentSize, int IncFactor = 2)
{
     foo *pNewFooList = new foo[CurrentSize*IncFactor];
     memcpy(pNewFooList,pOld,CurrentSize*sizeof(foo));
     delete [] pOld;
     return pNewFooList;
};
int main(int argc, char* argv[])
{
     foo *pFooList = new foo[100];
     pFooList = IncreaseFoo(pFooList,100);
     delete [] pFooList;
     return 0;
}

The above example is simpler and safer.

When you increase the size of the array, you should use a factor of 2.  Don't use an incremental size like (+10).  This is less ifficient.

By using a factor of 2, you can increase array  of 100 to the size of 25600 just by looping through the function eight times.  Using an incremental function this could take hundreds of loops.
Loop  Size
1    200
2    400
3    800
4    1600
5    3200
6    6400
7    12800
8    25600

Of course the above code will not help your sorting requirements.
The best way to achieve a dynamic array that stays sorted, is to use a link list.

class LinkList
{
public:
  LinkList* PointerToNextItem;
};

With a single link list, every time you add a new item to the list, you create it using new, and then you travers the list to see where the new item belongs within the sorted list.
You then insert the new item by changing the NextPointer in the item setting behind it, to point to the new item.  The new item points to what the item behind it use to point to.

I can do a better job of explaining it by posting some example code.
I order to support efficient insertion and removal of
list items, you'd need to use a double-linked list:

class ListItem {
  ListItem *prev;
  ListItem *next;
  int       value; // or whatever the actual value type is
};

A single-link list is useful for use cases where you
add/delete at the ends of the the list, and traverse the
list in the "forwards" direction only.  The double-linked
list has greater flexibility, at the expense of using
extra memory, and creating somewhat more complex list
operations (insertion and deletion).

You're currently using a fixed-length array, and
presumably inserting or deleting by simply moving the
contents of the "tail" of the array to create or plug
a hole needed for insertion or left by deletion.  This
can be quite costly, especially if the list grows
quite large.

There are a lot of sorting algorithms available, with
various advantages and disadvantages, some of which will
strongly affect your choice of data structures (lists,
arrays, etc.).  What sort of problem are you solving
with this sorted array?
VC 6 cannot produce DOS apps, but are you sure you need a DOS app.   VC 6 will produce win32 consoles.  These look like DOS apps to the user, but are 32 bit windows programs.  They will run on a computer with windows 95 or later.

Are you really using a DOS computer, it is it a windows computer?

If you are running on a windows computer, then converting yoru code from a DOS app to a console app is very easy and will give you lots of memory and lots of tools to make this sort of thing easy  (The ideal solution is to use an STL map or multimap, but that is not an option in your ancient compiler.)
>>Are you really using a DOS computer
I believe he's using 16-bit code that is not compatible with VC 6.0.

Ra,
If your code is not too extensive, maybe the better solution would be to post your code, and the compiler errors you're getting in VC 6.0.
It might be far easier to convert your code, rather then to attempt to reinvent the wheel by using a link list.
For info on link list, see the following link:

http://www.gator.net/~garyg/C/MISC/linklists.html#long

It has some example projects

//A linked list

//----------------------------------
//each item is a Node struct

struct Node
{
 Node *prev;       //ptr to previous item
 
 DataType MyData;  //the actual data for this element

 Node *next;
};

//----------------------------------
//In your code you will need a pointer to the first
//element (at least) and a pointer to the CURRENT element.

Node *current;          //uninitialized

Node *first = new Node; //create the first node
first->prev = NULL;     //No elements before it
first->next = NULL;     //no elements after it
first->MyData = whatever;

//-----------------------------------
//If you want to APPEND an element to end of list

first->next = new Node;     //create new Node
current     = first->next;  //make it current
current->next = NULL         //no next element
current->MyData = whatever;

//----------------------------------
//If you want to insert an element after first

Node *temp;                 //a temp pointer
temp = current;             //point temp at second node

first->next = new Node;     //create new Node
current     = first->next;  //make new Node current
current->next = temp;       //point next ptr at the old second node

//----------------------------------
//to move from beginning to end of list

current = first;          //make first Node current

//while there is a next element, make it current
while(current->next != NULL)
  current = current->next;


You should get the point.  You can (should) write a class to handle linked lists for you.  Your class can then contain pointers to the first node, the last node, the current node, you can keep "bookmark" nodes, you can have a pointer to the "last-current" node, etc.
Avatar of Ra

ASKER

Ok, I sense a bit of confusion out there about where this program is running.  It is running on a DOS based PC.  This is done basically because there is much less that can go wrong on a simple DOS machine then on a Windows system, plus it's much cheaper. ;)  I am compiling with VC 1.52c because it can produce the 16-bit app that I need.  Upgrading the app to run on Windows is not an option, my boss already said no to that.  I am acutally running the compiler on a Windows 2000 machine.  Ironic, isn't it. ^_^

Anyway, the people that originaly designed this program created a structure for the data and made a 1000 element array with that structure.  They had an active flag in the structure so they knew which structures were being used and which ones were not.

Accessing each element must be very fast and easy.  This is a multi-threaded app and up to 16 threads can be accesing the same data at the same time.  Each element has a special number and currently the index in this array is the same as that number.  This makes access extremely fast, no lookups, just direct access.  I can't do this anymore because that number is currently limited to 3 digits (0-999) and I need it to go up to 4 digtis (0-9999).  Making the array 10000 elements long broke the 64K size limit on arrays and it wouldn't compile.

The way I was planning on getting around this, and to keep up the speed, was to use an indexing system based on two dyanamic arrays, an index array and a data array.

The index array would just contain a small structure with just the index to the data array and that special number as members.  This index array could be searched faster then the data array because it would require less memory and time to load each element, but it must be in numerical order based on that number to allow fast searching.  The data array would still have to be dyanamic to allow adding and deleting of elements but could be in any order since it would not have to be searched.  

I will see what I can do with the link list tomorrow morning and let you know.  I have a good method of searching, it's based on an old COBOL method and is extremely fast compaired to just looping through the array, but I need a good way of keeping the index in order, either some good sort method or a way to add elements in the correct position each time.

Ok, that should explain exactly what I am doing.  If I just managed to confuse the heck out of you let me know and I will try to clear it up.

Thanks
> I will see what I can do with the link list tomorrow morning and let you know

A linked list is very unlikely to be what you want...access to linked lists is relatively slow, you must traverse the whole list to find elements - there is no random access.

You probably really wan't a dynamic array of some sort.  This is fairly straightforward to create, basically you need to wrap the array in a class, and then reallocate/move the memory as required.
>>access to linked lists is relatively slow, you
>>must traverse the whole list to find elements - there is
>>no random access.
>>You probably really wan't a dynamic array of some sort.  

The questioner wants a method that would let him have a dynamic array, that stays sorted.
That would be hard to do with just a straightforward dynamic array, and I've already posted an example for this anyway.

A link list is relatively slow, but so is resorting an array everytime you add an item.
> A link list is relatively slow,
> but so is resorting an array everytime you add an item.

it depends of the relative frequencies of insert vs. access.

He says that 'up to 16 threads can be accesing the same data at the same time', so I get the impression that there is a lot of access going on.

With only about 1000 elements in the array, the sort time is going to be fairly small.

> and I've already posted an example for this anyway.

I didn't see that first time around.  I still think that this type of approach is more likely to be correct (although I would probably wrap the array in a class, as I mentioned).
>>it depends of the relative frequencies of insert vs.
>>access.
>>I still think that this type of approach is more likely
>>to be correct

I think you have a good point.
>> This is a multi-threaded app and up to 16 threads
>> can be accesing the same data at the same time
Under DOS?  In VC 1.52?   I don't think that VC 1.52 has thread-safe libraries.  

>> They had an active flag in the structure
Is this being used as a mutex (critical section)?   If you try to do so through C++, like

if (somestructure.Inuse == false)
{
    somestructure.Inuse == true
    // use the somestructure object here.
}

This cannot be made thread safe.   There are ways to do this, but they must be done through assembly (or they can be done with the help of the OS, but DOS does not provide this help.)


>> Accessing each element must be very fast and easy.
That suggests that as jason said you want an dynamic array, like vector<> (although you do't have STL) so you have to write it yourself.  You might want to consider a deque as it is nearly as fast at random access, but is faster for insertion and deletion.  You also might consider a hash table.  This gives you extremly fast random access and extremepy fast insertion and deletion.  However, the data is not kept sorted and cannot be iterated through in any particular order.


>> This index array could be searched faster then the data array because it
>> would require less memory and time to load each element,
Nope.  You are not using virtual memorry under DOS.  You can iterate through an array of small items in just about the same time as an array of large ones.  (There may be a small improvement to the array of small items due to caching, but that is the only difference.

The main questions are:
Do you need to add/delete from the data throuout the program, or is that done only at the start.?
(only at the start, a vector might make sense   Throughout the program then a deque or hash table makes sense).

Do you need to iterate through the data in a sorted order?
(yes, then a vector or deque make sense, a hash table does not.)

>> A link list is relatively slow, but so is resorting an array everytime you add an item.
That is where the dequeue excels over a vector.
Avatar of Ra

ASKER

"You're currently using a fixed-length array, and
presumably inserting or deleting by simply moving the
contents of the "tail" of the array to create or plug
a hole needed for insertion or left by deletion.  This
can be quite costly, especially if the list grows
quite large."
Currently, there are no adds nor deletes, just that active flag.  As I said before, usually only less then 200 of these are active.

">> This is a multi-threaded app and up to 16 threads
>> can be accesing the same data at the same time
Under DOS?  In VC 1.52?   I don't think that VC 1.52 has thread-safe libraries."
The original programers made there own threading model.  Still not sure how they did this, but it works great.

"Do you need to add/delete from the data throuout the program, or is that done only at the start.?"
Add/deletes occur only when the administrative user logs in to the consol and add/deletes.  This is not done too often but we cannot take the machine offline to do it so it can't take all the processor power to search and sort the array.

99% of the time, it is just accessing the array that matters.

"Do you need to iterate through the data in a sorted order?
(yes, then a vector or deque make sense, a hash table does not.)"
Yes, this program takes input in the form of a number and needs to find the element in the array that owns this number.  If it is out of order, it becomes very hard to search.  (You have to search every element to determine if it does not exist rather then stoping when the next element has a number greater then what you are looking for.  Also, outputing to reports must be in order.  If the array is not in order, the reports are messed up.)

">> This index array could be searched faster then the data array because it
>> would require less memory and time to load each element,
Nope.  You are not using virtual memorry under DOS.  You can iterate through an array of small items
in just about the same time as an array of large ones.  (There may be a small improvement to the array
of small items due to caching, but that is the only difference."
Doh!!!  Your right, I'm not used to programming with DOS.  I'm still thinking like a Windows programmer.  But it would still be faster to sort this index array because it is smaller, right?
>> Currently, there are no adds nor deletes,
So you only add at the very start?  In that case a simple dynamic array (vector) is likely to be the best.  It uses the least memory and is fastest for random access.  Creatign the array will be slower than some of the other mechansims, but you "pay" that price only at program start-up.

>> The original programers made there own threading model.
I bet they didn't rewrite the whole run-time library.  They need to.  Without doing so its just a game of russian roulette.  If safety is a factor then you really need to consider this.  The reason you didn't want to use windows was a matter of safety, but you are probably much less safe using a homemade multi-threading system with code that is not thread safe.

>> Yes, this program takes input in the form of a number and needs to
>> find the element in the array that owns this number.  If it is out of
>> order, it becomes very hard to search.
That is not what I asked about though.  There are containers that provide fast searches without having the data sorted.  for eample a hash table provides very fast searches, but does not store the data in a sorted order, so it is impossible to retrieve the data in a sorted order.  I don't think a hash table wouydl be good for you now though since you don't do adds and delates after the container is first filled--right?

>> Also, outputing to reports must be in order.
Then a hash table is out

>>  But it would still be faster to sort this index array
>> because it is smaller, right?
yes, but again that sort process is a one-time cost, right?  You only need to sort the array when it is constructed, then never again.  If that is the case, then its unlikely the difference in speed would ever amatter.  It would probably only matter if you keep adding items to the array and deleting items as the program runs.
> But it would still be faster to sort this index array
> because it is smaller, right?

yes, but I doubt that sort time is going to be a problem anyway...as you say it only happens on add/delete, so it won't happen that often.  

I am assuming, BTW, that because the compiler is so old, then STL is not available either.  If this is the case, then I really would consider implementing your own, simple dynamic array class.

I can provide an example if you need it (as could many of the others, I guess).
Avatar of Ra

ASKER

nietod, currently there are no adds or dletes.  But when I move to a dyanamic array, there will have to be and the array would have to stay sorted.  The original program had a fixed length array of 1000 elements each of which had and active flag.  That variable told the program weather or not the element was in use.  For example, you could have 50, 55, 73, and 80 active and none of the others.  What I want to do is make the array dyanamic so I only have those 4 elements created.  Also, since I now need to move to 4 digits (0-9999) and I cannot make an array of 10000 elements with this structure, I need a way to make them go out of order rather then depending on a direct access method based on the index.  All I need is a way to make this array dyanamic and a way to sort it.

jasonclarke, you're correct.  Please post your example.

Thanks.
> jasonclarke, you're correct.  Please post your example.

#include <string.h>
#include <stdlib.h>

class simple_dynamic_array
{
public:
    simple_dynamic_array(size_t initialSize=100)
        : mMaxSize(initialSize), mSize(0)
    {
        mArray = new int[mMaxSize];
    }

    ~simple_dynamic_array()
    {
        delete [] mArray;
    }
   
    int& operator[](size_t index)
    {
        return mArray[index];
    }

    operator int*()
    {
        return mArray;
    }

    size_t size() const { return mSize; }

    void push_back(int value)
    {
        mArray[mSize] = value;

        ++mSize;

        if (mSize == mMaxSize)
        {
            expand();
        }
    }

    simple_dynamic_array(const simple_dynamic_array& rhs)
        : mSize(rhs.mSize), mMaxSize(rhs.mMaxSize)
    {
        mArray = new int[rhs.mMaxSize];
        ::memcpy(mArray, rhs.mArray, mSize*sizeof(int));
    }

    simple_dynamic_array& operator=(const simple_dynamic_array& rhs)
    {
        int* array = new int[rhs.mMaxSize];
        ::memcpy(array, rhs.mArray, rhs.mSize*sizeof(int));

        delete [] mArray;
        mSize = rhs.mSize;
        mMaxSize = rhs.mMaxSize;
        mArray = array;

        return *this;
    }

private:
    void expand()
    {
        mMaxSize = mMaxSize + mMaxSize;
       
        int* newBuffer = new int[mMaxSize];

        ::memcpy(newBuffer, mArray, mSize*sizeof(int));

        delete [] mArray;
        mArray = newBuffer;
    }

    size_t mSize;
    size_t mMaxSize;
    int*   mArray;
};

int int_compare( const void *arg1, const void *arg2 )
{
    int i1 = *static_cast<const int*>(arg1);
    int i2 = *static_cast<const int*>(arg2);

    if (i1 < i2)
    {
        return -1;
    }
    else if (i1 > i2)
    {
        return 1;
    }
    return 0;
}

int main()
{
    simple_dynamic_array arr(20);

    // do some inserts...
    for (int i=0; i<1000; i++)
    {
        arr.push_back(1000-i);
    }

    // sort it...
    ::qsort(arr,arr.size(),sizeof(int),::int_compare);

    // make some mods...
    for (size_t j=0; j<arr.size(); j++)
    {
        arr[j] *= 2;
    }


    // Test copy/assignment...
    simple_dynamic_array arr2(arr);
    simple_dynamic_array arr3;

    arr3 = arr;

    return 0;
}

Avatar of Ra

ASKER

jasonclarke, it looks good and seems much simpler then a link list.  Unfortunatly I can't test it today.  My boss came in and told me to work on something else today.  It may be a couple of days or a week before I can get back to this.  (He does this to me all the time...)

Thanks.
>> What I want to do is make the array dyanamic so I only
>> have those 4 elements created.
That is called a sparse array.

A good way to impliment that woudl be to write a sparese array container class.  This class would use a linked list to store the data and would use a hash table to "connect" index values (positions) to items in the linked list.  i.e. the hah table would hash based on the index value and would store a pointer (iteraotr) to items the cooresponding item from the linked list.  

jason't code is just for a dynamic array.  That is not really what you need.  (no criticism, it wasn't intended to be.)  That dynamic array has to store all the items, you only want to store some of the items.   You can use that code as part of a sparse array, the part that actually stores data, but actually a linked list is a better way to store data in a sparse array.
Avatar of Ra

ASKER

nietod, that's not exactly correct.  I need to have all the items in the array.  What I was trying to say is that I don't need to have 1000 of them if only 5 are going to be used.  With a dyanamic array I can simply add those 5 and that's it.

This is basically what is going on:
On initialzition of the program, it reads all the elements that have been created from a file.  (lets just say we have 5 elements in that file.)  Then the admin can come in and all or delete elements.  If he wants to add 50 new elements and delete 2 of the original 5, he can, and then 53 elements will be written to the file and the next time the program loads, it will load all 53 of them.  I need a dyanamic array so that he can add those additional 50 elements and delete any that he wants.
But you said

>>For example, you could have 50, 55, 73, and 80 active and none of the others

That means you don't just want a dynamic array, but a sparse array.  ie. you don't store 1,2,3, ... 49 and so on.   a sparse array allows you to store only some arrray alements and not others in a memory-efficient way and still provide speed efficient access.
Avatar of Ra

ASKER

No, if I kept it the was it was originally that would work, but I can't do that anymore.  The original programmers just made a 1000 element array and marked the ones they were using as active.  The other elements would still be there, they were just marked as inactive.

I don't want to do it this way anymore.  For the simple fact that I now need 10,000 elements instead of 1,000.  An array of 10,000 of these structures breaks the 64K limit and the program will not compile.  (1540 elements is the limit before I get this error.)

Anyway, a sparse array will not work for what I want now because all elements that are being used must be loaded and since I am no longer marking specific ones as active all the elements that are created will have to be loaded.  I just want the ability to add and delete elements and to keep them sorted.  jasonclarke's method looks like it will work perfectly for this.
You are not getting it.  In a sparse array, if you add 1 item as postion 10,000 the container stores 1 item.  the memory required is (aproximately) 1*sizeof(item).   But in a dynamioc array, like jason posted,  you must store all items before that 10,000th item.  So in that case the memory used is 10,000*size of item.
Avatar of Ra

ASKER

So your saying that there would be one element in this sparse array with an index of 10000... so then when I want to add, lets just say element 550, I just set sparse_array[550] to the approiate data even if no other elements have been created?

Ok, if that is the case, it would work even better then jason's method but jason's method would still work because I do not need element 10000 at index 10000, it just has to be in numerical order.  For example, I could have a dyanamic array like this:
[0] = 5
[1] = 39
[2] = 10000

Like I said origially, they do not have to be sequential, just in numerical order.  It comes down to which is faster and which takes less memory.  A dyanamic array requires sorting and searching which slows everything down, but if I understand the concept of a sparse array now, it would be much faster since I could still use direct access and only have a small number of elements in memory, rather then all 10000.  Please post an example of this.  BTW, how can you tell if an element exist or not?
Avatar of Ra

ASKER

Another question just came to mind.  If the indexes are not sequential, how to do parse through it?  Like when you want to output to reports as stuff.  A standard loop would not work because it would be adding 1 to the index each time.
>> So your saying that there would be one element in this sparse array with
>> an index of 10000... so then when I want to add, lets just say element 550,
>> I just set sparse_array[550] to the approiate data even if no other elements
>> have been created?
yes.  It is sparesly populated.  The unpopulated elements take up no space.  

>> I do not need element 10000 at index 10000, it just has to
>> be in numerical order.
You don't need it physcially stored in the 10000th position.  But you want to refer to it as if it was. right?  i.e you want

somecontainer[10000]

to return an item at what is considered the 10000th position.  but there might not actually be 99999 items before that.  

>>  A dyanamic array requires sorting and searching which
>> slows everything down
It also requires you to track what item is in what position.  i.e. if you are storing what should be the 10000th item in the dynamic array at item 5, (Because there are only 4 other items stored), then you need some way to track the fact that the 5th item is what was the 10000th item.   the sprase array does this for you.  (It tracks the association between the values supplied to the array index operations and the actual location where the data is stored.)

>>  If the indexes are not sequential, how to do parse through it?
it needs to have a function that takes an index value (1000th) and returns a bool that indicates if that element is actually stored.  If the array will be VERY sparse, then that might be inneficient to use.  In that case you might need to create soem sort of enumaration function that returns the indexes that are used.

I don't have an example written.  its a relatively common container.  jason might have one.  There probalby are ones on the web.  I can help you to write one, but I can't writ it for you--its a decent amount of work.  (Although if you have STL, you can use STL to make it pretty easy, but I suspect you can't use STL in VC 1.5).
Avatar of Ra

ASKER

">> So your saying that there would be one element in this sparse array with an index of 10000... so then when I want to add, lets just say element 550, I just set sparse_array[550] to the approiate data even if no other elements have been created?
yes.  It is sparesly populated.  The unpopulated elements take up no space."
So they are set to null or something like that?

">> I do not need element 10000 at index 10000, it just has to be in numerical order.
You don't need it physcially stored in the 10000th position.  But you want to refer to it as if it was.
right?  i.e you want

somecontainer[10000]

to return an item at what is considered the 10000th position.  but there might not actually be 99999
items before that."

My original plan was to just search through the array until I found the element I wanted.

">>  If the indexes are not sequential, how do you parse through it?
It needs to have a function that takes an index value (1000th) and returns a bool that indicates if
that element is actually stored.  If the array will be VERY sparse, then that might be inneficient to
use.  In that case you might need to create some sort of enumaration function that returns the indexes
that are used."
Most of the time, the elements will be grouped in clusters, like 50 here and 50 there or whatever (sometimes just one cluster).  It is very rare to have any by themselves, but not impossible.  99% of the time it will be using direct access to retrive or write the data, only the reports and the admin screens need to parse through it.

"I don't have an example written.  its a relatively common container.  jason might have one.  There probalby
are ones on the web.  I can help you to write one, but I can't write it for you--its a decent amount
of work.  (Although if you have STL, you can use STL to make it pretty easy, but I suspect you can't
use STL in VC 1.5). "
Nope, STL (not 100% sure as to what that is anyway) is not available in 1.52.  I'll search the web to see if I can find anything.  This looks like this could be the perfect solution here since access would be extremely fast and no searching or sorting would have to be done.

If anybody know of a good example, post a link to it.

Thanks
> its a relatively common container.  jason might have one

no, I don't.  As you say, I would never do anything like this without the STL.

You could use something like the dynamic array to simulate the behaviour of a sparse array, but it would not be very efficient.
I thought it woudl be easy to find code for this on the web.  I searched like over 100 sites now and haven't found any code that did not use STL!!!!

Ra you'll have to search yourself or write it yourself (I can ofter advice only).
I just got an e-mail that said this question was locked with an answer?  did anyone else get this?
Yes, I did, I was just composing a similar message to yours.

I guess this is another site glitch...

(the email was odd, it says the questioner is 24704 and the locker has no name).
Avatar of Ra

ASKER

Yes, I got that same email, wierd.
Avatar of Ra

ASKER

Ok, I looked in MSDN and I understand what STL is now, and I found a few sites that have a sparse array coded that do not use STL (at least I don't think they do).  Could you tell me which one is the best?  I'm still looking for others.  Just found one for VB... but that doesn't help. ;)

http://tomcat.bu.edu/sc447/sarray.htm
http://www.extreme.indiana.edu/hpc++/docs/Examples/global-ptrs/
Neither of them is very good, the first will be very inefficient, since it uses a simple linked list, so it will have poor access times.  The second one is demonstrating something else entirely, so not much thought has been given to the array itself.
Good morning.

I do not see the activity in the history of this question that would have generated the Email you noted here on the locked question.  The Id noted is that of Ra, yet see no evidence of the question being locked.  I've sent this link to Ian from Community Support for further investigation.

Moondancer
Community Support Moderator @ Experts Exchange
>> Neither of them is very good, the first will be very inefficient,
>> since it uses a simple linked list, so it will have poor access times.  
I agree, exept to clarify that it is not the fact that it uses a linked list that makes it poor, but the way in which it uses it.  If it used a hash table to translate from index values to entries in the linked list it would be fine.
Here is an attempt at a sparse array that uses the dynamic array that I defined earlier for it's implementation.

I hope that your compiler can cope with the templates OK.  The array is *very* loosely based on the STL map or set.
Hopefully it is *reasonably* efficient, the lookups are speeded up by keeping the array sorted.

#include <string.h>
#include <stdlib.h>
#include <assert.h>

#include <iostream.h>

template <typename T>
class simple_dynamic_array
{
public:
    simple_dynamic_array(size_t initialSize=100)
        : mMaxSize(initialSize), mSize(0)
    {
        mArray = new T[mMaxSize];
    }

    ~simple_dynamic_array()
    {
        delete [] mArray;
    }
   
    T& operator[](size_t index)
    {
        assert(index >= 0 && index < mSize);
        return mArray[index];
    }

    size_t erase(size_t index)
    {
        assert(index >= 0 && index < mSize);
        if (index < mSize-1)
        {
            // Shuffle Down...
            ::memcpy(mArray+index, mArray+index+1, (mSize-index-1)*sizeof(T));
        }
        --mSize;
        return index;
    }

    operator T*()
    {
        return mArray;
    }

    size_t size() const { return mSize; }

    void push_back(T value)
    {
        mArray[mSize] = value;

        ++mSize;

        if (mSize == mMaxSize)
        {
            expand();
        }
    }

    simple_dynamic_array(const simple_dynamic_array& rhs)
        : mSize(rhs.mSize), mMaxSize(rhs.mMaxSize)
    {
        mArray = new T[rhs.mMaxSize];
        ::memcpy(mArray, rhs.mArray, mSize*sizeof(T));
    }

    simple_dynamic_array& operator=(const simple_dynamic_array& rhs)
    {
        T* array = new T[rhs.mMaxSize];
        ::memcpy(array, rhs.mArray, rhs.mSize*sizeof(T));

        delete [] mArray;
        mSize = rhs.mSize;
        mMaxSize = rhs.mMaxSize;
        mArray = array;

        return *this;
    }

private:
    void expand()
    {
        mMaxSize = mMaxSize + mMaxSize;
       
        T* newBuffer = new T[mMaxSize];

        ::memcpy(newBuffer, mArray, mSize*sizeof(T));

        delete [] mArray;
        mArray = newBuffer;
    }

    size_t mSize;
    size_t mMaxSize;
    T*   mArray;
};

template <class P1, class P2>
struct simplepair
{
    P1 first;
    P2 second;
};

template <class K, class V>
class Compare
{
public:
    static int pair_compare( const void *arg1, const void *arg2 )
    {
        K i1 = static_cast<const simplepair<K,V>* >(arg1)->first;
        K i2 = static_cast<const simplepair<K,V>* >(arg2)->first;

        if (i1 < i2)
        {
            return -1;
        }
        else if (i1 > i2)
        {
            return 1;
        }
        return 0;
    }
};

template <class Value>
class simpleSparseArray
{
public:
    Value& operator[](int key)
    {
        size_t index = find(key);
        if (index == end())
        {
            index = insert(key,0);
        }
        return mData[index].second;
    }

    size_t begin()
    {
        return 0;
    }

    size_t end()
    {
        return mData.size();
    }

    int keyAt(size_t index)
    {
        assert(index >= 0 && index <= mData.size());
        return mData[index].first;
    }

    Value valueAt(size_t index)
    {
        assert(index >= 0 && index <= mData.size());
        return mData[index].second;
    }

    size_t find(const int key)
    {
        size_t index = 0;      
        while (index < mData.size())
        {
            if (key == mData[index].first)
            {
                break;
            }
            if (key < mData[index].first)
            {
                index = end();
                break;
            }
            ++index;
        }
        return index;
    }
    size_t insert(int key, const Value& value)
    {
        simplepair<int,Value> pair;
        pair.first = key;
        pair.second = value;
        mData.push_back(pair);
        sort();
        return find(key);
    }

    size_t erase(size_t index)
    {
        return mData.erase(index);
    }
private:
    void sort()
    {
        typedef int (*sortfunc)(const void *arg1, const void *arg2);
        sortfunc sf = Compare<int,Value>::pair_compare;
        ::qsort(mData, mData.size(), sizeof(simplepair<int,Value>), sf);
    }

    simple_dynamic_array<simplepair<int,Value> >  mData;
};

int main()
{
    simpleSparseArray<const char*> s;

    s[1] = "Hello";
    s[27] = "Array";
    s[4] = "Dense";

    s[4] = "Sparse";

    size_t i = s.begin();
    while (i != s.end())
    {
        cout << s.keyAt(i) << ", " << s.valueAt(i) << endl;
        ++i;
    }

    // Erase...
    size_t index = s.find(27);
    s.erase(index);

    return 0;
}
This sounds too complicated.  How about simply breaking the  array into two (or more) pieces, to stay within the 64K limits?  For example, we could split it in two, and use very simple accessor/mutator functions to hide the split:

#define SIZE 5000
static int array1[SIZE];
static int array2[SIZE];

int get(int idx ) {
   if (idx < SIZE)
      return array1[idx];
   return array2[idx - SIZE];
}

void put(int idx, int value) {
   if (idx < SIZE)
      array1[idx] = value;
   else
      array2[idx] = value;
}

Again, a C++ class could be used to hide this behind some operator[] methods to make things more readable.  Inlining could be used to improve speed.
If there are a 1000 entries (100 times as many as currently) you might have to split it into 100 sections, that would be 640k total.  That would exhaust all of the DOA memory--and then some--and 99% of it would be wasted since most entries are not needed.
Avatar of Ra

ASKER

jason, it's nice, but only one problem.  VC++ 1.52 does not have STL!  I tried compiling it anyway just incase I'm wrong about that, and I get over 100 errors.  After I commented out iostream I only get 30, most of them from the template stuff.  I looked in help and it stated "This product does not support templates."

rtenhove, I didn't think about doing that.  It does solve the problem but uses up a ton of wasted memory.  I can only fit 1500 elements in an array at maximum, so I would need 7 arrays to handle this.  This would really slow things down when it comes to doing reports.  SInce all 10,000 entires would have to be checked for that active flag.  Usually, there are fewer then 200 elements used per system.  That's 8800+ elements that are doing nothing but wasting memory.  I bleieve that a sparse array is the best option here.
Avatar of Ra

ASKER

Here is the correct math.  Rounding down to 1000 elements per array so as to not have any extra.

1 array = 1000 elements = 50K (estimated)
10 arrays = 500 KB

It doesn't quite get to 640 but it gets close.  That doesn't leave much memory for the program to use.
> jason, it's nice, but only one problem.  
> VC++ 1.52 does not have STL!

I knew it didn't have STL (this problem would be trivial if it did!), but I never thought there would be no templates!  I came to VC relatively recently, but I have been using templates in C++ for more than 10 years!

You could instantiate it manually if you are still interested in this as as solution (that is replace template types with actual types).

> This sounds too complicated

there is very little complexity in using the code, just maybe a little in understanding how it works.  But a sparse array is what was asked for, so that is what my code provides.

Nobody (that I noticed), up until your post mentioned DOS memory limits as a constraint - just efficiency.  I never had to program in the pre-Win32 days, something for which I am eternally grateful given how primitive it appears to be. I am not entirely sure how two statically allocated arrays will help over one...but with DOS, who knows.
>>  I am not entirely sure how two statically allocated arrays will help over
>> one...but with DOS, who knows.
The idea is to change to dynamcially allocated arrays because the number of elements that actually need to be stored are low.
> The idea is to change to dynamcially allocated arrays
> because the number of

exactly, hence the sparse and dynamic arrays, but  rtenhove seems to be suggesting using two (or more) statically allocated arrays to simulate one large one.  I am not sure how this is supposed to help...
>> rtenhove seems to be suggesting using two (or more)
Each in different segments.  That way you don't have arrays larger than 64k.  But the problem is you woudl need a 100 such arrays...

Of course, the other option is to switch to the huge or large (I think large) memory model.  These (huge at least) do allow arrays that are larger than 64 k.   But they more slowly and the code is larger.
Avatar of Ra

ASKER

Yeah, there is a _huge type modifier that you can use to make arrays, but like I said, 10,000 elements will take 500KB or more and a _huge array might take more then that.  Then you got to add everything that DOS puts into the first 640KB and you only have less then 100K left for the program.

jason, please post the code with the actuall type.  This still seems like the best method.  VC++ 1.52c is copyrighted 1993.  It's 8 years out of date now.  I believe 2.0 was the first MS C++ compiler to use templates, but not sure on this.  I've only been using C++ for a few years now.  I started when they had VC++ 5.0 out.  I'm still waiting for 7.0... VB with thread support!!!  ;)
ASKER CERTIFIED SOLUTION
Avatar of jasonclarke
jasonclarke

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of Ra

ASKER

Ok, thanks.  I'll test this after lunch.
Avatar of Ra

ASKER

>> int i1 = static_cast<const simplepair* >(arg1)->first;
>> int i2 = static_cast<const simplepair* >(arg2)->first;

jason, static_cast is unknown in VC 1.52.  Do you have alternative code for this?
> Do you have alternative code for this?

 int i1 = ((const simplepair*)(arg1))->first;
 int i2 = ((const simplepair*)(arg2))->first;
just use a C cast

int i2 = ((const simplepair*)arg2)->first;

Its hard to travel back in time...
> Its hard to travel back in time...

absolutely...these things come as second nature - it is not easy to spot all the new stuff.
Avatar of Ra

ASKER

> > Its hard to travel back in time...

> absolutely...these things come as second nature - it is
> not easy to spot all the new stuff.

LOL, and if you're like me and never done anything this old before, it's near impossible.

It works now.  Thanks for all the help.  The project is currently in C and I'm gonna try to convert it to C++.  It shouldn't be that hard and it looks as though C++ will still work as a MSDOS app, so everything should be good, if not, I'll just remove the class stuff and use the functions themselves.  I'm increasing the points since this question turned out to be harded then I expected.  Thanks to all of you for you help.
Avatar of Ra

ASKER

Thanks again.