Solved

Sorting, Integers/Int Strings

Posted on 2003-10-29
75
1,434 Views
Last Modified: 2013-12-14
This is basically what im working with.  

So far:
It takes a file name in from the command line.
Inserts it into an array.
Runs the array through various sorts.

End Result:
I'm wanting to be able to count the number of comparisons that take place in each sort function and output them to the user.  Some functions I have tried to implement a comparison counter but I dont think im doing this right??

I also want to output the sorted array to a file.  However im wanting to use the input filename + a suffix.  Could this be done in a seperate output() function by passing the sorted array, I guess i dont know how to keep track of the input filename to use it in the output name??

I am planning on using integer data files but inserting them into an array once as integers and then a second time as integers as strings.  Then performing sort on both, I guess this mean I need to change the sorting algorithms and use compare() where needed??

Here is what I have so far... any design tips, hints, points, anything would be appreciated.  There is no specific question here, just a little collaboration on how I am doing so far. Thanks.

#include <iostream>
#include <string.h>
#include <fstream>
using namespace std;

void compare(int a, int b) {
  // will compare the integers here and return result
}

void compare(char* a, char* b) {
  // will compare here also, except as integer strings
}

template <class T>
void InsertionSort(T A[], int size)
{
  int compares = 0;
  for(int j=1; j<size; j++)
    {
      T key = A[j];
      int i = j - 1;
      while ((i >= 0) && (A[i] > key))
      {
        A[i+1] = A[i];
        i -= 1;
        compares++;
      }
      A[i+1] = key;
    }
}

template <class T>
void mergesort(T A[], int left, int right)
{
  if(left<right)
    {
      int mid=(left + right)/2;
      mergesort(A, left, mid);
      mergesort(A, mid+1, right);
      merge(A,left,right);
    }
}

template <class T>
void merge(T A[],int left, int right)
{
  int size=right-left+1,
  mid=(left+right)/2;

  T *b=new T[size];
  int count=0;
  for(int x=left;x<=mid;x++,count++)
    {
      b[count]=A[x];
    }
  for(int x=right;x>=mid+1;x--,count++)
    {
      b[count]=A[x];
    }
  for(int x=0,y=size-1,put=left;x<=y;put++)
    {
      if(b[x]<=b[y])
      {
        A[put]=b[x];
        x++;
      }
      else
      {
        A[put]=b[y];
        y--;
      }
    }
  delete[] b;
}

template <class T>
void quicksort(T A[], int left, int right, int pivot)
{
  int left_side, right_side;

  left_side = left;
  right_side = right;
 
  do
    {
      while (A[right_side] > pivot)
      right_side--;
      while (A[left_side] < pivot)
      left_side++;
      if (left_side <= right_side)
      {
        swap(A[left_side], A[right_side]);
        left_side++;
        right_side--;
      }
    }
  while (right_arrow >= left_arrow);
 
  if (left < right_arrow)
    quick_sort(A, left, right_arrow);
  if (left_arrow < right)
    quick_sort(A, left_arrow, right);
}  

template <class T>
void Heapsort(T A[], int size)
{
  BuildHeap(A, size);
  for(i=size; i>1; i--)
    {
      int tmp = A[0];
      A[0] = A[i-1];
      A[i-1] = tmp;
      size--;
      Heapify(A, 1, size);
    }
}

template <class T>
void BuildHeap(T A[], int size)
{
  for(i=size/2; i>0; i--)
    Heapify(A, i, size);
}

template <class T>
void Heapify(T A[], int i, int size)
{      
  int  l = 2 * i;
  int  r = 2 * i + 1;
  int largest = 0;

  if(l <= heapsize && A[l-1] > A[i-1])
    largest = l;
  else
    largest = i;

  if(r <= heapsize && A[r-1] > A[largest-1])
    largest = r;
  if(largest != i)
    {
      int tmp = A[i-1];
      A[i-1] = A[largest-1];
      A[largest-1] = tmp;
      Heapify(A, largest, heapsize);
    }
}

int main(int argc, char* argv[])
{
  ifstream input;  // initializes input
  int dataLength;

  if(argc > 1)
    {
      string inData = argv[1];
      input.open(inData);

      if (!input) {
      cerr << "Unable to open file." << endl;
      exit (1);
      }

      input >> itemCount;
      arraySize=itemCount-1;
      int Array[arraySize];
      int arrayCount = 0;

      while(input >> x) {
      Array[arrayCount] = x;
      arrayCount++;
      }

      intertionsort(Array, arraySize);
      mergesort(Array, 0, arraySize);

      int lastPivot = Array[arraySize];
      int middlePivot = Array[arraySize]/2;

      quicksort(Array, 0, arraySize, lastPivot);
      quicksort(Array, 0, arraySize, middlePivot);
      heapsort(Array, arraySize);
    }
}
0
Comment
Question by:killer455
  • 39
  • 33
  • 2
  • +1
75 Comments
 
LVL 11

Expert Comment

by:bcladd
ID: 9646862
(1) Input file name: You have it in inData in the scope of the if statement in main. If you want to add ".out" to the file and use it for output you could use (just before end of if statement body)
0
 
LVL 11

Expert Comment

by:bcladd
ID: 9646885
Ooops.

You could try

    string outData = inData + ".out";
    output(Array, outData);

Assuming that output takes an array and the name of a file and uses an ofstream to print the data.

(2) My architecture hint from your other question pertains here even more so: write the comparison function and replace all your comparisons with calls to that function. It may need to be a templated function so that you can compare arbitrary types. Include inside the function the incrementing of the counter. You will still have to reset the comparison counter between sorts (if it is global). Alternatively you could make the comparison function a comparison functor that has the count contained in it.

(3) Your calculation of middlePivot is wrong. I think you want the /2 INSIDE the square brackets.

-bcl
0
 

Author Comment

by:killer455
ID: 9646888
Won't i need to output the array into the file with the suffix name added?  If so shouldnt this be done in a seperate output().  How would i get the input file name there?
0
 

Author Comment

by:killer455
ID: 9646905
Ok, when u say it may need to be templated.  Are you saying the compare function?  Well I want the program to be able to sort files (containing integers) as either integers or integer strings.  To do this wouldnt i need to use the function overloading (as above).??

0
 
LVL 11

Expert Comment

by:bcladd
ID: 9646908
The signature for the output function I dreamed up in my previous post was

    void output(int A[], const string & ofName);

thus you can open ofName for output.

Note: I just looked at the declaration of Array. C++ does not support dynamically sized arrays declared like that. You must either declare the array on the heap (using new[]) or use some dynamically sized container (might I suggest std::vector).

-bcl
0
 
LVL 11

Expert Comment

by:bcladd
ID: 9646928
Suggestion: You might want to comment out most of your program and see if you can read the input file and write the exact same array to the output file. That gives you a chance to play with C++'s syntax for declaring arrays and leads you to a working program sooner. Then uncomment one sort (one you think works) and work the kinks out of it. That will give you a feel for the syntax of using your comparison functions and making sure your templates work in one case.

I think you will find a templated compare function will make your life easier. Also, don't use char * if you can avoid it: use std::string.

-bcl
0
 
LVL 11

Expert Comment

by:bcladd
ID: 9646934
Overloading should work. I just thought it through and it should work fine. I still suggest trading char * for string. Building the integer strings will be easier, comparing them will be much easier,  populating the arrays will be much, much easier, etc.

-bcl
0
 
LVL 4

Expert Comment

by:n_fortynine
ID: 9646942
killer445, when I suggested that you use function overloading I had something else in mind (that you weren't gonna use templates) and I guess you misunderstood my last comment from the previous post. Here since you're using a function template, my suggestion would be to have 2 different arrays if you were going to do this. An array of type int, and the other one can be of a type you define to hold char*.
class myString {
   char* num;
   public:
   myString() { }
   setVal(int number) {
     //convert from your number into the array num
   }
   bool operator == (const myString& s) const {
      return strcmp(num, s.num) == 0;
   }
   bool operator > (const myString& s) const {
      return strcmp(s.num, num) < 0;
   }
   bool operator < (const myString& s) const {
      return strcmp(s.num, num) > 0;
   }
};

and in your main create an array of this:
  myString  mS[/*Same size as your int array*/];
  and when you get the data from file, make sure you call
  mS[/*index*/].setVal(/*the_int_data*/);

This class overloads your compare operators, so they'll work fine with the sort functions you're having now.
0
 
LVL 4

Expert Comment

by:n_fortynine
ID: 9646953
>>I think you will find a templated compare function will make your life easier. Also, don't use char * if you can
>> avoid it: use std::string.

I totally agree, there are >, <, == operators already defined for std::string so you can just make you compare function a function template and that should work. I think bcladd's got your answer for you there.
0
 

Author Comment

by:killer455
ID: 9647003
Ok those last couple posts kinda got me confused.  Here is some updated code, just to show u what im working on.  

Basically im confused about the whole templates thing now.  By a templated compare function.. do you mean.

template <class T>
void compare(T a, T b) {
  // will compare the integers here and return result
compares++;
}

instead of the two seperate ones I have below?

#include <iostream>
#include <string.h>
#include <fstream>
using namespace std;

void compare(int a, int b) {
  // will compare the integers here and return result
compares++;
}

void compare(string a, string b) {
  // will compare here also, except as integer strings
compares++;
}

template <class T>
void InsertionSort(T A[], int size)
{
  for(int j=1; compare(j, size); j++)     // compare j < size
    {
      T key = A[j];
      int i = j - 1;
      while ((compare(i, 0) && (compare(A[i], key))  // compare i>=0 and then A[i] > key
     {
       A[i+1] = A[i];
       i -= 1;
     }
      A[i+1] = key;
    }
}

template <class T>
void mergesort(T A[], int left, int right)
{
  if(compare(left, right))   // compare left<right
    {
      int mid=(left + right)/2;
      mergesort(A, left, mid);
      mergesort(A, mid+1, right);
      merge(A,left,right);
    }
}

template <class T>
void merge(T A[],int left, int right)
{
  int size=right-left+1,
  mid=(left+right)/2;

  T *b=new T[size];
  int count=0;
  for(int x=left;compare(x,mid);x++,count++)     // compare x<=mid
    {
      b[count]=A[x];
    }
  for(int x=right;compare(x, mid+1);x--,count++)      // compare x>=mid+1
    {
      b[count]=A[x];
    }
  for(int x=0,y=size-1,put=left;compare(x,y);put++)      // compare x<=y
    {
      if(compare(b[x],b[y]))        // compare b[x]<=b[y]
     {
       A[put]=b[x];
       x++;
     }
      else
     {
       A[put]=b[y];
       y--;
     }
    }
  delete[] b;
}

template <class T>
void quicksort(T A[], int left, int right, int pivot)
{
  int left_side, right_side;

  left_side = left;
  right_side = right;
 
  do
    {
      while (compare(A[right_side],pivot))    // compare A[right_side] > pivot
     right_side--;
      while (compare(A[left_side],pivot))     // compare A[left_side] < pivot
     left_side++;
      if (compare(left_side,right_side))      // compare left_side<=right_side
     {
       swap(A[left_side], A[right_side]);
       left_side++;
       right_side--;
     }
    }
  while (compare(right_side,left_side));   // compare right_side>=left_sude
 
  if (compare(left,right_side))                        // compare left < right_side
    quick_sort(A, left, right_side);
  if (compare(left_side,right))                     // compare left_side < right
    quick_sort(A, left_side, right);
}  

template <class T>
void Heapsort(T A[], int size)
{
  BuildHeap(A, size);
  for(i=size; compare(i,1); i--)           // compare i>1
    {
      int tmp = A[0];
      A[0] = A[i-1];
      A[i-1] = tmp;
      size--;
      Heapify(A, 1, size);
    }
}

template <class T>
void BuildHeap(T A[], int size)
{
  for(i=size/2; compare(i,0); i--)                 // compare i > 0
    Heapify(A, i, size);
}

template <class T>
void Heapify(T A[], int i, int size)
{      
  int  l = 2 * i;
  int  r = 2 * i + 1;
  int largest = 0;

  if(compare(l,heapsize) && compare(A[l-1],A[i-1]))  // l <= heapsize and A[l-1] > A[i-1]
    largest = l;
  else
    largest = i;

  if(compare(r,heapsize) && compare(A[r-1],A[largest-1]))   //  (r <= heap : A[r-1] > A
        largest = r;

  if(compare(largest,i)      // compare largest != i
    {
      int tmp = A[i-1];
      A[i-1] = A[largest-1];
      A[largest-1] = tmp;
      Heapify(A, largest, heapsize);
    }
}

int main(int argc, char* argv[])
{
  ifstream input;  // initializes input
  int dataLength;

  if(argc > 1)
    {
      string inData = argv[1];
      input.open(inData);

      if (!input) {
     cerr << "Unable to open file." << endl;
     exit (1);
      }

      input >> itemCount;
      arraySize=itemCount-1;
      int Array[arraySize];
      int arrayCount = 0;

      while(input >> x) {
     Array[arrayCount] = x;
     arrayCount++;
      }

      intertionsort(Array, arraySize);
      mergesort(Array, 0, arraySize);

      int lastPivot = Array[arraySize];
      int middlePivot = Array[arraySize/2];

      quicksort(Array, 0, arraySize, lastPivot);
      quicksort(Array, 0, arraySize, middlePivot);
      heapsort(Array, arraySize);
    }
}
0
 
LVL 11

Expert Comment

by:bcladd
ID: 9647025
(1) Yes, a single templated compare function will suffice.
(2) That way the templated compare function works with any class that supports your method of comparison (since you're using strings, you can use < inside compare and have it work).

-bcl
0
 
LVL 11

Expert Comment

by:bcladd
ID: 9647061
I'm sorry I didn't mention this sooner but "compare" is a terrible name for the function. When it is called, how can you tell when it should be true and when it should be false? Calling it "lessThan" or "less" or even "LT" (Fortran flashbacks for some here) makes it clear what you are comparing in the if statements (and while loops) in the sort functions.

-bcl
0
 

Author Comment

by:killer455
ID: 9647171
bcladd,

> void output(int A[], const string & ofName);
> thus you can open ofName for output.

If this is called from each sort seperately, how do i pass ofName?? since its not used in the sorts but in the main()

> Note: I just looked at the declaration of Array. C++ does not support dynamically sized >arrays declared like that. You must either declare the array on the heap (using new[]) or >use some dynamically sized container (might I suggest std::vector).

Don't quite know what u mean by this?  Is this a problem in another area of my program?
0
 
LVL 11

Expert Comment

by:bcladd
ID: 9647214
(1) Taking your questions in reverse order:

The following code WILL NOT COMPILE IN C++.  It is almost identical to the code you have allocating Array in your main function.

      int someSize;
      someSize = 19;
      int Array[someSize];

So you need to declare Array in some other way. Either declare it to be a huge, huge array that you know you will never overflow (this is a bad choice; ask Microsoft about the concept of "buffer overruns") or, having read the size from the file, allocate space to hold the array on the heap using an int * and new or use std::vector (a templated, dynamically sized array replacement in the STL).

This is a problem you are going to face.

(2) You are planning on each sort printing the array to a file? If you are only printing the results (the sorted array), then you should be able to do the printing in main() AFTER calling the sort. If you are going to print intermediate results during processing, you can pass in the name (outData) as a string parameter to each of the sort routines. They can then open the file and print their results.

An alternative would be to have a global ofstream, fout, that you open in main() before calling any of the sort routines and that all of your sort routines use for output. Note that most people frown on global variables but this one is very, very close to the global cout stream used for standard output.

0
 

Author Comment

by:killer455
ID: 9647242
So you need to declare Array in some other way. Either declare it to be a huge, huge array that you know you will never overflow (this is a bad choice; ask Microsoft about the concept of "buffer overruns") or, having read the size from the file, allocate space to hold the array on the heap using an int * and new or use std::vector (a templated, dynamically sized array replacement in the STL).

Ok i might need some help with this, havent worked with it before.  I could have sworn i did a past program using:

     int someSize;
     someSize = 19;
     int Array[someSize];

But i will take your word for it obviously :)  I dont have my files on this comp so i cant check.

Maybe this will change things, probably not but the first number in each file im going to be used is the size of the array, like this: (file)

3
123
234
234

It will take input from the file and use that as the input, but this still wont work will it?
0
 
LVL 11

Expert Comment

by:bcladd
ID: 9647305
No. The error in Borland's C++ compiler is "Constant expression required in function main(int,char * *)" at the declaration of Array. The problem is that C++ sets aside space in the function activation record for all local variables AT COMPILE TIME so it needs to know the size of every one of them at compile time. The code will work if you declare someSize to be a const int and initialize it on the same line:

    const int someSize = 19;
    int Array[someSize];

This works because the compiler knows the size at compile time and you cannot change the size at run time. Exactly what you don't want.

My suggestion:

#include<vector>

(you already are using namespace std to get string so that means std::vector is just vector)

vector<int> Array;

Then in your loop you can use the push_back method (that puts a new element at the back of the vector). You don't need to include the size in the file (vectors grow as needed) and you don't have to count the nubmer of elements because vectors have a size() method that returns the number of elements in the vector. vectors also have a [] operator that is efficient and works just like an array (okay, almost exactly like an array).

    while(input >> x) {
         Array.push_back(x);
      }

There is a way to dynamically declare an array but I don't want to get into it until you ask.

This means the signature of each of your sort functions will change to be vector<T> A rather than T A[]. You don't have to pass in the size unless you need it explicitly and if you need it from main you can refer to it as Array.size()

-bcl
0
 
LVL 3

Expert Comment

by:EarthQuaker
ID: 9650514
I'm sorry, but why don't you simply time those operations and report the time it tooks ?

It's obvious that what is faster does less or faster operations...

have a look around QueryPerformanceCounter() if interested.
0
 
LVL 11

Expert Comment

by:bcladd
ID: 9650851
EarthQuaker-

Exact operation timing is often worthwhile but it is also the case that understanding the big-O running time of various algorithms can be very useful. When studying big-O running times it is annoying to have to take into account whether the user was watching streaming video or had an IM session in the background or...you get the idea. That sort of noise can be filtered out of the data by simply counting operations of interest (as you suggest with the performance counter) and those operations are typically at a high level of abstraction (is it more important to know the number of comparisons charged to the sort at the C++ level or the number of times the CMP machine instruction was executed? What percentage of the CMP should be charged to the sort and which were set-up/tear-down code emitted by the compiler that would be different on a different compiler/architecture).

Just some thoughts on why you might want to count comparisons (or swaps; even harder to track at the machine language level) rather than time or machine instructions.

-bcl
0
 

Author Comment

by:killer455
ID: 9652037
blcadd,

In some of my sorts... what is the different between doing:
void insertionSort(int A[], int size);
Than doing:
void insertionSort(int *A, int size);

Is one better to use or...???







0
 
LVL 11

Expert Comment

by:bcladd
ID: 9652116
Both are fundamentally the same.

The difference is that you could change the value of A in the second example (You could use ++A or A += 2 or similar), that is, when it is a pointer.
You CANNOT change A in that way when it is passed as an unsized array.

The fact that using the two is all but the same indicates that C/C++ treats arrays as pointers at sequences of objects of the same type (this is under the hood) and explains why address arithmetic works with pointers and with arrays.

Is one better than the other? The [] notation indicates that you really expect to be working with sequences of items rather than a single item. Thus as a documentary device it can be useful. Otherwise no difference in performance so just pick one or the other (to be consistent) and get comfortable with reading it.

-bcl
0
 

Author Comment

by:killer455
ID: 9653181
Also the comparison count is going to differ depending on whether the array is full of integers or integer strings.

The number of comparisons of integers in an array in quicksort will differ from the number of comparisons on the array if the numbers were treated as integer strings.

How can you use only one compare function for both integers and integer strings? doesnt there need to be 2?
0
 
LVL 11

Accepted Solution

by:
bcladd earned 500 total points
ID: 9653261
Well, a tempated lessThan function would be able to compare any type that supported the internal comparison method. I will not write that function for you but I will write a small bit of source to do equals()

#include <string>
#include <iostream>
using namespace std;

template <class T>
bool equals(const T &A, const T &B)
{
  return (A == B);
 
}


int main(int argc, char **argv)
{
  int x = 10, y = 11;
  string a = "Hello", b="Hello", c="Goodbye";
  string Spc=" ";
  string Not="!";
 
 
  cout << (equals(x, x) ? " " : "!") << "equals(x, x)" << endl;
  cout << (equals(x, y) ? " " : "!") << "equals(x, y)" << endl;
  cout << (equals(a, b) ? " " : "!") << "equals(a, b)" << endl;
  cout << (equals(a, c) ? " " : "!") << "equals(a, c)" << endl;
  cout << (equals(x, 13) ? " " : "!") << "equals(x, 13)" << endl;  
} /** End of main **/


Now, your observation that you need two comparison functions makes sense and this program has two equals() functions, one for int and one for std::string. They were written by the compiler when the templated funciton was instantiated. Since your sort functions are templated (there are really two sort functions, one for each of two types), templating the counted comparison function seems to fit in with the same spirit.

-bcl
0
 

Author Comment

by:killer455
ID: 9653905
Ok, but the "comparison" count that is incremented inside that function will not be the same for integers and integer strings.

For example

template <class T>
bool lessThan(int a, int b) {

// Whatever is in here wont work for both integers and integer strings correct?
// The count that is incremented is going to be different also.
// I'm wanting to compare how many comparisons take place for regular integers
// and then for integer strings.

compares++;
}
0
 

Author Comment

by:killer455
ID: 9653949
Here is an outline of what im trying to say i guess:

1.  An array is pulled in from a file, the numbers stored as regular integers.
    - This array is run through all of the sorts.
    - Each different sort produces a comparison count that is displayed.
    - Then the final sorted array it sent to an output file.

2. A second array is then built with the same file, except stored as integer strings.
    - Same as above

This should allow comparison between the two different comparison counts that are produced.  So shouldnt  this mean that integer strings should be compared differently than regular integers?  If not then the count would be the same and there wouldnt be a reason for observation of the difference in comparisons.
0
 

Author Comment

by:killer455
ID: 9654009
Also in BuildHeap

  if(compare(largest,i)      // compare largest != i
    {
      int tmp = A[i-1];
      A[i-1] = A[largest-1];
      A[largest-1] = tmp;
      Heapify(A, largest, heapsize);
    }

I have to compare largest and i   --> largest != i

How can I do this with only one comparison function, I guess I dont see how to implement the function and be able to make all the comparisons possible.
0
 
LVL 11

Expert Comment

by:bcladd
ID: 9654054
I guess I don't understand what you mean by "integer string". It was my assumption that you would read the following file:

23
456
2200
213
3456

and produce the following sorted files:

integers
23
213
456
2200
3456

strings
213
2200
23
3456
456

(the strings are in lexicographic order).

If this is what you mean to do with your sort, then the comparison will be counted correctly if you use < in the your lessThan function. That is, one string comparison will be counted as one comparison. If you want to count how many character comparisons are done when comparing those strings then you will need two different versions of lessThan so you can write your own string compare and hook a count to the character comparisons (could be done using an int/char version of lessThan, actually).

Since the sort orders are different, it stands to reason that the amount of shuffling required to go from the original order to the sorted order will be different in the two scenarios so you will have something to compare.

If I am misunderstanding what you want to do, please let me know.

-bcl
0
 

Author Comment

by:killer455
ID: 9654108
You understand me correct, the sorted lists you have above are what im looking to do with this program.  However wont running the program on an array of ints and array of strings produce the exact same comparison count?  If so this isnt what im looking for.  I'm looking to observe the differences in the comparison count in each situation.  So I guess this means I want to count the character comparisons correct?

Then this leads me to my next question.  If this is the case, then how do I use two seperate functions and still call a compare within my sort functions?

If im getting ahead of myself, sorry, trying to grasp this.
0
 

Author Comment

by:killer455
ID: 9654121
Well nevermind i read your last statement again.  You're right the sort orders will be different.  Yes this is what i want, forget what i said about character comparisons.
0
 

Author Comment

by:killer455
ID: 9654157
I guess I dont understand how it can be done with one function.  How you "tell" the function that the incoming item is an integer or a string, and then performing the correct comparison depending on that.
0
 
LVL 11

Expert Comment

by:bcladd
ID: 9654158
Cool. The big win in using std::string to hold the string versions of the integers is that operator< is defined for std::string and is defined to order the strings lexicographically. So a template that uses < inside it will work with int, string, float, etc. It would not work (without a specialization) for char * because < would compare the addresses of the strings in memory, not the contents of the strings.

-bcl
0
 

Author Comment

by:killer455
ID: 9654201
Does this work?  Trying to get it to work for all types of comparisons.  Obviously if
!(a<b) then it has to be (a>=b).  However some of the comparisons require !=

template <class T>
void compare(T a, T b) {

if(a < b) { return 1; }
if(a > b) { return 0; }
if(a==b) { return 1; }

compares++;
}

0
 

Author Comment

by:killer455
ID: 9654211
If you need/want me to start a new thread let me know.  But your continued help/tips are very appreciated and having it all together helps also.
0
 
LVL 11

Expert Comment

by:bcladd
ID: 9654250
(1) Yes it works.

(2) Does it count as one comparison?

The second question is for you to answer.

Remember that in an ordered set, a != b is equavlent to ((a < b) || (b < a)); two comparisons but only using <

-bcl
0
 

Author Comment

by:killer455
ID: 9654278
(2) Does it count as one comparison?

Well inside the compare functions there may be more than one comparison being made.  But ultimately shouldnt it only be incremented once because the two items are being compared "once" in the sort function.  Since compare i only called once

0
 

Author Comment

by:killer455
ID: 9654295
template <class T>
void compare(T a, T b) {

  if(a < b) {
   compares++;
   return 1;
  }

  if(a > b) {
    compares++;
    return 0;
  }

  if(a==b) {
    compares++;
    return 1;
  }
}
0
 

Author Comment

by:killer455
ID: 9654401
Is that what you meant above  ^?
Also since the there has to be two arrays created (one for ints and one for strings) and then either create output in the sorts, or do it in the main (which would require making more copies of the arrays), isnt there some way to get around such repeated code?? Perhaps writing a function that builds the array and then that function calls the sorts?

int main()
{
...
      buildArray <int> (input);
      buildArray <string> (input);
...
}

template <class T>
buildArray(ifstream input)
{
      T* Array = NULL;  
      int itemCount;      
      input >> itemCount;
      int arraySize = itemCount-1;    
      a = new T[arraySize];
      int arrayCount = 0;

      while(input >> x) {
        Array[arrayCount] = x;
        arrayCount++;
      }

      intertionsort(Array, arraySize);
      mergesort(Array, 0, arraySize);

      int lastPivot = Array[arraySize];
      int middlePivot = Array[arraySize/2];
      quicksort(Array, 0, arraySize, lastPivot);
      quicksort(Array, 0, arraySize, middlePivot);

      heapsort(Array, arraySize);
}
0
 
LVL 11

Expert Comment

by:bcladd
ID: 9654408
Yep. You can make it a little simpler (and closer to your original) with

    template <class T>
    void compare(T a, T b) {    
        compares++;

        if (a < b) { return 1; }
        if (a > b) { return 0; }
        if (a==b) { return 1; }
    }

-bcl
0
How to run any project with ease

Manage projects of all sizes how you want. Great for personal to-do lists, project milestones, team priorities and launch plans.
- Combine task lists, docs, spreadsheets, and chat in one
- View and edit from mobile/offline
- Cut down on emails

 
LVL 11

Expert Comment

by:bcladd
ID: 9654490
Sorry, I didn't see your second posting above. It is a nice templated function.  
    Why is ArraySize itemCount - 1?
    You never declare a variable a so assigning to it is an error.
    How many output files will you have?
    When does compares get set to zero?

Also, you asked (a few back) about setting up another question. If you want to ask a different, specific, question or just get a broader perspective, feel free. If you asked I could even refrain from chiming in. I am not so much of a point-monger to need you to set up another question (not saying I won't accept the points for this one or even more than shown above; I just want to keep some dignity while grovelling...).  I like the teaching so I will give you what guidance you ask for.

-bcl
0
 

Author Comment

by:killer455
ID: 9654845
> Why is ArraySize itemCount - 1?

Because itemcount is equal to the number of items in the array.
Shouldnt the first item be in A[0].  So for 80 items, itemCount = 80.  Making ArraySize=79
Inputing items in A[0]....A[79]  which is 80 items.

> You never declare a variable a so assigning to it is an error.

Oops, thanks for the correction.

> How many output files will you have?

After one run of the program.  There will be 10 output files.  Five, one from each sort, pertaining to the array of ints.  Five more, one from each sort, pertaining to the array of strings.

> When does compares get set to zero?

I was unsure about this one.  I guess it depends on how I handle the output?? I could set compares to 0 at the beginning of each sort and then passing it to the compare function, but it seems to me that setting it up there would kinda be confusing.  I guess I don't know where I want to do the output at, or where is the most logical
0
 

Author Comment

by:killer455
ID: 9654852
Pertaining to the output, one reason I dont know where to do it is because I want to add a suffix to the input filename.  Like DataFile.inths.

Input filename = DataFile
Type of data = int
Type of sort = heapsort or "hs"
0
 
LVL 11

Expert Comment

by:bcladd
ID: 9657314
ArraySize is 80, not 79.  Yes, the indices run from 0 - 79 but the number in the square brackets after a type in new (or in-line when declaring an arry in C/C++) is the NUMBER OF ELEMENTS. So if you want to store 80 elements you had best declare 80 elements, not 79. C/C++ declares ranges (in arrays/vectors/strings) with the first element's index and ONE PAST the last element's index. So, with an implicit start of 0, you use the number of slots to declare. It makes for loops easy to write for counting up:

  for (int i = 0; i != ArraySize; ++i) {
    // do something with Array[i] here
  }

On output: You want output with the number of comparisons and the final sorted array generated for each sort, right? Assuming you don't care about intermediate results, it makes sense to alternate calls to sorts with output statements in main(). Why there? Because it leaves the sorts as general purpose as possible (they just sort their input and are not tied to this particular program except through their need for the lessThan function) and they can be reused. Because main() already has the name of the input file so it can generate related output file names without passing the file name around. Because one global output stream would make sense; ten global output streams makes no sense.

This is also where you should reset compares. That way if you wanted to count two different sorts together you could skip initialization between the two (compares continues counting across the two). Also keeps program specifics out of the sort routines.

-bcl
0
 

Author Comment

by:killer455
ID: 9657965
Yes im wanting to display the comparisons.  The sorted array should not be displayed, only sent to the file as output.  Is this what you meant for the main() dealing with output?

int main(int argc, char* argv[])
{
  ifstream input;
  ofstream output;
  int dataLength;

  if(argc > 1)
    {
      string inData = argv[1];
      input.open(inData);

      if (!input) {
        cerr << "Unable to open file." << endl;
        exit (1);
      }


      int lastPivot = Array[arraySize];
      int middlePivot = Array[arraySize/2];
     
      buildArray <int> (input);

      intertionsort(Array, arraySize);
      output.open(inData+.intis);
      // output sorted array here

      mergesort(Array, 0, arraySize);
      output.open(inData+.intms);
      // output sorted array here

      quicksort(Array, 0, arraySize, lastPivot);
      output.open(inData+.intqsl);
      // output sorted array here

      quicksort(Array, 0, arraySize, middlePivot);
      output.open(inData+.intqsm);
      // output sorted array here

      heapsort(Array, arraySize);
      output.open(inData+.inths);
      // output sorted array here

      buildArray <string> (input);

      intertionsort(Array, arraySize);
      output.open(inData+.stris);
      // output sorted array here

      mergesort(Array, 0, arraySize);
      output.open(inData+.strms);
      // output sorted array here

      quicksort(Array, 0, arraySize, lastPivot);
      output.open(inData+.strqsl);
      // output sorted array here

      quicksort(Array, 0, arraySize, middlePivot);
      output.open(inData+.strqsm);
      // output sorted array here

      heapsort(Array, arraySize);
      output.open(inData+.strhs);
      // output sorted array here

    }
}
0
 
LVL 11

Expert Comment

by:bcladd
ID: 9658012
Something like that. You need to put the extentions in quotes and make sure you close output when you are done writing the array.

0
 

Author Comment

by:killer455
ID: 9658033
To me that seems like a lot of code, but i guess its necessary right?  I guess I can always clean it up later after i get the general program working.
0
 

Author Comment

by:killer455
ID: 9658122
Also if i use the buildArray function, how do i pass the array back to the main for use in the sort function calls?
0
 
LVL 11

Expert Comment

by:bcladd
ID: 9658681
(1) How are you going to pass the Array into the sort routines? It needs to be modified inside the sort function, right? That makes it very similar to a newly created array. I suggest you use a reference to a pointer of the appropriate type (for readability you should declare a new type for the pointers so you can pass a reference to the new type).

(2) A lot of code? I act surprised because I wonder where you planned on having less code. You can simplify main() if you want to use function pointers (and you make all of your sort functions have exactly the same signature), construct an array, and cycle through the array. You should probably write an output routine (similar to the input routine) that prints an entire array. That way you just pass the file name, the array, and the length of the array and it prints it all out. Makes each call to sort look something like this:

    init_sort(); // initializes any counters and stuff like that (set compares to 0)
    intArrayType heapArray ;
    copy(heapArray, Array, arraySize);
    heapsort(heapArray, arraySize);
    saveArray(inData+".ihs", heapArray, arraySize);
    show_sort_count(); // displays the statistics

Notice that I used a bunch of functions you actually haven't written but the should all be short and easy to write.

-bcl
0
 

Author Comment

by:killer455
ID: 9659196
(1) How are you going to pass the Array into the sort routines? It needs to be modified inside the sort function, right? That makes it very similar to a newly created array. I suggest you use a reference to a pointer of the appropriate type (for readability you should declare a new type for the pointers so you can pass a reference to the new type).

Yes it needs to be modified inside the sort function.  Do you mean something like this?


template <class T>
void InsertionSort(T* A, int size)
{
  A = new T[]
  for(int j=1; compare(j, size); j++)     // compare j < size
    {
      T key = A[j];
      int i = j - 1;
      while ((compare(i, 0) && (compare(A[i], key))  // compare i>=0 and then A[i] > key
           {
             A[i+1] = A[i];
             i -= 1;
           }
      A[i+1] = key;
    }
}
0
 
LVL 11

Expert Comment

by:bcladd
ID: 9659224
template <class T>
buildArray(ifstream input)
{
      T* Array = NULL;  
      int itemCount;      
      input >> itemCount;
      int arraySize = itemCount-1;    
      a = new T[arraySize];
      int arrayCount = 0;

      while(input >> x) {
        Array[arrayCount] = x;
        arrayCount++;
      }
}

Pasted from one of your earlier posts. This will not work for several reasons: you cannot pass ifstream by value (MUST be by reference; there is no public copy constructor for ifstreams); you called this function twice with the same stream (that will read the file the first time and then be at the end of the file when the second call BEGINS); you don't (as you realized above) pass any information back to main().

Let's address these in reverse order:

(1) Passing information back: what information do we generate in the function? The array and the size of the array. Multiple data items going back implies using reference parameters:

    template <class T>
    buildArray(ifstream input, T*& Array, int & arraySize)
    { ... }

You don't want to declare them as local variables anymore, either.

(2) Actually, we'll address both of the other problems by NOT passing an ifstream at all. Instead the function will open (and close) and ifstream internally and we will just provide the name of the file as the first parameter to the function:

    template <class T>
    bool buildArray(const string & inFName, T*& Array, int & arraySize)
    {
        ifstream input(inFName);
        if (!input) {
             arraySize = 0;
             return false;
        }
        input >> arraySize;
        Array = new T[arraySize];
        int arrayCount = 0;
        T x;
        while(input >> x) {
            Array[arrayCount] = x;
            ++arrayCount;
         }
        input.close();
        return true;
    } // BuildArray

   
0
 

Author Comment

by:killer455
ID: 9659229
Your saying this should be in the main? or in an output function?

    init_sort(); // initializes any counters and stuff like that (set compares to 0)
    intArrayType heapArray ;
    copy(heapArray, Array, arraySize);
    heapsort(heapArray, arraySize);
    saveArray(inData+".ihs", heapArray, arraySize);
    show_sort_count(); // displays the statistics
0
 
LVL 11

Expert Comment

by:bcladd
ID: 9659234
The grouping of code is repeated over and over in main(), once for each kind of sort.
0
 

Author Comment

by:killer455
ID: 9659285
In some of your code above, what is:    
Dont understand where this is coming from?

intArrayType heapArray ;
0
 
LVL 11

Expert Comment

by:bcladd
ID: 9659338
Assume there is a typedef for the intArrayType.

It is equivalent to int * in your work. I was just trying to make it easier for me to understand by assuming a typedef:

typedef int * intArrayType;

-bcl
0
 

Author Comment

by:killer455
ID: 9659401
I think im pretty lost.  I'll work on what you have said and then make another post later.  Thanks for the continued help.
0
 

Author Comment

by:killer455
ID: 9659562
Ok instead of posting the code here.
Its here: http://www.geocities.com/bjohnson045/program.txt

Just to show you how im coming along on the coding.
Thanks again for your help.
0
 
LVL 11

Expert Comment

by:bcladd
ID: 9659593
Relax. Take a deep breath and think about what you want to do. If I understand you correctly you want to do the following:

Load an array of integers.
For each type of sort
    Convert an array of integers in the order of the original file to sorted order using the sort, counting comparisons
    Save the sorted array to a specially named file
Then do it with strings (I am ignoring that for the moment)

Okay. There are several "gotcha" moments in the above:
    Each sort should start with the SAME ELEMENTS in the SAME ORDER. This means repeatedly reading the file or sorting copies of the original array.
    The comparison count must be reset (if done using a global variable) between sorts
    The output file name needs to be known at the same time as there is a sorted array...either passed into sorts or handled in main()

Do these "high level" concerns make sense? The code I entered earlier, for heapsorting an array of integers, was designed to address these problems:

    init_sort();  // resets compares
    int* heapArray ;
    copy(heapArray, Array, arraySize); // copies the original data (you need to write this routine)
    heapsort(heapArray, arraySize);
    saveArray(inData+".ihs", heapArray, arraySize); // save the sorted contents of heapArray in the named file
    show_sort_count(); // displays the statistics

    init_sort(); // reset compares
    int* insertionArray ;
    copy(insertionArray, Array, arraySize);
    insertionsort(insertionArray, arraySize);
    saveArray(inData+".iis", insertionheapArray, arraySize); // save the sorted contents of insertionArray in the named file
    show_sort_count(); // displays the statistics

Hope this makes what I was trying to say clearer.

-bcl
0
 

Author Comment

by:killer455
ID: 9660702
Ok thats helps, updated here: http://www.geocities.com/bjohnson045/program.txt

Im a little unsure if I am passing everything right.
Also how with the sequence above differ with strings? if i should even ask that yet :)

0
 
LVL 11

Expert Comment

by:bcladd
ID: 9660724
(1) I'll take a look at the code after work (in a couple of hours).

(2) With templated functions the only thing that really changes with strings is the declaration of the "local" array. Of course you want to use different extensions for the names of the files and all of that, too, but the functions themselves will be instantiated and called by the compiler.

-bcl
0
 
LVL 11

Expert Comment

by:bcladd
ID: 9661566
(1) Your comparison counter, comparisons, is static in init_sort and is initialized to zero. There are two problems here:
    (a) A variable declared inside of a function is only in scope within that function. Thus no other function can see comparisons so you wil not be able to increment it and you will not be able to print it out.
    (b) A static variable is intialized only once. No matter how many times you call init_sort, the value of comparisons will only be set to 0 once (unless the code itself sets it to something else).
(1.5) You have an extra ; at the end of the function declaration

(2) copy needs to take T*&, not T&[], at least for the first parameter. You also need to allocate space (just like in BuildArray) for the new array.

(3) missing parameter type in the declaration of saveArray.

There are a whole lot of additional errors when those are fixed. I think most of them come from comarisons being static and locally scoped. Try moving it out to file scope (if you want you can declare is static there, though static at the file scope is a linkage directive and doesn't mean exactly the same thing as it does inside of a function).

Make sure you declare your variables before using them (in main there are problems).

Let me know if you have any more questions.

-bcl
0
 

Author Comment

by:killer455
ID: 9662079
In regards to the static comparison count.  This variable should be global right?  I guess i thought static meant "global", am i wrong in this?  If not, should it be declared static in the main and then set to 0 inside the init() function??


Note: its not letting me increment points anymore? because its at 500 or something?
0
 
LVL 11

Expert Comment

by:bcladd
ID: 9662576
(1) I know nothing about points. I would guess the max is 500 (since that is the highest I have seen on any single question). I think the standard is to post another question with the title "Points for bcladd" and then award them when I respond to that question. Haven't seen so much of that recently so I don't really know the etiquette.
 
(2)  You have aroused the lecturer within me. Be warned.

A variable declared outside of all functions in a C/C++ source file is a _global_ variable. This means that it is _in scope_ (visible) from the point of declaration to the end of the file.  This means that any function can refer to it by name. It also means that the variable is global to the _linker_ as well as the compiler and that if the variable is properly declared in another C/C++ source file (using the extern declaration modifier) it can even be used in other source files. To prevent such external linking a global variable declaration can be modifed to be static which means that it is not externally linkable. Since the variable is still global in scope in the current source file, nothing in the current file is changed by modifying a global variable to have static linkage.

Now, you knew most of that (and probably had little interest in the strange, fine points of C/C++ linkage). It is unfortunate but the term static is used when discussing the lifetime of variables as well as their linkage. You know what scope is, right? A given variable is only "visible" from some fraction of the program and the further nested in {} a declaration is, the smaller that fraction (typically) is. Have you ever considered the _duration_ of a given variable? That is, how long does the memory holding the value remain valid and contain the last value stored in it. In C/C++ there are three storage durations: _static_, _automatic_, and _allocated_.

Static storage duration means the storage is allocated by the program and last the duration of the execution. Variables declared outside of any block (global variables) and variables declared within a block using the static keyword have static duration. (Completely separate from the other use of static: global variables have static duration with or without static linkage.) Note that a variable declared inside a block has limited scope but "unlimited" lifetime.

Automatic storage duration applies to any variable declared inside a block without the static keyword and means that the variable's lifetime is approximately equal to the time when the enclosing block is executing. That is, variables declared within a function "come into being" when the function is entered and "go out of being" when the flow of control leaves the function's block.

Allocated storage duration applies to explicitly allocated storage (dynamically allocated storage is another way to say that; storage returned from new, malloc(), etc.) and means that the lifetime of the variable continues until it is explicitly released (delete, free()) or the program terminates.

Hope this helps.
-bcl
0
 

Author Comment

by:killer455
ID: 9666794
Ok. Well I updated the code again.
Another question:

I'm passing "Array" to buildArray in the main().  So this needs to be declared first. However, do i need to declare it twice for both buildArray <int> () and buildArray <string> ().  Such as:

int intArray[];
string strArray[];

buildArray <int> (inData, intArray, arraySize);
buildArray <string> (inData, strArray, arraySize);
0
 
LVL 11

Expert Comment

by:bcladd
ID: 9666947
Declare two of them but declare them to be pointers, not []. Also, there should be no need to use buildArray<int>; the compiler should be able to determine the template types from the types of the parameters.

-bcl
0
 

Author Comment

by:killer455
ID: 9668736
Do I need to delete anything throughout my program or is it done automatically once execution of the program has ceased?
0
 

Author Comment

by:killer455
ID: 9668997
Also how am i suppose to pass the correct, LAST and MIDDLE pivot values to the two different quicksorts.  Because the array isnt constructed until the buildArray function.  Simply place them after this function call then correct?

0
 
LVL 11

Expert Comment

by:bcladd
ID: 9670263
(1) You SHOULD clean up after yourself. After printing the local copy of the sorted array you can call delete[] on the pointer. Or you can let the system recover the memory when your program finishes.

(2) You seem to have a problem with calling quicksort. You have 4 parameters in the formal parameter list but your recursive calls only pass 3 parameters. Why do you pass the pivot in? And why do you only pass one pivot in? Quicksort typically calculates the pivot from the input range (either taking the first element, taking a random element, or calculating some approximation of the median value in the range (there is an efficient way to pick the median of 5 values so 5 values are chosen from the range and the median of the 5 values is used as the pivot)). That way you only need to pass in 3 parameters and the pivot is calculated inside EACH recursive call.

-bcl
0
 

Author Comment

by:killer455
ID: 9671546
I'm wanting to calculate the pivot because I am doing two different versions of quicksort.  One which uses the last element as the pivot and one that used the middle element as the pivot.

0
 
LVL 11

Expert Comment

by:bcladd
ID: 9671921
But you don't want to calculate the pivot once; the pivot location will be different for each recursive call (whether it is the middle or last element). You may want to pass a flag to quicksort indicating HOW to calculate the pivot but each recursive call will have to calculate its own pivot index.,

-bcl
0
 

Author Comment

by:killer455
ID: 9672754
So the actual quicksort function needs to be recoded to calculate the pivot?

0
 

Author Comment

by:killer455
ID: 9672778
Might it be better to use a version of quicksort that uses the partition function?

template <class T>
void quickSort(T *a, int first, int last) {
   if(first < last) {
      int loc = partition(a, first, last);
      quickSort(a, first, loc - 1);
      quickSort(a, loc + 1, last);
   }
}

template <class T>
int partition(T *a, int first, int last) {
   int i      = first;
   int loc    = last + 1;
   T pivot    = a[first];

   while(i < loc) {
      do {
         ++i;
      } while((a[i] < pivot) && (i < last));
      do {
         --loc;
      } while(a[loc] > pivot);

      if(i < loc)
         swap(a[i], a[loc]);
   }
   swap(a[first], a[loc]);
   return(loc);
}
0
 
LVL 11

Expert Comment

by:bcladd
ID: 9672821
Sure, use the partition function. Note that the pivot is ALWAYS the first element in the range in the version of partition you show.

If you want the pivot to differ, you will need to have some way of indicating the pivot to the partition function. That can be as a parameter (which would have to be passed in through quickSort) or a global flag (set in main() before each call to quickSort).

Only important if you really want to select different pivots on different calls to THE SAME function (yep, a third solution is to have quickSortFirst and quickSortMiddle (and partitionFirst and partitionMiddle) and call one or the other from main()).

-bcl
0
 

Author Comment

by:killer455
ID: 9673148
If i use this version that means i am going to pass the first and last values also.  I guess im unsure how I attain these values considering the array is built inside the buildArray function.
0
 
LVL 11

Expert Comment

by:bcladd
ID: 9673170
You tell buildArray the size of array you want. In C++, all array indices begin with 0 so first (to the top-most call of quickSort) is 0, last is sizeOfArray-1 (the value passed back from buildArray).

-bcl
0
 

Author Comment

by:killer455
ID: 9673619
buildArray determines the size of the array on its own by taking in the first value from the file.  Do I have to declare arraySize=0; in main() and then pass this to buildArray, and then determine the first and last values, and then call the sorts with those values?

After this question ill post a new thread, this one is huge, plus i can do the points easier.
0
 
LVL 11

Expert Comment

by:bcladd
ID: 9673952
You pass in a reference to an integer. That means that you get the size of the array BACK from buildArray. So you know how big it is.

-bcl
0
 

Author Comment

by:killer455
ID: 9674599
0

Featured Post

What Is Threat Intelligence?

Threat intelligence is often discussed, but rarely understood. Starting with a precise definition, along with clear business goals, is essential.

Join & Write a Comment

IntroductionThis article is the second in a three part article series on the Visual Studio 2008 Debugger.  It provides tips in setting and using breakpoints. If not familiar with this debugger, you can find a basic introduction in the EE article loc…
Update (December 2011): Since this article was published, the things have changed for good for Android native developers. The Sequoyah Project (http://www.eclipse.org/sequoyah/) automates most of the tasks discussed in this article. You can even fin…
THe viewer will learn how to use NetBeans IDE 8.0 for Windows to perform CRUD operations on a MySql database.
The viewer will learn how to pass data into a function in C++. This is one step further in using functions. Instead of only printing text onto the console, the function will be able to perform calculations with argumentents given by the user.

758 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

21 Experts available now in Live!

Get 1:1 Help Now