wills1300
asked on
Sorting a flat text file using MSVS C++ #1
What's the easiest way to get C++ to sort a text file that is delimited by a tilde ("~")? There are 15 columns of char's. Regular text files about 200+ meg. I want to sort by columns. Any ideas?
Here is a bubble sort routine I did some time ago in Turbo C++... YOu can convert it easily to any compiler... It also includes a test program..... It simply sorts the names and displays them in a sorted manner... The file is a bit messy, but I hope everything would be ok...
Hope this helps...
==================
#include <iostream.h>
#include <conio.h>
#include <stdio.h>
#include <string.h>
#define num 3
#define len 80
void SwapNames(char *, char *);
void ShowNames(char [][len], const int);
void SortNames(char [][len], const int);
void main()
{
char names[num][len];
clrscr();
for (int i = 0; i < num; i++) {
cout << "Please enter a name ==> ";
gets(names[i]);
}
clrscr();
cout << "Unsorted...\n\n";
ShowNames(names, num);
getch();
clrscr();
SortNames(names, num);
cout << "Sorted...\n\n";
ShowNames(names, num);
getch();
}
void SortNames(char names[][len], const int size)
{
int done;
do {
done = 1;
for (int i = 0; i < size-1; i++) {
if (strcmp(names[i], names[i+1]) > 0) {
SwapNames(names[i], names[i+1]);
done = 0;
}
}
} while (!done);
}
void ShowNames(char names[][len], const int size)
{
for (int i = 0; i < size; i++)
cout << names[i] << endl;
}
void SwapNames(char *a, char *b)
{
char tmp[len];
strcpy(tmp, a);
strcpy(a, b);
strcpy(b, tmp);
}
============
-Viktor
--Ivanov
Hope this helps...
==================
#include <iostream.h>
#include <conio.h>
#include <stdio.h>
#include <string.h>
#define num 3
#define len 80
void SwapNames(char *, char *);
void ShowNames(char [][len], const int);
void SortNames(char [][len], const int);
void main()
{
char names[num][len];
clrscr();
for (int i = 0; i < num; i++) {
cout << "Please enter a name ==> ";
gets(names[i]);
}
clrscr();
cout << "Unsorted...\n\n";
ShowNames(names, num);
getch();
clrscr();
SortNames(names, num);
cout << "Sorted...\n\n";
ShowNames(names, num);
getch();
}
void SortNames(char names[][len], const int size)
{
int done;
do {
done = 1;
for (int i = 0; i < size-1; i++) {
if (strcmp(names[i], names[i+1]) > 0) {
SwapNames(names[i], names[i+1]);
done = 0;
}
}
} while (!done);
}
void ShowNames(char names[][len], const int size)
{
for (int i = 0; i < size; i++)
cout << names[i] << endl;
}
void SwapNames(char *a, char *b)
{
char tmp[len];
strcpy(tmp, a);
strcpy(a, b);
strcpy(b, tmp);
}
============
-Viktor
--Ivanov
Oh, you do the reading from the file and the rest yourself... I just wanted to show you how to do the sorting of the strings... you use strcmp()
-Viktor
-Viktor
NEVER use bubble sort. A decent external sort algorithm is MergeSort. Combined with in-memory sorting (e.g., quicksort) as Alex suggested will get you what you need.
ASKER
Bubble sort does not seem feasible because I need to have the data sorted by rows, and from the sample code I could not see an easy way to maintain the rows, unless I took the entire row as a string. But I need to sort by columns also. Basically, the data file I am working with is like a spreadsheet. The data can be thought of like a huge address book, where I have to sort first by last name, then first name, than by address, and so on. I was hoping for anything that could do this. Some easy routine? What is qsort? MergeSort? Alexo and AlexVirochovsky could you expand on your answer, please? (P.S. I am fairly new to C++ developement.)
OK, this is not C++ but basic algorithm stuff.
In general you have two kinds of sort algorithms: internal sort (the data is small enough to fit in memory) and external sort (the data is too large to fit in memory).
The best-known internal sort is Quicksort. It is considered the fastest (on the average) general-purpose sort. In fact the C/C++ standard library includes a Quicksort function (qsort).
Mergesort is another algorithm that is suitable for both internal and external sorting. Its basic principle is taking 2 sorted sets (or files) and merging them into 1 sorted set (or file).
In general you have two kinds of sort algorithms: internal sort (the data is small enough to fit in memory) and external sort (the data is too large to fit in memory).
The best-known internal sort is Quicksort. It is considered the fastest (on the average) general-purpose sort. In fact the C/C++ standard library includes a Quicksort function (qsort).
Mergesort is another algorithm that is suitable for both internal and external sorting. Its basic principle is taking 2 sorted sets (or files) and merging them into 1 sorted set (or file).
ASKER
OK, QSORT sounds good, but how do I use it? Is it a function call?
will1300! I send you example of qsort
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int sort_function( const void *a, const void *b);
char list[5][4] = { "cat", "car", "cab", "cap", "can" };
int main(void)
{
int x;
qsort((void *)list, 5, sizeof(list[0]), sort_function);
for (x = 0; x < 5; x++)
printf("%s\n", list[x]);
return 0;
}
int sort_function( const void *a, const void *b)
{
return( strcmp((char *)a,(char *)b) );
}
If you want full text of Sorting Large Files programm,
you must:
1. Add points to 2500-300.
2. Accept answer to A category.
BTW: Full code read lines of text(this place you must change)
and Sort Function for Hebrew(this place you must change, too!).
Regards, Alex
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int sort_function( const void *a, const void *b);
char list[5][4] = { "cat", "car", "cab", "cap", "can" };
int main(void)
{
int x;
qsort((void *)list, 5, sizeof(list[0]), sort_function);
for (x = 0; x < 5; x++)
printf("%s\n", list[x]);
return 0;
}
int sort_function( const void *a, const void *b)
{
return( strcmp((char *)a,(char *)b) );
}
If you want full text of Sorting Large Files programm,
you must:
1. Add points to 2500-300.
2. Accept answer to A category.
BTW: Full code read lines of text(this place you must change)
and Sort Function for Hebrew(this place you must change, too!).
Regards, Alex
ASKER
Yeah, sure Alex. Just put the full answer in 'answer form.'
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
>> (C) SHARLIN HAGAI, MOSHE SHAPIRO 43, NATANIA
I suggest you take this line seriously. People from Netania have an er... "unconvential" way of enforcing copyright issues.
I suggest you take this line seriously. People from Netania have an er... "unconvential" way of enforcing copyright issues.
Ouups, sorry, i delete all lines about copyright(where many!),
and forgette this. You can use it free!
and forgette this. You can use it free!
ASKER
OK Alex, I haven't tried the program yet, but I'll take your word for it. Thanks. Will.
1.
read part (~ 60k): qsort part, save in temp file
read next, qsort, save in temp file
...
untill end of file.
2. merge temp files:
input line from 1-st temp file
...
input line from last temp file
{
qsort
save in result file
read line from this file, that line was saved
}
there is little problem with number of file,simultanuos
open (parameter Files in config.sys),
but this problem can solve with iterations:
merge 20 file, after that more 20, ..., merge results