Solved

Need a powerful multi-purpose sort utility

Posted on 2013-05-13
12
453 Views
Last Modified: 2013-09-07
Hi gurus, I currently call a 3rd party sort utility within my Delphi app called "Opt-tech Sort". It's very good and does *almost* everything I need, including:

- handling files of unlimited size
- has a facility to Include/Omit records based on user-specified criteria
- has a facility to modify/rearrange the format of the output record any way you like with regards to field order.

The only drawback to this, and it's a big one for me, is that Opt-tech cannot handle fields from a CSV file that have multiple quotes (""") back to back like that. For instance, here is a legit field from a CSV file according to CSV convention:

,"""+971502347269"" <0971502347269>",

Opt-Tech sort will see those first two back-to-back quotes and then disregard the whole rest of the field - thinking it has encountered an "empty" field with back-to-back quotes. This situation, naturally, will have disastrous results in my application. Does anyone know of a sort utility (commercial or free) that will handle the above case and tick all my boxes?

Thanks!
    Shawn
0
Comment
Question by:shawn857
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
12 Comments
 
LVL 27

Expert Comment

by:Sinisa Vuk
ID: 39163188
Try to use kbmMemTable + kbmMemCSVStreamFormat component. There should be working this and it is quite powerful.
Go to http://www.torry.net/pages.php?id=572 and find TkbmMem Table v.5.52.
0
 

Author Comment

by:shawn857
ID: 39163229
Thanks Sinisav... this component will be able to handle and sort a file if many gigabytes? I don't see how it can, if it has to bring all data into memory...

Thanks
    Shawn
0
 
LVL 45

Accepted Solution

by:
aikimark earned 500 total points
ID: 39166209
Have you looked at Syncsort?

Work-arounds
* you might use your Delphi application as a pre-processor for the Opt-Tech code, calling in Record mode, rather than File mode.  Your Delphi code could remove the doubled quotes that hang-up the sorting code (replace three consecutive quote characters with a single quote character).

* Similar to above, except you read/write each record before calling the Opt-Tech code in File mode as you currently do.

* Ask the Opt-Tech folks if they support exit routines.  If so, you might create a small program that cleans up the record before Opt-Tech code begins its processing.

* Look at PowerShell

* It might be possible to use ADO with a defining schema.ini file to accomplish the filtering, sorting, column reordering.

===========
You might use TSMImportFromText library to treat the file as a data source and use the data routines to perform the functions of the Opt-Tech code.
http://www.scalabium.com/smi/

Co-sort
http://www.iri.com/solutions/Data_Transformation/Sort_Merge

Sort Solution
http://www.mwlabs.de/psortsol.htm

If you have SAS on your system, you might use its Proc Sort
0
Technology Partners: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
LVL 45

Expert Comment

by:aikimark
ID: 39166389
If you have a SQL server instance, you might kick off an SSIS task to import the data and then save the sorted/massaged data.
0
 

Author Comment

by:shawn857
ID: 39175979
Hi Aikimark, thanks for all your suggestions... here's my thoughts:


Have you looked at Syncsort?

>> Mega-expensive!! Same with Co-sort


Work-arounds
* you might use your Delphi application as a pre-processor for the Opt-Tech code, calling in Record mode, rather than File mode.  Your Delphi code could remove the doubled quotes that hang-up the sorting code (replace three consecutive quote characters with a single quote character).

* Similar to above, except you read/write each record before calling the Opt-Tech code in File mode as you currently do.


>> Do-able... but this is kind of my last resort as this would require 3 passes through the >>data instead of just one. Speed of processing is going to be a big selling point for my >>application and this would pretty much defeat that.


* Ask the Opt-Tech folks if they support exit routines.  If so, you might create a small program that cleans up the record before Opt-Tech code begins its processing.


>> Again, this would require 3 complete passes through the data.


* Look at PowerShell


>> I had a brief look at it, embarrassed to say I had never heard of it before! it seems like it would be similar to calling a DOS routine from my application? Not so sure if I like this idea...


* It might be possible to use ADO with a defining schema.ini file to accomplish the filtering, sorting, column reordering.

>> Not familiar at all with ADO Mark... what is it?


You might use TSMImportFromText library to treat the file as a data source and use the data routines to perform the functions of the Opt-Tech code.
http://www.scalabium.com/smi/


>> Would I be able to import - into memory - files of many many gigabytes??? Doesn't sound feasible, or possible.


Co-sort
http://www.iri.com/solutions/Data_Transformation/Sort_Merge

Sort Solution
http://www.mwlabs.de/psortsol.htm


>> I tried Sort Solution in the past, it doesn't have the features I require.


If you have SAS on your system, you might use its Proc Sort


>> Once again, I have to say I'm not familiar with SAS Mark. What does that stand for?


If you have a SQL server instance, you might kick off an SSIS task to import the data and then save the sorted/massaged data.

>> I know nothing about SQL, sounds like it would be a whole new can of works... plus can >>it handle gigs and gigs of data in one shot?
>> The way I look at it is I have 2 basic alternatives: (1) Find another sort routine that will do everything I need, or (2) Stick with Opt-tech sort and if I encounter the problematic double-quote situation, then do the pre and post processing as you recommend.
  I have sent out emails to a few other sort utility firms - Ahlsort (but they seem expensive too, and are not royalty-free for distribution), and XSM sort (www.hhns.fr). I'm still waiting to hear back from the XSM sort people.

Thanks
    Shawn
0
 
LVL 45

Expert Comment

by:aikimark
ID: 39176229
What are your budgetary constraints?  You didn't mention that money was a consideration in your question text.

>>3 passes
If you can use record mode, then it only requires two passes.

>>speed of processing as a selling point
Then you had better describe what you are doing in more detail.  What kind of data is in these files?  What does your application do?

>>Again, (exit routines) would require 3 passes
No.  Opt-Tech is doing the same reading/writing, but would pass the record to your code (in memory) for processing.

>>PS -- similar to calling a DOS routine
How are you calling Opt-Tech sort now?
What don't you like about this idea?  Requirements and problem constraint details should be supplied to experts.

>>ADO
ActiveX Data Object -- It is a (DLL) library of Microsoft database components moved forward from the Jet/Access database.

>>files of many gigabytes?
Your processing space in a 32-bit environment is limited to 2GB.  I thought you might use the import to roll-your-own external sort routine, processing chunks of the huge file into sorted temp files that are then merged.

>>SAS
Statistical Analysis System from a company named SAS in Cary, NC

>>SQL server
Yes.  It is one of the large scale relational databases.
0
 

Author Comment

by:shawn857
ID: 39232008
Hi Aikimark... sorry for the long absence. To reply to your comments:


>What are your budgetary constraints?  You didn't mention that money was a >consideration in your question text.


Well, ideally I'd like to find a free solution, but if I had to spend, I would say a few hundred dollars.



>>3 passes
>If you can use record mode, then it only requires two passes.

>>speed of processing as a selling point
>Then you had better describe what you are doing in more detail.  What kind of data is in >these files?  What does your application do?



My data files are basically CSV files containing information about VOIP telephone calls. My app reads these files and prepares reports. A typical record might look like this:

10548509,9.71502E+11,4.43452E+11,routing,"""+971502347269"" <0971502347269>",SIP/217.168.44.132-000907e4,Local/10548509@voice-6ed4;1,Dial,"Local/10548509@voice,45,gL(3599000)",10/5/2012 11:25,10/5/2012 11:25,10/5/2012 11:25,5,5,ANSWERED,1349436313



>>Again, (exit routines) would require 3 passes
>No.  Opt-Tech is doing the same reading/writing, but would pass the record to your code >(in memory) for processing.



This could be the best option. I am currently just now discovering and investigating the "Record Sort" option of OptTech
sort and have emailed the author about it - it could be the answer for me. I've tried finding an alternative sort routine
and it doesn't look like one exists that will do all I need. I've talked with the people at Co-Sort, Ahlsort, XSM Sort, etc.
The answer *could* be right under my nose woth the Record Sort option of Opttech sort.



>>PS -- similar to calling a DOS routine
>How are you calling Opt-Tech sort now?


Programmatically from my Delphi app via DLL... nice and seamless.



>What don't you like about this idea?  Requirements and problem constraint details >should be supplied to experts.


Well, some years ago I made an app that called a DOS-based sort called "RPsort". It was pretty cumbersome and "ugly" - a big
black DOS window would appear on top of my app when the sort was being called. Not very seamless.


Will update you when I learn more about the OptTech "Record Sort" option... thanks!


Cheers
  Shawn
0
 
LVL 37

Expert Comment

by:Geert Gruwez
ID: 39472188
I've requested that this question be deleted for the following reason:

The question has either no comments or not enough useful information to be called an "answer".
0
 
LVL 45

Expert Comment

by:aikimark
ID: 39472225
@shawn857

Thanks for the points.  What utility/software/configuration did you finally use to solve your problem?
0
 

Author Comment

by:shawn857
ID: 39473784
Hi Aikimark, I stuck with my Opt-Tech sort and decided to write a pre-processing routine in my app to check and modify problematic records.

Thanks!
   Shawn
0

Featured Post

Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
Working with hours 3 86
Find and Replace Stream with 0s 8 86
Tidtcpserver listening on multiports? 1 54
How to Get Images From Server using App Tethering 11 51
In this tutorial I will show you how to use the Windows Speech API in Delphi. I will only cover basic functions such as text to speech and controlling the speed of the speech. SAPI Installation First you need to install the SAPI type library, th…
Creating an auto free TStringList The TStringList is a basic and frequently used object in Delphi. On many occasions, you may want to create a temporary list, process some items in the list and be done with the list. In such cases, you have to…
Attackers love to prey on accounts that have privileges. Reducing privileged accounts and protecting privileged accounts therefore is paramount. Users, groups, and service accounts need to be protected to help protect the entire Active Directory …

739 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question