Want to protect your cyber security and still get fast solutions? Ask a secure question today.Go Premium

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 513
  • Last Modified:

Need a powerful multi-purpose sort utility

Hi gurus, I currently call a 3rd party sort utility within my Delphi app called "Opt-tech Sort". It's very good and does *almost* everything I need, including:

- handling files of unlimited size
- has a facility to Include/Omit records based on user-specified criteria
- has a facility to modify/rearrange the format of the output record any way you like with regards to field order.

The only drawback to this, and it's a big one for me, is that Opt-tech cannot handle fields from a CSV file that have multiple quotes (""") back to back like that. For instance, here is a legit field from a CSV file according to CSV convention:

,"""+971502347269"" <0971502347269>",

Opt-Tech sort will see those first two back-to-back quotes and then disregard the whole rest of the field - thinking it has encountered an "empty" field with back-to-back quotes. This situation, naturally, will have disastrous results in my application. Does anyone know of a sort utility (commercial or free) that will handle the above case and tick all my boxes?

Thanks!
    Shawn
0
shawn857
Asked:
shawn857
1 Solution
 
Sinisa VukCommented:
Try to use kbmMemTable + kbmMemCSVStreamFormat component. There should be working this and it is quite powerful.
Go to http://www.torry.net/pages.php?id=572 and find TkbmMem Table v.5.52.
0
 
shawn857Author Commented:
Thanks Sinisav... this component will be able to handle and sort a file if many gigabytes? I don't see how it can, if it has to bring all data into memory...

Thanks
    Shawn
0
 
aikimarkCommented:
Have you looked at Syncsort?

Work-arounds
* you might use your Delphi application as a pre-processor for the Opt-Tech code, calling in Record mode, rather than File mode.  Your Delphi code could remove the doubled quotes that hang-up the sorting code (replace three consecutive quote characters with a single quote character).

* Similar to above, except you read/write each record before calling the Opt-Tech code in File mode as you currently do.

* Ask the Opt-Tech folks if they support exit routines.  If so, you might create a small program that cleans up the record before Opt-Tech code begins its processing.

* Look at PowerShell

* It might be possible to use ADO with a defining schema.ini file to accomplish the filtering, sorting, column reordering.

===========
You might use TSMImportFromText library to treat the file as a data source and use the data routines to perform the functions of the Opt-Tech code.
http://www.scalabium.com/smi/

Co-sort
http://www.iri.com/solutions/Data_Transformation/Sort_Merge

Sort Solution
http://www.mwlabs.de/psortsol.htm

If you have SAS on your system, you might use its Proc Sort
0
Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
aikimarkCommented:
If you have a SQL server instance, you might kick off an SSIS task to import the data and then save the sorted/massaged data.
0
 
shawn857Author Commented:
Hi Aikimark, thanks for all your suggestions... here's my thoughts:


Have you looked at Syncsort?

>> Mega-expensive!! Same with Co-sort


Work-arounds
* you might use your Delphi application as a pre-processor for the Opt-Tech code, calling in Record mode, rather than File mode.  Your Delphi code could remove the doubled quotes that hang-up the sorting code (replace three consecutive quote characters with a single quote character).

* Similar to above, except you read/write each record before calling the Opt-Tech code in File mode as you currently do.


>> Do-able... but this is kind of my last resort as this would require 3 passes through the >>data instead of just one. Speed of processing is going to be a big selling point for my >>application and this would pretty much defeat that.


* Ask the Opt-Tech folks if they support exit routines.  If so, you might create a small program that cleans up the record before Opt-Tech code begins its processing.


>> Again, this would require 3 complete passes through the data.


* Look at PowerShell


>> I had a brief look at it, embarrassed to say I had never heard of it before! it seems like it would be similar to calling a DOS routine from my application? Not so sure if I like this idea...


* It might be possible to use ADO with a defining schema.ini file to accomplish the filtering, sorting, column reordering.

>> Not familiar at all with ADO Mark... what is it?


You might use TSMImportFromText library to treat the file as a data source and use the data routines to perform the functions of the Opt-Tech code.
http://www.scalabium.com/smi/


>> Would I be able to import - into memory - files of many many gigabytes??? Doesn't sound feasible, or possible.


Co-sort
http://www.iri.com/solutions/Data_Transformation/Sort_Merge

Sort Solution
http://www.mwlabs.de/psortsol.htm


>> I tried Sort Solution in the past, it doesn't have the features I require.


If you have SAS on your system, you might use its Proc Sort


>> Once again, I have to say I'm not familiar with SAS Mark. What does that stand for?


If you have a SQL server instance, you might kick off an SSIS task to import the data and then save the sorted/massaged data.

>> I know nothing about SQL, sounds like it would be a whole new can of works... plus can >>it handle gigs and gigs of data in one shot?
>> The way I look at it is I have 2 basic alternatives: (1) Find another sort routine that will do everything I need, or (2) Stick with Opt-tech sort and if I encounter the problematic double-quote situation, then do the pre and post processing as you recommend.
  I have sent out emails to a few other sort utility firms - Ahlsort (but they seem expensive too, and are not royalty-free for distribution), and XSM sort (www.hhns.fr). I'm still waiting to hear back from the XSM sort people.

Thanks
    Shawn
0
 
aikimarkCommented:
What are your budgetary constraints?  You didn't mention that money was a consideration in your question text.

>>3 passes
If you can use record mode, then it only requires two passes.

>>speed of processing as a selling point
Then you had better describe what you are doing in more detail.  What kind of data is in these files?  What does your application do?

>>Again, (exit routines) would require 3 passes
No.  Opt-Tech is doing the same reading/writing, but would pass the record to your code (in memory) for processing.

>>PS -- similar to calling a DOS routine
How are you calling Opt-Tech sort now?
What don't you like about this idea?  Requirements and problem constraint details should be supplied to experts.

>>ADO
ActiveX Data Object -- It is a (DLL) library of Microsoft database components moved forward from the Jet/Access database.

>>files of many gigabytes?
Your processing space in a 32-bit environment is limited to 2GB.  I thought you might use the import to roll-your-own external sort routine, processing chunks of the huge file into sorted temp files that are then merged.

>>SAS
Statistical Analysis System from a company named SAS in Cary, NC

>>SQL server
Yes.  It is one of the large scale relational databases.
0
 
shawn857Author Commented:
Hi Aikimark... sorry for the long absence. To reply to your comments:


>What are your budgetary constraints?  You didn't mention that money was a >consideration in your question text.


Well, ideally I'd like to find a free solution, but if I had to spend, I would say a few hundred dollars.



>>3 passes
>If you can use record mode, then it only requires two passes.

>>speed of processing as a selling point
>Then you had better describe what you are doing in more detail.  What kind of data is in >these files?  What does your application do?



My data files are basically CSV files containing information about VOIP telephone calls. My app reads these files and prepares reports. A typical record might look like this:

10548509,9.71502E+11,4.43452E+11,routing,"""+971502347269"" <0971502347269>",SIP/217.168.44.132-000907e4,Local/10548509@voice-6ed4;1,Dial,"Local/10548509@voice,45,gL(3599000)",10/5/2012 11:25,10/5/2012 11:25,10/5/2012 11:25,5,5,ANSWERED,1349436313



>>Again, (exit routines) would require 3 passes
>No.  Opt-Tech is doing the same reading/writing, but would pass the record to your code >(in memory) for processing.



This could be the best option. I am currently just now discovering and investigating the "Record Sort" option of OptTech
sort and have emailed the author about it - it could be the answer for me. I've tried finding an alternative sort routine
and it doesn't look like one exists that will do all I need. I've talked with the people at Co-Sort, Ahlsort, XSM Sort, etc.
The answer *could* be right under my nose woth the Record Sort option of Opttech sort.



>>PS -- similar to calling a DOS routine
>How are you calling Opt-Tech sort now?


Programmatically from my Delphi app via DLL... nice and seamless.



>What don't you like about this idea?  Requirements and problem constraint details >should be supplied to experts.


Well, some years ago I made an app that called a DOS-based sort called "RPsort". It was pretty cumbersome and "ugly" - a big
black DOS window would appear on top of my app when the sort was being called. Not very seamless.


Will update you when I learn more about the OptTech "Record Sort" option... thanks!


Cheers
  Shawn
0
 
Geert GruwezOracle dbaCommented:
I've requested that this question be deleted for the following reason:

The question has either no comments or not enough useful information to be called an "answer".
0
 
aikimarkCommented:
@shawn857

Thanks for the points.  What utility/software/configuration did you finally use to solve your problem?
0
 
shawn857Author Commented:
Hi Aikimark, I stuck with my Opt-Tech sort and decided to write a pre-processing routine in my app to check and modify problematic records.

Thanks!
   Shawn
0

Featured Post

Free Tool: Site Down Detector

Helpful to verify reports of your own downtime, or to double check a downed website you are trying to access.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Tackle projects and never again get stuck behind a technical roadblock.
Join Now