Tariff Type (Call Class) Classification Technique

I have to write a program to modify a payphone call class recorded in call detail records. It is about 60,000 records a day.
So I think performance is important. I have to parse a log file using TjvCsvDataSet (JEDI) and create CSV file ready for MS SQL bulk insert. The original call class are:-
1 - Local Call
2 - STD (National Call)
4 - Bordering Call (Nearby Countries)
5 - Mobile Call

Payphone System is sometime classify incorrect class like 0,3,6 call class. I have to correct it. As the same time, I have to do a deeper classification for class 2.
  21 - Other call (begin with Non zero digit except 1234 )
  22 - STD, to fixed line (begin with these 02, 032, 032, 034, 035, 036, 037, 053, 054, 056, 074, 075, 076, 077
  222- STD, VoIP to fixed line (begin with 1234+prefix in 22 class)
  225 - STD, VoIP to mobile (begin with 1234 + prefix of 01, 06, 09, 033, 03, 055, 057, 071, 078, 079) all prefixes which are not for fixed line will be assigned to Mobile.

  4- Bordering call start with 007+Country Code+Destination number

I know that to get a best performance I have to reduce a number of checking. May be I have to sort these digit in length and start comparation from one digit .... if found then I look for a call class. As I have to copy substring from Dialed Number in Call Detail Record, Any Idea? For example

Dialed number = '1234015558950'  (original call class = 2, new = 225)

Normally Dialed number length = 9, Length <= 9 should be 21, > 9 should be 222, 225 or 4

If any one know the technique used in payphone or exchange please recommend. I want to know algorithm for best speed not size of program. Only suggestion or Idea is ok.
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

soapsiamAuthor Commented:
In telecommunication term, it is called as digit analysis. When you dial destination telephone number, an exchange should analyze digits and route to suitable route and classify that call.
As it seems, you need only the first 7 digits to analyze the class (this is the worst case - 1234 + 3 digits code). Right? So the first thing is to cut the first 7 digits.

Then you have to start checking for the most frequent activity. For example if you have 70% of local calls, start checking for local codes first. At the end should be the most rare activity. Thus you will process each record better.

If I had to do it, I would prepare the file for bulk insert in TEMPORARY tables and process everything inside the database using stored procedures. If the database is running on Dual Xeon machine ... the speed will be increased (if the stored procs are OK).
soapsiamAuthor Commented:

I think then I should check for length of dialed number first, then checking based on length of possibility
9 digit -> Local Call, STD to fixed line (22) and Mobile call shall be check first.
13 digit -> STD for VoIP to fixed Line or to Mobile should be check first, then Bordering call should be check later.

I will choose your approach for one of candidate in profiler.

You would get some points.

soapsiamAuthor Commented:
One more thing Ivanov_G,

My front end PC is One Xeon with 1GB RAM (Compaq) and Server is 2 Xeon CPU with 2 GB RAM (SQL Server).
After midnigth my program should prepare the text files for bulk insert to SQL server. Now, I have 7 clients and one server.

-> invalid Chip Card cdr file for insert in InvalidChipCardCDRs for manual processing.
-> valid Chip Card cdr file for insert in ValidChipCardCDRs
-> invalid Coin Card cdr file for insert in InvalidCoinCDRs for manual processing.
-> valid Coin Card cdr file for insert in ValidCoinCDRs

All vailid cdr should be re-classify call class as the rule above. (or if possible for invalid cdr like to mobile, VOIP and bordering call).
About checking the lenght first - yes, you are right, I think...

The previous company I worked for was developing billing system for telecoms, so I can tell you how it processed the records:
The CDRs are copied to a specified folder and "Mediation" scripts insert them in the database. I speak about 14 billable events every day. The CDRs are inserted in temporary tables. Once they are inside the database, the Rating module takes the data and process it, depending the price plan, time zone, destination number, etc.

So I really think your logic should be inside the database. This way you can have tables with dial codes, etc. Beside that, it is obvious that your DB server is better than the front end PC, so the processing time will be less. T-SQL have rich syntax according to non-procedural programming, so all these checks can be made easily.

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today

From novice to tech pro — start learning today.

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.