Advertisement

07.06.2008 at 08:29AM PDT, ID: 23541731
[x]
Attachment Details
[x]
The Solution Rating System

With so many solutions, how can you tell which solutions are most likely to help you and which ones are not? To provide you with a tool to use, we rate our solutions based on various elements that most accurately determine if a solution is a quality solution. To explain what factors affect the solution rating, here are the elements we take into consideration when formulating our solution rating.

  • The Grade of the Solution
  • The Zone Rank of the Expert Providing the Solution
  • The Number of Author and Expert Comments
  • The Number of Experts Contributing
  • The Feedback of the Community

Your Input Matters
Because of the way the system is set up, the most important variable in this equation is you. As a member of Experts Exchange, you are able to cast your vote on the quality of the solutions in regard to how complete, accurate, helpful and easy to understand each solution is. When you provide your feedback, each rating is adjusted accordingly. So, if you see a solution that has a poor rating that you think is a good solution, let us know by rating it. As you do, the rating will be adjusted and will become more accurate for other members of our site.

If you have any suggestions that you would like to make for our rating system, please ask a question in the Suggestions Zone of Community Support.

Thank you!

8.5

Count fields within a field using AWK

Asked by derdle in Shell Scripting, IBM AIX Unix, Scripting Languages

Tags:

I'm converting credit card data from one system to another.  The data as is sits today has a card holder name field which consists of a single column in my Sybase DB.  To properly extract/format for loading into the new system, I have to break apart the name into two parts (firstname, lastname).  The problem is, due to user creativity (LOL) the single field is not consistent in containing just a first name and a last name - sometimes there are middle initials, some times there are "notes" (e.g. do not use, etc.) in the field.  So, what we decided to act on the data as follows:
 
If there is only one element in the field, we assume it to be a last name and want to return it as "VERIFY, Lastname" (where VERIFY would end up in the first name field on the new system).
If there are two elements to the field, we'll place the first into first name and second into last name.
If there are three elements, we want to check the length of the second....if the length of the 2nd is equal to 1, we'll assume a middle initial (e.g. "J" ), and we'll place element one into the first name, and then element three into the last name (thus leaving the middle initial or element two behind).
If there are more than three elements in the field, or there are three elements and element two has a length greater than 1, then we want to place the first into first name, and all the remaining into the last name field.

I think I am very close, and I can get this to work, if the card holder name is the only field in the input file (since then NF applies to the total of the line as well as the field to be modified, since they are one in the same) - but the problem is, that this field is the forth field in the delimited record - so, I need to be able to count the number of elements/fields within that forth field so as to be able to act on it accordingly as indicated above.

Currently, I am trying to accomplish this via the following (but am open for other suggestions- if easier ways)

#! /bin/awk -f

BEGIN {FS="|"} { while ((getline) > 0)
               print $1 "," $2 "," $3 "," chk_data($4) "," $5 "," $6 "," $7 "," $8 "," $9 "," $10 }

function chk_data(input, result,words,n,i)
{
   n = split(input, words, " ")
                    i=NF
                    if (i > 3) {
                                for (r = 2; r <= i; r++) {
                                                                   result = result " " words[r]
                                                                   }
                               result = words[1] "," result
                               return result
                                }
                    if (i == 3 && length(words[2]) == 1 ) {
                                                                                 result = words[1] "," words[3]
                                                                                 return result
                                                                                  }
                    if (i == 3 && length(words[2]) > 1 ) {
                                                                               for (r = 2; r <= i; r++) {
                                                                                                                 result = result " " words[r]
                                                                                                                  }
                                                                               result = words[1] "," result
                                                                               return result
                                                                               }
                    if (i == 2) {
                                       result = words[1] "," words[2]
                                       return result
                                       }
                    if (i == 1) {
                                       result = "VERIFY," words[1]
                                       return result
                                       {
}

Examples of each scenario:

File being read in is pipe delimited.  Final file needed is comma delimited.

READ FROM FILE:    
304|VISACREDIT|4094123456789012|DOE        |9|2000|N|||250.00
  Expected Out---> "304,VISACREDIT,4094123456789012,VERIFY,DOE,9,2000,N,,,250.00
                                                           
304|VISACREDIT|4094123456789012|JOHN DOE        |9|2000|N|||250.00
  Expected Out---> "304,VISACREDIT,4094123456789012,JOHN,DOE,9,2000,N,,,250.00

304|VISACREDIT|4094123456789012|JOHN X DOE        |9|2000|N|||250.00
  Expected Out---> "304,VISACREDIT,4094123456789012,JOHN,DOE,9,2000,N,,,250.00

304|VISACREDIT|4094123456789012|JOHN EDWARD DOE        |9|2000|N|||250.00
  Expected Out---> "304,VISACREDIT,4094123456789012,JOHN,EDWARD DOE,9,2000,N,,,250.00

304|VISACREDIT|4094123456789012|DO NOT USE THIS CARD       |9|2000|N|||250.00
  Expected Out---> "304,VISACREDIT,4094123456789012,DO,NOT USE THIS CARD,9,2000,N,,,250.00

The problem I am having getting to these results is that NF even after calling the function (and passing it $4) is still referencing the total # of elements/fields in the entire line - not just that of field #4.  Within the function I need to be able to know the count of elements in $4 that was passed in.

Hopefully this makes sense....  =)

Any and all help is appreciated....Start Free Trial
 
Loading Advertisement...
 
[+][-]07.06.2008 at 02:27PM PDT, ID: 21941696

View this solution now by starting your 7-day free trial. Setting up your free trial is quick, easy, and secure. We will return you to this solution, unlocked, when you're done.

 

About this solution

Zones: Shell Scripting, IBM AIX Unix, Scripting Languages
Tags: AWK
Sign Up Now!
Solution Provided By: ahoffmann
Participating Experts: 2
Solution Grade: A
 
 
[+][-]07.06.2008 at 02:29PM PDT, ID: 21941701

At Experts Exchange, members can ask their questions to thousands of technology professionals, also known as Experts. Experts compete and collaborate to answer those questions by leaving comments like this one.

Start your 7-day free trial to view this Expert Comment or ask the Experts your question.

 
[+][-]07.06.2008 at 06:23PM PDT, ID: 21942286

Often, when Experts are collaborating with members who have asked questions, they will request additional information about the problem. Askers respond with an author comment like this one.

Start your 7-day free trial to view this Author Comment or ask the Experts your question.

 
[+][-]07.07.2008 at 06:25AM PDT, ID: 21944754

At Experts Exchange, members can ask their questions to thousands of technology professionals, also known as Experts. Experts compete and collaborate to answer those questions by leaving comments like this one.

Start your 7-day free trial to view this Expert Comment or ask the Experts your question.

 
[+][-]07.10.2008 at 06:55PM PDT, ID: 21979342

Often, when Experts are collaborating with members who have asked questions, they will request additional information about the problem. Askers respond with an author comment like this one.

Start your 7-day free trial to view this Author Comment or ask the Experts your question.

 
 
Loading Advertisement...
20080716-EE-VQP-32 / EE_QW_2_20070628