Why Experts Exchange?

Experts Exchange always has the answer, or at the least points me in the correct direction! It is like having another employee that is extremely experienced.

Jim Murphy
Programmer at Smart IT Solutions

When asked, what has been your best career decision?

Deciding to stick with EE.

Mohamed Asif
Technical Department Head

Being involved with EE helped me to grow personally and professionally.

Carl Webster
CTP, Sr Infrastructure Consultant
Ask ANY Question

Connect with Certified Experts to gain insight and support on specific technology challenges including:

Troubleshooting
Research
Professional Opinions
Ask a Question
Did You Know?

We've partnered with two important charities to provide clean water and computer science education to those who need it most. READ MORE

troubleshooting Question

AWK: Pytthagoras bp script Part REVISITED

Avatar of Anthony Mellor
Anthony MellorFlag for United Kingdom of Great Britain and Northern Ireland asked on
Shell Scripting* AWK
3 Comments1 Solution146 ViewsLast Modified:
edit to add: the attached files are not extracts, they are the actual files I used to produce the output file "out.txt"
That said, the acc.csv is a very small sample of the main file.
 

This is the unconstrained script.
The input files sites.csv is being appended to the end of the output file

Code is here with line 18 fixed (I had a strange double quote):

# ACC.csv
#CODE,ACC_EC,ACC_NC,Long,Lat,POL,Severity,No_Vehicles,No_Casualties,Date,DayofWeek,Time,LA_District,LA_Highway,1st_Road_Class,1st_Road_Number,Road_Type,LIMIT,Junction_Detail,Junction_Control,2nd_Road_Class,2nd_Road_Number,Zebra-Human,Zebra_Machine,Visibility,Weather_Conditions,Surface,Special_Conditions,Road_Hazards,Urban_Rural,Police_Attended,LSOAofLocation,ACCREF,Month,Minute,ACCDAY,Hour,YEAR,WHEN,ACC_EC5,ACC_NC5
#   a = ACC_EC ($2)
#   b = ACC_NC ($3)
#
# SITES.csv
#  OLD  SORT,POL,REF,CAM ETG,CAM NTG,INSTALL,MONTH,WHEN,TYPE,CAM_ETGkm,CAM_NTGkm
# SORT,POL,REF,SITE_ ETG,SITE_NTG,INSTALL,MONTH,WHEN,TYPE,BEFORE,AFTER
#   r = REF ($3)
#   c = SITE_ ETG ($4)
#   d = SITE_NTG ($5)

# One time logic before records are read from input file
BEGIN {
   # Change field separator from space to comma for our file
   FS = ","
   file2 ="SITES.CSV"
}

# Main processing loop handling each line of the file
{
   # Check if first line (header)
   if (NR == 1) {

      # If this is header just write it to output adding new column headers
      print $0 ",REF,CAM ETG,CAM NTG,Distance"

   } else {

      # skip blank lines
      if ($0 != "") {

         # Save input record from file1
         source = $0

         # Save needed fields for calcs below, remove double quotes if found DOUBLE QUOTES PREVIOUSLY STRIPPED FROM FILE
         a = $2         # ACC_EC 
         b = $3         # ACC_NC
         gsub("\"", "", a)
         gsub("\"", "", b)

         Bestx = 999999999
         # Add new columns at the end
         while ((getline < file2 ) > 0 ) {
            if ($1 != "SORT") {
               r = $3
               c = $4      # SITE_ETG
               d = $5      # SITE_NTG
               x = sqrt(((a-c)^2)+((b-d)^2))
               if (x < Bestx) {
                  Bestx = x
                  Bestr = r
                  Bestc = c
                  Bestd = d
               }
            }
         }
         printf("%s,%s,%s,%s,%f\n", source, Bestr, Bestc, Bestd, Bestx)
         close(file2)
      }
   }
}

I ran the above script like this:

MacPro:5-CombineBOTH ADM$ awk -f unlimited.awk ACC.CSV sites.csv > out.txt

The attached out.txt file shows the output, which (erroneously) includes the entire sites.csv file and some odd headers right at the end.

(Reconstruction is complete) 01:36
acc.csv
Sites.csv
out.txt