Learn how to a build a cloud-first strategyRegister Now

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 369
  • Last Modified:

Perform Group By Calculation Using Shell Script

I need help in developing a script.
A flat file count.dat is created everyday through some processes
 and it contains values like this--

NAME         DEPT      CITY       COUNT

SCOTT        MRKT        NEW YORK       100
SCOTT        MRKT        NEW JERS       150
TIGER        ACCT        ATLANTA        200
PETER        IS        ATLANTA        100
MIKE        HR        NEW YORK        50
SAM        HR        PORT           300
SAM        IS        PORT           200
SCOTT        MRKT        PORT           100




I need a Shell script that reads this file and create a new file name CALCL.DAT after
performing following calculations
 
SUM(COUNT) group by NAME,DEPT,CITY,COUNT.

Your help is higly appreciated.
Thanks
srn




0
srpendyala
Asked:
srpendyala
  • 3
  • 2
1 Solution
 
Kent OlsenData Warehouse Architect / DBACommented:

A couple of things to think about.

*  Some cities are one word, some two.  This means you may have 4 parameters per line or 5.  You should start by changing all city names to single words, hence all lines will have 4 parameters.

*  This is a bad example.  All lines are unique.  I'm assuming that that isn't true.

*  There is no "clean" way to do this that I'm aware of.  Still, it can be done.

Here's a script that will print a total line every time one of the keys changes.  It assumes that the file is already sorted as you want it.

#!/bin/ksh

export SUM=0
export NAME=''
export DEPT=''
export CITY=''

PrintSum()
{
  print "Totals:  $NAME   $DEPT   $CITY  $SUM"
}

SumByCity()
{
  if [ "$CITY" = "$1" ]; then
    export SUM=$(($SUM+$2))
  else
    PrintSum
    CITY=$1
    SUM=$2
  fi
}

SumByDept()
{
  if [ "$DEPT" = "$1" ]; then
    SumByCity $2 $3
  else
    PrintSum
    export DEPT="$1"
    export CITY="$2"
    export SUM="$3"
  fi
}

SumByName()
{
  if [ "" = "$NAME" ];then
    export NAME="$1"
    export DEPT="$2"
    export CITY="$3"
    export SUM="$4"
  elif [ "$NAME" = "$1" ];then
    SumByDept $2 $3 $4
  else
    PrintSum
    export NAME="$1"
    export DEPT="$2"
    export CITY="$3"
    export SUM="$4"
  fi
}

cat d | while read f
do
  SumByName $f
done

PrintSum



It's ugly, and needs work.  But it's a great start!

Good Luck,
Kent
0
 
ahoffmannCommented:
awk '{if(NF==4){c=$3}else{c=$3" "$4}}{s[$1" "$2" "c]+=$NF;}END{for(v in s){printf "%-20s\t%s\n", v,s[v]}}' count.dat > CALCL.DAT

KISS - keep it simple stupid
0
 
Kent OlsenData Warehouse Architect / DBACommented:

Yeah, right.  ;)

20+ years in the unix game I've still managed to avoid awk.....


0
 
ahoffmannCommented:
20+ years in the unix game and awk is still my favorite for quick solutions ;-)
0
 
Kent OlsenData Warehouse Architect / DBACommented:


I keep thinking that I need to break down and learn how to use it, but then good judgement overtakes me again.  ;)

0

Featured Post

Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

  • 3
  • 2
Tackle projects and never again get stuck behind a technical roadblock.
Join Now