Link to home
Start Free TrialLog in
Avatar of gram77
gram77Flag for India

asked on

Unix shell script to identify and delete duplicates in data

I want to sort the below given data on filed# 41i.e. ID_BB_UNIQUE, identify the duplicates and delete the duplicates all in one unix script.  

Example script to identify duplicates: sort dummy.txt | uniq -c | awk '{if ($1 > 1) print $0}'
Avatar of gram77
gram77
Flag of India image

ASKER


Here is the data:
Columns:
START-OF-FIELDS
1  TICKER
2  NAME
3  UNDERLYING_CUSIP
4  OPT_PUT_CALL
5  OPT_UNDL_PX
6  OPT_STRIKE_PX
7  PX_BID
8  PX_MID
9  PX_ASK
10 PX_LAST
11 PX_OPEN
12 PX_HIGH
13 PX_LOW
14 PX_VOLUME
15 OPT_OPEN_INT
16 OPT_PX
17 SETTLE_DT
18 MARKET_SECTOR_DES
19 SECURITY_TYP
20 COUNTRY_ISO
21 EXCH_CODE
22 OPT_UNDL_TICKER
23 OPT_CTD
24 OPT_IMPLIED_VOLATILITY_BID
25 OPT_IMPLIED_VOLATILITY_ASK
26 OPT_IMPLIED_VOLATILITY_MID
27 OPT_FINANCE_RT
28 OPT_EXPIRE_DT
29 OPT_EXER_TYP
30 OPT_UNDL_CRNCY
31 OPT_UNDL_ISIN
32 OPT_FIRST_TRADE_DT
33 OPT_TICK_VAL
34 HIGH_52WEEK
35 LOW_52WEEK
36 HIGH_DT_52WEEK
37 LOW_DT_52WEEK
38 PX_EVAL
39 LAST_UPDATE
40 LAST_UPDATE_DT
41 ID_BB_UNIQUE  <--delete records with duplicates here
  ID_BB_COMPANY
  ID_BB_SECURITY
  ID_ISIN
  CRNCY
  PRICING_SOURCE
  CNTRY_ISSUE_ISO
  LONG_COMP_NAME
  CASH_SETTLED
  OPT_CONT_SIZE_REAL
  FUTURES_CATEGORY
  PX_SETTLE_LAST_DT
  PX_SETTLE
  UNDL_ID_BB_UNIQUE
  OPT_CRNCY_FOREIGN
  FUT_PX_SESSION
  SECURITY_DES
  FUT_TICK_SIZE
  UNIQUE_ID_FUT_OPT
  FUT_LAST_TRADE_DT
  OPT_IMPLIED_VOLATILITY_LAST
  HIST_CALL_IMP_VOL
  HIST_PUT_IMP_VOL
  PX_SCALING_FACTOR
  FUT_VAL_PT
  QUOTED_CRNCY
  OPTION_ROOT_TICKER
  OPRA_SYMBOL
  OCC_SYMBOL
  END-OF-FIELDS

TIMESTARTED=Tue Apr 20 17:03:49 EDT 2010
START-OF-DATA
Data:
KCH3P 50.00 Comdty|0|69|KCH3P|COFFEE 'C' FUT OP Mar13P    50|KCH3|P|146.800000|50.000000|N.A.| |N.A.|0.330000|N.A.|N.A.|N.A.|N.A.|0|0.330000|20100420|Comdty|
Physical commodity option.|US|NYB|KCH3| |31.443|31.443|31.443|.120|20130208|American|USD| |20100211|3.750|0.330000|0.250000|20100420|20100406|---> identify and delete duplicates here 0.330000|08:00:
00|20100420|IX10047966-0-0640| | | |USD|EX| | |N|37500.00|Foodstuff|20100420|0.330000|IX10192353-0| |P|KCH3P    50 PIT|.01000000|KCH3P 50.00 Comdty|20130208|
31.443|N.A.|31.412|1.000|375.00|USd| | | |
 
KCH3P 52.50 Comdty|0|69|KCH3P|COFFEE 'C' FUT OP Mar13P  52.5|KCH3|P|146.800000|52.500000|N.A.| |N.A.|0.430000|N.A.|N.A.|N.A.|N.A.|0|0.430000|20100420|Comdty|
Physical commodity option.|US|NYB|KCH3| |31.402|31.402|31.402|.120|20130208|American|USD| |20100211|3.750|0.430000|0.330000|20100420|20100406|---> identify and delete duplicates here 0.430000|08:00:
00|20100420|IX10047966-0-0690| | | |USD|EX| | |N|37500.00|Foodstuff|20100420|0.430000|IX10192353-0| |P|KCH3P    52.5 PIT|.01000000|KCH3P 52.50 Comdty|2013020
8|31.402|N.A.|31.370|1.000|375.00|USd| | | |
 

KCH3P 55.00 Comdty|0|69|KCH3P|COFFEE 'C' FUT OP Mar13P    55|KCH3|P|146.800000|55.000000|N.A.| |N.A.|0.550000|N.A.|N.A.|N.A.|N.A.|0|0.550000|20100420|Comdty|
Physical commodity option.|US|NYB|KCH3| |31.359|31.359|31.359|.120|20130208|American|USD| |20100211|3.750|0.560000|0.430000|20100419|20100406|---> identify and delete duplicates here 0.550000|08:00:
00|20100420|IX10047966-0-06E0| | | |USD|EX| | |N|37500.00|Foodstuff|20100420|0.550000|IX10192353-0| |P|KCH3P    55 PIT|.01000000|KCH3P 55.00 Comdty|20130208|
31.359|N.A.|31.426|1.000|375.00|USd| | | |
 
KCH3P 57.50 Comdty|0|69|KCH3P|COFFEE 'C' FUT OP Mar13P  57.5|KCH3|P|146.800000|57.500000|N.A.| |N.A.|0.700000|N.A.|N.A.|N.A.|N.A.|0|0.700000|20100420|Comdty|
Physical commodity option.|US|NYB|KCH3| |31.382|31.382|31.382|.120|20130208|American|USD| |20100211|3.750|0.710000|0.550000|20100419|20100406|---> identify and delete duplicates here 0.700000|08:00:
00|20100420|IX10047966-0-0730| | | |USD|EX| | |N|37500.00|Foodstuff|20100420|0.700000|IX10192353-0| |P|KCH3P    57.5 PIT|.01000000|KCH3P 57.50 Comdty|2013020
8|31.382|N.A.|31.432|1.000|375.00|USd| | | |
 
KCH3P 60.00 Comdty|0|69|KCH3P|COFFEE 'C' FUT OP Mar13P    60|KCH3|P|146.800000|60.000000|N.A.| |N.A.|0.870000|N.A.|N.A.|N.A.|N.A.|0|0.870000|20100420|Comdty|
Physical commodity option.|US|NYB|KCH3| |31.357|31.357|31.357|.120|20130208|American|USD| |20100211|3.750|0.880000|0.690000|20100419|20100406|---> identify and delete duplicates here 0.870000|08:00:
00|20100420|IX10047966-0-0780| | | |USD|EX| | |N|37500.00|Foodstuff|20100420|0.870000|IX10192353-0| |P|KCH3P    60 PIT|.01000000|KCH3P 60.00 Comdty|20130208|
31.357|N.A.|31.394|1.000|375.00|USd| | | |
 
KCH3P 62.50 Comdty|0|69|KCH3P|COFFEE 'C' FUT OP Mar13P  62.5|KCH3|P|146.800000|62.500000|N.A.| |N.A.|1.070000|N.A.|N.A.|N.A.|N.A.|0|1.070000|20100420|Comdty|
Physical commodity option.|US|NYB|KCH3| |31.351|31.351|31.351|.120|20130208|American|USD| |20100211|3.750|1.080000|0.850000|20100419|20100406|---> identify and delete duplicates here 1.070000|08:00:
00|20100420|IX10047966-0-07D0| | | |USD|EX| | |N|37500.00|Foodstuff|20100420|1.070000|IX10192353-0| |P|KCH3P    62.5 PIT|.01000000|KCH3P 62.50 Comdty|2013020
8|31.351|N.A.|31.378|1.000|375.00|USd| | | |
 
KCH3P 65.00 Comdty|0|69|KCH3P|COFFEE 'C' FUT OP Mar13P    65|KCH3|P|146.800000|65.000000|N.A.| |N.A.|1.300000|N.A.|N.A.|N.A.|N.A.|0|1.300000|20100420|Comdty|
Physical commodity option.|US|NYB|KCH3| |31.347|31.347|31.347|.120|20130208|American|USD| |20100211|3.750|1.310000|1.030000|20100419|20100405|---> identify and delete duplicates here 1.300000|08:00:
00|20100420|IX10047966-0-0820| | | |USD|EX| | |N|37500.00|Foodstuff|20100420|1.300000|IX10192353-0| |P|KCH3P    65 PIT|.01000000|KCH3P 65.00 Comdty|20130208|
31.347|N.A.|31.366|1.000|375.00|USd| | | |
ASKER CERTIFIED SOLUTION
Avatar of amit_g
amit_g
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of gram77

ASKER

Partially answered
What is partially answered in this?