gram77
asked on
Unix shell script to identify and delete duplicates in data
I want to sort the below given data on filed# 41i.e. ID_BB_UNIQUE, identify the duplicates and delete the duplicates all in one unix script.
Example script to identify duplicates: sort dummy.txt | uniq -c | awk '{if ($1 > 1) print $0}'
Example script to identify duplicates: sort dummy.txt | uniq -c | awk '{if ($1 > 1) print $0}'
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
Partially answered
What is partially answered in this?
ASKER
Here is the data:
Columns:
START-OF-FIELDS
1 TICKER
2 NAME
3 UNDERLYING_CUSIP
4 OPT_PUT_CALL
5 OPT_UNDL_PX
6 OPT_STRIKE_PX
7 PX_BID
8 PX_MID
9 PX_ASK
10 PX_LAST
11 PX_OPEN
12 PX_HIGH
13 PX_LOW
14 PX_VOLUME
15 OPT_OPEN_INT
16 OPT_PX
17 SETTLE_DT
18 MARKET_SECTOR_DES
19 SECURITY_TYP
20 COUNTRY_ISO
21 EXCH_CODE
22 OPT_UNDL_TICKER
23 OPT_CTD
24 OPT_IMPLIED_VOLATILITY_BID
25 OPT_IMPLIED_VOLATILITY_ASK
26 OPT_IMPLIED_VOLATILITY_MID
27 OPT_FINANCE_RT
28 OPT_EXPIRE_DT
29 OPT_EXER_TYP
30 OPT_UNDL_CRNCY
31 OPT_UNDL_ISIN
32 OPT_FIRST_TRADE_DT
33 OPT_TICK_VAL
34 HIGH_52WEEK
35 LOW_52WEEK
36 HIGH_DT_52WEEK
37 LOW_DT_52WEEK
38 PX_EVAL
39 LAST_UPDATE
40 LAST_UPDATE_DT
41 ID_BB_UNIQUE <--delete records with duplicates here
ID_BB_COMPANY
ID_BB_SECURITY
ID_ISIN
CRNCY
PRICING_SOURCE
CNTRY_ISSUE_ISO
LONG_COMP_NAME
CASH_SETTLED
OPT_CONT_SIZE_REAL
FUTURES_CATEGORY
PX_SETTLE_LAST_DT
PX_SETTLE
UNDL_ID_BB_UNIQUE
OPT_CRNCY_FOREIGN
FUT_PX_SESSION
SECURITY_DES
FUT_TICK_SIZE
UNIQUE_ID_FUT_OPT
FUT_LAST_TRADE_DT
OPT_IMPLIED_VOLATILITY_LAS
HIST_CALL_IMP_VOL
HIST_PUT_IMP_VOL
PX_SCALING_FACTOR
FUT_VAL_PT
QUOTED_CRNCY
OPTION_ROOT_TICKER
OPRA_SYMBOL
OCC_SYMBOL
END-OF-FIELDS
TIMESTARTED=Tue Apr 20 17:03:49 EDT 2010
START-OF-DATA
Data:
KCH3P 50.00 Comdty|0|69|KCH3P|COFFEE 'C' FUT OP Mar13P 50|KCH3|P|146.800000|50.00
Physical commodity option.|US|NYB|KCH3| |31.443|31.443|31.443|.120
00|20100420|IX10047966-0-0
31.443|N.A.|31.412|1.000|3
KCH3P 52.50 Comdty|0|69|KCH3P|COFFEE 'C' FUT OP Mar13P 52.5|KCH3|P|146.800000|52.
Physical commodity option.|US|NYB|KCH3| |31.402|31.402|31.402|.120
00|20100420|IX10047966-0-0
8|31.402|N.A.|31.370|1.000
KCH3P 55.00 Comdty|0|69|KCH3P|COFFEE 'C' FUT OP Mar13P 55|KCH3|P|146.800000|55.00
Physical commodity option.|US|NYB|KCH3| |31.359|31.359|31.359|.120
00|20100420|IX10047966-0-0
31.359|N.A.|31.426|1.000|3
KCH3P 57.50 Comdty|0|69|KCH3P|COFFEE 'C' FUT OP Mar13P 57.5|KCH3|P|146.800000|57.
Physical commodity option.|US|NYB|KCH3| |31.382|31.382|31.382|.120
00|20100420|IX10047966-0-0
8|31.382|N.A.|31.432|1.000
KCH3P 60.00 Comdty|0|69|KCH3P|COFFEE 'C' FUT OP Mar13P 60|KCH3|P|146.800000|60.00
Physical commodity option.|US|NYB|KCH3| |31.357|31.357|31.357|.120
00|20100420|IX10047966-0-0
31.357|N.A.|31.394|1.000|3
KCH3P 62.50 Comdty|0|69|KCH3P|COFFEE 'C' FUT OP Mar13P 62.5|KCH3|P|146.800000|62.
Physical commodity option.|US|NYB|KCH3| |31.351|31.351|31.351|.120
00|20100420|IX10047966-0-0
8|31.351|N.A.|31.378|1.000
KCH3P 65.00 Comdty|0|69|KCH3P|COFFEE 'C' FUT OP Mar13P 65|KCH3|P|146.800000|65.00
Physical commodity option.|US|NYB|KCH3| |31.347|31.347|31.347|.120
00|20100420|IX10047966-0-0
31.347|N.A.|31.366|1.000|3