[Okta Webinar] Learn how to a build a cloud-first strategyRegister Now

x
?
Solved

grap the max value

Posted on 2011-04-28
10
Medium Priority
?
494 Views
Last Modified: 2012-05-11
Hi, I have a text file that has millions lines. The format likes
1	bs1_533	        0	533	
1	bs1_41342	0	41342	
1	bs1_41791	0	41791	
1	bs1_44449	0	44449	
1	rs2462492	0	44539	
1	bs1_44571	0	44571	
1	rs10399749	0	45162
2           rs45677888  0             12345
2           rs45646770  0              23456	

Open in new window

Suppose the first column is a key, the fouth column is a value
You see that the number of the last column is in the asceder order for the the correspond key. I want to
1) grab the max value of it here is 45162 for key 1.
2) We can create <key,value> pair such as <1,45162>
<2,55555> etc..
I want to get the sum of all  values.

Thanks for help. Linux shell script or dos batch are all welcome.
0
Comment
Question by:zhshqzyc
  • 5
  • 2
  • 2
  • +1
10 Comments
 
LVL 48

Accepted Solution

by:
Tintin earned 1600 total points
ID: 35487159
Here's a shell script solution

sort -rn -k1 -k4 file.dat | sed "s/\t/     /"| uniq -w6 | awk '{t+=$4} END {print t}'
0
 
LVL 85

Expert Comment

by:oBdA
ID: 35487250
Just for the fun of it, here's a plain batch script. Probably not very fast, and depending on the number of lines, there might be an overflow, the NT shell can only calculate from -2147483648 to 2147483647. The script will exit with errorlevel 1 in this case (there might be workarounds, but a ballpark figure of the highest expected sum would be nice).
@echo off
setlocal enabledelayedexpansion
set /a Sum = 0
REM Must be set to the first key in the file:
set /a OldKey = 1
for /f "tokens=1,4" %%a in ('type test.txt') do (
  set NewKey=%%a&set Value=%%b
  if not !NewKey!==!OldKey! (
    set /a Sum += Value
    echo ^<!OldKey!,!Value!^> - !Sum!
    if %Sum% LSS 0 (echo Overflow!&exit /b 1)
    set OldKey=!NewKey!
  )
)
set /a Sum += Value
echo ^<%NewKey%,%Value%^> - %Sum%

Open in new window

0
 

Author Comment

by:zhshqzyc
ID: 35487310
The sum is 3021633417 by using other programming code(c#).
0
NFR key for Veeam Agent for Linux

Veeam is happy to provide a free NFR license for one year.  It allows for the non‑production use and valid for five workstations and two servers. Veeam Agent for Linux is a simple backup tool for your Linux installations, both on‑premises and in the public cloud.

 

Author Comment

by:zhshqzyc
ID: 35487329
And the delimter is a tab key.
0
 
LVL 85

Assisted Solution

by:oBdA
oBdA earned 400 total points
ID: 35488168
Scratch that first version, it doesn't even work with sums less than 2147483647, sorry.
This now finally has the same results as Excel, and can work up to 21474836470000. Got stuck with some inexplicably incorrect results which turned out to be caused by the NT shell's specialty of treating strings with a leading 0 as octal numbers ...
@echo off
setlocal enabledelayedexpansion
set /a HiSum = 0
set /a LoSum = 0
REM Must be set to the first key in the file:
set /a OldKey = 1
set /a OldValue = 0
for /f "tokens=1,4" %%a in ('type test.txt') do (
  set NewKey=%%a
  if !NewKey!==!OldKey! (
    set OldValue=%%b
  ) else (
    set /a LoSum = LoSum + OldValue
    set /a HiSum = HiSum + LoSum/10000
    set /a LoSum = LoSum %% 10000
    echo ^<!OldKey!,!OldValue!^> - !HiSum! !LoSum!
    set OldKey=!NewKey!
    set OldValue=%%b
  )
)
set /a LoSum = LoSum + OldValue
set /a HiSum = HiSum + LoSum/10000
set /a LoSum = LoSum %% 10000
echo ^<!OldKey!,!OldValue!^> - !HiSum! !LoSum!
set LoSumLZ=0000!LoSum!
set LoSumLZ=!LoSumLZ:~-4!
echo Sum: !HiSum!!LoSumLZ!

Open in new window

0
 
LVL 4

Expert Comment

by:aavictor
ID: 35488940
The biggest bubble floats to the top first.
So, after looping you can find the max.
During recording the new Max in comparison, record the key temporarily.
after looping, you will get only one row.  The potential error is
that there are duplicated values you will miss. You get the Max value but not
all row positions. Perhaps, you find the first or the last  
0
 

Author Comment

by:zhshqzyc
ID: 35491301
Dos batch command is much slow than the other way. One hour passed still no result, using shell script or c# only takes about 2 minutes. But I really appreciate your effort.
0
 

Author Comment

by:zhshqzyc
ID: 35491388
@Tintin,

Could you please explain a little bit about
uniq -w6 

Open in new window

I searched online but no lucky.
0
 

Author Closing Comment

by:zhshqzyc
ID: 35493411
Thanks.
0
 
LVL 48

Expert Comment

by:Tintin
ID: 35494979
-w6 means only compare the first 6 characters when determining if a line is unique or not.
0

Featured Post

What is SQL Server and how does it work?

The purpose of this paper is to provide you background on SQL Server. It’s your self-study guide for learning fundamentals. It includes both the history of SQL and its technical basics. Concepts and definitions will form the solid foundation of your future DBA expertise.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

VALIDATING DATES One method of validating dates is to jam the date into the DATE command and see if it accepts it by examining the system's errorlevel value. A non-zero result indicates failure. A typical example might look something like the fol…
Introduction: Recently, I got a requirement to zip all files individually with batch file script in Windows OS. I don't know much about scripting, but I searched Google and found a lot of examples and websites to complete my task. Finally, I was ab…
Learn several ways to interact with files and get file information from the bash shell. ls lists the contents of a directory: Using the -a flag displays hidden files: Using the -l flag formats the output in a long list: The file command gives us mor…
In a recent question (https://www.experts-exchange.com/questions/29004105/Run-AutoHotkey-script-directly-from-Notepad.html) here at Experts Exchange, a member asked how to run an AutoHotkey script (.AHK) directly from Notepad++ (aka NPP). This video…

834 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question