asked on

Python : find smallest difference

Hi Experts,
Could you please help me to resolve following problem.
----------------------------------------------------------------------------------------------------------------
The file football.txt contains the results from the English Premier League for 2001/2. The columns labeled ‘F’ and ‘A’ contain the total number of goals scored for and against each team in that season (so Arsenal scored 79 goals against opponents, and had 36 goals scored against them). Need a java program to print the name of the team with the smallest difference in ‘for’ and ‘against’ goals.
-------------------------------------------------------------------------------------------------------------
My main objective of this question is to identify the activities we will be able to write the code and demonstrate some of the practices and techniques that we would use in best practices under industry. I'm mainly be looking for the thought process and how we can complete this task. But points goes to the first/best post.

Thanks a lot Experts.

pepr

The first steps shows the spike solution -- just proving the idea works:

import re

#    Team            P     W    L   D    F      A     Pts
# 1. Arsenal         38    26   9   3    79  -  36    87 

rex = re.compile(r'''^\s*\d+\.
                     \s+(?P<Team>.+?)
                     \s+(?P<P>\d+)
                     \s+(?P<W>\d+)
                     \s+(?P<L>\d+)
                     \s+(?P<D>\d+)
                     \s+(?P<F>\d+)
                     \s+-
                     \s+(?P<A>\d+)
                     \s+(?P<Pts>\d+)
                     \s*$''', re.VERBOSE)


fname = 'football.txt'
diff = -10000  # init -- impossible result in reality
winner = None  # init

with open(fname) as f:              # open in text mode for reading
    for line in f:                  # loop throught the text file lines
        m = rex.match(line)         
        if m:                       # if it is the line with results
            f = int(m.group('F'))   # convert the wanted numbers to int
            a = int(m.group('A'))
            
            if f - a > diff:               # if better
                diff = f - a               # remember the result
                winner = m.group('Team')   # and the team
                
# The file object is closed automatically because the with construct was used.                
                
print winner, 'is the winner with the result', diff

Open in new window

It prints on my console

c:\tmp\_Python\Dushan911\Q_27842934>python a.py
Arsenal is the winner with the result 43

Open in new window

pepr

The above solution uses regular expression to detect the lines that are of the interest. However, it is always better to wrap the functionality so that the way of extraction is a kind of abstracted. The reason is that the input can change in future and we would like to fix the unrelated parts separately:

import re

def getTheBestTeam(fname, rex=re.compile(r'''^\s*\d+\.
                     \s+(?P<Team>.+?)
                     \s+(?P<P>\d+)
                     \s+(?P<W>\d+)
                     \s+(?P<L>\d+)
                     \s+(?P<D>\d+)
                     \s+(?P<F>\d+)
                     \s+-
                     \s+(?P<A>\d+)
                     \s+(?P<Pts>\d+)
                     \s*$''', re.VERBOSE)):

    diff = -10000  # init -- impossible result in reality
    winner = None  # init

    with open(fname) as f:              # open in text mode for reading
        for line in f:                  # loop throught the text file lines
            m = rex.match(line)         
            if m:                       # if it is the line with results
                f = int(m.group('F'))   # convert the wanted numbers to int
                a = int(m.group('A'))
                
                if f - a > diff:               # if better
                    diff = f - a               # remember the result
                    winner = m.group('Team')   # and the team
    return winner, diff            
                

winner, result = getTheBestTeam('football.txt')

print winner, 'is the winner with the result', result

Open in new window

Here the rex became the implementation detail, and yet, it can be passed as the optional argument if needed. Anyway, main goal was to hide the detail.

There are other things that can be improved to make the code more robust in future.

pepr

Sorry. Stupid me. The solutions above are probably totally off. They search for the biggest difference. Anyway, what do you exactly want to find?

import re

def getTheWorseTeam(fname, rex=re.compile(r'''^\s*\d+\.
                     \s+(?P<Team>.+?)
                     \s+(?P<P>\d+)
                     \s+(?P<W>\d+)
                     \s+(?P<L>\d+)
                     \s+(?P<D>\d+)
                     \s+(?P<F>\d+)
                     \s+-
                     \s+(?P<A>\d+)
                     \s+(?P<Pts>\d+)
                     \s*$''', re.VERBOSE)):

    diff =  10000  # init -- impossible result in reality
    looser = None  # init

    with open(fname) as f:              # open in text mode for reading
        for line in f:                  # loop throught the text file lines
            print line.rstrip()
            m = rex.match(line)         
            if m:                       # if it is the line with results
                f = int(m.group('F'))   # convert the wanted numbers to int
                a = int(m.group('A'))
                
                if f - a < diff:               # if smaller difference 
                    diff = f - a               # remember the result
                    looser = m.group('Team')   # and the team
    return looser, diff            
                

looser, result = getTheWorseTeam('football.txt')

print looser, 'is the looser with the result', result

Open in new window

It echoes the input lines and prints the result on the last line:

c:\tmp\_Python\Dushan911\Q_27842934>python c.py
Source <a
href="http://sunsite.tut.fi/rec/riku/soccer_data/tab/93_94/table.eng0.01_02.html
">sunsite.tut.fi/rec/riku/soccer_data/tab/93_94/table.eng0.01_02.html</a>

<pre>
       Team            P     W    L   D    F      A     Pts
    1. Arsenal         38    26   9   3    79  -  36    87
    2. Liverpool       38    24   8   6    67  -  30    80
    3. Manchester_U    38    24   5   9    87  -  45    77
    4. Newcastle       38    21   8   9    74  -  52    71
    5. Leeds           38    18  12   8    53  -  37    66
    6. Chelsea         38    17  13   8    66  -  38    64
    7. West_Ham        38    15   8  15    48  -  57    53
    8. Aston_Villa     38    12  14  12    46  -  47    50
    9. Tottenham       38    14   8  16    49  -  53    50
   10. Blackburn       38    12  10  16    55  -  51    46
   11. Southampton     38    12   9  17    46  -  54    45
   12. Middlesbrough   38    12   9  17    35  -  47    45
   13. Fulham          38    10  14  14    36  -  44    44
   14. Charlton        38    10  14  14    38  -  49    44
   15. Everton         38    11  10  17    45  -  57    43
   16. Bolton          38     9  13  16    44  -  62    40
   17. Sunderland      38    10  10  18    29  -  51    40
   -------------------------------------------------------
   18. Ipswich         38     9   9  20    41  -  64    36
   19. Derby           38     8   6  24    33  -  63    30
   20. Leicester       38     5  13  20    30  -  64    28
</pre>
Leicester is the looser with the result -34

Open in new window

ASKER CERTIFIED SOLUTION

pepr

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

Dushan Silva

ASKER

Thanks a lot pepr!
You are always Genius..