Link to home
Start Free TrialLog in
Avatar of projects
projects

asked on

Did programmer complete the task as required?

I am trying to understand the code that this programmer wrote. He says it is exactly as I wanted and therefore the job is complete.

The job was described as follows;

***SNIP***
Need help re-writing a function in a current script.
The function will use MTR so you must be familiar with MTR otherwise you will waste both our time.

"Very simply... I need this function to continuously run MTR in report mode while there is a problem, then spit out it's report so that it can be sent to php/mysql"

***SNIP***

I never once said that the networks were static because I would be wasting his time and mine. The networks certainly can and do change. I repeatedly said that I didn't want to use single cycles and was looking to run mtr continuously.

I repeatedly showed a basic description of what I needed as follows;

-Start of outage - timestamp

-Which hop is missing - obviously, the one after which ever is last reached
-Did any other hops go missing and if so, timestamp those changes
-Don't log anything else if nothing much has changed so don't use single cycle checking because that simply creates tons of useless logs.

-End of outage - timestamp

The following is the code he wrote for the function which he insists is what I asked for. I need the experts to tell me if you believe I received what I asked for.



function mtr_report() {
      mtr --no-dns --report --report-cycles=1 smartpet.com -p | awk '{printf "%s,%s,%s\n", $1,$2,$3}'
}
function outage()
{
    echo "Server is down looking for the problem"
    #while the network error persists check where is the error
    #this needs to be accurate in terms of timestamps

    # clean report file
    > $OUTAGE_REPORT

    while [ -z $(primary_check) ]; do
        NOW="$(date +"%F %T")#"
        mtr_report > ${FILE_TMP}
        if ! cmp -s $FILE_TMP $FILE_TMP_OLD  ; then
          echo -n "$NOW" >> ${OUTAGE_REPORT}
          # hop down
          grep -vf ${FILE_STATIC_TRACE} ${FILE_TMP} |head -n 1 | tr '\n' '#' | sed 's/#/,0#/'  >> ${OUTAGE_REPORT}
          # hop up
          grep -vf ${FILE_TMP} ${FILE_TMP_OLD} |head -n 1 | tr '\n' '#' | sed 's/#/,1#/'  >> ${OUTAGE_REPORT}
          #cat ${FILE_TMP} | tr '\n' '#' >> ${OUTAGE_REPORT}
          echo >> ${OUTAGE_REPORT}
        fi
        mv $FILE_TMP $FILE_TMP_OLD
    done
    rm "$FILE_TMP_OLD"

    # when server is back send a report
    $CURL -F function=outage  -F "level=@${OUTAGE_REPORT}" $SERVER_URL/app.php
}

Open in new window

I am also asking for help in the following question;
https://www.experts-exchange.com/questions/28478579/bash-script-logging-missing-hops-during-outage.html
Avatar of projects
projects

ASKER

Dang, I keep forgetting to do that. Thanks.
. . .
I never once said that the networks were static because I would be wasting his time and mine. The networks certainly can and do change. I repeatedly said that I didn't want to use single cycles and was looking to run mtr continuously.

I repeatedly showed a basic description of what I needed as follows;

-Start of outage - timestamp

-Which hop is missing - obviously, the one after which ever is last reached
-Did any other hops go missing and if so, timestamp those changes
-Don't log anything else if nothing much has changed so don't use single cycle checking because that simply creates tons of useless logs.

-End of outage - timestamp
If you did not put the above in writing, the programmer can deliver whatever original requirements correspond to the text between the "***SNIP***" tags regardless.

Any "verbal" requirements/instructions added afterwards should be considered "out of scope" and therefore billable.

Therefore if you did not provide clear requirements, the programmer delivered what was asked.
Sorry.
Avatar of hypercube
I've written a few of these with pretty good results as far as I'm concerned.  Only the code context was a bit different but the objectives were the same.

I've read both threads and there is a lack of precision in the specification.  One specification is:
Which hop is missing - obviously, the one after which ever is last reached
Unless one knows what you're thinking, this specification could be labeled "impossible".  To avoid that, one needs to be more specific.  Such as:

Trace the path and record it.  Do this periodically so the "latest good path" is recorded, discarding the previous one.
When an error occurs, trace the path / or save the trace of the truncated path when an error occurs.
Compare the "good path" to the "truncated path".
Identify the "likely next hop that went missing".

Now, when I write my own tools for this I'm more interested in the last responding hop than the "missing" hop.
That's for the obvious reason that the missing hop is, well....., "missing" and all one can do for sure is to imply what went missing.
But you can surely define your requirements as I have above.
Even so, there still remains a problem:

Let us say that the "normal" path looks like this:
A-B-C-D-E-F
or
A-B-Q-Y-E-F
and, of course, there are other possibilities but I list 2 just as a short example.

Let us say that the last "normal" traced path is:
A-B-C-D-E-F
and the path fails directly thereafter.
The possibilities and possible "answers" which are implied are now:
A-B-C-D-E-*    >>   F has failed
A-B-C-D-*-*    >>   E has failed
A-B-C-*-*-*    >>   D or Y or ???? has failed
A-B-*-*-*-*    >>   C or Q or ???? has failed
A-*.*.*.*-*     >>   B has failed
*-*-*-*-*-*    >>   A has failed

Of course, this goes beyond the original specification in terms of what needs to be saved, for how long, etc.

I don't know if this helps much but I hope it may suggest possible items leading to confusion re: the specification.
Your post seems - unhappy.  Do you feel you didn't get what you specified?  

I see this all the time - people give a programmer sketchy information, the programmer just jumps in and makes bunch of assumptions, and then the requester is not happy with the results.

Blame probably goes to both of you, if you ask me.

When you hire a programmer, you need to make sure they are crystal-clear on every one of your requirements.  You need to take a LOT more time to write up what you want done, and explain it in detail.  If you are automating a manual process, you need to show them the manual process.  You need to specify how the project will be tested, how the results will be validated, and exactly what the program needs to produce in order to consider the job done satisfactorily.

If you want a report, then sit down, take ten extra minutes, and type up a sample of EXACTLY what you want the report to look like - don't generalize.

Post your test results.  Does it do what you want it to do when you test it?  If not, you make a list of defects, give the defect list to the programmer,  and tell the programmer to fix the defects before he gets paid.  And do a good job of explaining the defect, and exactly what you need fixed.  Programmers aren't mind-readers.  This cycle continues until you have what you want.  You get it fastest if you do a great job of describing what you want up front.

That's just how you do a programing project.

I think these comments are indicative of the problem:

I never once said that the networks were static because I would be wasting his time and mine. The networks certainly can and do change

How is the developer supposed to know that is a requirement if you don't tell him?  You're the one who is specifying how this should work and what it should do.  Not all programmers are experienced network techs

I repeatedly said that I didn't want to use single cycles and was looking to run mtr continuously.

Looking at what you asked for, this looks about right.

I assume something calls outage() when an outage is originally detected.  Then it goes into a loop, running MTR over and over in single-cycle mode dumping output to a temp file, pulling some data out to a file on each pass, and deleting the temp file.  When the outage is over (primary_check succeeds), the resulting summary goes to the

I repeatedly showed a basic description of what I needed as follows;

-Start of outage - timestamp

-Which hop is missing - obviously, the one after which ever is last reached
-Did any other hops go missing and if so, timestamp those changes
-Don't log anything else if nothing much has changed so don't use single cycle checking because that simply creates tons of useless logs.

-End of outage - timestamp

I don't know what the endpoints are here, but some of your requirements don't make sense to me.  In a large network, there are multiple paths.  So on one pass through you might go one way, and on another you may go a different way.  Nothing wrong with that.  So the idea of a hop "going missing" is a mystery to me.  If you are trying to do outage detection, then you want to see the hop where packets stop going through.  What happens after that is probably moot.  So "did any other hops go missing" doesn't make any sense to me.

Run mtr, grab some output, edit it up the way you want, and show him exactly what you want:   Give this input, I want this output.

I don't know what agreement you had with the programmer.  I don't know what you actually gave him as a design document.  I don't know how you defined how the program would be tested.  I don't know how you defined the acceptance criteria.  Hopefully you wrote all that up on your contract.
Don't let the title skew your thinking of the message in this video  https://www.youtube.com/watch?v=6h3RJhoqgK8
My specs are usually just an outline of what I am looking for.
However, I spend a great deal of time communicating with the programmer about my requirements once someone has become interested.

When I communicated with the programmer, I very specifically explained what I was after, detailing exactly what I wanted to achieve which is pretty much what I explain in my original post. It's not that complicated.

His side is that he gave me exactly what I was after but I would never have said I wanted something static. If you look at the short description alone, it never says anything about statics.

The programmer threatened to post MY code publicly, asking for opinions so I told him that he might as well just keep the money and that I not only didn't really care if he did but that it was highly immature of him to act this way, very unprofessional.  

As for why I am asking this. I already took the loss but I wanted to better understand what went wrong which is why I posted the code he gave me along with explaining my side.

I just want to avoid nonsense like this in the future.
@fmarshall; Maybe I need to use your description to explain my task next time I post it.

Going missing simply means it is no longer reachable.
Well, "going missing" is probably more a British English term but I find it useful in a case like this.  
What I meant to convey was that a hop, that HAD responded earlier, was no longer responding - thus "missing" (along with all those that follow of course).
This is not to be confused with a hop that's working still doesn't respond in any case.  And, of course, there are those.

Whereas "no longer reachable" could connote the end destination.  It's probably a good idea to be more definite about this and the objectives related to it.
That is: I don't normally think of the hops being "reached", only the end destination.
But neither would I say that it's incorrect to say that "a hop is reached" but it could be confusing if not in context.
@Gary Patterson
You are assuming that we didn't communicate but as I said, I spent a lot of time explaining and describing what I needed and asked repeatedly if he had any questions.
He misunderstood the requirements so built what you see which even if my short description above doesn't explain it well, doesn't seem to match up with what he wrote.

And yes, we did test the code, over and over again and I kept saying that this isn't what I am looking for as a result. I kept saying that single cycle will not work and that I am looking more for a way to continuously run mtr, killing it when the outage is over, without losing it's reports.

I've also been asking on this site over and over again if that can even be done. Running mtr continuously during a problem, then being able to kill it while still getting the reports.

@Scott Fell (padas). Can you just make the point. I don't have time to watch a 30 minute video to try and decipher what you mean? :)
@fmarshall; I agree that things can always be defined better but in the end, if you are talking with someone who understands what you are talking about, it should not have to be so overly defined.

I do not believe the programmer really understood what I was asking, even though I repeatedly kept asking if he understood.
The purpose of EE is to give answers and advice.  The 30 minute video cited is very much worth the time.

I've written a lot of contracts, statements of work, etc. etc.  You are describing a situation where there was a lack of understanding.  That calls for better definition so how can better and "overly" be reconciled?  It's your opinion that the other person "understands what you are talking about".  What did you do to ascertain if that was reasonably correct?

It's concerning that you "repeatedly asked if he understood".  A natural answer to this kind of question is "yes" .. period.  And that comes often enough without the person trying to be elusive.  What's important is to learn if understanding is there.  

Steven Covey advises: "Seek first to understand and then to be understood".  He advises this because people often focus on the latter at the expense of the former.
It wasn't really a lack of understanding, more like the programmer was finding a way to justify that he did what was wanted when he knew full well that he didn't.

Either way, what I am really asking everyone is what does the code do?

Yes, it could have been better described and yes I did communicate my requirements but aside from all of that, what does the code do?

Does it even come close to what I asked for based on what I have described?
It's a good reminder to get what you want done in writing before you start.  Breaks down why things go off.  It is mainly aimed at developers/designers.

It sounds like you had a lot of communication but it was after the fact.  I think 50% of getting what you expect is for both parties to interpret the problem in the same way before you start.  It is common for both parties to hear what they want to hear.  

I agree with what is already stated, sounds like both sides are to blame.
No I communicated quite a lot before he started and during the task.
Anyhow, that isn't my question. My question is based on what I was asking for, did the programmer supply to required code?

His code seems to be overly complicate for what I wanted let alone that it just kept sending single cycles to the log.
I've also been asking on this site over and over again if that can even be done. Running mtr continuously during a problem, then being able to kill it while still getting the reports.

MTR has two modes:  interactive mode and report mode.  In interactive mode, you run mtr interactively, and it runs continuously.  In report mode, it dumps a report after "x" number of cycles.  You can't use interactive mode in a script.  I'd need to see the actual program output to tell you much more.  Why don't you post the results of your last round of testing?
I no longer have the log but it was simply single cycle results, testing over and over again then quitting once the outage was over. No time stamps, just repeatedly logging the results of one cycle.
I could easily do that and fill my drive with logs but I asked specifically that we only log differences, not every single cycle.
I use a Windows command prompt script / .bat file which does this:

Runs and logs a traceroute one time for reference purposes.
Pings at set intervals.
If "n" sequential pings are missed, run and log a traceroute.
Loop / Go back to pinging.

So this approach logs numerous traceroute results IF the outage remains for a long period of time.
Yet, this sort of log is pretty useful and not really too voluminous.
The script could be modified to discard all but the first trace for each outage / followed by the first "good" trace thereafter for reference purposes.

This sounds like the sort of thing you want......
"I just want to avoid nonsense like this in the future. "

Suggestions:

1) Hire a well-regarded professional (or firm).  

2) Consider hiring an experienced analyst the next few projects to turn your spoken requirements into professional written specifications.  Watch what they do so that you can learn to develop good specs yourself.  Good specifications always have report samples as part of the report specifications.

3) Spell out the deal in a good contract.  Pay particular attention to the testing, defect resolution, and acceptance criteria that you spell out.  Make sure there is a nondisclosure clause and a confidentiality clause, and contract with someone in a country where you can enforce your agreement if they do something foolish like "publish your code".  Professionals don't pull that kind of thing - I don't know where you found this person, but I wouldn't go back there again.

4) Don't pay up front - certainly not on small jobs like this.  On larger jobs, you'll typically pay a retainer and then for certain milestones, but save a significant payment for the end.
I do have a log of existing hops which is why I'm not too worried about knowing which is missing. More importantly for me is knowing if there are changes during an outage.

When looking at networks, I never see them as static no matter what because there is always the chance that something has changed so I simply make that assumption.

In the script, all I want to do is test to see which hop I cannot reach when something is down. However, sometimes, one hop might come back and another might go down. That second part is the only reason I want to log changed hops as well as the usual traceroute showing which hop is down.

My script will already tell me when the outage occurred and when it ended, I have time stamps for that. What I don't have is time stamps showing when/if hops changed condition during the outage.

That is why mtr is such a great tool. Because interactively at least, you can practically watch as networks go up and down but non interactively, I know of no way, hence, I was hoping to find some way of coming up with that data.
@Gary Patterson

In this case, I just got this person from one of those programmer bidding sites. The kind where 99% of the programmers tell you they are experts but just like in this case, I could tell this guy was looking things up on the net while we were chatting.

I figured it was a simple job, it's not worth much, what could go wrong. I also figured since they escrow the funds, I don't have to pay until the task is completed but I decided to be nice and pay anyhow and I got taken. Not the first time on those sites, plenty of bad experiences. However, some good ones also if you can get through the baloney because there really are some great people out there and it's worth looking for them using small jobs.

I try to use these small jobs to find people who can really show their skills then I keep sending them work but in this case, this guy decided that he would play head games with me. While he can make a good argument all he wants, the bottom line is that he knows full well he didn't deliver what he was supposed to but doesn't seem to care.

After everything was said and done, I just wanted to get a little clarification by asking a real experts community what they felt in terms of requirement and solution given.

Yes, I could have explained it much better but I also felt it was a pretty small task and I really did try to express it so he understood, even using 'missing' as a term :)
If that is an accurate description of the output you got, then I'd say "no, the programmer didn't give you what you asked for."  

I'd also qualify that with - "but, what you asked for wasn't reasonable based on the constraints you provided, since you can't run mtr in interactive mode in a script."

Your programmer should have told you that.

Personally, I'd just do something similar to what fmarshall suggests.  When you detect an outage, run mtr for a 60 cycles or something - that should be more than enough information to diagnose the cause of an outage.  There are hundreds - thousands of tools, many of them free, that do uptime monitoring - why not just use one of them?

Just one example: https://uptimerobot.com/

Free, checks every 5 minutes.  Notifies you various ways, and offers a simple API so you can interface with it (to get that mtr, for example).
OK, say you have this during an outage:

                           My traceroute  [v0.71]
            example.lan                           Sun Mar 25 00:07:50 2007

                                       Packets                Pings
Hostname                            %Loss  Rcv  Snt  Last Best  Avg  Worst
 1. example.lan                        0%   11   11     1    1    1      2
 2. ae-31-51.ebr1.Chicago1.Level3.n   19%    9   11     3    1    7     14
 3. ae-1.ebr2.Chicago1.Level3.net      0%   11   11     7    1    7     14
 4. ae-2.ebr2.Washington1.Level3.ne   19%    9   11    19   18   23     31
 5. ae-1.ebr1.Washington1.Level3.ne   28%    8   11    22   18   24     30
*** bad line here

Then a minute later, you have this

                           My traceroute  [v0.71]
            example.lan                           Sun Mar 25 00:07:50 2007

                                       Packets                Pings
Hostname                            %Loss  Rcv  Snt  Last Best  Avg  Worst
 1. example.lan                        0%   11   11     1    1    1      2
 2. ae-31-51.ebr1.Chicago1.Level3.n   19%    9   11     3    1    7     14
 3. ae-1.ebr2.Chicago1.Level3.net      0%   11   11     7    1    7     14
 4. ae-2.ebr2.Washington1.Level3.ne   19%    9   11    19   18   23     31
 5. alt-1.ebr1.Washington1.Level3.ne   28%    8   11    22   18   24     30
***bad line here

Then this:

                           My traceroute  [v0.71]
            example.lan                           Sun Mar 25 00:07:50 2007

                                       Packets                Pings
Hostname                            %Loss  Rcv  Snt  Last Best  Avg  Worst
 1. example.lan                        0%   11   11     1    1    1      2
 2. ae-31-51.ebr1.Chicago1.Level3.n   19%    9   11     3    1    7     14
 3. ae-1.ebr2.Chicago1.Level3.net      0%   11   11     7    1    7     14
 4. ae-2.ebr2.Washington1.Level3.ne   19%    9   11    19   18   23     31
***bad line here

Then this

                           My traceroute  [v0.71]
            example.lan                           Sun Mar 25 00:07:50 2007

                                       Packets                Pings
Hostname                            %Loss  Rcv  Snt  Last Best  Avg  Worst
 1. example.lan                        0%   11   11     1    1    1      2
 2. ae-31-51.ebr1.Chicago1.Level3.n   19%    9   11     3    1    7     14
 3. ae-1.ebr2.Chicago1.Level3.net      0%   11   11     7    1    7     14
 4. ae-2.ebr2.Washington1.Level3.ne   19%    9   11    19   18   23     31
 5. ae-1.ebr1.Washington1.Level3.ne   28%    8   11    22   18   24     30
 6. ge-3-0-0-53.gar1.Washington1.Le    0%   11   11    18   18   20     36
 *** bad line here

How should the report look?
I also kept telling the programmer that I was counting on him to ask me the proper questions if he was truly an mtr expert so there was an amount of trust I had to put in that person.

Yes, there are countless tools out there and we use many of the larger packages. I simply need a smaller version which is my own, where I can easily customize and add as I need.

Since mtr obviously doesn't do what I need, then I am still going to be hunting for something that does. Again, it will amount to a script but perhaps it will use traceroute or something else instead but achieve the same results I am after.
This looks like a new specification in contrast to what's been said heretofore:
In the script, all I want to do is test to see which hop I cannot reach when something is down. However, sometimes, one hop might come back and another might go down. That second part is the only reason I want to log changed hops as well as the usual traceroute showing which hop is down.
So, the flow logic of this is:
Trace for path reference
LOOP Test for outage
IF an outage
THEN trace and log
   IF trace path is different than reference path
   THEN log path difference
              update path reference
END IF
LOOP

Is that it?
@Gary Patterson; great example.

You have four tests and exactly as I've described, hops have changed. The logging I am wanting to accomplish would look like this;

outage - timestamp

timestamp - 5. ae-1.ebr1.Washington1.Level3.ne (last hop reachable)
or timestamp - hop 6 is down

timestamp - 4. ae-2.ebr2.Washington1.Level3.ne (last hop reachable)
or timestamp - hop 5 is down

timestamp - 6. ge-3-0-0-53.gar1.Washington1.Le (last hop reachable)
or timestamp hop 7 is down

end of outage - timestamp

My quest is to find out which hop was down. When it came back would be the 'end of outage' timestamp. If another hop changed condition, then I would have that timestamp. Etc.

Unfortunately, I have to rush out the door for an hour or two but I hope I got this right.
Yeah, I did that on purpose since it is the complex case - path changed.

OK - I understand what you want.  I don't think the report you've described is really very useful, since the "last hop responding" is only really meaningful in the context of how you got there.  As a network guy, or as your ISP, I'd want to see the whole trace, up to the last hop responding.  So I'd recommend this:

OUTAGE: timestamp

Last hop responding: Sun Mar 25 00:07:50 2007

                                       Packets                Pings
Hostname                            %Loss  Rcv  Snt  Last Best  Avg  Worst
 1. example.lan                        0%   11   11     1    1    1      2
 2. ae-31-51.ebr1.Chicago1.Level3.n   19%    9   11     3    1    7     14
 3. ae-1.ebr2.Chicago1.Level3.net      0%   11   11     7    1    7     14
 4. ae-2.ebr2.Washington1.Level3.ne   19%    9   11    19   18   23     31
 5. ae-1.ebr1.Washington1.Level3.ne   28%    8   11    22   18   24     30

Last hop responding: Sun Mar 25 00:07:51 2007

                                       Packets                Pings
Hostname                            %Loss  Rcv  Snt  Last Best  Avg  Worst
 1. example.lan                        0%   11   11     1    1    1      2
 2. ae-31-51.ebr1.Chicago1.Level3.n   19%    9   11     3    1    7     14
 3. ae-1.ebr2.Chicago1.Level3.net      0%   11   11     7    1    7     14
 4. ae-2.ebr2.Washington1.Level3.ne   19%    9   11    19   18   23     31
 5. alt-1.ebr1.Washington1.Level3.ne   28%    8   11    22   18   24     30

Last hop responding: Sun Mar 25 00:07:52 2007

                                       Packets                Pings
Hostname                            %Loss  Rcv  Snt  Last Best  Avg  Worst
 1. example.lan                        0%   11   11     1    1    1      2
 2. ae-31-51.ebr1.Chicago1.Level3.n   19%    9   11     3    1    7     14
 3. ae-1.ebr2.Chicago1.Level3.net      0%   11   11     7    1    7     14
 4. ae-2.ebr2.Washington1.Level3.ne   19%    9   11    19   18   23     31

Last hop responding:  Sun Mar 25 00:07:50 2007

                                       Packets                Pings
Hostname                            %Loss  Rcv  Snt  Last Best  Avg  Worst
 1. example.lan                        0%   11   11     1    1    1      2
 2. ae-31-51.ebr1.Chicago1.Level3.n   19%    9   11     3    1    7     14
 3. ae-1.ebr2.Chicago1.Level3.net      0%   11   11     7    1    7     14
 4. ae-2.ebr2.Washington1.Level3.ne   19%    9   11    19   18   23     31
 5. ae-1.ebr1.Washington1.Level3.ne   28%    8   11    22   18   24     30
 6. ge-3-0-0-53.gar1.Washington1.Le    0%   11   11    18   18   20     36

OUTAGE ENDED - TIMESTAMP

Of course you can trim headings and such to make it more compact, but I like them in there.  After all, how many outages do you have?
Many many many outages when new circuits are put in. Can't tell you why but that's partly why I want my own tool to figure it out.

@marshall; Not sure what you mean about a spec change. I would have no reason to change the spec but it's possible I'm trying to describe it better and maybe that makes it sound different?
In the script, all I want to do is test to see which hop I cannot reach when something is down.
I took this to be the original specification.

However, sometimes, one hop might come back and another might go down. That second part is the only reason I want to log changed hops as well as the usual traceroute showing which hop is down.
I took this as a new requirement, thus a change (add) to the specification.  As far as implementation is concerned this addition is a big deal.  

One really needs to try to be a literalist when writing specifications.  Phrases like: "obvious to one skilled in the art" can be used at cocktail parties but not in specifications.  If you *can* say it better then that means there's new and important information.  If added relative to an established specification then it's a change.  I think the quotes above are a great example of this.

Respectfully, If I might be so bold:  It seems that you may be removing yourself a bit from the issue.  *You* know what you want in some ethereal sense but may not have explained it to others very well.  That's not uncommon nor a criticism.  Conveying good specifications can be hard work.  (I have found that writing clearly and concisely about something you know a LOT about is harder than writing about something for which you have fewer things floating about in your brain as you write.  Assumptions and context are part of that difficulty.) My point here is that your contractors and the EE experts aren't the only responsible players.  You are one of those responsible players as well.  We or they can't be the only ones responsible for the end result.  So, if folks don't understand the specification then it's not axiomatic that they are at fault....

Why have you not responded to the questions that we've asked above?  They were intended to gain understanding.
Many many many outages when new circuits are put in. Can't tell you why but that's partly why I want my own tool to figure it out.

That's certainly a problem.  You ought to open a question about that - that's not normal in a well-managed environment.
I agree but it is actually a development environment so it's normal.

Still, the code I need is to do exactly what you posted as an example and my reply.
It doesn't have to be mtr, I just hoped that there might be a creative way of using mtr in a script which would allow for this kind of output. Running single cycles generates more logs than I can use and I need only what the examples you and I just did as output.

Thanks to that example, I can now better show what I am looking for. Based on the outage you showed, this is the output I would like to end up with using a script stripping everything else away.

id1 timestamp - outage
id2 timestamp - 5. ae-1.ebr1.Washington1.Level3.ne (last hop reachable) or hop 6 down
id3 timestamp - 4. ae-2.ebr2.Washington1.Level3.ne (last hop reachable) or hop 5 down
id4 timestamp - 6. ge-3-0-0-53.gar1.Washington1.Le (last hop reachable) or hop 7 down
id5 timestamp -end of outage

If only one hop went down, then it would look like;

id1 timestamp - outage
id2 timestamp - 4. ae-2.ebr2.Washington1.Level3.ne (last hop reachable) or hop 5 down
id3 timestamp -end of outage
So what about high packet loss as opposed to a completely unreachable hop?  

A few % packet loss is all it takes to interfere with some applications, and high packet loss will make most applications inaccessible, resulting in a "down" condition from the perspective of web site visitors.

And seeing packet loss on a "middle-of-thel-list" server isn't very useful as a diagnostic tool without seeing the packet loss above it, too.
I already have that being logged. Mind you, that's also why I was hoping to use mtr, because of the way of both both ping and traceroute simultaneously.

Basically, I was trying to find a way to write my own version of mtr, one which can be killed as a process but still spit out it's log. That's why when I posted jobs looking for help, I always said mtr EXPERT :)
I will open yet one more question based on ID: 40220034.
However, I would like to close this one.

Based on what you all have read, what is the consensus? Did the programmer provide the correct code to do anything even remotely close to what I was asking for?
ASKER CERTIFIED SOLUTION
Avatar of Gary Patterson, CISSP
Gary Patterson, CISSP
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
I think that after going back and forth, considering the things I might have done wrong as well, the answer really is what I picked. The fact is that the programmer should not have pretended to understand what I needed and should have asked a lot more questions because I would have been very happy to answer everything I possibly could.

Lesson learned as you said.

Thanks.
I am going to post my question one more time, this time, not using mtr but any other means to accomplish what we have described in this question.