Solved

Find random points from a list out side a range

Posted on 2011-09-13
10
233 Views
Last Modified: 2012-05-12
Hi
I’m trying to find the random point from a set of Longitude Latitude coordinates so I can create a co-ordinate string from the cluster

The Longitude values can be between -180 + 180  and Latitude between -90 and +90 degrees
each point needs to be with +or- 10 degrees of the last point

The code I’ve come up with works unless the first point is the one that is wrong causing every other point to be in error and  $co-ord-String ony equels the firsst point.

The code is copied from a standalone so ignore obvious typos
my $co-ord-String;
For my $pNum (sort keys %{$hash{$key}}
  {
     my @points = (sort{$a<=> $b} keys %{$hash{$key}));
     my ($long_H,$long_L,$Lat_H,$Lat_L);
     Foreach my $pt (@points)
       {
          my $long = $hash{$key}{$pNum}{Long};
          my $lat = $hash{$key}{$pNum}{Lat};

           if(($long_H eq "") and ($long_L eq ""))
              {
                 #set higher & lower linits
                 $long_H = $long + 10;
                 $long_L = $long – 10;

                    Repeat for $Lat
                }
             else
                {
                 # check if $long is within 10 of last piont
                  if($long >=$long_L && $long <= $long_H)
                    {     
                      $long_H = $long + 10;
                       $long_L = $long – 10;
                      }
                   else
                     {
                         # long is in error
                         $long = ‘E’;
                      }
                    Repeat for $Lat

                 }
     
      Unless (($long eq ‘E’) or ($lat = ‘E’)
         {
            $co-ord-String = $co-ord-String .  $long . " " . $lat . ",";
          }
      }
  }
$co-ord-String =~ s/,$//g

Open in new window

0
Comment
Question by:trevor1940
  • 5
  • 5
10 Comments
 
LVL 9

Accepted Solution

by:
oheil earned 500 total points
ID: 36534964
I have to put everything into code because it refuses to accept the i in paranthesis:

What you are checking is:

Starting with the second point:
p[i+1] >= p[i]-10  &&  p[i+1] <= && p[i]+10 
That is, the next point must be in a range of +-10 of the point before.
If this is true, p[i+1] is not an error.

The problem is clear with the first point being the real error.

A solution would be:

Starting with the first point:
Check for
p[i+1]+10 > p[i] && p[i+1]-10 < p[i]

In this case the check logic is something else. Imagin the second point in the middle of the range:
i                 o
i+1  |--------o--------|
From the left to right every position is allowed.
If the position is the most left one which is p[i]-10 = p[i+1]
than p[i] not to be an error 
p[i] < p[i+1]+10 must be true.
And if the position is the most right one which is p[i]+10 = p[i+1]
than p[i] not to be an error 
p[i] > p[i+1]-10 must be true.

All together:
p[i] is not an error if
p[i+1]+10 > p[i] && p[i+1]-10 < p[i]
is true.

But thats still not all, because now the second last point can be an error, but the real error is the 
last one. A new problem.

Solution:
For all points except the first one and the last one, check both conditions:

1 is the index of the first point, m is the index of the last point, k is the current point to check:

for 1<k<m
conditions:

( p[k] <= p[k-1]+10 and p[k] >= p[k-1]-10 )
or
( p[k] < p[k+1]+10 and p[k] > p[k+1]-10 )

must be true, for p[k] not to be an error.

for k=1 (the first point):
( p[k] < p[k+1]+10 and p[k] > p[k+1]-10 )
must be true, for p[k] not to be an error.

for k=m  (the last point):
( p[k] <= p[k-1]+10 and p[k] >= p[k-1]-10 )
must be true, for p[k] not to be an error.

Implementing it now should be straight forward.

There is still a problem: say some point in the middle is an error in comparison
to the points before but not in comparison to the points after it. Which means you
have to lists where all points would be ok if the two lists would be separated.
Checking like described will not find an error. But this condition is not defined
in your explanation. Its an undefined condition of the list of points in your
scenario. Add another condition, called "break of lists" = b.
p[k] is of type b, if 
( p[k] <= p[k-1]+10 and p[k] >= p[k-1]-10 )
XOR
( p[k] < p[k+1]+10 and p[k] > p[k+1]-10 )
is true.

Regards,

Oli

PS: long explanation but I developed it during writing :-)

Short version: check in both directions for each points, for the first point check against next one only, for the last point check against the second last only. Scenarion "break of list" maybe unpossible in your case or use the XOR rule to find out.

Open in new window










0
 
LVL 9

Assisted Solution

by:oheil
oheil earned 500 total points
ID: 36534967
Sorry for the bad layout of my answer.
0
 
LVL 1

Author Comment

by:trevor1940
ID: 36538411
Thats kind of it but I've expanded a little

so when testing the first point I'm testing it agents the last and second and the last point against first & second last

although i am testing the points in numerical order they may not be nearest to each other if plotted on a map but this is dealt with later while inserting into my database

BUT

say point 3 is in error 2 & 4 are ok then all 3 points are erroring because I'm testing

if (last point +- 10 of Point && Next point +- 10 of Point) on both the x and y axis (Long and Latitude) rather like a 10 degree circle around each point both it's neighbors are within it's circle.

So each point can be in error 0 or 1 time but not 2

so some how i need to test this

0
 
LVL 9

Assisted Solution

by:oheil
oheil earned 500 total points
ID: 36541185
so when testing the first point I'm testing it agents the last and second and the last point against first & second last

No, I don't think so. You test the first point (in the sorted list) againts its successor, the last one against its predecessor. All others are tested against its predecessor and successor.

although i am testing the points in numerical order they may not be nearest to each other if plotted on a map but this is dealt with later while inserting into my database

Totally clear because you have longitude and latitude.

say point 3 is in error 2 & 4 are ok then all 3 points are erroring because I'm testing

Why is that? Or is this a new definition of yours now?
This would be my suggested test with k = 3:
for 1<k<m
conditions:

( p[k] <= p[k-1]+10 and p[k] >= p[k-1]-10 )
or
( p[k] < p[k+1]+10 and p[k] > p[k+1]-10 )

must be true, for p[k] not to be an error.

Open in new window

For k=3 this results in false, because k-1=2 and k+1=4 are outside the +-10 range, 0 or 0 = 0.
For k=2 and 4 this test would result in true, because of the OR logic:
For k=2 it is false for k+1 (because 3 is an error) but true for k-1, together 1 or 0 = 1.
For k=4 it is false for k-1 (because 3 is an error) but true for k+1, together 0 or 1 = 1.

if (last point +- 10 of Point && Next point +- 10 of Point) on both the x and y axis (Long and Latitude) rather like a 10 degree circle around each point both it's neighbors are within it's circle.

No, thats not my suggestion, my logic would be:
if( point +- 10 of last point || point +- 10 of next point ) ... at least one neighbor is within its circle.
Mind the OR.

ok?

Oli




0
 
LVL 1

Author Comment

by:trevor1940
ID: 36541571
Oli thanx
May be i need to explain the situation a little more

If for instance you had a square you need a minimum of 4 points to make a square

Point   co-ords X,Y
1      1,1
2      1,2
3      2,1
4      1,2
5      1,1

(You actually need 5 because the first and last points need to be the same to close the square)

In my data I’ve maybe 50 unique points at a seemingly random pattern and order within that same square they are numbered 1 to 50  but point 5 may not be adjacent geographically to points 4 or 6 hence my 10 degree error margin (The UK is about 6.5 degrees at it widest point)  some of these points are not within this square. That’s why I need to test all points in both directions (as they are numbered) ie point 1 is with a 10d radius of both point 2 and 50 thus reducing the probability of point 1 being in error.

If I could I’d test every point against every other point but that’s not going to happen


0
What Is Threat Intelligence?

Threat intelligence is often discussed, but rarely understood. Starting with a precise definition, along with clear business goals, is essential.

 
LVL 9

Assisted Solution

by:oheil
oheil earned 500 total points
ID: 36541650
I still miss the point it seems.

You want to test if points are inside a given square?
Or is the square somehow spanned by the points in a beforehand unknown manner and you like to eliminate outliers?
Actually a square is mathematically defined by two points.

Maybe it would help for my comprehension, if you give an example of, say, 5 points and the exact question you have regarding these points.

Some more ideas:
Maybe you can calculate the centroid as an average of all points and check the distance of each point to the centroid.

Or you check for outliers statistically, e.g. by median absolute deviation larger than some value, (search for "median absolute deviation").

Oli


0
 
LVL 1

Author Comment

by:trevor1940
ID: 36541733
I used a square to illustrate the issue

Maybe if you look at http://en.wikipedia.org/wiki/Convex_hull it may help you understand my problem.

My database is PostGis it has a Convex_hull it creates a polygon from a cluster of points like in the diagram on the wiki page.

In my data set some of these points are in error creating spikes in the polygon that covers half the world unfortunately there isn't a postGis command that identifies these points so I’m attempting it in Perl first

I appreciate your help on this as it is making me focus on the problem so I can explain it.


0
 
LVL 9

Assisted Solution

by:oheil
oheil earned 500 total points
ID: 36541783
That helps a lot in understanding the problem.

If some points produce spikes than not the points are "in" error but the postGis (never heard) command Convex_hull has a bug and does not produce the convex hull as defined.

I try to draw an example with one "error" point:
points 1-4, no error point, the convex hull is trivial a polygon through all points

                   +               +
+                                                   +

the same with an additional "error" point called e, the others 1,2,3,4 from left to right:

                            +



                   +               +
+                                                   +

The convex hull would be a triangle defined by the first 1, the last 4 and the new "error" point e.
The two other points (2,3) are inside the convex hull.

Open in new window


There should never be a spike.

The lacking of an command in postGis is propably because there is no clear definition of an error point.

Oli
0
 
LVL 1

Author Comment

by:trevor1940
ID: 36545142
Oli

FYI postGis  is a bolt on to PostGresSQL giving it Geo (maps) analytical capabilities

using a insert command with Convex_hull will create what is known as a Geo blob of all its given points usually a polygon or if all points are straight a line.

unfortunately you can't tell it to ignore the spikes.

i think I've come up with a solution

loop through each point

if the point is within the last point +-10d on both Long & Lat axis add 1 to $count

repeat for it's next point

if the $count >=1 it must be good

the only issue is if two wrong points are together (points 3 & 4) they both have a count of 1 so pass the test

I need to investigate my data to see if this actually occurs or not.

thanx
0
 
LVL 1

Author Closing Comment

by:trevor1940
ID: 36567696
Oli
thanx for your hellp not solved the issue yet but ta any way
0

Featured Post

IT, Stop Being Called Into Every Meeting

Highfive is so simple that setting up every meeting room takes just minutes and every employee will be able to start or join a call from any room with ease. Never be called into a meeting just to get it started again. This is how video conferencing should work!

Join & Write a Comment

There are many situations when we need to display the data in sorted order. For example: Student details by name or by rank or by total marks etc. If you are working on data driven based projects then you will use sorting techniques very frequently.…
In the distant past (last year) I hacked together a little toy that would allow a couple of Manager types to query, preview, and extract data from a number of MongoDB instances, to their tool of choice: Excel (http://dilbert.com/strips/comic/2007-08…
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…
In this tutorial you'll learn about bandwidth monitoring with flows and packet sniffing with our network monitoring solution PRTG Network Monitor (https://www.paessler.com/prtg). If you're interested in additional methods for monitoring bandwidt…

760 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

18 Experts available now in Live!

Get 1:1 Help Now