Link to home
Start Free TrialLog in
Avatar of justearth
justearthFlag for United States of America

asked on

Procedure / Script / Macro for Creation of Lists from Data Matrix

Hello,
I have data contained in square matrix. The data is the distance in meters between two locations.  I'd like to create/output lists from this data matrix based on the following scheme:

- All of the locations (the numbers in the matrix) are ranked based on "Priority Level", the .1, .2 or .3 following the number indicates priority level. Any selection between any points will need to have a distance greater than 16100.  
- I would like the lists to be created "randomly" I say I have placed randomly in quotes because I want a preference given to priority level  (i.e. 1>2>3 ).
- I don't know if this is possible, but I would like to start out by "randomly" choosing a priority 1 location and then adding as many priority 1 locations, then priority 2, and finally priority 3 locations, given that all points are greater than 16100 apart, for like 50 runs.

I don't care how the data is stored ( text file, excel, csv, etc )

Does this sound plausible to do in VB macro in excel, python or VBscript. I'd be greatly appreciative of any annotated code so that I might learn what you did.

Thank you kindly,
JE
N-x-N-2.xls
SOLUTION
Avatar of riteshparakh
riteshparakh
Flag of India image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of justearth

ASKER

I don't know how to write a script.

For example, Locations 1 - 23 are priority one. Randomly picking location "Four" would yield  the locations of 7.1, 100.3, 105.3 and 106.3 unsuitable because they are closer than 16100 meters from location 4.

I guess I was thinking something like this:

 I know I can't use all entries because not all of the locations are greater than 16100 meters apart from each other.  I know there is more than one solution to the problem of "creating a list of locations that are greater 16100 meters apart". For my future analysis the locations with the highest priority are most important to me and want to include as many of these first, then followed by the other priorities. I thought randnomness would allow me to 'honestly' create many unique solutions. I am open to other ideas about how to complete this task of location selection based on the greater than 16100 rule.

Thanks,
JE




I still don't understand sufficiently to provide any help. Is there any significance to the data being stored in a matrix or is this really just a list of data?

How does the word "Four" relate to the above? Is it just another way of saying 4 or is the name significant.

Please show us a subset of the matrix.
Sorry for the lack of clarity. I'll try to explain more.

No the data does not have to be in a matrix. I have a attached another spreadsheet. Which show the distances with no duplication between each location. This and the previous are the only two formats I have for this data set. It is a result of the program used to calculate these distances.

"Four" = 4. Sorry for the confusion. Just another way of saying it.

I hope this helps you help me,
Thanks again,
JE
1x1-axb.xls
Location-Priority-Number.xls
ASKER CERTIFIED SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
ramrom:

>>I am embarrassed to report that I failed to notice the attachment in your question. For some weird reason I tend to skip over some details. This has led as you may notice to extra frustration on my part.

No worries.  :)

>>To answer part of your question - "Does this sound plausible to do in VB macro in excel, python or VBscript" - Yes.

Good.

>>I'm still not 100% clear on what you want. "Randomly picking location "Four" would yield  the locations of 7.1, 100.3, 105.3 and 106.3". By what magic?

What I did to arrive at these values was that I looked at the matrix and all the distance from location 4. There are only four other locations that are closer to 16,100 meters. Which I wrote above. The magic was just me reading the data matrix.

>>Let's choose location 1 as the starting point. You would take row 1 in the matrix, examine each of the other 225 columns, extract those which are > 16100 into a list.

This sounds good, I think.

>>Maybe I'm missing something, but doesn't that just amount to examining each cell in the matrix, picking those > 16100 and putting them in a list along with their coordinates? That is a total of 25312 comparisons, takes almost no compute time to rearrange the entire matrix.

>>If that is all you want that is trivial. Why all the concern about random and 50 runs?

Perhaps, I over thought this question and failed to recognize how the data format (the matrix) might allow for such a parsimonious and sophomoric solution.  In my thinking there are numerous solutions. Given that I wanted to do "50 runs" and the randomness and hence 'fairly' choose the possible solutions.  When I read about your solution for outputting all the >16,100 for each row, it seems reasonable. But does this maximize the locations in an outputted list based on priority levels. Again, based on the priority ranking 1>2>3 do the solutions for each row, that you proposed above, account for this? If so then I would appreciate a method to compute your proposed solution. I am missing something, will the row by row analysis provided all the possible solutions?

Thank you kindly,
JE  
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
I am doing a spatial analysis on the area within the 16,100 m radius around a given location.  I will be comparing spatial attributes among or between the locations.   I'd like to use all points, but I can't if they are not spatially independent (i.e. <16,100 M, occupying the same space on the earth) because this would interfere with future statistical analyses. The priority levels are based on importance to numerous goals and I would therefore like to include the higher priorities preferentially.


Thanks again,
Cheers,
JE
That is a good overview, but not specific enough for me to create a script.

What do you want the output of the script to be?
Hello,

I thought the output would be a list(s) of the locations that are greater than 16,100 m apart from each other.

A list such as a text file, or a new sheet in the workbook would work for me. I am not picky about the output  format.


HTH,
Thanks,
JE
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
In the list I envision, each iteration would have no repeats and the list would contain the entire "set" of numbers for that run. Does that make sense? Like a string of numbers:

1.1, 2.1, 3.1, .... 42.2, 43.2, ... 100.3, etc  

or


1.1
2.1
3.1
...
42.2
43.2
...
100.3


Thanks,
JE  
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
I am sorry for all the circles and confusion. I truly appreciate your effort and patience. Sorry for introduction of new ill-defined jargon. Since I am unsure about how this script will run or what you think the output  might be, I guess I was unclear about the inputs and outputs. Perhaps all of this should be considered an answer to a question and I should ask a new and more clear one.

Input:
This is the data matrix I provided right?

Output:
I guess iteration and run were actually describing the same thing. A run/iteration would be each unique combination of locations: a list.  I thought there was going to be more than one list. So each list would be unique (right?) and therefore more than one list equals more than one "run / iteration".

I'd appreciate the script you think I want because even if its not perfect it'll be much much closer than I am now.

Thank you kindly,
JE
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
>>A pair of points has 2 priority levels. That suggests that pairs with levels (1,1) are first and (3,3) last.

Yes, priority one are first and then down the list.

>>I do not understand "each unique combination of locations". What determines what goes into a list?

Each list will be the locations that are greater than 16,100 m apart from each other.

- first the script would look at location 1 and location 2, and it would see that that is fine (the distance is greater than 16,100).
-Then it would look at distances between location 1 and 3, and 2 and 3. Here 1 and 3 are fine, but 2 and 3 are not (they are closer than 16,100) since we already chose 1:2 we will not select location 3.  
-Then we examine  1:4 and 2:4. Both of these can be included as they are greater than 16,100 m.           - Moving on to the next location we would examine 1:5, 2:5 and 4:5 all are greater than 16,100 m and can be included.
-Then we would look at the distance between 1:6, 2:6, 4:6 and 5:6. The distances between 2:6 and 5:6 will not work (<16,100), so will not included location 6.

From above we would end up with locations 1, 2, 4, and 5. After looking at locations 1 - 6.  There could have been more than one solution though, for example if I chose to use location 3 instead of 2, and then subsequent selections would also then be affected.  I guess, because of these different possible outcomes is why I was thinking "random", "unique" and "50 runs" above.  And then when there is a choice between priorities I would want to preferentially choose ( 1>2>3)

Do you see a solution to this question as described above?  If this can't be answered here in entirety, could you give me suggestions on future questions to ask so that I might assemble this script through multiple questions / answers / partial scripts?

Thank you kindly,
JE
Thanks for the help on the confusing question on my part. I think I know how to ask a new better one now.