Link to home
Start Free TrialLog in
Avatar of krakatoa
krakatoaFlag for United Kingdom of Great Britain and Northern Ireland

asked on

Best container for a pair of values

I want to store pairs of values (two ints actually) in something that can be searched for any of those pairs. So, for e.g., say the list is 12,567; 1400,925; 0,47; then how to store them, and what would a search look like when looking for 1400,925? I'm having a bit of a blank-out trying to figure out what the best option here is. Any ideas ? Thanks if so.
Avatar of Tomas Helgi Johannsson
Tomas Helgi Johannsson
Flag of Iceland image

Hi,
There are many ways to do this.
Simple key-value-pair storage would satisfy this.
A Key-Value-Pair class object, such as HashMap, would be a good choice.
An example of Map is
Map<Integer, Integer> map = new HashMap<Integer, Integer>();
map.put(1, 10);
map.put(2, 20);
map.put(3, 30);

Open in new window

Then to store it permanently to disk and retrieve it back, you issue Serialize/Deserialize methods.

Retrieving a value by key would look like this
int myvalue = map.get(2)

Open in new window

this example would return 20 in myvalue.
Then
map.ContainsKey(2) 

Open in new window

and 
map.ContainsValue(30)

Open in new window

are methods to search for keys or values in the Map.

https://www.java8net.com/2020/03/serialize-hashmap-in-java.html
https://beginnersbook.com/2014/06/difference-between-hashmap-and-hashtable/
https://www.codejava.net/java-core/collections/java-map-collection-tutorial-and-examples
https://www.geeksforgeeks.org/hashmap-get-method-in-java/?ref=lbp
https://www.geeksforgeeks.org/hashmap-containsvalue-method-in-java/?ref=lbp

Regards,
   Tomas Helgi

What is the possible range of each value?
That will let you calculate the possible number of values.
How do want to prioritize storage space vs search speed?

You might want to make a binary index of all possible values, a collisionless hash table.
Avatar of krakatoa

ASKER

Tomas
Yes . . . I had considered that, but I'm not sure I'm so happy treating a key as one of the values in the pair. It might cause a lot of collisions. But suggest away if you think there's no issue.

@ d-glitch
range in the order of zero to couple of thousand. I'll have to talk about the storage a bit later maybe, but let's say I'm dealing with several million pairs . . . I want, as you hint at, the speed to be as fast as possible. The idea being that any given pair is only used once, so a binary search of the set seems a good idea.
Hi,

When you want to store and work with millions and more of key-value pairs, then Apache Spark may also be a good choice, as it is a lightning-fast environment.
https://spark.apache.org/
https://intellipaat.com/blog/tutorial/spark-tutorial/working-with-keyvalue-pairs/

Also using HashMaps you could store the frequency of a given value in a key by simply creating a hashmap like this
Map<Integer,IntegerA> map = new HashMap<Integer,Integer>; // storage of value and frequency
Map<Integer,HashMap> keymap = new HashMap<Integer, HashMap>

Open in new window

and add them together something like
keymap.put(value,freq)
map.put(key, keymap)

Open in new window


Or even if keys have more than one value then you store the values as ArrayLists in the Value Section.
Map<Integer,ArrayList> keymap = new HashMap<Integer, ArrayList>

Open in new window

Note that, you would need to override the ContainsValue method to search in the ArrayList for a specific value.
Regards,
    Tomas Helgi
A binary search is SLOW, compared to a hash table or a complete index.

IF (n, m) exists  THEN  index(n, m) = TRUE

You have to build the table, but searches become trivial and fast.

Hash tables have similar performance with a different trade-off between storage space and overhead.
The problem with the HashMap is that you can't store both (12,567) and (12,43), as 12 is the key and only the last pair would remain in the map.

Ideally, you should try and keep it simple and clean, create a class that stores your 2 ints, implement the "equals" and "hashcode" methods and then store objects of this class in a Set<> (since you said a given pair is only used once).

If you find that you really DO need to optimise this, only then would start to look into it further. In this case, and given the range constraint that you mention, I would pack the pair of ints into one int...

int packed = x * 10000 + y;       // where x & y < 10000

And then you can store these this packed int in a simple HashSet<Integer>. The hash based get/put should work reasonably well here. But note, that you can't store primitive ints in a Set so they will get wrapped in Integer objects, which adds more to the memory requirements.

Further optimisations may be possible but it would depend on your specifics... you mention that you want "the speed to be as fast as possible", but are you talking about the speed to create this data structure or to search it or what? Also, it depends on the dynamics of how you are using this... are you building this data structure once and then searching it multiple times (in which case you could look at storing these packed ints in an array, sorting them and then using a binary search), or are you searching as you are building it/making a number of modifications to the data which would make the constant sorting an expensive operation. If you can give more detail here, I could be more specific.

There's a lot to think about there from your three latest comments. I'm really appreciative of your contributions, and want to measure up in my response to them before blurting out something naive here. Be back.
Maybe it's better if I say what I'm aiming at, which is to produce random x and y coordinates for a Canvas on which a single pixel will be painted. I don't want the spot to be overdrawn with another pixel, so my reckoning was to store the randomly generated coords somewhere that allows checking as to whether the location is already taken, and ignore a subsequent paint operation if it is. Obviously the locations can be generated and checked before any drawing takes place if this is faster than painting them in inline code.

I need to store the coords because the requirement thereafter will be to obliterate each pixel in turn, once all the required number of pixels have been painted. I don't want the pixels painted serially (making it look like it's happening in straight lines ). . . they should be painted and erased randomly using the set of coords. Make sense ?

Maybe it's better if I say what I'm aiming at

Always ;)
...which is to produce random x and y coordinates for a Canvas on which a single pixel will be painted.
What are the dimensions of the canvas?
If every time you want to look something up you are using both values, why not a new class that has two data values in it as member variables. Then just use a standard hash set to store those. Assuming you have defined the equals method of that new object as something unique with those two values. I am assuming that you wanted to look up by the combination of the values.
I would use
TreeMap<Integer,TreeSet<Integer>>

Open in new window

for the data structure.
ASKER CERTIFIED SOLUTION
Avatar of mccarl
mccarl
Flag of Australia image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Please clarify your requirements.  As quoted by mccarl you posted
obliterate each pixel in turn
 Which led mccarl to post
meaning that you need to also maintain the order that you generate the pairs
You also posted
they should be painted and erased randomly using the set of coords
Which would lead us to believe that order does not matter.
Which would lead us to believe that order does not matter. 

Correct.

@CEHJ
What are the dimensions of the canvas? 

new Dimension(1200,700)
The first thing that springs to mind is that if you have a matrix of w=1200 by h=700 is to have two List instances of length w*h. You can randomly remove from one, and record what was removed in the other. You can derive values for w and h arithmetically.

I'm not sure if that accords well with your problem domain though and i haven't thought it through in a big way ;)
This is what I have so far. Pls ignore the Control class. Forget the Listeners for now also.
This code paints pixels on the canvas randomly. My aim is to do that but without overwriting one pixel with another due to randomly hitting the same x y coords. Once the image is rendered, I want to go back over the pixels *randomly* across the complete set, and reset them (delete iow) back to the background color.


import java.io.*;
import java.util.*;
import javax.swing.*;
import java.awt.*;
import java.awt.event.*;
import javax.swing.plaf.metal.*;


 class Identifier_GUI implements MouseListener, MouseMotionListener, ComponentListener{


 protected JFrame mainFrame;
 private Identifier_GUI idg;
 private MCanvas canvas;
 
 private Control control;
 
 private int xcoord;
 private int ycoord;
 
 private volatile boolean exited;
 
 private final int ULIMIT;
 
 private Entity[] entA = new Entity[100000];
 private Entity ent;

    public static void main(String[] args){

        new Identifier_GUI();
        
    }
    
    private class MCanvas extends Canvas{
    
        public void paint(Graphics g){ 
    
            if(!exited){return;}
            if(exited){return;}
                   
            Graphics2D g2d = (Graphics2D)g;
  
            g2d.setColor(Color.RED);
            int x = 125; int y = 132;
            g2d.drawArc(xcoord,ycoord,4,4,10,360);
            g2d.setColor(Color.GREEN);
            g2d.fillArc(xcoord,ycoord,4,4,10,360);
                
        }
        
        public void myPaint (Graphics g, int x, int y){
        
            Graphics2D g2d = (Graphics2D)g;
  
            g2d.setColor(Color.WHITE);
            g2d.drawArc(x,y,1,1,10,360);
            g2d.setColor(Color.BLACK);
            g2d.fillArc(x,y,1,1,10,360);
        
        }
        
    }
    
    
    public Identifier_GUI (){
    
        this.idg = this;
        
        ULIMIT = 60000000; // THIS IS THE ULTIMATE AIM, BUT FOR NOW, USE LOWER NUMBER DUE TO RENDERING TIME (.5 hour for 60m)
                
        canvas = new MCanvas();
        canvas.setVisible(true);
        canvas.addMouseListener(this);
        canvas.addMouseMotionListener(this);
        
        mainFrame = new JFrame();
        mainFrame.setSize(new Dimension(1200,700));
        mainFrame.setLocationRelativeTo(null);
        mainFrame.setDefaultCloseOperation(JFrame.EXIT_ON_CLOSE);
        mainFrame.add(new JPanel(),BorderLayout.NORTH);mainFrame.add(new JPanel(),BorderLayout.SOUTH);mainFrame.add(new JPanel(),BorderLayout.WEST);mainFrame.add(new JPanel(),BorderLayout.EAST);
        mainFrame.add(canvas,BorderLayout.CENTER);
        mainFrame.setVisible(true);
        
        control = new Control();
        control.window.setVisible(true);
        mainFrame.addComponentListener(this);
        
        exited = false;
        
        Random rand = new Random();
        
        for(int ra = 0;ra<100000;ra++){entA[ra] = new Entity((Math.abs(rand.nextInt()) % canvas.getWidth()),(Math.abs(rand.nextInt()) % canvas.getHeight()));entA[ra].doPaint();mainFrame.setTitle(String.valueOf(ra));}
        
        /*int largestx =0;int largesty =0;//DEBUG
        for(Entity e : entA){System.out.println(e.xcoor + " "+ e.ycoor);if(e.xcoor>largestx){largestx=e.xcoor;}if(e.ycoor>largesty){largesty=e.ycoor;}} 
        System.out.println(largestx+ " "+largesty);*/     
        
                           
    }
    
    public void mouseExited(MouseEvent mEx){}
                  
    public void mouseEntered(MouseEvent mEx){}
    
    public void componentHidden(ComponentEvent e){}

    public void   componentMoved(ComponentEvent e){
    
        idg.control.window.setVisible(true);
        idg.control.window.setLocation(mainFrame.getX()+mainFrame.getWidth()+10,mainFrame.getY());
    }

    public void   componentResized(ComponentEvent e){}

    public void   componentShown(ComponentEvent e){}
    
    public void mouseReleased(MouseEvent mEx){
    
    }

    public void mousePressed(MouseEvent mEx){}
    
    public void mouseClicked(MouseEvent mEx){}
    
    public void mouseDragged(MouseEvent mEx){}
    
    public void mouseMoved(MouseEvent mEx){}
    
    
    
    class Control{
    
    Window window;
    
        Control (){
    
                window = new Window(mainFrame);window.setSize(new Dimension(300,400));window.setLocation(mainFrame.getX()+mainFrame.getWidth()+10,mainFrame.getY());window.setVisible(true);
                
                window.setOpacity(.85f);
                window.setBackground(Color.PINK);
                                               
                GridLayout grid = new GridLayout(0,1);
                window.setLayout(grid);
                
                JTextArea jA = new JTextArea();
                jA.setLineWrap(true);
                jA.setBackground(Color.GRAY);
                JScrollPane js = (new JScrollPane(jA));
                js.setBorder(new MetalBorders.TextFieldBorder());
                js.setVisible(true);
                
                window.add(js);
                window.add(new JButton("Set Number"));
                window.add(new JLabel("Person"));
                String[] o = {"Duration"};
                JComboBox<String> jcb = new JComboBox<String>(o);
                window.add(jcb);
                JSlider jslider = new JSlider(0,100,50);
                jslider.setName("ggg");
                jslider.setPaintTrack(true);
                jslider.setMajorTickSpacing(25);
                jslider.setLabelTable(jslider.createStandardLabels(25));
                jslider.setPaintLabels(true);
                jslider.setSnapToTicks(true);
                jslider.setPaintTicks(true);
                window.add(jslider);
                window.add(new JButton("Another jbutton"));
                window.add(new JButton("Another jbutton"));
                
    
        
        }
    }
    
    
    class Entity {
    
        private Color color;
        private int xcoor;
        private int ycoor;
        
    
      public Entity (int x, int y){
    
         this.xcoor = x;
         this.ycoor = y;
    
      }
    
      public void doPaint (){
    
         canvas.myPaint(canvas.getGraphics(),this.xcoor,this.ycoor);
    
      }
    
    
    }
       
 }

Open in new window

Now, my suggestion is to use
HashSet<ColoredPoint> points = new HashSet<ColoredPoint>();

Open in new window

where a ColoredPoint object extends java.awt.Point and decorates it with a color.
@rrz
Using Point knocked it out of the park for speed. But I didn't use a HashSet because there's a class, which is needed t hold other vars as well. I upped the heap to 6g, and so far, that was enough to render all the points, and hold the 60 million objects of the above Entity class. So that's progress.
I didn't use a HashSet because there's a class, which is needed t hold other vars as well.
So, what class did you use?
This - 
class Entity {
    
        private Color color;
        private Point point;
           
      
      public Entity (Point p){
    
           this.point = p;
             
      }
    
      public void doPaint (){
    
           canvas.myPaint(canvas.getGraphics(),this.point.x,this.point.y);
    
      }
    
    
    }

Open in new window

Avatar of dpearson
dpearson

Re-using Point seems like an excellent idea here.

But here's a class that I consider one of my "essential" classes in Java.  It's just a Pair of values (any two types).  I use these all over the place - e.g. when you have a method that needs to return 2 values.

Not as optimal for performance though as Point.  But more general purpose.

FYI, I also have Tuple3, Tuple4 etc. which are its natural cousins since Java lacks built in tuple support.

public class Pair<T1, T2> {
   private final T1 m_Value1;
   private final T2 m_Value2;

   public Pair(T1 value1, T2 value2) {
      m_Value1 = value1;
      m_Value2 = value2;
   }

   public T1 getFirst() {
      return m_Value1;
   }

   public T2 getSecond() {
      return m_Value2;
   }

   @Override public String toString() {
      return m_Value1 + "," + m_Value2 ;
   }

   @Override
   public boolean equals(Object o) {
      if (this == o) return true;
      if (o == null || getClass() != o.getClass()) return false;
      Pair<?, ?> pair = (Pair<?, ?>) o;
      return Objects.equals(m_Value1, pair.m_Value1) &&
            Objects.equals(m_Value2, pair.m_Value2);
   }

   @Override
   public int hashCode() {
      return Objects.hash(m_Value1, m_Value2);
   }
}

Open in new window

Thanks Doug - generous share there. :)
I'm going to develop things a bit now and be back.
I just want to check on the wisdom of ignoring a random number generator to generate the Point coords, and instead using a HashSet to accept objects containing the coordinates to be painted, and relying on the unpredictability of iterating over a Set to achieve the effect of the random painting from the linearly-produced Set ?

The code - which seems to work ok - that I have for this is now :

        int w = canvas.getWidth();
        int h = canvas.getHeight();
                
        choices = new HashSet<>();   
                       
        for(int wloop=0; wloop<w; wloop++){
        
            for(int hloop=0; hloop<h; hloop++){
         
                Point p = new Point(wloop,hloop);
                choices.add(new Entity(p));
                
            }
            
        }
            
        choices.forEach(entity -> entity.doPaint(Color.BLACK));

Open in new window

(I can post the whole code if full context is needed).
I am confused. Isn't the following code a proper test?
import java.util.*;
public class Test {
    public static void main(String[] args) {
		HashSet<Integer> choices = new HashSet<Integer>();
		for(int i = 0; i < 500; i++){
			choices.add(i);   
		}
        choices.forEach(number -> System.out.print(number + " "));
    }
}

Open in new window

 According to the HashSet docs:
It makes no guarantees as to the iteration order of the set
 I don't see it ever being random.
@rrz

The HashSet contains Entity instances, as given in earlier code not Integers.

What you are seeing is an effect of the Integer class returning proper order.
It makes no guarantees as to the iteration order of the set 
That will be fine - I want the pixels painted non-linearly, and even though it might not be true random, the effect is as good from what I can see.

import java.util.*;
import java.awt.*;

public class Test {
    public static void main(String[] args) {
      HashSet<Point> choices = new HashSet<Point>();
      for(int i = 0; i < 500; i++){
            Point p = new Point(i,i*2);
            //choices.add(i);  
            choices.add(p);
      }
        choices.forEach(number -> System.out.print(number.x + " "+ number.y+" "));
    }
}

Open in new window


I think (from what little i know about what you're doing) that's probably enough randomness, Quite clever approach.

Since your problem space (and therefore number of Point) is known, you can give your app a little more efficiency by providing the number of entries in the ctor to the Set
Yep I agree.  I think that's a solid approach.  It's unpredictable rather than random.  And should be fine for this problem.
Righty, thanks both. A couple of touches more to the code, and that'll be it for this Q - although one or two more waiting in the wings. ; )
relying on the unpredictability of iterating over a Set 

So, I know this might be pedantic, but to make things clear, Set is an interface and doesn't specify an iteration order. It is up to the implementation, and so you were probably referring to HashSet's iteration order.

Now, the iteration order of HashSet is neither random nor unpredictable. No, it isn't in any sort of linear or natural ordering, but it definitely is predictable and definitely repeatable (for the same input conditions, in this case, the same width & height). It will depend on both the implementation of HashSet and how the hashcode method of the Point class is implemented as well.

As to which way you should go, well that is entirely up to what your requirements are. If what you have at the moment is sufficient for your needs, there's no reason to go any further. If you are ok with the pixels being drawn in the pattern that they are, given that it will always draw in the same pattern, then all good. Otherwise, you will need something like we have mentioned earlier to make it random and different on each run.

Last final note, the fact that the HashSet make "no guarantee" to the iteration order of the set, doesn't mean that you will always get the elements back in a non-naturally ordered fashion. It is entirely possible that some JVM out there (or some in the future) *may* decide to return the elements to you in a natural order and then you won't get the desired effect at all. Yes, it's probably not likely but it's also not impossible. Just so that you know!
Hi mccarl -

you were probably referring to HashSet's iteration order. 
Yes. HashSet's the impl, so I'm guided by that. But . . .

Now, the iteration order of HashSet is neither random nor unpredictable. No, it isn't in any sort of linear or natural ordering, but it definitely is predictable and definitely repeatable (for the same input conditions, in this case, the same width & height). 
Can I take the last clause of that first : yep, for the same params (width and height in this case as you say) it would be repeatable and in that way predictable. I do see this as a drawback in a sense, but the relevance of that is at this moment not fully visible to me (not because of what you say, but because I'm not quite at ease with the paradigm I have in mind.
The trouble I have with the statement is however that it's only predictable once you have run the disclosing code - the code that either prints out the Points. You'd have a job on your hands seeing determinant process by just watching the pixels appear on the screen.
This takes me to the first part of what you say : that the iteration is not unpredictable. In my interpretation of what's 'going on' inside Java, according to the lit., is that " It makes no guarantees as to the iteration order of the set; in particular, it does not guarantee that the order will remain constant over time.  " So, as you rightly later say, the iterator could come up with a perfectly linear response and 'print out' the elements in total ordering. But that would still be a (pseudo-)random set, as it would be just one set among many - even though it looks like it has been dealt from a predictable hand. In any case, it worries me that the lit. is clear that an initial returned order could not be predicted in advance. If it doesn't imply that, then either I or the lit is not making itself clear.

I think you last paragraph is covered in what I've re-questioned just here above - the repeated returned values thesis, on which I agree. More difficult is the notion that a developer would come along with a Set theory that ignores the mathematical foundation that it relies on, and re-invent an unneeded wheel by giving us a Set impl which would amount to an ordered List, which we've already got of course.

But look, there's no doubt that my fusky code isn't the best deal in town, and I am infinitely open to suggestions for improvement or rectification. My aim is to have an 'agent' piece of code roam amongst the values *truly randomly* however, picking "Entities" with those coords as members, and eliciting boolean field setting from these objects, and then measuring how long that random search takes take to locate a pre-set number of Entities with the boolean values the user chooses. There, I can see, that if the entire Points field is vulnerable to being predictable, that the search will always return the same results, which would be of course useless.

I don't know how much of that you or anyone else understands, but that's as close as I can make it at the moment. ; )