Solved

Help needed to make this collections method much efficient

Posted on 2004-08-21
125
355 Views
Last Modified: 2010-03-31
 private java.util.List[] getURLList(java.util.List list) {
    Hashtable ht = new Hashtable();
    for (int i = 0; i < list.size(); i++) {
      String url = list.get(i).toString();
      String array[] = url.split("/");

      String domain = array[0];
      if (array[0].toLowerCase().startsWith("http")) {
        domain = array[2];
        domain = domain + array[4];
      }
      else {
        domain = domain + array[2];
      }

      ArrayList sublist = (ArrayList) ht.get(domain);
      if (sublist == null) {
        sublist = new ArrayList();
      }
      sublist.add(url);
      ht.put(domain, sublist);
    }
    return (List[]) Collections.list(ht.elements()).toArray(new ArrayList[0]);
  }


Thanks
Koundinya
0
Comment
Question by:sudhakar_koundinya
  • 62
  • 39
  • 16
  • +2
125 Comments
 
LVL 86

Assisted Solution

by:CEHJ
CEHJ earned 375 total points
Comment Utility
Probably lots you could do to optimise this (possibly at the expense of extensibility) but i'd approach it like this:

      private java.util.List[] getUrlList(java.util.List list) {
            Hashtable hostToUrls = new Hashtable();
            try {
                  for (int i = 0; i < list.size(); i++) {
                        URL url = new URL((String) list.get(i));
                        String hostKey = url.getHost();
                        // Could be a Set if no duplicates required
                        ArrayList urlsForHost = (ArrayList)hostToUrls.get(hostKey);
                        if (urlsForHost == null) {
                              urlsForHost = new ArrayList();
                              urlsForHost.add(url);
                              hostToUrls.put(hostKey, urlsForHost);
                        }
                        else {
                              urlsForHost.add(url);
                        }
                  }
            }
            catch (MalformedURLException e) {
                  return null;
            }
            int numberOfLists = hostToUrls.size();
            return (java.util.List[]) Collections.list(hostToUrls.elements()).toArray(new ArrayList[numberOfLists]);
      }
0
 
LVL 86

Expert Comment

by:CEHJ
Comment Utility
I used Lists of type URL as they're more versatile, but you may want to use String
0
 
LVL 14

Author Comment

by:sudhakar_koundinya
Comment Utility
suppose if the list contains both http and https based urls then how can we handle them??
0
 
LVL 86

Expert Comment

by:CEHJ
Comment Utility
Not quite sure what you mean. They'd appear together under the same host list
0
 
LVL 14

Author Comment

by:sudhakar_koundinya
Comment Utility
list will be some thing like this


            al.add("http://192.12.0.38/exchange/Administrator/Inbox/TEST Message 0101.EML");
          al.add("http://192.12.0.38/exchange/Administrator/Inbox/TEST Message 0102.EML");
        al.add("http://192.12.0.38/exchange/Administrator/Inbox/TEST Message 0103.EML");
      al.add("http://192.12.0.38/exchange/Administrator/Inbox/TEST Message 0104.EML");
        al.add("http://192.12.0.38/exchange/Administrator/Deleted Items/TEST Message 0056.EML");
        al.add("http://192.12.0.38/exchange/Administrator/Deleted Items/TEST Message 0055.EML");
        al.add("http://192.12.0.38/exchange/Administrator/Inbox/TEST Message 0105.EML");
        al.add("http://192.12.0.38/exchange/Administrator/Inbox/TEST Message 0106.EML");
        al.add("http://192.12.0.38/exchange/Administrator/Inbox/TEST Message 0107.EML");
        al.add("http://192.12.0.38/exchange/Administrator/Inbox/TEST Message 0108.EML");
        al.add("http://192.12.0.38/exchange/Administrator/Inbox/TEST Message 0109.EML");
        al.add("http://192.12.0.38/exchange/Administrator/Inbox/TEST Message 0110.EML");
        al.add("http://192.12.0.26/exchange/t1/Inbox/testmailfromAD.EML");
        al.add("http://192.12.0.3/exchange/vijaya/Inbox/testattachment.EML");
        al.add("https://192.12.0.38:403/exchange/Administrator/Inbox/TEST Message 01.EML");
        al.add("https://192.12.0.3/exchange/Administrator/Inbox/TEST Message 011.EML");
        al.add("https://192.12.0.3/exchange/Administrator/Inbox/TEST Message 012.EML");
        al.add("https://192.12.0.38:403/exchange/Administrator/Inbox/TEST Message 013.EML");
0
 
LVL 86

Expert Comment

by:CEHJ
Comment Utility
0
 
LVL 86

Expert Comment

by:CEHJ
Comment Utility
(slightly unprettily printed, but you'll get the idea ;-))
0
 
LVL 14

Author Comment

by:sudhakar_koundinya
Comment Utility
following is the TestCase created with my method and your method by adding url.getProtocol() as extra

Now your  results are much faster

My Case : above 7 Seconds
Your Case : above 5 Seconds

Is it possible to reduce still??

Thanks
Sudha



import junit.framework.*;
import java.util.*;
import java.net.*;
import junit.extensions.*;

/**
 * A sample test case, testing <code>java.util.ArrayList</code>.
 *
 */
public class TestURLList extends TestCase {
      
      ArrayList vect=new ArrayList();
      public TestURLList(String name) {
            super(name);
      }

      public static void main (String[] args) {
            junit.textui.TestRunner.run (suite());
      }
      protected void setUp() {
      }

      public static Test suite() {
            return new TestSuite(TestURLList.class);
      }
      public void testURLS()
      {
            for(int i=0;i<100;i++)
            {
                  for(int j=0;j<1000;j++)
                  {
                        vect.add("http://Test"+i+"/"+j+"/"+(i+j));
                  }
            }
            //List[] array=getUrlList(vect); //7501 seconds
            List[] array=getURLList(vect); // 5258 seconds
      }

//7501
        private java.util.List[] getUrlList(java.util.List list) {
    Hashtable ht = new Hashtable();
    for (int i = 0; i < list.size(); i++) {
      String url = list.get(i).toString();
      String array[] = url.split("/");

      String domain = array[0];
      if (array[0].toLowerCase().startsWith("http")) {
        domain = array[2];
        domain = domain + array[4];
      }
      else {
        domain = domain + array[2];
      }

      ArrayList sublist = (ArrayList) ht.get(domain);
      if (sublist == null) {
        sublist = new ArrayList();
      }
      sublist.add(url);
      ht.put(domain, sublist);
    }
    return (List[]) Collections.list(ht.elements()).toArray(new ArrayList[0]);
  }

// 5258 milli seconds
   private static java.util.List[] getURLList(java.util.List list) {
          Hashtable hostToUrls = new Hashtable();
          try {
               for (int i = 0; i < list.size(); i++) {
                    URL url = new URL((String) list.get(i));
                    String hostKey = url.getHost();
                              String protocol=url.getProtocol();
                        //      System.err.println(url.getProtocol() );
                    // Could be a Set if no duplicates required
                    ArrayList urlsForHost = (ArrayList)hostToUrls.get(protocol+hostKey);
                    if (urlsForHost == null) {
                         urlsForHost = new ArrayList();
                         urlsForHost.add(url);
                         hostToUrls.put(protocol+hostKey, urlsForHost);
                    }
                    else {
                         urlsForHost.add(url);
                    }
               }
          }
          catch (MalformedURLException e) {
               return null;
          }
          int numberOfLists = hostToUrls.size();
          return (java.util.List[]) Collections.list(hostToUrls.elements()).toArray(new ArrayList[numberOfLists]);
     }
}
0
 
LVL 86

Expert Comment

by:CEHJ
Comment Utility
I'll think about it. So you want to keep http and https separate?

>>domain = domain + array[4];

What makes you think the above will always be in range?
0
 
LVL 14

Author Comment

by:sudhakar_koundinya
Comment Utility
there is a possibility of giving urls some thing like this

192.12.0.39/exchange/MailBox/Mail.eml or
http://192.12.0.39/exchange/MailBox/Mail.eml
0
 
LVL 14

Author Comment

by:sudhakar_koundinya
Comment Utility
>> So you want to keep http and https separate?

Yes including domain
0
 
LVL 21

Assisted Solution

by:MogalManic
MogalManic earned 80 total points
Comment Utility
You can squeeze a few more milliseconds by removing the call to list.size() in the for loop.  The function is called every iteration in your loop, in your example that is 100,000 times (every time it is the same answer).

Change the for loop to the following:
final int listSize=list.size();
for (int i = 0; i < listSize; i++) {
...

You could try making all of the string delcarations 'final'.  This would allow the java compiler to make better optomization choices using temporary variables or registers.

Change the return type from List[] to Collection.  This would save the final iteration through your results to convert it to an array.  The results would be a Collection of Array lists.  Like this:
            return hostToUrls.values();  //Return the Set of values in Map
0
 
LVL 86

Expert Comment

by:CEHJ
Comment Utility
>>The function is called every iteration in your loop

That's not the case in fact
0
 
LVL 14

Author Comment

by:sudhakar_koundinya
Comment Utility

>> Change the return type from List[] to Collection.

Friend,
If that is the case why I should write this method?? I could somply use either ArrayList or Vector. Thanks for helping me

How Efficiently I could change this line??
 return (java.util.List[]) Collections.list(hostToUrls.elements()).toArray(new ArrayList[numberOfLists]);

Thanks and Best Regards
Sudhakar

0
 
LVL 14

Author Comment

by:sudhakar_koundinya
Comment Utility
>>results would be a Collection of Array lists.

My apologies, I didn't read entire comment
0
 
LVL 14

Author Comment

by:sudhakar_koundinya
Comment Utility
Let me try your Idea
0
 
LVL 86

Assisted Solution

by:CEHJ
CEHJ earned 375 total points
Comment Utility
Or just return a List:

      public final static java.util.List getUrlList2(java.util.List list) {
            Hashtable hostToUrls = new Hashtable();
            try {
                  for (int i = 0; i < list.size(); i++) {
                        URL url = new URL((String) list.get(i));
                        final String key = url.getHost() + url.getProtocol();
                        // Could be a Set if no duplicates required
                        ArrayList urlsForHost = (ArrayList) hostToUrls.get(key);
                        if (urlsForHost == null) {
                              urlsForHost = new ArrayList();
                              urlsForHost.add(url);
                              hostToUrls.put(key, urlsForHost);
                        }
                        else {
                              urlsForHost.add(url);
                        }
                  }
            }
            catch (MalformedURLException e) {
                  return null;
            }
            return Collections.list(hostToUrls.elements());
      }
0
 
LVL 14

Author Comment

by:sudhakar_koundinya
Comment Utility
I compqred your idea with my logic (modified version from CEHJ)

And here are the sample results

My code : 1484 milli seconds
Your  code : 1485 milli seconds

And I tested the codes  on  Pentium 4/256 MB Ram. running On multiple tests what I have identified is some times ur idea is fast and some times my idea is fast and that too a difference of few milli seconds. Maximum difference I have identified is +/- 25 milli seconds. This may be because of other processes that are running.  So I don't think returning as collection is right solution.

More or less we are doing the same in both codes. My idea converts the list into list array by calee  whereas ur idea converts the list into list array by caller

Here is the test case

import java.util.*;
import java.net.*;
public class ArrayTest
{
      


   private static Collection getURLList(java.util.List list) {
          Hashtable hostToUrls = new Hashtable();
          try {
               for (int i = 0; i < list.size(); i++) {
                    URL url = new URL((String) list.get(i));
                    String hostKey = url.getHost();
                         String protocol=url.getProtocol();
                    //     System.err.println(url.getProtocol() );
                    // Could be a Set if no duplicates required
                    ArrayList urlsForHost = (ArrayList)hostToUrls.get(protocol+hostKey);
                    if (urlsForHost == null) {
                         urlsForHost = new ArrayList();
                         urlsForHost.add(url);
                         hostToUrls.put(protocol+hostKey, urlsForHost);
                    }
                    else {
                         urlsForHost.add(url);
                    }
               }
          }
          catch (MalformedURLException e) {
               return null;
          }
          int numberOfLists = hostToUrls.size();
       
      return hostToUrls.values();

     }


  private static java.util.List[] getUrlList(java.util.List list) {
          Hashtable hostToUrls = new Hashtable();
          try {
               for (int i = 0; i < list.size(); i++) {
                    URL url = new URL((String) list.get(i));
                    String hostKey = url.getHost();
                         String protocol=url.getProtocol();
                    //     System.err.println(url.getProtocol() );
                    // Could be a Set if no duplicates required
                    ArrayList urlsForHost = (ArrayList)hostToUrls.get(protocol+hostKey);
                    if (urlsForHost == null) {
                         urlsForHost = new ArrayList();
                         urlsForHost.add(url);
                         hostToUrls.put(protocol+hostKey, urlsForHost);
                    }
                    else {
                         urlsForHost.add(url);
                    }
               }
          }
          catch (MalformedURLException e) {
               return null;
          }
          int numberOfLists = hostToUrls.size();
          return (java.util.List[]) Collections.list(hostToUrls.elements()).toArray(new ArrayList[numberOfLists]);
     }


public static void main(String s[])
{
   
      java.util.Date start=new java.util.Date();
ArrayList al=new ArrayList();

  for(int i=0;i<100;i++)
          {
               for(int j=0;j<1000;j++)
               {
                    al.add("http://Test"+i+"/"+j+"/"+(i+j));
               }
          }

/**
   Following is your Idea
*/
Collection c=getURLList(al);
Object[] array=c.toArray();
for(int i=0;i<array.length;i++)
{

      List array1=(List)array[i];
//      System.err.println("Array "+i+"\n");
      for(int j=0;j<array1.size();j++)
      {
            //System.err.println(array1.get(j));
      }
      
} // This had taken 1485 milli seconds


/*
    Following is My Idea
*/

/*List array[]=getUrlList(al);
for(int i=0;i<array.length;i++)
{

      List array1=(List)array[i];
//      System.err.println("Array "+i+"\n");
      for(int j=0;j<array1.size();j++)
      {
            //System.err.println(array1.get(j));
      }
      
}*/ //this had taken 1484 milli seconds

java.util.Date end=new java.util.Date();
System.err.println(end.getTime()-start.getTime());

}
}
0
 
LVL 86

Expert Comment

by:CEHJ
Comment Utility
i.e. return

List<List<URL>>

(per my example)

or

List<List<String>>
0
 
LVL 14

Author Comment

by:sudhakar_koundinya
Comment Utility
CEHJ,

I think even ur second idea also gives the same results more or less.
0
 
LVL 86

Expert Comment

by:CEHJ
Comment Utility
Obviously try running these things repeatedly and take averages
0
 
LVL 14

Author Comment

by:sudhakar_koundinya
Comment Utility
How Efficiently I could change this line??
 return (java.util.List[]) Collections.list(hostToUrls.elements()).toArray(new ArrayList[numberOfLists]);
0
 
LVL 14

Author Comment

by:sudhakar_koundinya
Comment Utility
>>List<List<URL>>

The current JDK Version is 1.4.2, So that Idea may not be helpful for me
0
 
LVL 86

Expert Comment

by:CEHJ
Comment Utility
>>The current JDK Version is 1.4.2, So that Idea may not be helpful for me

No - i mean conceptually - not necessarily literally in code ;-)

What are your ideal return types?
0
 
LVL 14

Author Comment

by:sudhakar_koundinya
Comment Utility
>>What are your ideal return types?

I did not get you
0
 
LVL 86

Expert Comment

by:CEHJ
Comment Utility
What types do you want returned from the method, i.e. what would you like to work with ideally, or does it not matter?
0
 
LVL 14

Author Comment

by:sudhakar_koundinya
Comment Utility
Basically I need to return List[]

If this is not possible, I am open to know other solutions. Main thing is I need to have optimized solution

At runtime, method  need to work with 10000 - 1000000 mail objects(URLs) depending on conditions.

0
 
LVL 14

Author Comment

by:sudhakar_koundinya
Comment Utility
It's time to sleep now (1.55 AM). I will discuss with you tomorrow from my office PC.

Good Night

0
 
LVL 14

Author Comment

by:sudhakar_koundinya
Comment Utility
BTW,

What is your full name and what is your Native Place?? Just Curious

I am from INDIA
0
 
LVL 14

Author Comment

by:sudhakar_koundinya
Comment Utility
Hi,

Back again
0
 
LVL 86

Expert Comment

by:CEHJ
Comment Utility
Were you asking me that q?
0
 
LVL 14

Author Comment

by:sudhakar_koundinya
Comment Utility
nope just intimating
0
 
LVL 21

Assisted Solution

by:MogalManic
MogalManic earned 80 total points
Comment Utility
If you change your return type from List[] to Collection, you need to change how you parse the results.  Instead of:
  Collection c=getURLList(al);
  Object[] array=c.toArray();
  for(int i=0;i<array.length;i++)
  {...
do the following:
  Collection c=getURLList(al);
  for(Iterator it=c.Iterator();it.hasNext();)
  {
     ArrayList eachList=(ArrayList) it.next();
     ...Process eachList...


The iterator will process the collection w/o moving things around in memory (In a sense you are still iterating through the original HashTable).  If you use toArray() you are effectively using the same code when you were just returning List[].

0
 
LVL 86

Expert Comment

by:CEHJ
Comment Utility
:-)
0
 
LVL 86

Accepted Solution

by:
CEHJ earned 375 total points
Comment Utility
Can you confirm this is returning the right results?

      public final static java.util.List getUrlList4(final java.util.List list) {
            Hashtable hostToUrls = new Hashtable(list.size());
            try {
                  for (int i = 0; i < list.size(); i++) {
                        URL url = new URL((String) list.get(i));
                        final String key = url.getHost() + url.getProtocol() + url.getFile().split("/")[1];
                        // Could be a Set if no duplicates required
                        ArrayList urlsForHost = (ArrayList) hostToUrls.get(key);
                        if (urlsForHost == null) {
                              urlsForHost = new ArrayList();
                              urlsForHost.add(url);
                              hostToUrls.put(key, urlsForHost);
                        }
                        else {
                              urlsForHost.add(url);
                        }
                  }
            }
            catch (MalformedURLException e) {
                  return null;
            }
            return Collections.list(hostToUrls.elements());
      }
0
 
LVL 14

Author Comment

by:sudhakar_koundinya
Comment Utility
url.getFile().split("/")[2]; do the job

But I believe it also reduces the performance as discussed  already here
http://www.experts-exchange.com/Programming/Programming_Languages/Java/Q_21109363.html

Thanks
Sudhakar
0
 
LVL 86

Expert Comment

by:CEHJ
Comment Utility
As mentioned in another q., could you post the link to a sizeable chunk of test data?
0
 
LVL 86

Assisted Solution

by:CEHJ
CEHJ earned 375 total points
Comment Utility
OK, with the (i guess smallish) sample of data you posted earlier, this seems to return the correct result and is about 5 times faster than the fastest one i tested so far (yours actually):

      public final static List getUrlList6(java.util.List urls) {
            Hashtable hostToUrls = new Hashtable();
            StringBuffer accountBuffer = new StringBuffer(32);
            for (int i = 0; i < urls.size(); i++) {
                  accountBuffer.setLength(0);
                  String url = (String) urls.get(i);
                  int forwardSlashCount = 0;
                  for (int ixChar = 4; forwardSlashCount < 5 && ixChar < url.length(); ixChar++) {
                        if (url.charAt(ixChar) == '/') {
                              forwardSlashCount++;
                        }
                        if (forwardSlashCount == 4) {
                              accountBuffer.append(url.charAt(ixChar));
                        }
                  }
                  // end for
                  accountBuffer.append(url.charAt(4));
                  final String key = accountBuffer.toString();
                  ArrayList urlsForHost = (ArrayList) hostToUrls.get(key);
                  if (urlsForHost == null) {
                        urlsForHost = new ArrayList();
                        urlsForHost.add(url);
                        hostToUrls.put(key, urlsForHost);
                  }
                  else {
                        urlsForHost.add(url);
                  }
            }
            return Collections.list(hostToUrls.elements());
      }
0
 
LVL 14

Author Comment

by:sudhakar_koundinya
Comment Utility
0
 
LVL 14

Author Comment

by:sudhakar_koundinya
Comment Utility
Thanks CEHJ,

didn't see the comment.

I will test that and let you know
0
 
LVL 14

Author Comment

by:sudhakar_koundinya
Comment Utility
I am seeing the changes you have done, trying to understand the code
0
 
LVL 86

Expert Comment

by:CEHJ
Comment Utility
Sorry - it's almost completely uncommented. More or less the same as before apart from counting slashes by String index
0
 
LVL 14

Author Comment

by:sudhakar_koundinya
Comment Utility
Although the change  improved the performance

The code is failed for following test case  :(

array.add("http://192.12.0.38/exchange/Administrator/Inbox/TEST Message 0237.EML");
array.add("http://192.12.0.39/exchange/Administrator/Inbox/TEST Message 0237.EML");

0
 
LVL 86

Expert Comment

by:CEHJ
Comment Utility
LOL sorry - i've forgotten in the meantime that the host is part of the key too!
0
 
LVL 14

Author Comment

by:sudhakar_koundinya
Comment Utility
Just wrote this code

I beg your help to check if this is giving fast results than previous methods at your end

public static final List getUrlList7(List l)
    {
          //http://192.12.0.62/exchange/Administrator
          Hashtable ht=new Hashtable();
          for(int i=0;i<l.size();i++)
          {
                String url=(String)l.get(i);
                int hashCount=0;
                StringBuffer key=new StringBuffer();
                for(int j=0;j<url.length();j++)
                {
                      char c=url.charAt(j);
                      key.append(c);
                      if(c=='/')
                            hashCount++;
                      if(hashCount==5)
                            break;
                      
                }
                List array=(List)ht.get(key.toString());
                if(array==null)
                {
                      array=new ArrayList();
                      ht.put(key.toString(),array);
                }
                array.add(url);
                
          }
          return Collections.list(ht.elements());
   
          
    }

thanks,
sudhakar
0
 
LVL 14

Author Comment

by:sudhakar_koundinya
Comment Utility
nope that is  bad code i think :( giving poor results
0
 
LVL 86

Assisted Solution

by:CEHJ
CEHJ earned 375 total points
Comment Utility
I'll try it. In the meantime - try this:

      public final static List getUrlList8(java.util.List urls) {
            Hashtable hostToUrls = new Hashtable(urls.size());
            StringBuffer accountBuffer = new StringBuffer(32);
            for (int i = 0; i < urls.size(); i++) {
                  accountBuffer.setLength(0);
                  String url = (String) urls.get(i);
                  int forwardSlashCount = 0;
                  int hostHashCode = 0;
                  for (int ixChar = 4; forwardSlashCount < 5 && ixChar < url.length(); ixChar++) {
                        if (url.charAt(ixChar) == '/') {
                              forwardSlashCount++;
                        }
                        if (forwardSlashCount == 2) {
                              // Compute hashcode of host
                              hostHashCode = 31 * hostHashCode + url.charAt(ixChar);
                        }
                        else if (forwardSlashCount == 4) {
                              accountBuffer.append(url.charAt(ixChar));
                        }
                  }
                  // end for
                  accountBuffer.append(url.charAt(4)).append(hostHashCode);
                  final String key = accountBuffer.toString();
                  ArrayList urlsForHost = (ArrayList) hostToUrls.get(key);
                  if (urlsForHost == null) {
                        urlsForHost = new ArrayList();
                        urlsForHost.add(url);
                        hostToUrls.put(key, urlsForHost);
                  }
                  else {
                        urlsForHost.add(url);
                  }
            }
            return Collections.list(hostToUrls.elements());
      }
0
 
LVL 86

Expert Comment

by:CEHJ
Comment Utility
>>nope that is  bad code i think :( giving poor results

Well it's (getUrlListSud2) about 3 times faster than your original one (getURLList):
Following results in milliseconds for 1000 invocations against the test data. If you're wondering what the '3' is it's just me checking that my method is returning the right number of lists ;-)


C:\java\tiger\lang>jr TestUrlLists 1000
Executing getURLList...
40208.0
Executing getUrlListSud2...
12739.0
Executing getUrlList8...
8481.0
3

C:\java\tiger\lang>jr TestUrlLists 1000
Executing getURLList...
39727.0
Executing getUrlListSud2...
13149.0
Executing getUrlList8...
8671.0
3

C:\java\tiger\lang>jr TestUrlLists 1000
Executing getUrlList8...
8572.0
Executing getUrlListSud2...
12437.0
Executing getURLList...
40459.0
3
0
 
LVL 14

Author Comment

by:sudhakar_koundinya
Comment Utility
Ok I willtry that

And I  have done a small change in above code

If my assumtion is correct, this should work fast. I need to check with Exchange administrator for this.

I am assuming exchange is common.

public static final List getUrlList7(List l)
    {
          //http://192.12.0.62/exchange/Administrator
          Hashtable ht=new Hashtable();
          for(int i=0;i<l.size();i++)
          {
                String url=(String)l.get(i);
                int hashCount=0;
                StringBuffer key=new StringBuffer();
                for(int j=4;j<url.length();j++)
                {
                      //char c=url.charAt(j);
                      key.append(url.charAt(j));
                      if(key.charAt(key.length()-1)=='/')
                            hashCount++;
                      if(hashCount==3)
                            j=j+8;
                      if(hashCount==5)
                            break;
                      
                }
                List array=(List)ht.get(key.toString());
                if(array==null)
                {
                      array=new ArrayList();
                      ht.put(key.toString(),array);
                }
                array.add(url);
                
          }
            Enumeration e= ht.keys();
         while(e.hasMoreElements())
          System.err.println(e.nextElement());
          return Collections.list(ht.elements());
   
          
    }
0
 
LVL 14

Author Comment

by:sudhakar_koundinya
Comment Utility
it  improved performance by 300 milli seconds at my side

for folllowing test case

 public Vector getURLS() {
        Vector vect=new Vector();
        for(int i=0;i<100;i++) {
            for(int j=0;j<1000;j++) {
                vect.add("http://Test"+i+"/"+j+"/"+(i+j));
            }
        }
        //List[] array=getUrlList(vect); //7501 seconds
       return vect;
}
0
 
LVL 14

Author Comment

by:sudhakar_koundinya
Comment Utility
i mean this

public Vector getURLS() {
        Vector vect=new Vector();
        for(int i=0;i<100;i++) {
            for(int j=0;j<1000;j++) {
                vect.add("http://Test"+i+"/exchange/"+(i+j));
            }
        }
0
 
LVL 14

Author Comment

by:sudhakar_koundinya
Comment Utility
looking at ur code

if (forwardSlashCount == 2) {
                       // Compute hashcode of host
                       hostHashCode = 31 * hostHashCode + url.charAt(ixChar);
                  }
                  else if (forwardSlashCount == 4) {
                       accountBuffer.append(url.charAt(ixChar));
                  }


what you are trying to do on above snippet??

0
 
LVL 14

Author Comment

by:sudhakar_koundinya
Comment Utility
>>hostHashCode = 31 * hostHashCode + url.charAt(ixChar);

will this be unique for both http and https based ??
0
 
LVL 86

Expert Comment

by:CEHJ
Comment Utility
Your last is actually slower than your previous:

C:\java\tiger\lang>jr TestUrlLists 10000
Executing getUrlListSud3...
130687.0
Executing getUrlList8...
84761.0
Executing getUrlListSud2...
126582.0
Executing getURLList...
394578.0

>>what you are trying to do on above snippet??

First block takes the hashcode of the host name
Second block distinguishes between http & https

The variable named 'accountBuffer' should better be called 'keyBuffer'
0
 
LVL 14

Author Comment

by:sudhakar_koundinya
Comment Utility
could you please post your whole test code??

I wanted to test the same at my end also

thanks
Sudhakar
0
 
LVL 14

Author Comment

by:sudhakar_koundinya
Comment Utility
0
 
LVL 14

Author Comment

by:sudhakar_koundinya
Comment Utility
>>hostHashCode = 31 * hostHashCode + url.charAt(ixChar);

Why specifically this one?? OR can we use any other logic also?? in identifying the hashcode
0
 
LVL 86

Expert Comment

by:CEHJ
Comment Utility
>>Why specifically this one??

That's actually the central code of String's hashCode()

>>could you please post your whole test code??

Well there's quite a lot of it, but here are the essentials:

http://www.cehjohnson.uklinux.net/aire/MethodTimer.java

, which is 'work in progress' the obvious thing that's missing (great if you can put it in) is that it should be able to return the function's return value to the caller, as well as the time (simple custom class required) and this is being called from main thus:

            Class[] methodTypes = new Class[]{java.util.List.class};
            Object[] methodArgs = new Object[]{al};
            // Put test method names here
            final String[] METHOD_NAMES = {"getUrlList8", "getUrlListSud9"};
            //final String[] METHOD_NAMES = {"testStringTokenizer", "testSplit"};
            Stack methodStack = new Stack();
            methodStack.addAll(Arrays.asList(METHOD_NAMES));
            Collections.shuffle((java.util.List) methodStack);
            double aggregatedTime = 0.0;
            // Number of invocations passed in on the command line
            final int NUMBER_OF_INVOCATIONS = Integer.parseInt(args[0]);
            while (!methodStack.isEmpty()) {
                  String currentMethodName = (String) methodStack.pop();
                  System.out.println("Executing " + currentMethodName + "...");
                  for (int i = 0; i < NUMBER_OF_INVOCATIONS; i++) {
                        aggregatedTime += MethodTimer.timeStaticMethod("TestUrlLists", currentMethodName, methodTypes, methodArgs);
                  }
                  System.out.println(aggregatedTime);
                  aggregatedTime = 0.0;
            }



0
 
LVL 14

Author Comment

by:sudhakar_koundinya
Comment Utility
it seems ur last code is only the final way. Anyhow I will  wait for some other experts inputs for some time. If nobody  participate here then the points will be ur's ;-)

thanks
sudhakar

 
0
 
LVL 92

Expert Comment

by:objects
Comment Utility
I'll have a look when I get in the office tomorrow, whats the current state?
0
 
LVL 14

Author Comment

by:sudhakar_koundinya
Comment Utility
CEHJ gave me this code


     public final static List getUrlList8(java.util.List urls) {
          Hashtable hostToUrls = new Hashtable(urls.size());
          StringBuffer accountBuffer = new StringBuffer(32);
          for (int i = 0; i < urls.size(); i++) {
               accountBuffer.setLength(0);
               String url = (String) urls.get(i);
               int forwardSlashCount = 0;
               int hostHashCode = 0;
               for (int ixChar = 4; forwardSlashCount < 5 && ixChar < url.length(); ixChar++) {
                    if (url.charAt(ixChar) == '/') {
                         forwardSlashCount++;
                    }
                    if (forwardSlashCount == 2) {
                         // Compute hashcode of host
                         hostHashCode = 31 * hostHashCode + url.charAt(ixChar);
                    }
                    else if (forwardSlashCount == 4) {
                         accountBuffer.append(url.charAt(ixChar));
                    }
               }
               // end for
               accountBuffer.append(url.charAt(4)).append(hostHashCode);
               final String key = accountBuffer.toString();
               ArrayList urlsForHost = (ArrayList) hostToUrls.get(key);
               if (urlsForHost == null) {
                    urlsForHost = new ArrayList();
                    urlsForHost.add(url);
                    hostToUrls.put(key, urlsForHost);
               }
               else {
                    urlsForHost.add(url);
               }
          }
         

After discussing the different test cases. At this moment it seems to better to me
0
Better Security Awareness With Threat Intelligence

See how one of the leading financial services organizations uses Recorded Future as part of a holistic threat intelligence program to promote security awareness and proactively and efficiently identify threats.

 
LVL 92

Expert Comment

by:objects
Comment Utility
> At this moment it seems to better to me

how much improvement is that giving you?
0
 
LVL 92

Expert Comment

by:objects
Comment Utility
Get rid of that Hashtable for a start and replace it with a HashMap.
And give your ArrayList an initial size (you could make it as big as the number of elements remaining in urls).
0
 
LVL 14

Author Comment

by:sudhakar_koundinya
Comment Utility
My first posting took  7  seconds

where as CEHJ's took 1.2 seconds currently for 20 Milliion URLs  grouping

Thanks
Sudhakar
0
 
LVL 86

Assisted Solution

by:CEHJ
CEHJ earned 375 total points
Comment Utility
Almost twice as fast as the last one, but needs more testing for accuracy. We really need more data than the test data in the link:

public final static List getUrlList10(java.util.List urls) {
            Hashtable hostToUrls = new Hashtable(urls.size());
            for (int i = 0; i < urls.size(); i++) {
                  String url = (String) urls.get(i);
                  int forwardSlashCount = 0;
                  int hostPlusAccountPlusProtocolHashcode = 0;
                  for (int ixChar = 4; forwardSlashCount < 5 && ixChar < url.length(); ixChar++) {
                        if (url.charAt(ixChar) == '/') {
                              forwardSlashCount++;
                        }
                        if (forwardSlashCount == 2 || forwardSlashCount == 4) {
                              // Compute hashcode of host
                              hostPlusAccountPlusProtocolHashcode = 31 * hostPlusAccountPlusProtocolHashcode + url.charAt(ixChar);
                        }
                  }
                  // end for
                  if (url.charAt(4) == 's') {
                        hostPlusAccountPlusProtocolHashcode = ~hostPlusAccountPlusProtocolHashcode;
                  }
                  Integer key = new Integer(hostPlusAccountPlusProtocolHashcode);
                  ArrayList urlsForHost = (ArrayList) hostToUrls.get(key);
                  if (urlsForHost == null) {
                        urlsForHost = new ArrayList();
                        urlsForHost.add(url);
                        hostToUrls.put(key, urlsForHost);
                  }
                  else {
                        urlsForHost.add(url);
                  }
            }
            return Collections.list(hostToUrls.elements());
      }
0
 
LVL 14

Author Comment

by:sudhakar_koundinya
Comment Utility
On higher end, no doubt that will help me.

curently I am working on exchange server project in which the current task is to download the 25 mails per minute using slide

Now currently I am able to reach the target of 15/min.

So I am currently searching for the code optimizations

thanks
sudhakar

 
0
 
LVL 14

Author Comment

by:sudhakar_koundinya
Comment Utility
>> Get rid of that Hashtable for a start and replace it with a HashMap.

Just discussing the same here
http://www.experts-exchange.com/Programming/Programming_Languages/Java/Q_21111028.html
0
 
LVL 86

Expert Comment

by:CEHJ
Comment Utility
>>where as CEHJ's took 1.2 seconds currently for 20 Milliion URLs  grouping

Can you try the last one on the 20,000,000 and see if the result is accurate?
0
 
LVL 92

Expert Comment

by:objects
Comment Utility
>> Get rid of that Hashtable for a start and replace it with a HashMap.
> Just discussing the same here

It will be faster.

0
 
LVL 14

Author Comment

by:sudhakar_koundinya
Comment Utility
yeah that's true took 985 milli secs
0
 
LVL 92

Expert Comment

by:objects
Comment Utility
specifying an initial capacity for the list will give you further improvements.
Give it as much as you can afford (<= number of elements left in list)
0
 
LVL 92

Expert Comment

by:objects
Comment Utility
wheres the list coming from anyway?
and do you have any analysis on the typical breakdown of the data?

0
 
LVL 14

Author Comment

by:sudhakar_koundinya
Comment Utility
CEHJ on several test runs

What I have identified is ur last  code is almost double the spped.

>>wheres the list coming from anyway?

Acutally we are developing the project along with the client. client's programmers are developing their own API and we need to integrate the API with our's. And we did not get their API also yet for testing. So it is difficult to get the actual data.

So what i did is, I have created test environment on 10 machines and    testing my API

>> specifying an initial capacity for the list will give you further improvements.
 I will do that.  assuming 10,000 to 50,1000 urls for single list. How much initial capacity I can give to that??

Suppose if the list contains only 10 urls and if set the size to 100, will it occupy extra  memory??



Thanks
sudhakar
0
 
LVL 86

Expert Comment

by:CEHJ
Comment Utility
How and where are you planning to set the capacity of the List?
0
 
LVL 14

Author Comment

by:sudhakar_koundinya
Comment Utility
Your last code is really a good Idea :-)
0
 
LVL 92

Expert Comment

by:objects
Comment Utility
> Suppose if the list contains only 10 urls and if set the size to 100, will it occupy extra  memory??

Sure but your goal is to make it go as fast as you can isn't it, not to use less memory :)
0
 
LVL 92

Expert Comment

by:objects
Comment Utility
that code does not seem to behave exactly the same as the requirements were specified earlier.
Exactly what is the possible formats that the code would need to handle, and how should they be broken down.

>  client's programmers are developing their own API and we need to integrate the API with
> our's. And we did not get their API also yet for testing. So it is difficult to get the actual data.

If you really need to squeeze your msec out of this, then you should be discussing with them to ensure they pass it to you in a form that is as ideal as possible for processing.

0
 
LVL 14

Author Comment

by:sudhakar_koundinya
Comment Utility
0
 
LVL 92

Expert Comment

by:objects
Comment Utility
Earlier you said you could also recieve the format:
192.12.0.39/exchange/MailBox/Mail.eml

Is this still the case?

What is the definition of possible formats?
0
 
LVL 14

Author Comment

by:sudhakar_koundinya
Comment Utility
>>192.12.0.39/exchange/MailBox/Mail.eml

Sorry objects,  I forgot to mention that. This case was removed
0
 
LVL 86

Expert Comment

by:CEHJ
Comment Utility
>>Your last code is really a good Idea :-)

Was that directed at me sudhakar? (sometimes i don't know which of us you're addressing ;-))
0
 
LVL 21

Assisted Solution

by:MogalManic
MogalManic earned 80 total points
Comment Utility
No matter how you tweek the same routine It is still going to bo O(n) (Since the routine is just a for loop of each item in the list, the speed processing the list is some factor times the size of the list).

If your server is on a multi CPU box, you could get factor of O((n/m)lg(n/m)) by spliting the list into a number of M sublists.  Where M is the number of CPU's.  Here is the pseudocode:

int listSize=list.size();
for (m=0; m<#ofCPU's;m++) {
  BuildThreadWorker worker[m]=BuildThreadWorker(list, listSize*m, listSize*m+listSize/m);
  Thread t= new Thread(worker[m]);
  t.start();
}

//wait for workers to complete
int done=false
while(!done) {
  done=true;
  for(int m=0;m<#ofCPU's;m++) {
    if (worker[m].isDone()==false) {
         done=false;
        Thread.yield();  //Allow threads to work
        break;//check again
   }
}
MergeResults(list, listSize/m);

class BuildThreadWorker implements Runnable
{
 
  BuildThreadWorkder(List list, int start, int end)
  {
     this.list=list;
     this.start=start;
     this.end;
  }

  public void run()
  {
      for(int i=start;i<end;i++) {
          //Modified merge sort or quicksort of sublist
      }
  }
}

For large lists, it might be faster doing some sort of divide and conquer method.
0
 
LVL 14

Author Comment

by:sudhakar_koundinya
Comment Utility
>>Was that directed at me sudhakar?

Yes :-)
0
 
LVL 86

Expert Comment

by:CEHJ
Comment Utility
Actually, the last code i posted should not be used. It's late here - more later ;-)
0
 
LVL 92

Assisted Solution

by:objects
objects earned 45 total points
Comment Utility
Also make a best guess for the initial capacity and load factor used by your Map.
0
 
LVL 14

Author Comment

by:sudhakar_koundinya
Comment Utility
>> Actually, the last code i posted should not be used. It's late here - more later ;-)


why??
0
 
LVL 14

Author Comment

by:sudhakar_koundinya
Comment Utility
>> load factor used by your Map.

Sorry, I didn't get you
0
 
LVL 92

Assisted Solution

by:objects
objects earned 45 total points
Comment Utility
A lower load factor should improve the lookup times.
0
 
LVL 14

Author Comment

by:sudhakar_koundinya
Comment Utility
Objects,
As earlier said, at following link
http://www.experts-exchange.com/Programming/Programming_Languages/Java/Q_21111028.html

other experts are getting negative results only regarding Hashmap. So I am not yet/able decide which to use
0
 
LVL 92

Expert Comment

by:objects
Comment Utility
Both Maps use a load factor so you'd get an improvement whichever you use.
0
 
LVL 92

Expert Comment

by:objects
Comment Utility
Can you email me a large set of test data and I'll do some benchmarking.
0
 
LVL 14

Author Comment

by:sudhakar_koundinya
Comment Utility
>> Can you email me a large set of test data and I'll do some benchmarking

I will try to do that tomorrow. As today is Festival (Rakhee) Holiday in India

thanks
Sudhakar
0
 
LVL 86

Expert Comment

by:CEHJ
Comment Utility
>>why??

Let's carry on as normal and i'll explain this later:

>>Can you email me a large set of test data

Please don't email anything, but post the link instead.
0
 
LVL 86

Expert Comment

by:CEHJ
Comment Utility
>>other experts are getting negative results only regarding Hashmap

HashMap will give you (in theory) only slightly better results, as the cost of synchronization is not as high as is commonly imagined
0
 
LVL 92

Expert Comment

by:objects
Comment Utility
> Let's carry on as normal and i'll explain this later

????

> Please don't email anything

email is fine for me
0
 
LVL 30

Expert Comment

by:mayankeagle
Comment Utility
Moderators generally don't like seeing e-mail IDs or questions being solved through e-mail.
0
 
LVL 92

Expert Comment

by:objects
Comment Utility
not solving it thru email, just need the test data.
And its too much to post here so email is the easiest, and its not directly relevant to the solution anyway.
0
 
LVL 30

Expert Comment

by:mayankeagle
Comment Utility
Yeah, I agree to that :)
0
 
LVL 86

Expert Comment

by:CEHJ
Comment Utility
There's no problem with posting a link to test data, and absolutely no need for emails, quite apart from it being discouraged by the moderators.
0
 
LVL 14

Author Comment

by:sudhakar_koundinya
Comment Utility
ok guys,

I have uploaded the testurls zip file at http://www.prithvisolutions.com/urllist.zip  I may remove this with in one or two days as I have uploaded at my comany's official site. I just get 2 days permission to place that zip file there.

OK coming to the scenario,

With the mail client that sends the multiple emails at a time I was able to send the mails to different exchange servers. All these are just test servers. So I am sorry because I am not able to give realtime Email URLS for this.

And using Slide API, I got the urllist that I have placed in a text file. It contains arround 16 lacs of urls list

Sridhar,
Welcome to this thread participation :-). And I hope u understand what exactly I am looking for by seeing the previous discussions.


Thanks,
Sudhakar
0
 
LVL 14

Author Comment

by:sudhakar_koundinya
Comment Utility
I am increasing the points to 500 as you guys are too patience in helping me :-)
0
 
LVL 92

Expert Comment

by:objects
Comment Utility
> There's no problem with posting a link to test data

And no difference :-D Except some inconvenience on suds.
No more of your petty complaints please, noone else had a problem so why should you.
It not like you even asked for any test data.
0
 
LVL 30

Expert Comment

by:mayankeagle
Comment Utility
>> And I hope u understand what exactly I am looking for by seeing the previous discussions

I've been listening to this but haven't been able to take out time for bench-marking something. BTW, I tried clicking on the link you posted (the ZIP file). It said Page cannot be displayed.
0
 
LVL 86

Expert Comment

by:CEHJ
Comment Utility
sudhakar - it would be useful if you could post how many lists are meant to be produced by the test data
0
 
LVL 14

Author Comment

by:sudhakar_koundinya
Comment Utility
>> It said Page cannot be displayed.
???. It is working fine at my side. I just tested on both IE and NS
0
 
LVL 14

Author Comment

by:sudhakar_koundinya
Comment Utility

>>Let's carry on as normal and i'll explain this later:

CEHJ, will that won't work for some conditions??
0
 
LVL 86

Expert Comment

by:CEHJ
Comment Utility
>>CEHJ, will that won't work for some conditions??

I'm not going to prejudice you against it at the moment - keep testing it ;-)

a. does it produce the right number of lists for the test data?
b. what *is* the right number of lists?
0
 
LVL 14

Author Comment

by:sudhakar_koundinya
Comment Utility
Please let me know if you have the copy of that zip file. I am revoing that by this evening

thanks
Sudhakar
0
 
LVL 86

Expert Comment

by:CEHJ
Comment Utility
I've got it
0
 
LVL 14

Author Comment

by:sudhakar_koundinya
Comment Utility
objects shall I remove that file??
0
 
LVL 14

Author Comment

by:sudhakar_koundinya
Comment Utility
Ok CEHJ,

for the contents of that zip file

here is the key information

http://www.geocities.com/sudhakar_koundinya/urls.html
0
 
LVL 86

Expert Comment

by:CEHJ
Comment Utility
Did my code produce the right number of lists? (i haven't tested)
0
 
LVL 14

Author Comment

by:sudhakar_koundinya
Comment Utility
CEHJ,

for 1823794 URLS your method get 1860 lists
where as actuals are 1984

Still trying to identify the problem
0
 
LVL 14

Author Comment

by:sudhakar_koundinya
Comment Utility
CEHJ,

here are the results http://www.geocities.com/sudhakar_koundinya/results.htm

for the following test case

import java.io.*;
import java.util.*;
class ReadURLS
{
static long urls=0;
      public static void main(String[] args)
      {
      try {
        BufferedReader in = new BufferedReader(new FileReader("c:\\urllist.txt"));
        String str;
        while ((str = in.readLine()) != null) {
                    getUrlList10(str);    //cehj          
                             process(str); //sudhakar
        }
                        System.err.println(ht.size());
                        Enumeration e=ht.keys();
                        while(e.hasMoreElements())
            {
                              String str1=(String)e.nextElement();
                              System.out.println("Key  : "+str1+"   Value : "+ht.get(str1));
            }
            
                        System.err.println("Current Size :"+ht.size()+" -> No of URLS "+urls);
        in.close();
            List l=Collections.list(hostToUrls.keys());
            long value=0;
            for(int i=0;i<l.size();i++)
            {
                  Long l1=(Long)l.get(i);
                              System.out.println("Key  : "+l1+"   Value : "+hostToUrls.get(l1));
            }
            System.out.println("No of Lists :"+l.size());


    } catch (IOException e) {
    }
      }
      public static void process(String str)
      {
            if(str==null) return;
            if(str.trim().length()<=0) return;
            StringBuffer sb=new StringBuffer();
            int counter=0;
            for(int i=0;i<str.length();i+=1)
            {
                  sb.append(str.charAt(i));
                  if(sb.charAt(sb.length()-1)=='/')
                  {
                        counter++;
                        if(counter==5)
                        {
                              Long i1=(Long)ht.get(sb.toString());
                              if(i1==null)
                              {
                                    i1=new Long(1);
                              }
                              else
                              {
                                    i1=new Long(i1.longValue()+1);
                              }
                              ht.put(sb.toString(),i1);
                              break;
                        }
                  }
            }
            urls++;
            if(urls%50000==0)
            System.err.println("Current Size :"+ht.size()+" -> No of URLS "+urls);
      }
      public final static void getUrlList10(String str) {

          //for (int i = 0; i < urls.size(); i++)
                    {
               String url = (String) str;
                                 if(str==null) return;
            if(str.trim().length()<=0) return;
               int forwardSlashCount = 0;
               int hostPlusAccountPlusProtocolHashcode = 0;
               for (int ixChar = 4; forwardSlashCount < 5 && ixChar < url.length(); ixChar++) {
                    if (url.charAt(ixChar) == '/') {
                         forwardSlashCount++;
                    }
                    if (forwardSlashCount == 2 || forwardSlashCount == 4) {
                         // Compute hashcode of host
                         hostPlusAccountPlusProtocolHashcode = 3 * hostPlusAccountPlusProtocolHashcode + url.charAt(ixChar);
                    }
               }
               // end for
               if (url.charAt(4) == 's') {
                    hostPlusAccountPlusProtocolHashcode = ~hostPlusAccountPlusProtocolHashcode;
               }
               Long key = new Long(hostPlusAccountPlusProtocolHashcode);
               Long urlsForHost = (Long) hostToUrls.get(key);
               if (urlsForHost == null) {
                    urlsForHost = new Long(1);
                    //urlsForHost.add(url);
                    hostToUrls.put(key, urlsForHost);
               }
               else {
                    urlsForHost=new Long(urlsForHost.longValue()+1);
                                                  hostToUrls.put(key, urlsForHost);
               }
                     _urls++;
                     if(_urls%100000==0)
                     System.err.println("Key :"+key+" Value :"+urlsForHost+"Urls Count :"+_urls);
          }


     }

    static      Hashtable hostToUrls = new Hashtable();
          static      Hashtable ht= new Hashtable();
            static long _urls=0;

}
0
 
LVL 14

Author Comment

by:sudhakar_koundinya
Comment Utility
CEHJ,

any updates??
0
 
LVL 14

Author Comment

by:sudhakar_koundinya
Comment Utility
Hello objects,

Are you able to find some benchmark solution??
0
 
LVL 86

Expert Comment

by:CEHJ
Comment Utility
>>
for 1823794 URLS your method get 1860 lists
where as actuals are 1984
>>

Is that the one using String as the hash key or Integer?
0
 
LVL 14

Author Comment

by:sudhakar_koundinya
Comment Utility
that is Integer
0
 
LVL 86

Assisted Solution

by:CEHJ
CEHJ earned 375 total points
Comment Utility
What did i tell you? ;-) Try the one using String(Buffer)
0
 
LVL 14

Author Comment

by:sudhakar_koundinya
Comment Utility
But the later one is faster know??. Whay it is failing to get the right hashcode??
0
 
LVL 86

Expert Comment

by:CEHJ
Comment Utility
I know it's faster, but there's an  problem inherent to it: Integer's hashCode is essentially 'synonymous' with its equals method. That means that if a hashcode collision *does* occur, and this is more likely with such large amounts of data as you have [too large for me to handle here unfortunately], then the equals method will not 'save' it from replacing the previous entry in the table. This is not the case with String, as the String needs to be equal as well as returning the same hashcode before the previous entry will get kicked out of the table.

Of course, i'm not certain that this is why the anomaly occurs - it could be an error in my code ;-)
0
 
LVL 14

Author Comment

by:sudhakar_koundinya
Comment Utility
OK CEHJ,

I end up with ur StringBuffer related code

thanks
sudhakar
0
 
LVL 86

Expert Comment

by:CEHJ
Comment Utility
:-)
0

Featured Post

How to improve team productivity

Quip adds documents, spreadsheets, and tasklists to your Slack experience
- Elevate ideas to Quip docs
- Share Quip docs in Slack
- Get notified of changes to your docs
- Available on iOS/Android/Desktop/Web
- Online/Offline

Join & Write a Comment

Suggested Solutions

Java contains several comparison operators (e.g., <, <=, >, >=, ==, !=) that allow you to compare primitive values. However, these operators cannot be used to compare the contents of objects. Interface Comparable is used to allow objects of a cl…
By the end of 1980s, object oriented programming using languages like C++, Simula69 and ObjectPascal gained momentum. It looked like programmers finally found the perfect language. C++ successfully combined the object oriented principles of Simula w…
Viewers learn about the “for” loop and how it works in Java. By comparing it to the while loop learned before, viewers can make the transition easily. You will learn about the formatting of the for loop as we write a program that prints even numbers…
Viewers will learn about arithmetic and Boolean expressions in Java and the logical operators used to create Boolean expressions. We will cover the symbols used for arithmetic expressions and define each logical operator and how to use them in Boole…

763 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

11 Experts available now in Live!

Get 1:1 Help Now