Still celebrating National IT Professionals Day with 3 months of free Premium Membership. Use Code ITDAY17

x
?
Solved

Help needed to make this collections method much efficient

Posted on 2004-08-21
125
Medium Priority
?
365 Views
Last Modified: 2010-03-31
 private java.util.List[] getURLList(java.util.List list) {
    Hashtable ht = new Hashtable();
    for (int i = 0; i < list.size(); i++) {
      String url = list.get(i).toString();
      String array[] = url.split("/");

      String domain = array[0];
      if (array[0].toLowerCase().startsWith("http")) {
        domain = array[2];
        domain = domain + array[4];
      }
      else {
        domain = domain + array[2];
      }

      ArrayList sublist = (ArrayList) ht.get(domain);
      if (sublist == null) {
        sublist = new ArrayList();
      }
      sublist.add(url);
      ht.put(domain, sublist);
    }
    return (List[]) Collections.list(ht.elements()).toArray(new ArrayList[0]);
  }


Thanks
Koundinya
0
Comment
Question by:sudhakar_koundinya
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 62
  • 39
  • 16
  • +2
125 Comments
 
LVL 86

Assisted Solution

by:CEHJ
CEHJ earned 1500 total points
ID: 11858447
Probably lots you could do to optimise this (possibly at the expense of extensibility) but i'd approach it like this:

      private java.util.List[] getUrlList(java.util.List list) {
            Hashtable hostToUrls = new Hashtable();
            try {
                  for (int i = 0; i < list.size(); i++) {
                        URL url = new URL((String) list.get(i));
                        String hostKey = url.getHost();
                        // Could be a Set if no duplicates required
                        ArrayList urlsForHost = (ArrayList)hostToUrls.get(hostKey);
                        if (urlsForHost == null) {
                              urlsForHost = new ArrayList();
                              urlsForHost.add(url);
                              hostToUrls.put(hostKey, urlsForHost);
                        }
                        else {
                              urlsForHost.add(url);
                        }
                  }
            }
            catch (MalformedURLException e) {
                  return null;
            }
            int numberOfLists = hostToUrls.size();
            return (java.util.List[]) Collections.list(hostToUrls.elements()).toArray(new ArrayList[numberOfLists]);
      }
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 11858458
I used Lists of type URL as they're more versatile, but you may want to use String
0
 
LVL 14

Author Comment

by:sudhakar_koundinya
ID: 11858484
suppose if the list contains both http and https based urls then how can we handle them??
0
Build and deliver software with DevOps

A digital transformation requires faster time to market, shorter software development lifecycles, and the ability to adapt rapidly to changing customer demands. DevOps provides the solution.

 
LVL 86

Expert Comment

by:CEHJ
ID: 11858490
Not quite sure what you mean. They'd appear together under the same host list
0
 
LVL 14

Author Comment

by:sudhakar_koundinya
ID: 11858497
list will be some thing like this


            al.add("http://192.12.0.38/exchange/Administrator/Inbox/TEST Message 0101.EML");
          al.add("http://192.12.0.38/exchange/Administrator/Inbox/TEST Message 0102.EML");
        al.add("http://192.12.0.38/exchange/Administrator/Inbox/TEST Message 0103.EML");
      al.add("http://192.12.0.38/exchange/Administrator/Inbox/TEST Message 0104.EML");
        al.add("http://192.12.0.38/exchange/Administrator/Deleted Items/TEST Message 0056.EML");
        al.add("http://192.12.0.38/exchange/Administrator/Deleted Items/TEST Message 0055.EML");
        al.add("http://192.12.0.38/exchange/Administrator/Inbox/TEST Message 0105.EML");
        al.add("http://192.12.0.38/exchange/Administrator/Inbox/TEST Message 0106.EML");
        al.add("http://192.12.0.38/exchange/Administrator/Inbox/TEST Message 0107.EML");
        al.add("http://192.12.0.38/exchange/Administrator/Inbox/TEST Message 0108.EML");
        al.add("http://192.12.0.38/exchange/Administrator/Inbox/TEST Message 0109.EML");
        al.add("http://192.12.0.38/exchange/Administrator/Inbox/TEST Message 0110.EML");
        al.add("http://192.12.0.26/exchange/t1/Inbox/testmailfromAD.EML");
        al.add("http://192.12.0.3/exchange/vijaya/Inbox/testattachment.EML");
        al.add("https://192.12.0.38:403/exchange/Administrator/Inbox/TEST Message 01.EML");
        al.add("https://192.12.0.3/exchange/Administrator/Inbox/TEST Message 011.EML");
        al.add("https://192.12.0.3/exchange/Administrator/Inbox/TEST Message 012.EML");
        al.add("https://192.12.0.38:403/exchange/Administrator/Inbox/TEST Message 013.EML");
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 11858507
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 11858510
(slightly unprettily printed, but you'll get the idea ;-))
0
 
LVL 14

Author Comment

by:sudhakar_koundinya
ID: 11858514
following is the TestCase created with my method and your method by adding url.getProtocol() as extra

Now your  results are much faster

My Case : above 7 Seconds
Your Case : above 5 Seconds

Is it possible to reduce still??

Thanks
Sudha



import junit.framework.*;
import java.util.*;
import java.net.*;
import junit.extensions.*;

/**
 * A sample test case, testing <code>java.util.ArrayList</code>.
 *
 */
public class TestURLList extends TestCase {
      
      ArrayList vect=new ArrayList();
      public TestURLList(String name) {
            super(name);
      }

      public static void main (String[] args) {
            junit.textui.TestRunner.run (suite());
      }
      protected void setUp() {
      }

      public static Test suite() {
            return new TestSuite(TestURLList.class);
      }
      public void testURLS()
      {
            for(int i=0;i<100;i++)
            {
                  for(int j=0;j<1000;j++)
                  {
                        vect.add("http://Test"+i+"/"+j+"/"+(i+j));
                  }
            }
            //List[] array=getUrlList(vect); //7501 seconds
            List[] array=getURLList(vect); // 5258 seconds
      }

//7501
        private java.util.List[] getUrlList(java.util.List list) {
    Hashtable ht = new Hashtable();
    for (int i = 0; i < list.size(); i++) {
      String url = list.get(i).toString();
      String array[] = url.split("/");

      String domain = array[0];
      if (array[0].toLowerCase().startsWith("http")) {
        domain = array[2];
        domain = domain + array[4];
      }
      else {
        domain = domain + array[2];
      }

      ArrayList sublist = (ArrayList) ht.get(domain);
      if (sublist == null) {
        sublist = new ArrayList();
      }
      sublist.add(url);
      ht.put(domain, sublist);
    }
    return (List[]) Collections.list(ht.elements()).toArray(new ArrayList[0]);
  }

// 5258 milli seconds
   private static java.util.List[] getURLList(java.util.List list) {
          Hashtable hostToUrls = new Hashtable();
          try {
               for (int i = 0; i < list.size(); i++) {
                    URL url = new URL((String) list.get(i));
                    String hostKey = url.getHost();
                              String protocol=url.getProtocol();
                        //      System.err.println(url.getProtocol() );
                    // Could be a Set if no duplicates required
                    ArrayList urlsForHost = (ArrayList)hostToUrls.get(protocol+hostKey);
                    if (urlsForHost == null) {
                         urlsForHost = new ArrayList();
                         urlsForHost.add(url);
                         hostToUrls.put(protocol+hostKey, urlsForHost);
                    }
                    else {
                         urlsForHost.add(url);
                    }
               }
          }
          catch (MalformedURLException e) {
               return null;
          }
          int numberOfLists = hostToUrls.size();
          return (java.util.List[]) Collections.list(hostToUrls.elements()).toArray(new ArrayList[numberOfLists]);
     }
}
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 11858556
I'll think about it. So you want to keep http and https separate?

>>domain = domain + array[4];

What makes you think the above will always be in range?
0
 
LVL 14

Author Comment

by:sudhakar_koundinya
ID: 11858570
there is a possibility of giving urls some thing like this

192.12.0.39/exchange/MailBox/Mail.eml or
http://192.12.0.39/exchange/MailBox/Mail.eml
0
 
LVL 14

Author Comment

by:sudhakar_koundinya
ID: 11858579
>> So you want to keep http and https separate?

Yes including domain
0
 
LVL 21

Assisted Solution

by:MogalManic
MogalManic earned 320 total points
ID: 11858622
You can squeeze a few more milliseconds by removing the call to list.size() in the for loop.  The function is called every iteration in your loop, in your example that is 100,000 times (every time it is the same answer).

Change the for loop to the following:
final int listSize=list.size();
for (int i = 0; i < listSize; i++) {
...

You could try making all of the string delcarations 'final'.  This would allow the java compiler to make better optomization choices using temporary variables or registers.

Change the return type from List[] to Collection.  This would save the final iteration through your results to convert it to an array.  The results would be a Collection of Array lists.  Like this:
            return hostToUrls.values();  //Return the Set of values in Map
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 11858656
>>The function is called every iteration in your loop

That's not the case in fact
0
 
LVL 14

Author Comment

by:sudhakar_koundinya
ID: 11865316

>> Change the return type from List[] to Collection.

Friend,
If that is the case why I should write this method?? I could somply use either ArrayList or Vector. Thanks for helping me

How Efficiently I could change this line??
 return (java.util.List[]) Collections.list(hostToUrls.elements()).toArray(new ArrayList[numberOfLists]);

Thanks and Best Regards
Sudhakar

0
 
LVL 14

Author Comment

by:sudhakar_koundinya
ID: 11865334
>>results would be a Collection of Array lists.

My apologies, I didn't read entire comment
0
 
LVL 14

Author Comment

by:sudhakar_koundinya
ID: 11865337
Let me try your Idea
0
 
LVL 86

Assisted Solution

by:CEHJ
CEHJ earned 1500 total points
ID: 11865425
Or just return a List:

      public final static java.util.List getUrlList2(java.util.List list) {
            Hashtable hostToUrls = new Hashtable();
            try {
                  for (int i = 0; i < list.size(); i++) {
                        URL url = new URL((String) list.get(i));
                        final String key = url.getHost() + url.getProtocol();
                        // Could be a Set if no duplicates required
                        ArrayList urlsForHost = (ArrayList) hostToUrls.get(key);
                        if (urlsForHost == null) {
                              urlsForHost = new ArrayList();
                              urlsForHost.add(url);
                              hostToUrls.put(key, urlsForHost);
                        }
                        else {
                              urlsForHost.add(url);
                        }
                  }
            }
            catch (MalformedURLException e) {
                  return null;
            }
            return Collections.list(hostToUrls.elements());
      }
0
 
LVL 14

Author Comment

by:sudhakar_koundinya
ID: 11865433
I compqred your idea with my logic (modified version from CEHJ)

And here are the sample results

My code : 1484 milli seconds
Your  code : 1485 milli seconds

And I tested the codes  on  Pentium 4/256 MB Ram. running On multiple tests what I have identified is some times ur idea is fast and some times my idea is fast and that too a difference of few milli seconds. Maximum difference I have identified is +/- 25 milli seconds. This may be because of other processes that are running.  So I don't think returning as collection is right solution.

More or less we are doing the same in both codes. My idea converts the list into list array by calee  whereas ur idea converts the list into list array by caller

Here is the test case

import java.util.*;
import java.net.*;
public class ArrayTest
{
      


   private static Collection getURLList(java.util.List list) {
          Hashtable hostToUrls = new Hashtable();
          try {
               for (int i = 0; i < list.size(); i++) {
                    URL url = new URL((String) list.get(i));
                    String hostKey = url.getHost();
                         String protocol=url.getProtocol();
                    //     System.err.println(url.getProtocol() );
                    // Could be a Set if no duplicates required
                    ArrayList urlsForHost = (ArrayList)hostToUrls.get(protocol+hostKey);
                    if (urlsForHost == null) {
                         urlsForHost = new ArrayList();
                         urlsForHost.add(url);
                         hostToUrls.put(protocol+hostKey, urlsForHost);
                    }
                    else {
                         urlsForHost.add(url);
                    }
               }
          }
          catch (MalformedURLException e) {
               return null;
          }
          int numberOfLists = hostToUrls.size();
       
      return hostToUrls.values();

     }


  private static java.util.List[] getUrlList(java.util.List list) {
          Hashtable hostToUrls = new Hashtable();
          try {
               for (int i = 0; i < list.size(); i++) {
                    URL url = new URL((String) list.get(i));
                    String hostKey = url.getHost();
                         String protocol=url.getProtocol();
                    //     System.err.println(url.getProtocol() );
                    // Could be a Set if no duplicates required
                    ArrayList urlsForHost = (ArrayList)hostToUrls.get(protocol+hostKey);
                    if (urlsForHost == null) {
                         urlsForHost = new ArrayList();
                         urlsForHost.add(url);
                         hostToUrls.put(protocol+hostKey, urlsForHost);
                    }
                    else {
                         urlsForHost.add(url);
                    }
               }
          }
          catch (MalformedURLException e) {
               return null;
          }
          int numberOfLists = hostToUrls.size();
          return (java.util.List[]) Collections.list(hostToUrls.elements()).toArray(new ArrayList[numberOfLists]);
     }


public static void main(String s[])
{
   
      java.util.Date start=new java.util.Date();
ArrayList al=new ArrayList();

  for(int i=0;i<100;i++)
          {
               for(int j=0;j<1000;j++)
               {
                    al.add("http://Test"+i+"/"+j+"/"+(i+j));
               }
          }

/**
   Following is your Idea
*/
Collection c=getURLList(al);
Object[] array=c.toArray();
for(int i=0;i<array.length;i++)
{

      List array1=(List)array[i];
//      System.err.println("Array "+i+"\n");
      for(int j=0;j<array1.size();j++)
      {
            //System.err.println(array1.get(j));
      }
      
} // This had taken 1485 milli seconds


/*
    Following is My Idea
*/

/*List array[]=getUrlList(al);
for(int i=0;i<array.length;i++)
{

      List array1=(List)array[i];
//      System.err.println("Array "+i+"\n");
      for(int j=0;j<array1.size();j++)
      {
            //System.err.println(array1.get(j));
      }
      
}*/ //this had taken 1484 milli seconds

java.util.Date end=new java.util.Date();
System.err.println(end.getTime()-start.getTime());

}
}
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 11865437
i.e. return

List<List<URL>>

(per my example)

or

List<List<String>>
0
 
LVL 14

Author Comment

by:sudhakar_koundinya
ID: 11865443
CEHJ,

I think even ur second idea also gives the same results more or less.
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 11865484
Obviously try running these things repeatedly and take averages
0
 
LVL 14

Author Comment

by:sudhakar_koundinya
ID: 11865501
How Efficiently I could change this line??
 return (java.util.List[]) Collections.list(hostToUrls.elements()).toArray(new ArrayList[numberOfLists]);
0
 
LVL 14

Author Comment

by:sudhakar_koundinya
ID: 11865518
>>List<List<URL>>

The current JDK Version is 1.4.2, So that Idea may not be helpful for me
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 11865525
>>The current JDK Version is 1.4.2, So that Idea may not be helpful for me

No - i mean conceptually - not necessarily literally in code ;-)

What are your ideal return types?
0
 
LVL 14

Author Comment

by:sudhakar_koundinya
ID: 11865542
>>What are your ideal return types?

I did not get you
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 11865548
What types do you want returned from the method, i.e. what would you like to work with ideally, or does it not matter?
0
 
LVL 14

Author Comment

by:sudhakar_koundinya
ID: 11865577
Basically I need to return List[]

If this is not possible, I am open to know other solutions. Main thing is I need to have optimized solution

At runtime, method  need to work with 10000 - 1000000 mail objects(URLs) depending on conditions.

0
 
LVL 14

Author Comment

by:sudhakar_koundinya
ID: 11865591
It's time to sleep now (1.55 AM). I will discuss with you tomorrow from my office PC.

Good Night

0
 
LVL 14

Author Comment

by:sudhakar_koundinya
ID: 11865601
BTW,

What is your full name and what is your Native Place?? Just Curious

I am from INDIA
0
 
LVL 14

Author Comment

by:sudhakar_koundinya
ID: 11867264
Hi,

Back again
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 11867735
Were you asking me that q?
0
 
LVL 14

Author Comment

by:sudhakar_koundinya
ID: 11868495
nope just intimating
0
 
LVL 21

Assisted Solution

by:MogalManic
MogalManic earned 320 total points
ID: 11868611
If you change your return type from List[] to Collection, you need to change how you parse the results.  Instead of:
  Collection c=getURLList(al);
  Object[] array=c.toArray();
  for(int i=0;i<array.length;i++)
  {...
do the following:
  Collection c=getURLList(al);
  for(Iterator it=c.Iterator();it.hasNext();)
  {
     ArrayList eachList=(ArrayList) it.next();
     ...Process eachList...


The iterator will process the collection w/o moving things around in memory (In a sense you are still iterating through the original HashTable).  If you use toArray() you are effectively using the same code when you were just returning List[].

0
 
LVL 86

Expert Comment

by:CEHJ
ID: 11896632
:-)
0
 
LVL 86

Accepted Solution

by:
CEHJ earned 1500 total points
ID: 11910923
Can you confirm this is returning the right results?

      public final static java.util.List getUrlList4(final java.util.List list) {
            Hashtable hostToUrls = new Hashtable(list.size());
            try {
                  for (int i = 0; i < list.size(); i++) {
                        URL url = new URL((String) list.get(i));
                        final String key = url.getHost() + url.getProtocol() + url.getFile().split("/")[1];
                        // Could be a Set if no duplicates required
                        ArrayList urlsForHost = (ArrayList) hostToUrls.get(key);
                        if (urlsForHost == null) {
                              urlsForHost = new ArrayList();
                              urlsForHost.add(url);
                              hostToUrls.put(key, urlsForHost);
                        }
                        else {
                              urlsForHost.add(url);
                        }
                  }
            }
            catch (MalformedURLException e) {
                  return null;
            }
            return Collections.list(hostToUrls.elements());
      }
0
 
LVL 14

Author Comment

by:sudhakar_koundinya
ID: 11911009
url.getFile().split("/")[2]; do the job

But I believe it also reduces the performance as discussed  already here
http://www.experts-exchange.com/Programming/Programming_Languages/Java/Q_21109363.html

Thanks
Sudhakar
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 11915019
As mentioned in another q., could you post the link to a sizeable chunk of test data?
0
 
LVL 86

Assisted Solution

by:CEHJ
CEHJ earned 1500 total points
ID: 11922477
OK, with the (i guess smallish) sample of data you posted earlier, this seems to return the correct result and is about 5 times faster than the fastest one i tested so far (yours actually):

      public final static List getUrlList6(java.util.List urls) {
            Hashtable hostToUrls = new Hashtable();
            StringBuffer accountBuffer = new StringBuffer(32);
            for (int i = 0; i < urls.size(); i++) {
                  accountBuffer.setLength(0);
                  String url = (String) urls.get(i);
                  int forwardSlashCount = 0;
                  for (int ixChar = 4; forwardSlashCount < 5 && ixChar < url.length(); ixChar++) {
                        if (url.charAt(ixChar) == '/') {
                              forwardSlashCount++;
                        }
                        if (forwardSlashCount == 4) {
                              accountBuffer.append(url.charAt(ixChar));
                        }
                  }
                  // end for
                  accountBuffer.append(url.charAt(4));
                  final String key = accountBuffer.toString();
                  ArrayList urlsForHost = (ArrayList) hostToUrls.get(key);
                  if (urlsForHost == null) {
                        urlsForHost = new ArrayList();
                        urlsForHost.add(url);
                        hostToUrls.put(key, urlsForHost);
                  }
                  else {
                        urlsForHost.add(url);
                  }
            }
            return Collections.list(hostToUrls.elements());
      }
0
 
LVL 14

Author Comment

by:sudhakar_koundinya
ID: 11922512
Thanks CEHJ,

didn't see the comment.

I will test that and let you know
0
 
LVL 14

Author Comment

by:sudhakar_koundinya
ID: 11922597
I am seeing the changes you have done, trying to understand the code
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 11922608
Sorry - it's almost completely uncommented. More or less the same as before apart from counting slashes by String index
0
 
LVL 14

Author Comment

by:sudhakar_koundinya
ID: 11922631
Although the change  improved the performance

The code is failed for following test case  :(

array.add("http://192.12.0.38/exchange/Administrator/Inbox/TEST Message 0237.EML");
array.add("http://192.12.0.39/exchange/Administrator/Inbox/TEST Message 0237.EML");

0
 
LVL 86

Expert Comment

by:CEHJ
ID: 11922685
LOL sorry - i've forgotten in the meantime that the host is part of the key too!
0
 
LVL 14

Author Comment

by:sudhakar_koundinya
ID: 11922735
Just wrote this code

I beg your help to check if this is giving fast results than previous methods at your end

public static final List getUrlList7(List l)
    {
          //http://192.12.0.62/exchange/Administrator
          Hashtable ht=new Hashtable();
          for(int i=0;i<l.size();i++)
          {
                String url=(String)l.get(i);
                int hashCount=0;
                StringBuffer key=new StringBuffer();
                for(int j=0;j<url.length();j++)
                {
                      char c=url.charAt(j);
                      key.append(c);
                      if(c=='/')
                            hashCount++;
                      if(hashCount==5)
                            break;
                      
                }
                List array=(List)ht.get(key.toString());
                if(array==null)
                {
                      array=new ArrayList();
                      ht.put(key.toString(),array);
                }
                array.add(url);
                
          }
          return Collections.list(ht.elements());
   
          
    }

thanks,
sudhakar
0
 
LVL 14

Author Comment

by:sudhakar_koundinya
ID: 11922752
nope that is  bad code i think :( giving poor results
0
 
LVL 86

Assisted Solution

by:CEHJ
CEHJ earned 1500 total points
ID: 11922758
I'll try it. In the meantime - try this:

      public final static List getUrlList8(java.util.List urls) {
            Hashtable hostToUrls = new Hashtable(urls.size());
            StringBuffer accountBuffer = new StringBuffer(32);
            for (int i = 0; i < urls.size(); i++) {
                  accountBuffer.setLength(0);
                  String url = (String) urls.get(i);
                  int forwardSlashCount = 0;
                  int hostHashCode = 0;
                  for (int ixChar = 4; forwardSlashCount < 5 && ixChar < url.length(); ixChar++) {
                        if (url.charAt(ixChar) == '/') {
                              forwardSlashCount++;
                        }
                        if (forwardSlashCount == 2) {
                              // Compute hashcode of host
                              hostHashCode = 31 * hostHashCode + url.charAt(ixChar);
                        }
                        else if (forwardSlashCount == 4) {
                              accountBuffer.append(url.charAt(ixChar));
                        }
                  }
                  // end for
                  accountBuffer.append(url.charAt(4)).append(hostHashCode);
                  final String key = accountBuffer.toString();
                  ArrayList urlsForHost = (ArrayList) hostToUrls.get(key);
                  if (urlsForHost == null) {
                        urlsForHost = new ArrayList();
                        urlsForHost.add(url);
                        hostToUrls.put(key, urlsForHost);
                  }
                  else {
                        urlsForHost.add(url);
                  }
            }
            return Collections.list(hostToUrls.elements());
      }
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 11922843
>>nope that is  bad code i think :( giving poor results

Well it's (getUrlListSud2) about 3 times faster than your original one (getURLList):
Following results in milliseconds for 1000 invocations against the test data. If you're wondering what the '3' is it's just me checking that my method is returning the right number of lists ;-)


C:\java\tiger\lang>jr TestUrlLists 1000
Executing getURLList...
40208.0
Executing getUrlListSud2...
12739.0
Executing getUrlList8...
8481.0
3

C:\java\tiger\lang>jr TestUrlLists 1000
Executing getURLList...
39727.0
Executing getUrlListSud2...
13149.0
Executing getUrlList8...
8671.0
3

C:\java\tiger\lang>jr TestUrlLists 1000
Executing getUrlList8...
8572.0
Executing getUrlListSud2...
12437.0
Executing getURLList...
40459.0
3
0
 
LVL 14

Author Comment

by:sudhakar_koundinya
ID: 11922851
Ok I willtry that

And I  have done a small change in above code

If my assumtion is correct, this should work fast. I need to check with Exchange administrator for this.

I am assuming exchange is common.

public static final List getUrlList7(List l)
    {
          //http://192.12.0.62/exchange/Administrator
          Hashtable ht=new Hashtable();
          for(int i=0;i<l.size();i++)
          {
                String url=(String)l.get(i);
                int hashCount=0;
                StringBuffer key=new StringBuffer();
                for(int j=4;j<url.length();j++)
                {
                      //char c=url.charAt(j);
                      key.append(url.charAt(j));
                      if(key.charAt(key.length()-1)=='/')
                            hashCount++;
                      if(hashCount==3)
                            j=j+8;
                      if(hashCount==5)
                            break;
                      
                }
                List array=(List)ht.get(key.toString());
                if(array==null)
                {
                      array=new ArrayList();
                      ht.put(key.toString(),array);
                }
                array.add(url);
                
          }
            Enumeration e= ht.keys();
         while(e.hasMoreElements())
          System.err.println(e.nextElement());
          return Collections.list(ht.elements());
   
          
    }
0
 
LVL 14

Author Comment

by:sudhakar_koundinya
ID: 11922893
it  improved performance by 300 milli seconds at my side

for folllowing test case

 public Vector getURLS() {
        Vector vect=new Vector();
        for(int i=0;i<100;i++) {
            for(int j=0;j<1000;j++) {
                vect.add("http://Test"+i+"/"+j+"/"+(i+j));
            }
        }
        //List[] array=getUrlList(vect); //7501 seconds
       return vect;
}
0
 
LVL 14

Author Comment

by:sudhakar_koundinya
ID: 11922954
i mean this

public Vector getURLS() {
        Vector vect=new Vector();
        for(int i=0;i<100;i++) {
            for(int j=0;j<1000;j++) {
                vect.add("http://Test"+i+"/exchange/"+(i+j));
            }
        }
0
 
LVL 14

Author Comment

by:sudhakar_koundinya
ID: 11923005
looking at ur code

if (forwardSlashCount == 2) {
                       // Compute hashcode of host
                       hostHashCode = 31 * hostHashCode + url.charAt(ixChar);
                  }
                  else if (forwardSlashCount == 4) {
                       accountBuffer.append(url.charAt(ixChar));
                  }


what you are trying to do on above snippet??

0
 
LVL 14

Author Comment

by:sudhakar_koundinya
ID: 11923017
>>hostHashCode = 31 * hostHashCode + url.charAt(ixChar);

will this be unique for both http and https based ??
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 11923020
Your last is actually slower than your previous:

C:\java\tiger\lang>jr TestUrlLists 10000
Executing getUrlListSud3...
130687.0
Executing getUrlList8...
84761.0
Executing getUrlListSud2...
126582.0
Executing getURLList...
394578.0

>>what you are trying to do on above snippet??

First block takes the hashcode of the host name
Second block distinguishes between http & https

The variable named 'accountBuffer' should better be called 'keyBuffer'
0
 
LVL 14

Author Comment

by:sudhakar_koundinya
ID: 11923056
could you please post your whole test code??

I wanted to test the same at my end also

thanks
Sudhakar
0
 
LVL 14

Author Comment

by:sudhakar_koundinya
ID: 11923067
0
 
LVL 14

Author Comment

by:sudhakar_koundinya
ID: 11923095
>>hostHashCode = 31 * hostHashCode + url.charAt(ixChar);

Why specifically this one?? OR can we use any other logic also?? in identifying the hashcode
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 11923162
>>Why specifically this one??

That's actually the central code of String's hashCode()

>>could you please post your whole test code??

Well there's quite a lot of it, but here are the essentials:

http://www.cehjohnson.uklinux.net/aire/MethodTimer.java

, which is 'work in progress' the obvious thing that's missing (great if you can put it in) is that it should be able to return the function's return value to the caller, as well as the time (simple custom class required) and this is being called from main thus:

            Class[] methodTypes = new Class[]{java.util.List.class};
            Object[] methodArgs = new Object[]{al};
            // Put test method names here
            final String[] METHOD_NAMES = {"getUrlList8", "getUrlListSud9"};
            //final String[] METHOD_NAMES = {"testStringTokenizer", "testSplit"};
            Stack methodStack = new Stack();
            methodStack.addAll(Arrays.asList(METHOD_NAMES));
            Collections.shuffle((java.util.List) methodStack);
            double aggregatedTime = 0.0;
            // Number of invocations passed in on the command line
            final int NUMBER_OF_INVOCATIONS = Integer.parseInt(args[0]);
            while (!methodStack.isEmpty()) {
                  String currentMethodName = (String) methodStack.pop();
                  System.out.println("Executing " + currentMethodName + "...");
                  for (int i = 0; i < NUMBER_OF_INVOCATIONS; i++) {
                        aggregatedTime += MethodTimer.timeStaticMethod("TestUrlLists", currentMethodName, methodTypes, methodArgs);
                  }
                  System.out.println(aggregatedTime);
                  aggregatedTime = 0.0;
            }



0
 
LVL 14

Author Comment

by:sudhakar_koundinya
ID: 11923361
it seems ur last code is only the final way. Anyhow I will  wait for some other experts inputs for some time. If nobody  participate here then the points will be ur's ;-)

thanks
sudhakar

 
0
 
LVL 92

Expert Comment

by:objects
ID: 11923485
I'll have a look when I get in the office tomorrow, whats the current state?
0
 
LVL 14

Author Comment

by:sudhakar_koundinya
ID: 11923556
CEHJ gave me this code


     public final static List getUrlList8(java.util.List urls) {
          Hashtable hostToUrls = new Hashtable(urls.size());
          StringBuffer accountBuffer = new StringBuffer(32);
          for (int i = 0; i < urls.size(); i++) {
               accountBuffer.setLength(0);
               String url = (String) urls.get(i);
               int forwardSlashCount = 0;
               int hostHashCode = 0;
               for (int ixChar = 4; forwardSlashCount < 5 && ixChar < url.length(); ixChar++) {
                    if (url.charAt(ixChar) == '/') {
                         forwardSlashCount++;
                    }
                    if (forwardSlashCount == 2) {
                         // Compute hashcode of host
                         hostHashCode = 31 * hostHashCode + url.charAt(ixChar);
                    }
                    else if (forwardSlashCount == 4) {
                         accountBuffer.append(url.charAt(ixChar));
                    }
               }
               // end for
               accountBuffer.append(url.charAt(4)).append(hostHashCode);
               final String key = accountBuffer.toString();
               ArrayList urlsForHost = (ArrayList) hostToUrls.get(key);
               if (urlsForHost == null) {
                    urlsForHost = new ArrayList();
                    urlsForHost.add(url);
                    hostToUrls.put(key, urlsForHost);
               }
               else {
                    urlsForHost.add(url);
               }
          }
         

After discussing the different test cases. At this moment it seems to better to me
0
 
LVL 92

Expert Comment

by:objects
ID: 11923583
> At this moment it seems to better to me

how much improvement is that giving you?
0
 
LVL 92

Expert Comment

by:objects
ID: 11923602
Get rid of that Hashtable for a start and replace it with a HashMap.
And give your ArrayList an initial size (you could make it as big as the number of elements remaining in urls).
0
 
LVL 14

Author Comment

by:sudhakar_koundinya
ID: 11923632
My first posting took  7  seconds

where as CEHJ's took 1.2 seconds currently for 20 Milliion URLs  grouping

Thanks
Sudhakar
0
 
LVL 86

Assisted Solution

by:CEHJ
CEHJ earned 1500 total points
ID: 11923644
Almost twice as fast as the last one, but needs more testing for accuracy. We really need more data than the test data in the link:

public final static List getUrlList10(java.util.List urls) {
            Hashtable hostToUrls = new Hashtable(urls.size());
            for (int i = 0; i < urls.size(); i++) {
                  String url = (String) urls.get(i);
                  int forwardSlashCount = 0;
                  int hostPlusAccountPlusProtocolHashcode = 0;
                  for (int ixChar = 4; forwardSlashCount < 5 && ixChar < url.length(); ixChar++) {
                        if (url.charAt(ixChar) == '/') {
                              forwardSlashCount++;
                        }
                        if (forwardSlashCount == 2 || forwardSlashCount == 4) {
                              // Compute hashcode of host
                              hostPlusAccountPlusProtocolHashcode = 31 * hostPlusAccountPlusProtocolHashcode + url.charAt(ixChar);
                        }
                  }
                  // end for
                  if (url.charAt(4) == 's') {
                        hostPlusAccountPlusProtocolHashcode = ~hostPlusAccountPlusProtocolHashcode;
                  }
                  Integer key = new Integer(hostPlusAccountPlusProtocolHashcode);
                  ArrayList urlsForHost = (ArrayList) hostToUrls.get(key);
                  if (urlsForHost == null) {
                        urlsForHost = new ArrayList();
                        urlsForHost.add(url);
                        hostToUrls.put(key, urlsForHost);
                  }
                  else {
                        urlsForHost.add(url);
                  }
            }
            return Collections.list(hostToUrls.elements());
      }
0
 
LVL 14

Author Comment

by:sudhakar_koundinya
ID: 11923645
On higher end, no doubt that will help me.

curently I am working on exchange server project in which the current task is to download the 25 mails per minute using slide

Now currently I am able to reach the target of 15/min.

So I am currently searching for the code optimizations

thanks
sudhakar

 
0
 
LVL 14

Author Comment

by:sudhakar_koundinya
ID: 11923656
>> Get rid of that Hashtable for a start and replace it with a HashMap.

Just discussing the same here
http://www.experts-exchange.com/Programming/Programming_Languages/Java/Q_21111028.html
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 11923663
>>where as CEHJ's took 1.2 seconds currently for 20 Milliion URLs  grouping

Can you try the last one on the 20,000,000 and see if the result is accurate?
0
 
LVL 92

Expert Comment

by:objects
ID: 11923674
>> Get rid of that Hashtable for a start and replace it with a HashMap.
> Just discussing the same here

It will be faster.

0
 
LVL 14

Author Comment

by:sudhakar_koundinya
ID: 11923689
yeah that's true took 985 milli secs
0
 
LVL 92

Expert Comment

by:objects
ID: 11923699
specifying an initial capacity for the list will give you further improvements.
Give it as much as you can afford (<= number of elements left in list)
0
 
LVL 92

Expert Comment

by:objects
ID: 11923703
wheres the list coming from anyway?
and do you have any analysis on the typical breakdown of the data?

0
 
LVL 14

Author Comment

by:sudhakar_koundinya
ID: 11923751
CEHJ on several test runs

What I have identified is ur last  code is almost double the spped.

>>wheres the list coming from anyway?

Acutally we are developing the project along with the client. client's programmers are developing their own API and we need to integrate the API with our's. And we did not get their API also yet for testing. So it is difficult to get the actual data.

So what i did is, I have created test environment on 10 machines and    testing my API

>> specifying an initial capacity for the list will give you further improvements.
 I will do that.  assuming 10,000 to 50,1000 urls for single list. How much initial capacity I can give to that??

Suppose if the list contains only 10 urls and if set the size to 100, will it occupy extra  memory??



Thanks
sudhakar
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 11923841
How and where are you planning to set the capacity of the List?
0
 
LVL 14

Author Comment

by:sudhakar_koundinya
ID: 11923880
Your last code is really a good Idea :-)
0
 
LVL 92

Expert Comment

by:objects
ID: 11923978
> Suppose if the list contains only 10 urls and if set the size to 100, will it occupy extra  memory??

Sure but your goal is to make it go as fast as you can isn't it, not to use less memory :)
0
 
LVL 92

Expert Comment

by:objects
ID: 11924026
that code does not seem to behave exactly the same as the requirements were specified earlier.
Exactly what is the possible formats that the code would need to handle, and how should they be broken down.

>  client's programmers are developing their own API and we need to integrate the API with
> our's. And we did not get their API also yet for testing. So it is difficult to get the actual data.

If you really need to squeeze your msec out of this, then you should be discussing with them to ensure they pass it to you in a form that is as ideal as possible for processing.

0
 
LVL 92

Expert Comment

by:objects
ID: 11924478
Earlier you said you could also recieve the format:
192.12.0.39/exchange/MailBox/Mail.eml

Is this still the case?

What is the definition of possible formats?
0
 
LVL 14

Author Comment

by:sudhakar_koundinya
ID: 11924483
>>192.12.0.39/exchange/MailBox/Mail.eml

Sorry objects,  I forgot to mention that. This case was removed
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 11924928
>>Your last code is really a good Idea :-)

Was that directed at me sudhakar? (sometimes i don't know which of us you're addressing ;-))
0
 
LVL 21

Assisted Solution

by:MogalManic
MogalManic earned 320 total points
ID: 11925349
No matter how you tweek the same routine It is still going to bo O(n) (Since the routine is just a for loop of each item in the list, the speed processing the list is some factor times the size of the list).

If your server is on a multi CPU box, you could get factor of O((n/m)lg(n/m)) by spliting the list into a number of M sublists.  Where M is the number of CPU's.  Here is the pseudocode:

int listSize=list.size();
for (m=0; m<#ofCPU's;m++) {
  BuildThreadWorker worker[m]=BuildThreadWorker(list, listSize*m, listSize*m+listSize/m);
  Thread t= new Thread(worker[m]);
  t.start();
}

//wait for workers to complete
int done=false
while(!done) {
  done=true;
  for(int m=0;m<#ofCPU's;m++) {
    if (worker[m].isDone()==false) {
         done=false;
        Thread.yield();  //Allow threads to work
        break;//check again
   }
}
MergeResults(list, listSize/m);

class BuildThreadWorker implements Runnable
{
 
  BuildThreadWorkder(List list, int start, int end)
  {
     this.list=list;
     this.start=start;
     this.end;
  }

  public void run()
  {
      for(int i=start;i<end;i++) {
          //Modified merge sort or quicksort of sublist
      }
  }
}

For large lists, it might be faster doing some sort of divide and conquer method.
0
 
LVL 14

Author Comment

by:sudhakar_koundinya
ID: 11926776
>>Was that directed at me sudhakar?

Yes :-)
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 11927700
Actually, the last code i posted should not be used. It's late here - more later ;-)
0
 
LVL 92

Assisted Solution

by:objects
objects earned 180 total points
ID: 11927768
Also make a best guess for the initial capacity and load factor used by your Map.
0
 
LVL 14

Author Comment

by:sudhakar_koundinya
ID: 11927802
>> Actually, the last code i posted should not be used. It's late here - more later ;-)


why??
0
 
LVL 14

Author Comment

by:sudhakar_koundinya
ID: 11927807
>> load factor used by your Map.

Sorry, I didn't get you
0
 
LVL 92

Assisted Solution

by:objects
objects earned 180 total points
ID: 11927813
A lower load factor should improve the lookup times.
0
 
LVL 14

Author Comment

by:sudhakar_koundinya
ID: 11927834
Objects,
As earlier said, at following link
http://www.experts-exchange.com/Programming/Programming_Languages/Java/Q_21111028.html

other experts are getting negative results only regarding Hashmap. So I am not yet/able decide which to use
0
 
LVL 92

Expert Comment

by:objects
ID: 11927914
Both Maps use a load factor so you'd get an improvement whichever you use.
0
 
LVL 92

Expert Comment

by:objects
ID: 11927931
Can you email me a large set of test data and I'll do some benchmarking.
0
 
LVL 14

Author Comment

by:sudhakar_koundinya
ID: 11929410
>> Can you email me a large set of test data and I'll do some benchmarking

I will try to do that tomorrow. As today is Festival (Rakhee) Holiday in India

thanks
Sudhakar
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 11931009
>>why??

Let's carry on as normal and i'll explain this later:

>>Can you email me a large set of test data

Please don't email anything, but post the link instead.
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 11931151
>>other experts are getting negative results only regarding Hashmap

HashMap will give you (in theory) only slightly better results, as the cost of synchronization is not as high as is commonly imagined
0
 
LVL 92

Expert Comment

by:objects
ID: 11937616
> Let's carry on as normal and i'll explain this later

????

> Please don't email anything

email is fine for me
0
 
LVL 30

Expert Comment

by:Mayank S
ID: 11939243
Moderators generally don't like seeing e-mail IDs or questions being solved through e-mail.
0
 
LVL 92

Expert Comment

by:objects
ID: 11939271
not solving it thru email, just need the test data.
And its too much to post here so email is the easiest, and its not directly relevant to the solution anyway.
0
 
LVL 30

Expert Comment

by:Mayank S
ID: 11939343
Yeah, I agree to that :)
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 11939718
There's no problem with posting a link to test data, and absolutely no need for emails, quite apart from it being discouraged by the moderators.
0
 
LVL 14

Author Comment

by:sudhakar_koundinya
ID: 11939743
ok guys,

I have uploaded the testurls zip file at http://www.prithvisolutions.com/urllist.zip  I may remove this with in one or two days as I have uploaded at my comany's official site. I just get 2 days permission to place that zip file there.

OK coming to the scenario,

With the mail client that sends the multiple emails at a time I was able to send the mails to different exchange servers. All these are just test servers. So I am sorry because I am not able to give realtime Email URLS for this.

And using Slide API, I got the urllist that I have placed in a text file. It contains arround 16 lacs of urls list

Sridhar,
Welcome to this thread participation :-). And I hope u understand what exactly I am looking for by seeing the previous discussions.


Thanks,
Sudhakar
0
 
LVL 14

Author Comment

by:sudhakar_koundinya
ID: 11939771
I am increasing the points to 500 as you guys are too patience in helping me :-)
0
 
LVL 92

Expert Comment

by:objects
ID: 11939804
> There's no problem with posting a link to test data

And no difference :-D Except some inconvenience on suds.
No more of your petty complaints please, noone else had a problem so why should you.
It not like you even asked for any test data.
0
 
LVL 30

Expert Comment

by:Mayank S
ID: 11939821
>> And I hope u understand what exactly I am looking for by seeing the previous discussions

I've been listening to this but haven't been able to take out time for bench-marking something. BTW, I tried clicking on the link you posted (the ZIP file). It said Page cannot be displayed.
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 11939861
sudhakar - it would be useful if you could post how many lists are meant to be produced by the test data
0
 
LVL 14

Author Comment

by:sudhakar_koundinya
ID: 11940008
>> It said Page cannot be displayed.
???. It is working fine at my side. I just tested on both IE and NS
0
 
LVL 14

Author Comment

by:sudhakar_koundinya
ID: 11940106

>>Let's carry on as normal and i'll explain this later:

CEHJ, will that won't work for some conditions??
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 11940129
>>CEHJ, will that won't work for some conditions??

I'm not going to prejudice you against it at the moment - keep testing it ;-)

a. does it produce the right number of lists for the test data?
b. what *is* the right number of lists?
0
 
LVL 14

Author Comment

by:sudhakar_koundinya
ID: 11950144
Please let me know if you have the copy of that zip file. I am revoing that by this evening

thanks
Sudhakar
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 11950440
I've got it
0
 
LVL 14

Author Comment

by:sudhakar_koundinya
ID: 11951160
objects shall I remove that file??
0
 
LVL 14

Author Comment

by:sudhakar_koundinya
ID: 11962483
Ok CEHJ,

for the contents of that zip file

here is the key information

http://www.geocities.com/sudhakar_koundinya/urls.html
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 11968546
Did my code produce the right number of lists? (i haven't tested)
0
 
LVL 14

Author Comment

by:sudhakar_koundinya
ID: 11970476
CEHJ,

for 1823794 URLS your method get 1860 lists
where as actuals are 1984

Still trying to identify the problem
0
 
LVL 14

Author Comment

by:sudhakar_koundinya
ID: 11970575
CEHJ,

here are the results http://www.geocities.com/sudhakar_koundinya/results.htm

for the following test case

import java.io.*;
import java.util.*;
class ReadURLS
{
static long urls=0;
      public static void main(String[] args)
      {
      try {
        BufferedReader in = new BufferedReader(new FileReader("c:\\urllist.txt"));
        String str;
        while ((str = in.readLine()) != null) {
                    getUrlList10(str);    //cehj          
                             process(str); //sudhakar
        }
                        System.err.println(ht.size());
                        Enumeration e=ht.keys();
                        while(e.hasMoreElements())
            {
                              String str1=(String)e.nextElement();
                              System.out.println("Key  : "+str1+"   Value : "+ht.get(str1));
            }
            
                        System.err.println("Current Size :"+ht.size()+" -> No of URLS "+urls);
        in.close();
            List l=Collections.list(hostToUrls.keys());
            long value=0;
            for(int i=0;i<l.size();i++)
            {
                  Long l1=(Long)l.get(i);
                              System.out.println("Key  : "+l1+"   Value : "+hostToUrls.get(l1));
            }
            System.out.println("No of Lists :"+l.size());


    } catch (IOException e) {
    }
      }
      public static void process(String str)
      {
            if(str==null) return;
            if(str.trim().length()<=0) return;
            StringBuffer sb=new StringBuffer();
            int counter=0;
            for(int i=0;i<str.length();i+=1)
            {
                  sb.append(str.charAt(i));
                  if(sb.charAt(sb.length()-1)=='/')
                  {
                        counter++;
                        if(counter==5)
                        {
                              Long i1=(Long)ht.get(sb.toString());
                              if(i1==null)
                              {
                                    i1=new Long(1);
                              }
                              else
                              {
                                    i1=new Long(i1.longValue()+1);
                              }
                              ht.put(sb.toString(),i1);
                              break;
                        }
                  }
            }
            urls++;
            if(urls%50000==0)
            System.err.println("Current Size :"+ht.size()+" -> No of URLS "+urls);
      }
      public final static void getUrlList10(String str) {

          //for (int i = 0; i < urls.size(); i++)
                    {
               String url = (String) str;
                                 if(str==null) return;
            if(str.trim().length()<=0) return;
               int forwardSlashCount = 0;
               int hostPlusAccountPlusProtocolHashcode = 0;
               for (int ixChar = 4; forwardSlashCount < 5 && ixChar < url.length(); ixChar++) {
                    if (url.charAt(ixChar) == '/') {
                         forwardSlashCount++;
                    }
                    if (forwardSlashCount == 2 || forwardSlashCount == 4) {
                         // Compute hashcode of host
                         hostPlusAccountPlusProtocolHashcode = 3 * hostPlusAccountPlusProtocolHashcode + url.charAt(ixChar);
                    }
               }
               // end for
               if (url.charAt(4) == 's') {
                    hostPlusAccountPlusProtocolHashcode = ~hostPlusAccountPlusProtocolHashcode;
               }
               Long key = new Long(hostPlusAccountPlusProtocolHashcode);
               Long urlsForHost = (Long) hostToUrls.get(key);
               if (urlsForHost == null) {
                    urlsForHost = new Long(1);
                    //urlsForHost.add(url);
                    hostToUrls.put(key, urlsForHost);
               }
               else {
                    urlsForHost=new Long(urlsForHost.longValue()+1);
                                                  hostToUrls.put(key, urlsForHost);
               }
                     _urls++;
                     if(_urls%100000==0)
                     System.err.println("Key :"+key+" Value :"+urlsForHost+"Urls Count :"+_urls);
          }


     }

    static      Hashtable hostToUrls = new Hashtable();
          static      Hashtable ht= new Hashtable();
            static long _urls=0;

}
0
 
LVL 14

Author Comment

by:sudhakar_koundinya
ID: 11979085
CEHJ,

any updates??
0
 
LVL 14

Author Comment

by:sudhakar_koundinya
ID: 11979088
Hello objects,

Are you able to find some benchmark solution??
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 11979759
>>
for 1823794 URLS your method get 1860 lists
where as actuals are 1984
>>

Is that the one using String as the hash key or Integer?
0
 
LVL 14

Author Comment

by:sudhakar_koundinya
ID: 11979760
that is Integer
0
 
LVL 86

Assisted Solution

by:CEHJ
CEHJ earned 1500 total points
ID: 11979773
What did i tell you? ;-) Try the one using String(Buffer)
0
 
LVL 14

Author Comment

by:sudhakar_koundinya
ID: 11979783
But the later one is faster know??. Whay it is failing to get the right hashcode??
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 11979805
I know it's faster, but there's an  problem inherent to it: Integer's hashCode is essentially 'synonymous' with its equals method. That means that if a hashcode collision *does* occur, and this is more likely with such large amounts of data as you have [too large for me to handle here unfortunately], then the equals method will not 'save' it from replacing the previous entry in the table. This is not the case with String, as the String needs to be equal as well as returning the same hashcode before the previous entry will get kicked out of the table.

Of course, i'm not certain that this is why the anomaly occurs - it could be an error in my code ;-)
0
 
LVL 14

Author Comment

by:sudhakar_koundinya
ID: 11980351
OK CEHJ,

I end up with ur StringBuffer related code

thanks
sudhakar
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 11981955
:-)
0

Featured Post

The top UI technologies you need to be aware of

An important part of the job as a front-end developer is to stay up to date and in contact with new tools, trends and workflows. That’s why you cannot miss this upcoming webinar to explore the latest trends in UI technologies!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

By the end of 1980s, object oriented programming using languages like C++, Simula69 and ObjectPascal gained momentum. It looked like programmers finally found the perfect language. C++ successfully combined the object oriented principles of Simula w…
Are you developing a Java application and want to create Excel Spreadsheets? You have come to the right place, this article will describe how you can create Excel Spreadsheets from a Java Application. For the purposes of this article, I will be u…
Viewers will learn about if statements in Java and their use The if statement: The condition required to create an if statement: Variations of if statements: An example using if statements:
This video teaches viewers about errors in exception handling.
Suggested Courses

721 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question