We help IT Professionals succeed at work.

Check out this week's podcast, "Dairy Farms to Databases: Community's Hand in Technology"Listen Now

x

i have an array with randomly placed numbers from 1 to 1000000 ,there is an integer duplicated in the array what is the easiest , perf. efficient way to find that value

359 Views
Last Modified: 2016-02-18
Hi,

I have been recently asked a question by someone and it drove me crazy as i could not give an efficient answer to a question.

so the trick is; there is an array with numbers  placed from 1 to 1 million randomly but one of the integers inside the array has been duplicated and i need to find this value in the most efficient way. what could this be?

Obvious answer is to get the first value of the array and compare this with the rest and then move to the next one etc.. but this is not what is wanted, there needs to be a way to very efficiently retrieve this but how?

could you help me on this puzzle like algorithm question? thanks
Comment
Watch Question

ozo
CERTIFIED EXPERT
Most Valuable Expert 2014
Top Expert 2015

Commented:
If the array contains 1000001 entries, with one each of all integer values from 1 to 1000000, except for one value that is duplicated, then you can take the sum of all elements and subtract 500000500000
Dave BaldwinFixer of Problems
CERTIFIED EXPERT
Most Valuable Expert 2014

Commented:
In databases, the 'standard' method is to sort the rows on the field you want to find the duplicates in.  Then duplicates will be in consecutive rows.  If you can sort the array, you can look for the occurrence of the same value in two consecutive positions.  I don't know C# but PHP has an array_sort function that I can use.  Sort functions are usually much more efficient than going thru the array and making comparisons to each and every element.  Then I can use 'foreach' to go thru the sorted array to find items that are duplicated.
ozo
CERTIFIED EXPERT
Most Valuable Expert 2014
Top Expert 2015

Commented:
If the array can contain some missing values as well as a duplicate value, then you might use a hash table or a bit vector to record numbers that you have seen and detect a duplicate when you find a number you have seen before
CERTIFIED EXPERT
Commented:
This one is on us!
(Get your first solution completely free - no credit card required)
UNLOCK SOLUTION
Michael FowlerSolutions Consultant

Commented:
You could move the array to a dictionary and either stop when you get an error or even get a count of the number of duplicates eg

Dictionary dict = new Dictionary<int, int>();    
foreach (int i in array)
{
   if (dict.ContainsKey(i))
      dict[i]++;
    else
        dict.Add (i, 1)
 }

Open in new window

Dictionary dict = new Dictionary<int, int>(); 
int dup = 0   
foreach (int i in array)
{
     if (dict.ContainsKey(i))
         dup = i;
         break;
}

Open in new window

ozo
CERTIFIED EXPERT
Most Valuable Expert 2014
Top Expert 2015
Commented:
This one is on us!
(Get your first solution completely free - no credit card required)
UNLOCK SOLUTION
Michael FowlerSolutions Consultant

Commented:
Second code block from my post is wrong. 2nd time lucky

Dictionary dict = new Dictionary<int, int>();
int dup = 0 
foreach (int i in array)
{
   if (dict.ContainsKey(i))
      dup = i;
      break;
    else
       dict.Add (i, 1)
 }

Open in new window

Hello

You can find the duplicate values using the following algorithm.

int my_array[] = { 50, 51, 32, 50, 7, 3, 7 };
	int size = sizeof(my_array) / sizeof(my_array[0]);

	for (int i = 0; i < size; i++)
	{
		for (int j = i + 1; j < size; j++)
		{
			if (my_array[i] == my_array[j])
				printf(" %d ", my_array[i]);
		}
	}

Open in new window


But also you can use 'Set' Associative containers which will ignore the duplicate values
AndyAinscowFreelance programmer / Consultant
CERTIFIED EXPERT

Commented:
I'd have suggested a dictionary - lot of the work already done for you - and check as you add to the dictionary if that key is in use.  BUT, Michael74 has already suggested that.
I note you request efficient.
You also talk of 1 to 1 million as the range of numbers being stored in the array but it is no clear how many slots there are in your array.  10, 100, 1000, 100,000,000 (extreme but not excluded from your wording as I understand it)
If the array has relatively few values then the 'double' looping is easy to code and understand and probably pretty much as efficient as the dictionary.  However with 100 million entries in the array (that still leaves a reasonable possibility of some numbers only appearing once) then that is very poor on performance.
Richard LeeSoftware Enthusiast
CERTIFIED EXPERT

Commented:
Hi,

Here is an approach in C# that is more commonly used in an RDBMS. Its effective and quite fast. Simply using Grouping.

var Duplicates = Numbers
	.GroupBy(x => x)
	.Where(x => x.Count() > 1)
	.Select(x => x.Key)
	.ToArray();

Open in new window


Any numbers found in the resulting array are duplicates. If you want to know how often each number of duplication then you can try the following query.

var Duplicates = Numbers
	.GroupBy(x => x)
	.Where(x => x.Count() > 1)
	.Select(x => new { Number = x.Key, Count = x.Count() })
	.ToArray();

Open in new window


A different approach from the previous answers but very fast.

Hope this helps.
CERTIFIED EXPERT
Most Valuable Expert 2011
Top Expert 2015
Commented:
This one is on us!
(Get your first solution completely free - no credit card required)
UNLOCK SOLUTION
Richard LeeSoftware Enthusiast
CERTIFIED EXPERT

Commented:
Bringing the figures to the table might clarify matters a bit. Using an example of 1 million randomly generated numbers such as the following yielded an elapsed time of 531.0978ms.

class Program
{
	static void Main(string[] args)
	{
		var rand = new Random((int) DateTime.Now.Ticks);
		var sw = new Stopwatch();

		var array = new int[1000000];

		for (int i = 0; i < array.Length; i++)
			array[i] = rand.Next(1000000);

		sw.Start();
			
		var duplicates = array
			.GroupBy(x => x)
			.Where(x => x.Count() > 1)
			.Select(x => x.Key)
			.ToArray();

		sw.Stop();

		Console.WriteLine("Elapsed Time: {0}", sw.Elapsed.TotalMilliseconds);
		Console.ReadKey();
	}
}

Open in new window


Hope this helps clarify matters.
CERTIFIED EXPERT
Most Valuable Expert 2011
Top Expert 2015

Commented:
@DaTribe

Oh, it's on like Donkey Kong now...    ; )

My code is slower because of the printing to the console. Take that bit out, and you're looking at around 15 ms (on my i7-3770).
CERTIFIED EXPERT

Commented:
1/2 second is a very long time, IMO.
Richard LeeSoftware Enthusiast
CERTIFIED EXPERT

Commented:
Okay I concede. :)
ozo
CERTIFIED EXPERT
Most Valuable Expert 2014
Top Expert 2015

Commented:
I'm still wondering if there is a reason the question author said "randomly placed numbers from 1 to 1000000" instead of "random numbers from 1 to 1000000"

Author

Commented:
@kaufman ,thank you for your great help and code of course !! it was really helpful
@ozo ,thanks for your help
@phoffric, thank you for your support

Gain unlimited access to on-demand training courses with an Experts Exchange subscription.

Get Access
Why Experts Exchange?

Experts Exchange always has the answer, or at the least points me in the correct direction! It is like having another employee that is extremely experienced.

Jim Murphy
Programmer at Smart IT Solutions

When asked, what has been your best career decision?

Deciding to stick with EE.

Mohamed Asif
Technical Department Head

Being involved with EE helped me to grow personally and professionally.

Carl Webster
CTP, Sr Infrastructure Consultant
Empower Your Career
Did You Know?

We've partnered with two important charities to provide clean water and computer science education to those who need it most. READ MORE

Ask ANY Question

Connect with Certified Experts to gain insight and support on specific technology challenges including:

  • Troubleshooting
  • Research
  • Professional Opinions
Unlock the solution to this question.
Join our community and discover your potential

Experts Exchange is the only place where you can interact directly with leading experts in the technology field. Become a member today and access the collective knowledge of thousands of technology experts.

*This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

OR

Please enter a first name

Please enter a last name

8+ characters (letters, numbers, and a symbol)

By clicking, you agree to the Terms of Use and Privacy Policy.