macro to delete repeated records

hi, i need macro to delete repeated records in excel file. Thanks
reoromanAsked:
Who is Participating?
 
gowflowConnect With a Mentor Commented:
The following file should answer your question about duplicates.

To make it work, move your real data in that workbook (sheet should be called Training) then run the button remove duplicates and depending how large the data is it will take time it will give you  a trace on what it is doing step by step. As an indication for data of 500000 rows it should take almost 1/2 hour to complete giving you full trace.

The way it works it will not perform removing records in your live sheet buyt it will create a Temp sheet and put the new unique rows there.

It will take into concideration the last Column that has Y also as part of the duplicates if you do not want it to be affected then before runnning is just Clear all the Y that are in that column.

Try it and let me know.
gowflow
RemoveDuplicates.xlsm
0
 
gowflowCommented:
well depending on the file you may use the removeduplicates propert and it will only keep unique values like following


activesheet.range("C:C").Removeduplicates

Will in fact only keep unique values in Col C.

But As I have been following your specific case and as you are dealing with high volume of records RemoveDuplicates will not work in your case.

Can you pls put the specific sample table you need rows to be unique /
gowflow
0
 
reoromanAuthor Commented:
ok, before running the code, I have two questions:

i have rows from A to AQ, i did not create empty row AV-1 to AV-44, should i do this first?

what about if the last column is AP and it has values from 0 to 4, or from 1 to 5. (like in the old work) will the code work too?

thanks
0
Cloud Class® Course: Python 3 Fundamentals

This course will teach participants about installing and configuring Python, syntax, importing, statements, types, strings, booleans, files, lists, tuples, comprehensions, functions, and classes.

 
gowflowCommented:
Please create the first row AV1 etc .... and here do not create the last column Y and regardless if AP or AZ or 1 or Normal .... it does not matter as it is cheking the ENTIREROW and see if another ENTIREROW is similar then it will only keep 1

Isn't this what you want ??? the ENTIRE ROW to be checked ???
gowflow
0
 
reoromanAuthor Commented:
yes boss, that is what i need. you got it right... i will do it now.. cheers
0
 
gowflowCommented:
Your welcome. Anctiously waiting for your feedback on this.
gowflow
0
 
reoromanAuthor Commented:
my feed back is known ...
0
 
reoromanAuthor Commented:
amazing, surprising, astounding, Thanks a lot
0
 
gowflowCommented:
Your welcome gret to know it worked for you. Do not spare me if you need more help you can post the link in here.
Rgds/gowflow
0
 
reoromanAuthor Commented:
yes, i need something new.. i will put the link here after i post it.

the thing is. i need to apply mathematical formula in the data set.. i can implement it manulay... and this takes time.. I will get back to you with full detials.. Thanks
0
 
gowflowCommented:
ok fine.
gowflow
0
 
gowflowCommented:
Did you post the question you needed help with ?
gowflow
0
 
reoromanAuthor Commented:
hi... this is the formula, but i need to explain it well in one example to you.. so you might implement it well.. i need some times for that.
equation.JPG
0
 
reoromanAuthor Commented:
i did not post the question yet, before that i will inform you boss
0
 
gowflowCommented:
WOW it reminds me of my University Math courses !!!
ok just one question this is a formula but it applies to what columns ? as you have 43 !!!

I prefer we initiate conversation here till the idea is pretty clear then you can post .. or else we may go chaos !!

gowflow
0
 
reoromanAuthor Commented:
I think we can go for easier formula like this:
x=x-min(Xi) / max (x) - min (x)
=(D11-55)/(90-55)  in my case the 77 number was in D11 cell.
before    after
77            0.6286
55            0
84            0.8286
90            1

so as you can see above the 77 has become 0.6286 after applying the formula.
* I changed the decimal number places into 4 from format cells.

this formula should be implemented in all columns from (1 to 41)... 42 and 43 is not included.

so you have to calculate the mean value for every column, the maximum and minimum value, so that you can implement it in the formula.

if you have more question please ask. Thanks
0
 
gowflowCommented:
so let me get this right.

You mean to say you would have a worksheet that has 41 columns and X rows and they would all be in hte format
77 82 36 90 45 ....
All the worksheet would be this way ? not the present worksheet that has values like 0 0.2334  1 0.3453 etc ...

1) Is that what you want to apply the formula ?
2) If my previous assumption is correct then you would need this run to be done ONLY once right ? to convert the numbers into fractions using the formula ?
3) Again if my previous assumption is correct then let me get the essence of your formula then
Say we have following Columns and rows
A          B                 C   ....
20        30               70
12        06               43
25        92               67
36        45               28
22        29               34  ....

then your formula starting Col A for A1 whcih is 20 would be as follows:
MIN of all values in Col A = 12
MAX of all values in Col A = 36
So the A1 that is 20 would be (20-12)/(36-12) = 8/24 = 0.3333

and with the same logic for B1 it would be
(30-6)/(92-06) = 24/86 = 0.2791

etc ...

ok now if my previous assumptions are all correct then when we get to row2 we have something like this:

A                 B                   C   ....
0.3333        0.2791          1
12               06                  43
25               92                  67
36               45                  28
22               29                  34  ....

Now A2 = 12 the formula would be:
(12-0.3333)/(36-0.3333) = 11.6667/35.6667 = 0.3271

Is this what you are trying to acheive ?

Finally if all my previous assumptions are correct then I suggest you do not 'go for an easier formula' as you stated but go for the final formula like you posted. Here I would only need 1 clarification which is

formula is
Xi = New min + ( New max - New min) / ((Xi - Xmin)/(Xmax - Xmin))

what is meant by New min (I presume it is the new values that we have converted like in the above example it would be for col A 0.3333 ?? and case we had more rows produced to the New max would be the new Maximum of these values converted ? is that what is meant by this ? and Xmax Xmin would be the Max and Min of the values prior to conversion ???

please advise and sorry for this long explanation as need to have the details cleared.
gowflow
0
 
reoromanAuthor Commented:
Not like that … let me be more clear.
Say we have following Columns and rows
     A          B            C   .... AO.  (AP and AQ is we do not touche them)
1- 20        30          70
2- 12        6            43
3- 25        92          67
4- 36        45          28
5- 22        29          34  ....

X=x-min(Xi) / max (x) - min (x)

Then the formula starting on Col A for A1 = 20 would be as follows:
MIN of all values in Col A = 12
MAX of all values in Col A = 36
So the A1 that is 20 would be (A1-12)/(36-12) = 8/24 = 0.3333
and with the same logic for A2 and A3 and A4 ……etc it would be
(A2-12)/(36-12) = 24/86 = 0.2791
etc ... for A3, A4, A5 till the end of column A, I usually use hold and drag the output of A1, which will automatically give me the output of A2, A3, A4,….. etc like this
20      0.333333
12      0
25      0.541667
36      1
22      0.416667


Then for column B
MIN of all values in Col A = 6
MAX of all values in Col A = 92
So: X=x-min(Xi) / max (x) - min (x)
= (B1-6)/(92-6)= 0.27907  and here I will hold and drag the output down it will automatically give me the B2, B3, B4, B5….etc
30      0.27907
6      0
92      1
45      0.453488
29      0.267442

the last two columns i need them to be transferred in the new output sheet untouched.
Please tell me if it is not clear. Thanks
0
 
reoromanAuthor Commented:
regarding the first formula, i still do research about it. but for more information about it, please check the attached picture read here:
http://wiki.answers.com/Q/What_is_min-max_normalization

but for now we go for the formula:

X=x-min(Xi) / max (x) - min (x)

Thanks
min-max.JPG
0
 
gowflowCommented:
ok for
but for now we go for the formula:

X=x-min(Xi) / max (x) - min (x)

But you did not comment on my reasoning is it correct ???
If yes then I will need a sample data to be able to build the macro so I can test it correctly.
gowflow
0
 
reoromanAuthor Commented:
hi... i did comment on your reasoning... check the above comments..
by: reoromanPosted on 2013-02-05 at 19:36:49ID: 38856207
0
 
reoromanAuthor Commented:
sample data ... oh... yes... sure ... i will upload it now.. just give me 5 to 10 minutes please
0
 
gowflowCommented:
Yes sorry just saw it I thought it was my post as just looked the couple of last lines. So you do it by column not by row ?
gowflow
0
 
reoromanAuthor Commented:
yes by column.

here is the file. it took me long time until i prepare it, that is why i am late.

http://www53.zippyshare.com/v/70231260/file.html
0
 
gowflowCommented:
Ok I worked on it and just to make sure I got this right you want the values after the formula to be decimal 4 digits right ?
gowflow
0
 
reoromanAuthor Commented:
yes. thank you.
0
 
gowflowCommented:
ok fine you can shoot the question then,
gowflow
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.