I need excel macro to implement the flowing formula
X=x-min(Xi) / max (x) - min (x)

Say we have following Columns and rows
A B C .... AO.
1- 20 30 70
2- 12 6 43
3- 25 92 67
4- 36 45 28
5- 22 29 34 ....

X=x-min(Xi) / max (x) - min (x)

Then the formula starting on Col A for A1 = 20 would be as follows:
MIN of all values in Col A = 12
MAX of all values in Col A = 36
So the A1 that is 20 would be (A1-12)/(36-12) = 8/24 = 0.3333
and with the same logic for A2 and A3 and A4 ……etc it would be
(A2-12)/(36-12) = 24/86 = 0.2791
etc ... for A3, A4, A5 till the end of column A, I usually use hold and drag the output of A1, which will automatically give me the output of A2, A3, A4,….. etc like this
20 0.333333
12 0
25 0.541667
36 1
22 0.416667

Then for column B
MIN of all values in Col A = 6
MAX of all values in Col A = 92
So: X=x-min(Xi) / max (x) - min (x)
= (B1-6)/(92-6)= 0.27907 and here I will hold and drag the output down it will automatically give me the B2, B3, B4, B5….etc
30 0.27907
6 0
92 1
45 0.453488
29 0.267442

Is that what you want ?
Install in here your data file and activate the button Formula I warn you it will take al lot of time as you hv lots of data. I tried by all means to have formula being populated automatically but without success.

Let it run for some 1000+ records and check the results in Col A if it is ok the just press again on the F5 in the code and it will continue running it will go thru each Column till it hit all the columns except the last 2.

To stop it when it is processing press on CTRL and BREAK at the same time.

If you're using Excel 2007/2010/2013.... =IFERROR((A1-MIN(A$1:A$5))/(MAX(A$1:A$5)-MIN(A$1:A$5)),"")
...or, to support earlier releases,... =IF(MAX(A$1:A$5)-MIN(A$1:A$5)=0,"",IFERROR((A1-MIN(A$1:A$5))/(MAX(A$1:A$5)-MIN(A$1:A$5)),""))

Are these formulas not fast enough? How many columns do you have? Are there other formulas? Conditional Formatting? Which version of Excel? How old is the PC?

Edit: I tested the second (slower) formula against 16,384 columns (and 5 rows) - as near as I could tell, the paste and calculation were instantaneous. My PC is a two year-old business laptop.

Regards,
Brian.

0

reoromanAuthor Commented:

MY is PC is new, PC is desktop (HP) Core i7 with 3.4GHZ and 8GB RAM.

I have excel 2010 64bit installed on windows 7 pro 64 bit.

I have two data set one with 494021 records and the other with 311029 records.

Apologies, but it's not obvious how you want the formula modified for the extra rows. Currently it's taking a block of 5 rows and analysing them over five rows, 20 rows down. Perhaps you could give me one column - it's the formulas I want, so by all means clear the data cells.

Past my bedtime, so I'll be back to this in the morning.

That link ends up trying to download an executable, which I'm not willing to run. Probably no great loss as I'm running 32bit Windows and Excel. If I had to process this data, I'd probably do it a chunk at a time, immediately replacing each chunk's formulas by their values. I'd also store the results in a separate file - that way I could analyse the results without having to drag the original data along. (This also gives you the possibility of merging the two datasets.)

Regards,
Brian.

0

reoromanAuthor Commented:

dear gowflow. for column A only, the code has only process 67959 out of 494021 in 15 minutes before i stop it. that means it needs whole day for the all 41 columns.

Yes I know I tried so loong to get it to work by formula but did not succeed so this is why I need from you the following:

Tell me about hte formula and how you do it:
You put in A2 this ?
=(A2-Min("A2:A496202"))/(Max("A2:A496202")-Min("A2:A496202"))
and drag it down ?
When you do this manually I mean I need the formula that you put when you do it manually.

As if it is like this then you are generating a Circular Reference error as the cell refrence itself.

Pls let me know.
gowflow

0

reoromanAuthor Commented:

I do it like this in SPSS. First i make SPSS calculate the maximum and minimum values, then i copy the results to this SPSS syntax and run it. it will do the command fast, but i have to deal with divide on zero error after that. here is the command.

Well I looked again at your dataset that you posted and this is not how you explained it at first !! You are coing each row all the columns where before you mentioned you do a Col A all rows then Col B all rows ....
It looks diffrent now
is this how you do it ? Row1 all columns then row2 all columns etc .... ?
gowflow

redmondb there are 1/2 a million row with 41 columns and ranges evaluation on the entire column MAX and MIN .... When I put the formula my PC SCREAMMSSSS WOOOOOOWWWW I am hurt in pain it dies like mad.

Hv tried to get the values instead of formula ... no success all turns to zeroes.

The best I had acheive is what I posted earlier is a turtle type of walk slow ... but sure will get there after 24 hours .!!!!

Good luck if you find a better way, I spend till 4 AM ystday on this with no luck ! My last question was if possibly this is a built in formula in Excel then it should be no sweat. Not too familiar with these probability/financial functions will wait for reoroman reply to my last post.

OK, I tried out the ideas I mentioned earlier. Please do the following...
- Create a new file and add the code below to it. Edit the Chunk_III macro to reflect your source book, sheet and range.
- Open your source book.
- Run the Chunk_III macro.

The macro adds a sheet and, column by column, writes the formulas and replaces them by their values. Even with 32bit Windows and Excel, this works fine (with a slight workaround), so it should be a walk in the park for reoroman's PC. (For 500,000 rows, my PC takes 7 or 8 seconds for each of the 39 columns.)

Edit: Gowflow, pending an answer from reoroman, I'm still using the original formulas. BTW, if Min and Max should be using the entire range then my approach would be even faster - I'd only need to do 39 of each!

Edit2: FWIW, my output file is a 149MB xlsb - or a 210MB xlsm.

Option ExplicitSub Chunk_III()Dim i As LongDim xRange As RangeDim xSheet_List As StringDim xRange_List As StringDim xBook_List As StringDim xSource As WorksheetxBook_List = "Lots_of_Columns.xlsb"xSheet_List = "Sheet1"xRange_List = "$A$2:$A$500001"If Not Book_Exists(xBook_List) Then MsgBox ("Please open """ & xBook_List & """ and retry - run cancelled.") Exit SubEnd IfIf Not Sheet_Exists(xSheet_List, xBook_List) Then MsgBox ("Sheet """ & xSheet_List & """ not found in """ & xBook_List & """ - run cancelled.") Exit SubEnd IfIf Range(xRange_List).Rows.Count Mod 5 <> 0 Then MsgBox ("The source range (" & xRange_List & ") must be a multiple of 5 rows - run cancelled.") Exit SubEnd IfSet xSource = Sheets.AddxSheet_List = "[" & xBook_List & "]" & xSheet_List & "!"Range("A2") = "=IFERROR((" & xSheet_List & "A1-MIN(" & xSheet_List & "A1:A5))/(MAX(" & xSheet_List & "A1:A5)-MIN(" & xSheet_List & "A1:A5)),"""")"Range("A3") = "=IFERROR((" & xSheet_List & "A2-MIN(" & xSheet_List & "A1:A5))/(MAX(" & xSheet_List & "A1:A5)-MIN(" & xSheet_List & "A1:A5)),"""")"Range("A4") = "=IFERROR((" & xSheet_List & "A3-MIN(" & xSheet_List & "A1:A5))/(MAX(" & xSheet_List & "A1:A5)-MIN(" & xSheet_List & "A1:A5)),"""")"Range("A5") = "=IFERROR((" & xSheet_List & "A4-MIN(" & xSheet_List & "A1:A5))/(MAX(" & xSheet_List & "A1:A5)-MIN(" & xSheet_List & "A1:A5)),"""")"Range("A6") = "=IFERROR((" & xSheet_List & "A5-MIN(" & xSheet_List & "A1:A5))/(MAX(" & xSheet_List & "A1:A5)-MIN(" & xSheet_List & "A1:A5)),"""")"Sheets.AddApplication.ScreenUpdating = False For i = 0 To 38 Debug.Print i & " - " & Now() xSource.Range("A2:A6").Copy Destination:=Range(xRange_List).Offset(0, i) Range(xRange_List).Offset(0, i).Formula = Range(xRange_List).Offset(0, i).Value Next Application.DisplayAlerts = False xSource.Delete Application.DisplayAlerts = TrueApplication.ScreenUpdating = TrueEnd SubFunction Book_Exists(xBook_Name As String) As BooleanBook_Exists = FalseOn Error Resume Next Book_Exists = (Workbooks(xBook_Name).Name = xBook_Name)On Error GoTo 0End FunctionFunction Sheet_Exists(xSheet_Name As String, Optional xBook As String) As BooleanIf xBook = "" Then xBook = ActiveWorkbook.NameSheet_Exists = FalseOn Error Resume Next Sheet_Exists = (Workbooks(xBook).Sheets(xSheet_Name).Name = xSheet_Name)On Error Resume NextEnd Function

Dear Brian, could you please send me the excel sheet ready with your code, so i can implemented directly. i made mistakes when i do as you mentioned above. thanks

0

reoromanAuthor Commented:

Hello Gentlemen, Sorry I was sleeping
Quotes 1 [gowflow]: Well I looked again at your dataset that you posted and this is not how you explained it at first !! You are coing each row all the columns where before you mentioned you do a Col A all rows then Col B all rows ....
It looks diffrent now
is this how you do it ? Row1 all columns then row2 all columns etc .... ?
gowflow
Answer: No, Row1 all columns then row2 all columns etc... , is not the way that formula does.
This should work on the columns now rows.
SPSS has columns from v1 to v41, and the bellow command will create beside those columns other columns from nv1 to nv41 by implementing the following formula:
Computer nv1 = (v1-(min value between data from 1 to 494021) which is 0.) / (The maximum value between data from 1 to 494021) which is 58329) - (min value between data from 1 to 494021) which is 0.)
so the minimum is 0 and the maximum is 58329 is static in V1 and it will be implemented in all data from 1 to 494021 on the V1 and will be saved in new column named NV1 see …
COMPUTE NV1 = ( V1 - 0 ) / ( 58329 - 0 )

Quotes 2 [gowflow]: One mor question what is the name of this function
A2 - MIN(Xi)/MAX(Xi)-MIN(Xi) ?? Maybe it is a preset function in Excel ??
gowflow
I do not think the formula is preseted in excel. It is used for min-max normalization is as equation:
Xn = (X0 – Xmin) / (Xmax – Xmin)
Where,
Xn = new value for variable X
X0 = current value for variable X
Xmin = minimum value in data set
Xmax = maximum value in data set

Please post the name of the file, the sheet and the range containing the data.

Edit: And please see earlier posts about exactly what the formula should be for 500,000 records v. 5. If my request isn't clear, please let me know.

Thanks,
Brian.

0

reoromanAuthor Commented:

The name of the file is: train
here is the data http://www74.zippyshare.com/v/46259397/file.html
data range is from A to AO column
the formula should be for 494021 records in every column from A to AO, did i got you right?

(1) What is the extension of the file? Edit: And the sheet name?
(2) As I mentioned, your link requires downloading and running an exe, which I'm not willing to do.
(3) Again, I need to know what formula you want! The one already provided worked a range of 5 - is that what you want for the 500,000 rows? Ideally, I'd like you to post the first column of the formulas, but if you at least gave me a half-dozen rows that might help.

I do not know where you click on the previous link so you got to download an executable file.
check the attached picture where you should click, as i tried it and it works fine.. any way the file .csv and i put it in rar as it is big file. zip.JPG

Thanks, reoroman, but I still need answers to my questions. Are they unclear?

(Edit: This has to run on your PC - I'm not uploading a 150MB file!)

0

reoromanAuthor Commented:

what do you mean by the sheet name? i name it train

quote [Again, I need to know what formula you want! The one already provided worked a range of 5 - is that what you want for the 500,000 rows?]

I wrote that .. it is Xn = (X0 – Xmin) / (Xmax – Xmin)
Where,
Xn = new value for variable X
X0 = current value for variable X
Xmin = minimum value in data set
Xmax = maximum value in data set

0

reoromanAuthor Commented:

please upload the new formula with your code here as gowflow did, and i will attach the data to it

0

reoromanAuthor Commented:

you do not need to upload any 150MB files, i sent you the data so you can check it as you want.

The name of the file is "train.csv", the name of the sheet is "train" and the range is "$A$1:$AO$494021".

Please post the formulas you'd expect to see in A494022 and A494023. Is that a problem?

Edit: Apologies, I just saw your previous post. What checking can I do with your data? The file gave me the file and sheet names plus the range, but what else does it give me?

I'm faitly certain that the formulas in the attached are wrong - the only thing going for them is that they produced the same answers as the examples in your original post. (A side effect is that the total no. of rows must be a multiple of 5 so the last row is ignored.)

Something I didn't make clear is that the range specified should just be for column A - the code assumes that the last column is AO.

There are a lot of blank cells in the output - Divide by Zero errors.

Finally, just open your csv file, open this file and press the blue button.

I should expect answers for all records in column A in new records say for example name AR.

example:
A494020 which its value is 0 would be after formula like this: =(A494020-0)/(58329-0) = 0
and
A494021 which its value is 0 would be =(A494021-0)/(58329-0) = 0

Xn = (X0 – Xmin) / (Xmax – Xmin)Where, Xn = new value for variable XX0 = current value for variable XXmin = minimum value in data setXmax = maximum value in data set

Now to go back to the data you posted say we take Col A and you have following values:
and suppose we only have 8 rows but in your case it would go to the maximum rows you have say 496000+

Col A
Row 2 20
3 0
4 60
5 25
6 1
7 18
8 0

When we want to convert A2 it is = 20 - Min(Row 2 to Row 8) / Max(Row 2 to Row 8) - Min(Row2 to Row 8)
A2 = 20 - 0 / 60 - 0 = 20 / 60 = 0.3333

so after running A2 we have now
Col A
Row 2 0.3333
3 0
4 60
5 25
6 1
7 18
8 0

Couple of things...
(1) It may be a 32bit thing, but it's probably a good idea to recycle Excel before starting.
(2) On my PC, at least, it appears that one big Max and Min from VBA is appreciably slower than 500,000 smaller ones from formulas. I may be able to fix that by simply moving the Max and Min's to the sheet.

More crossing posts - do you now have your own results?

Brian.

0

reoromanAuthor Commented:

God·damn excel. your excel code is good, it works like a hero.. thanks.. but i think after this experiment i will use excel for X O games next time. it took 12 minutes until it finished just column A. so that means it would take 12*41=492 minutes / 60 = 8.2 hours to finish the whole work, which SPSS finish it in 2 seconds.

yes it did as you mentioned. But maybe he is not seeing the data flashing coz no screen updating but we see trace in the debug. One issue though h has plenty of zeroes that you onverted to "" and in this case you hv less data and it look akward blank zillilines then some figures then wapo blank .... I think he will not like it much.

If suddenly u hv zeroes there it will be trumendously slow.
gowflow

Hey reoroman
Just noticed you closed the question ! tks v much but feel here the credit goes to redmondb he gave you a good fast solution !!!!

Shows ur a gentlman !!! not like an other question I spend most of overnite on it .... to get a nice donut !!! not that am after points or anything but it tell u much about people and atitude wow !

Anyway most important is to hv fun !
cheers
gowflow

Sorry, I don't understand - the only blanks in IV are from an error. The error would almost certainly be a Divide by Zero - and that would only happen when all the values in a column were the same.

Thanks. If you're comfortable with the VBE, the macro outputs timings into the Immediate Pane before processing each column.

Brian.

0

reoromanAuthor Commented:

no I am not good with VB. i use primitive way to check the time.. the stop watch .. lol
but till now 7 minutes has passed and no results.
maybe i have something wrong with my desktop. I will try it later on my laptop

0

reoromanAuthor Commented:

redmondb .. do you have knowledge with SPSS

0

reoromanAuthor Commented:

no, it works with long time... no worries.. thanks anyway.

but till now 7 minutes has passed and no results.
Updating the screen slows down Excel, so lots of big runs (including this one) tuen off screen updates until the run is finished. If you're happy with the results and will be running this job repeatedly then let me know and I'll show you how to see progress.

Edit: BTW, my run was less than 20 minutes.

do you have knowledge with SPSS
I'm sfraid not, but I'm sure you'll find lots of experts who are!

Brian.

0

reoromanAuthor Commented:

thanks ... i will stick with SPSS, really it takes maximum2 seconds... wow

The formula for A20 is...

=(A1-MIN(A$1:A$5))/(MAX(A$

... Drag & Fill or Copy & Paste as required.

Please note that I've not included code to avoid Divide by Zero errors. Would you like something for that?

Regards,

Brian.

.