Link to home
Start Free TrialLog in
Avatar of Student_101
Student_101Flag for Canada

asked on

Auto Correlation Function, Output Range and Regressiona Analysis data to Excel

Hi Experts,

I asked this question about 2 weeks ago, but I couldn't get the help I was looking for, so I will ask my question again.

So I am working on a frequency converter in VB6, and this is the part I need help with.

I have data for about 10 years, and in my flex grid, I have the Date, Value 1, and Value 2.

Each month has a certain number of data sets, and all the data is static, so no changes.

No from my flex grid, I need to get the ranges of the R-squared values between Value 1 and Value 2, such that, without skipping any date, and the values follow a pattern according to the dates, the function should be able to calculate the R-squared values closest to one, and then output the information in this format.

When the user clicks the command button, the command should output the results to excel, yes this is not the challenging part, but this is the problem

The user wants the results in a specified format.

Firstly the auto correlation function needs to work, so that we can get all the ranges of data from the beginning until the end with R-squared values closest to one. The data is sorted by date, and the program cant jump up and down to get the best R-squared values, it has to follow the date order, and within that order, as soon as it reaches a R-squared value closest to one, it should store the start and the end point, and export that to excel. This should go on until the whole data is analyzed, but a loop will take care of that.

Now the format, it should be something like this.

Start Date ----- End Date ----- Plot Date ( this is the middle date from the start and finish, and I need to show the count as well, how many points were between the start and finish) --- R-squared value for that Range with standard Error ----- Slope of that range with the standard Error ------ The Y intercept With the Standard Error and the other in built results that the Regression analysis within the Data analysis gives, such as the upper and lower 95% and all.

I also need a graph, that shows that Plot date VS. The Y intercepts, for the complete data set. This can be obtained from our columns above, like the Plot date column vs the Y-Intercept column.

The part that I urgently need help with is the auto correlation, and reporting those date ranges in excel, with the specified format.

I know how to export data to excel and draw the chart, I should be able to do that.

Now to provide a logical explanation to this scenario,  
As I tried to think of it in my mind, I realized that if the loop goes from the first to the 2nd value, keeping in mind that the values follow a pattern, then obviously we will have a R-squared value very close to 1, but this is not correct.

So to make the problem more simpler, the loop should look up to the first 8 values from the point it begins the loop, as every month has 4 data values, so we can look at 2 months data.

A temp variable can be created that checks how the R-squared value is affected by adding the first 6 values, the first 7 values, or the first 8 values, if the R-squared values keeps increasing as n increases, then we should keep increasing the loop, but if the R-squared value decreases as n increases, then we should keep the value of n that resulted in the highest R-squared value.

This is just a logical explanation to solve this problem,
now I need to put it into code.

So I will need a loop, a temp variable to keep track of n, and the R-squared value.
Might need If case.

I was thinking of doing this, but not sure

Implementing the Max function in excel, along with the correlation to find the max correlation and then report it.

I have the code for implementing the correl function, the code for formatting as well as the chart, but it gives me the R-squared value for the whole grid. I updated it a little bit, and I was able to get it for every row, but the only edition that needs to be made is to get the closest value to 1, and then report the required things.

I have also attached a sample output as well as input file.

If my question is still confusing, then please refer to my previous deleted question
https://www.experts-exchange.com/questions/24109514/Finding-R-squared-automatically-through-searching-range.html

Your help is much awaited and appreciated.
Kind regards.



''' Code for the correl Function
Public Function Corell()
   Dim objApp As Excel.Application
   Dim objBook As Workbook
   Dim objSheet As Worksheet
   On Error Resume Next
   Set objApp = GetObject("excel.application")
   If Err.Number <> 0 Then
      Set objApp = CreateObject("excel.application")
   End If
   Set objBook = objApp.Workbooks.Add
   Set objSheet = objApp.Sheets(1)
 
   Dim j As Integer, high As Integer
   
     
   'This would be the number of data elements
   If keeling_Select = True Then
   high = therowsel
   'High = cbo_RowSelected.List(cbo_RowSelected.ListCount - 1) 'cbo_RowSelected.ListCount - 1
   'For j = cbo_RowSelected.List(0) To cbo_RowSelected.List(cbo_RowSelected.ListCount - 1)
   For j = therow To high
           objSheet.Cells(j, 1).value = grd_result.TextMatrix(j, 2)
           objSheet.Cells(j, 2).value = grd_result.TextMatrix(j, 3)
             For thecol = 1 To grd_result.Cols - 1  'RS_find.Fields.Count
                    grd_result.Col = thecol
                    grd_result.ColSel = thecol
                    grd_result.Row = j
                    grd_result.CellBackColor = &HFFFF&
                    Next thecol
            Next
   
   Else
   
   
   Dim i As Integer
   high = i  ' grd_result.Rows - 1
   i = 2
   For j = 1 To grd_result.Rows - 1
   high = i
           objSheet.Cells(j, 1).value = grd_result.TextMatrix(j, 2)
           objSheet.Cells(j, 2).value = grd_result.TextMatrix(j, 3)
           i = i + 1
            objSheet.Range("a" & high + 1 & ":a" & high + 1).formula = "=Correl(a1:a" & high & ", b1:b" & high & ")^2"
            result = FormatNumber(CDbl(objSheet.Range("a" & high + 1 & ":A" & high + 1).value), 5, vbTrue)
            Debug.Print (result)
        Next
   End If
   
'   objSheet.Range("a" & high + 1 & ":a" & high + 1).formula = "=Correl(a1:a" & high & ", b1:b" & high & ")^2"
'            result = FormatNumber(CDbl(objSheet.Range("a" & high + 1 & ":A" & high + 1).value), 5, vbTrue)
'
   'MsgBox result
   
   objApp.DisplayAlerts = False
   Set objSheet = Nothing
   Set objBook = Nothing
   objApp.Quit
   Set objApp = Nothing
End Function
 
 
''' Code for Max function
Sub MaxValue()
Dim rg As Range
Dim MaxVal As Double
Dim MaxRow As Long, MaxCol As Long
Dim sName As String
Set rg = Selection
MaxVal = Application.Max(rg)
MaxRow = rg.Find(MaxVal, lookat:=xlWhole, LookIn:=xlValues).Row
MaxCol = rg.Find(MaxVal, lookat:=xlWhole, LookIn:=xlValues).Column
sName = rg.Cells(MaxRow - rg.Row + 1, 1)
[AA1:AA4].Value = Application.Transpose(Array(MaxVal, MaxRow, MaxCol, sName))
End Sub
 
 
''' Code for Exporting format and Chart to excel
Private Sub cmd_OutputKeelingPlot_Click()
   Dim objApp As Excel.Application
   Dim objBook As Workbook
   Dim objSheet As Worksheet
   Dim objExcelCI As Excel.Chart
   
   On Error Resume Next
   Set objApp = GetObject("excel.application")
   If Err.Number <> 0 Then
      Set objApp = CreateObject("excel.application")
   End If
   Set objBook = objApp.Workbooks.Add
   Set objSheet = objApp.Sheets(1)
   
   objApp.Visible = True
   objApp.UserControl = True
   objApp.WindowState = xlMaximized
   objApp.DisplayAlerts = False
   
   With objSheet
   .Cells(1, 1).value = " Starting Date "
   .Cells(1, 2).value = " Ending Date "
   .Cells(1, 3).value = " Plot Date "
   .Cells(1, 4).value = " n "
   .Cells(1, 5).value = " R-squared "
   .Cells(1, 6).value = " Standard Error "
   .Cells(1, 7).value = " Slope "
   .Cells(1, 8).value = " Standard Error "
   .Cells(1, 9).value = " Y-intercept "
   .Cells(1, 10).value = " Standard Error "
   
   ''' Have the code for the loop  that gives all the values to these columns and defiens the range
   
   .Range("A1:k1").Select
   
   With Selection
   
  .HorizontalAlignment = xlCenter
   .Range.VerticalAlignment = xlBottom
   .Range.WrapText = False
   .Range.Orientation = 0
   .Range.AddIndent = False
   .Range.IndentLevel = 0
   .Range.ShrinkToFit = True
   .Range.ReadingOrder = xlContext
   .Range.MergeCells = False
   .Range.Borders(xlDiagonalDown).LineStyle = xlNone
   .Range.Borders(xlDiagonalUp).LineStyle = xlNone
   
   With .Range.Borders(xlEdgeLeft)
        .LineStyle = xlContinuous
        .Weight = xlMedium
        .ColorIndex = xlAutomatic
    End With
    
    With .Range.Borders(xlEdgeTop)
        .LineStyle = xlContinuous
        .Weight = xlMedium
        .ColorIndex = xlAutomatic
    End With
    
    With .Range.Borders(xlEdgeBottom)
        .LineStyle = xlContinuous
        .Weight = xlMedium
        .ColorIndex = xlAutomatic
    End With
    
    With .Range.Borders(xlEdgeRight)
        .LineStyle = xlContinuous
        .Weight = xlMedium
        .ColorIndex = xlAutomatic
    End With
    
    .Range.Borders(xlInsideVertical).LineStyle = xlNone
    .Range.Borders(xlInsideHorizontal).LineStyle = xlNone
    
    End With
    
    .Range("A1:K1").Select
    
    With Selection
    
    .Range.Font.Bold = True
    .Range.Borders(xlDiagonalDown).LineStyle = xlNone
    .Range.Borders(xlDiagonalUp).LineStyle = xlNone
    
    With .Range.Borders(xlEdgeLeft)
        .LineStyle = xlContinuous
        .Weight = xlMedium
        .ColorIndex = xlAutomatic
    End With
    
    With .Range.Borders(xlEdgeTop)
        .LineStyle = xlContinuous
        .Weight = xlMedium
        .ColorIndex = xlAutomatic
    End With
    
    With .Range.Borders(xlEdgeBottom)
        .LineStyle = xlContinuous
        .Weight = xlMedium
        .ColorIndex = xlAutomatic
    End With
    
    With .Range.Borders(xlEdgeRight)
        .LineStyle = xlContinuous
        .Weight = xlMedium
        .ColorIndex = xlAutomatic
    End With
    Selection.Borders(xlInsideVertical).LineStyle = xlNone
   End With
   End With
 
   Set objExcelCI = objExcelW.Charts.Add
   With objExcelCI
          
    .ChartType = xlXYScatter
    .SetSourceData Source:=Sheets("Sheet1").Range("D57:D66,J57:J66"), _
        PlotBy:=xlColumns
        
    .Location Where:=xlLocationAsNewSheet, Name:="Keeling Plot"
    
    .HasTitle = True
    .ChartTitle.Characters.Text = "Keeling Plot"
    .Axes(xlCategory, xlPrimary).HasTitle = True
    .Axes(xlCategory, xlPrimary).AxisTitle.Characters.Text = "Y-Intercept"
    .Axes(xlValue, xlPrimary).HasTitle = True
    .Axes(xlValue, xlPrimary).AxisTitle.Characters.Text = "Plot Date"
    .HasLegend = False
    
    .PlotArea.Select
 
    With Selection.Border
        .ColorIndex = 16
        .Weight = xlThin
        .LineStyle = xlContinuous
    End With
    
    Selection.Fill.OneColorGradient Style:=msoGradientDiagonalUp, Variant:=3, _
        Degree:=0.231372549019608
        
    With Selection
        .Fill.Visible = True
        .Fill.ForeColor.SchemeColor = 43
    End With
    
    .SeriesCollection(1).Select
    
    With Selection.Border
        .ColorIndex = 1
        .Weight = xlHairline
        .LineStyle = xlDot
    End With
    
    With Selection
        .MarkerBackgroundColorIndex = 2
        .MarkerForegroundColorIndex = 1
        .MarkerStyle = xlCircle
        .Smooth = False
        .MarkerSize = 5
        .Shadow = False
    End With
    
    End With
       
       
   Set objSheet = Nothing
   Set objBook = Nothing
   Set objApp = Nothing
   Set objExcelCI = Nothing
   
End Sub

Open in new window

Sample-Output-For-keeling-Plot.xls
Keeling-Plot-data.xls
Avatar of VK
VK
Flag of Germany image

hello student 101,

you have written a lot of text...
Unfortunately it doesn't make it easier to understand your question.
I tried out your code, but it isn't complete: e.g. keeling_Select isn't defined
Avatar of Student_101

ASKER

Hi Vk,

I apologize for being unclear, but do you get the idea of what I am looking for?

I just need at auto correlation function to give me the ranges (start-date and end date) of R-squared values closest to 1 for my data.

I have updated the code and sent it to you, keeling_select is a boolean statement that becomes true when the user selects some rows from the grid.

We can ignore it for now.

Here is the code.

II was using this code to give me the R-squared values one by one,  as it reads in even row, but the function should calculate the R-squared value for a minimum of 6-8 points, and then within a range of about 8-12 points, it should check for the closest R-squared value to 1, and export its Start and End date.

Really appreciate your help !!

Public Function Corell()
   Dim objApp As Excel.Application
   Dim objBook As Workbook
   Dim objSheet As Worksheet
   On Error Resume Next
   Set objApp = GetObject("excel.application")
   If Err.Number <> 0 Then
      Set objApp = CreateObject("excel.application")
   End If
   Set objBook = objApp.Workbooks.Add
   Set objSheet = objApp.Sheets(1)
 
   Dim j As Integer, high As Integer 
   Dim i As Integer
   high = i  ' grd_result.Rows - 1
   i = 2
   For j = 1 To grd_result.Rows - 1
   high = i
           objSheet.Cells(j, 1).value = grd_result.TextMatrix(j, 2)
           objSheet.Cells(j, 2).value = grd_result.TextMatrix(j, 3)
           i = i + 1
            objSheet.Range("a" & high + 1 & ":a" & high + 1).formula = "=Correl(a1:a" & high & ", b1:b" & high & ")^2"
            result = FormatNumber(CDbl(objSheet.Range("a" & high + 1 & ":A" & high + 1).value), 5, vbTrue)
            Debug.Print (result)
        Next
   End If
   
'   objSheet.Range("a" & high + 1 & ":a" & high + 1).formula = "=Correl(a1:a" & high & ", b1:b" & high & ")^2"
'            result = FormatNumber(CDbl(objSheet.Range("a" & high + 1 & ":A" & high + 1).value), 5, vbTrue)
'
   'MsgBox result
   
   objApp.DisplayAlerts = False
   Set objSheet = Nothing
   Set objBook = Nothing
   objApp.Quit
   Set objApp = Nothing

Open in new window

OK.

As i see the code above copies the content of 2 columns of your >>flex grid<< into a excel sheet.
Now it comes to the interesting:
objSheet.Range("a" & high + 1 & ":a" & high + 1).formula = "=Correl(a1:a" & high & ", b1:b" & high & ")^2"

when we start at rownumber 1 then high is 3 and the result is:
objSheet.Range("a4:a4").formula = "=Correl(a1:a3, b1:b3)^2"

That means that for each row you are calculating the square correl for ALL rows before.
The results are written 3 rows below the current row, right?

>>
II was using this code to give me the R-squared values one by one,  
as it reads in even row, but the function should calculate the R-squared value for a minimum of 6-8 points,
and then within a range of about 8-12 points,
it should check for the closest R-squared value to 1,
and export its Start and End date.
<<

>>a minimum of 6-8 points<<
Your function already calculates the R-squared values
for all points which are already written in the excel sheet.
But before row 7 you havent enough points to do that

>>and then within a range of about 8-12 points<<
Do you mean looking back 8-12 points?
You have to decide if 8,9,10,11 or 12.
Looking back is easy in changing:

=Correl(a1:a" & high & ", b1:b" & high & ")^2

I can post you a function which you can insert into the string above.

We can continue with >>check for the closest R-squared value to 1<<
if you can explain me the obscurities.
Hi VK,

Glad to hear back from you.

I actually posted this question again, as you said my question wasn't very clear
(https://www.experts-exchange.com/questions/24141217/Auto-Correlation-Function-Output-Range-and-Regression-Analysis-data-to-Excel.html)
I wanted to just edit it, but it was not possible.

I updated my input file, as the one I attached before had data for 1998 which was not very consistent and not giving good R-square values.

Can we continue this question in that thread, or should I upload those files here?

and my description in that question is very concise.

Appreciate your help :)
To be very clear and concise.

I want the closest R-square value to 1 between points 1- 15, with a minimum of 6-8 points, and a maximum of 15 points :)

And then the function just starts doing this again from the last row that it left off.
From the macro you gave me,

and after some thought, how about we make the function like this.

It should look at 6 days of data, meaning 6 rows, and then gives us the highest RSQ value before the pattern starts to decrease, with its range (start/End Date)

I think this should be very easy to achieve, as we now have concrete number of days, and as soon as we find the pattern of RSQ going down, we report the RSQ value with its range.

I would like to provide this functionality just as a second choice to the user, but I am currently working on the manual part to get it from from the rows selected in flexgrid and from the dates in the combo box, but I would appreciate if you could help me with this approach, using 6 days, and reporting the max RSQ before the pattern starts to decrease.

Regards,
Macro (From Patrick)
Sub ranger()
Dim startrow As Long
Dim i As Long
Dim n As Long
 
i = 1
startrow = 2
For n = 1 To 400
    Cells(startrow + i, 7).FormulaR1C1 = _
"=RSQ(R" & startrow & "C3:R" & startrow + i & "C3" & ",R" & startrow & "C4:R" & startrow + i & "C4)"
    If Cells(startrow + i, 7) < Cells(startrow + i - 1, 7) And i >= 6 And i <= 15 And Cells(startrow + i - 1, 7) > 0.984 Then
'        Cells(startrow + i - 1, 7).Font.Bold = True
        Cells(startrow + i, 7).ClearContents
        startrow = startrow + i
        i = 0
    End If
    i = i + 1
'MsgBox "i = " & i
Next n
 
End Sub

Open in new window

File attached - press the button, make the entries requested and press 'Submit'

Sample-Data-for-keeling-Plot-06.xls
Hi Patrick,

I tried running it from my laptop at home, but i think my edition of excel doesn't support it.

I will check from work tomorrow morning and get back to you then.

Thanks once again.
Sorry, the macros were disabled in my laptop.

Patrick, I still dont see the Start/End dates for those RSQ values
Patrick, all this does is gives us the corresponding m,c, RSQ

not what we were looking for in the function, that once it reaches the max RSQ, it starts over again
Patrick,

after playing around with the function, I realized that by setting the max threshhold to 0, we get the desired ouput,

But shouldn't the max be 1?
It only seems to work until row 110

Please see attached file
Copy-of-Sample-Data-for-keeling-.xls
ASKER CERTIFIED SOLUTION
Avatar of patrickab
patrickab
Flag of United Kingdom of Great Britain and Northern Ireland image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial