Link to home
Start Free TrialLog in
Avatar of cosmicgazer
cosmicgazer

asked on

Convert VBA Module to T-SQL

I am trying to convert the below VBA sub procedure to T-SQL. I am new to MS SQL Server and would appreciate any assistance. Ideally I would like to create a stored procedure.

VBA Script -

Sub CreateCentroidTablesNew1()

    Dim tblRegion As String
    Dim numRegion As Integer
    Dim regionID As Integer
    Dim numVertex As Long
    Dim vertexID As Long
    'Dim openDynaset As String
    Dim x As Double
    Dim y As Double
   
    'tblRegion = "dbo_westminster_const_region6" '' adapt for user input
    tblRegion = "dbo_scotland_and_wales_const_region5"
   
   
''define recordset
Dim dbDatabaseToRead As DAO.Database
Dim boundaryLine As Recordset

Set dbDatabaseToRead = CurrentDb
Set boundaryLine = dbDatabaseToRead.OpenRecordset(tblRegion, dbOpenDynaset)
''
   
    x = 0
    y = 0
   
   
    numRegion = DMax("id", tblRegion) - DMin("id", tblRegion) + 1 ' later subtract DMin
'    MsgBox numRegion & " regions found"
   
    regionID = DMin("id", tblRegion) ' later adapt to DMin
       
    For regionID = 87 To DMax("id", tblRegion)

        numVertex = DMax("RunningCounterModID", tblRegion, "id =" & regionID) - DMin("RunningCounterModID", tblRegion, "id =" & regionID) + 1
'        MsgBox numVertex & " vertices found for region " & regionID
       
        vertexID = DMin("RunningCounterModID", tblRegion, "id =" & regionID)
       
        While vertexID < DMax("RunningCounterModID", tblRegion, "id =" & regionID)
       
            boundaryLine.FindFirst "[RunningCounterModID]=" & vertexID & " AND [id] =" & regionID
            x = x + boundaryLine![X-Coordinate]
            y = y + boundaryLine![Y-Coordinate]
       
       
'            x = x + DLookup("[X-Coordinate]", tblRegion, "id =" & regionID & " AND RunningCounterModID =" & vertexID)
'            MsgBox "X Centroid of region " & vertexID & " = " & x
'            y = y + DLookup("[Y-Coordinate]", tblRegion, "id =" & regionID & " AND RunningCounterModID =" & vertexID)
'            MsgBox "Y Centroid of region " & vertexID & " = " & y
       
        Debug.Print "Lookup - VertexID: (" & vertexID & "), X-Coordintate: (" & x & "), Y-Coordinate: (" & y & ")"
       
        vertexID = vertexID + numVertex / 1000

      Wend

        x = x / numVertex
        y = y / numVertex

        Dim dbDatabaseToWrite As DAO.Recordset
        Set dbDatabaseToWrite = dbDatabaseToRead.OpenRecordset("centroid_table_devolved_const")

    '' write to table
   
        dbDatabaseToWrite.AddNew
        dbDatabaseToWrite("id").Value = regionID
        dbDatabaseToWrite("X-Centroid").Value = x
        dbDatabaseToWrite("Y-Centroid").Value = y
        dbDatabaseToWrite.Update
       
        Debug.Print "Writing to table - RegionID: (" & regionID & "), XCoordintate: (" & x & "), YCoordinate: (" & y & ")"

    Next regionID

End Sub

Many Thanks,

CG
Avatar of Aaron Tomosky
Aaron Tomosky
Flag of United States of America image

My problem is that I don't know Vba well enough to follow this.
Avatar of cosmicgazer
cosmicgazer

ASKER



This script basically takes (x, y) parameters from a source table and processes the coordinates into the below spatial algorithm. The output is then written to destination table.

As the source table contains several millions rows of data, the processing rate is an issue !

Ideally if the script can be adopted into a stored procedure this could then be automated.
You are going to have to provide a schema of all the tables involved.  A SQL script populating those tables with some sample data and a clear description of what you would expect to see once the update is done.

Included two attachments.

doc_rptSampleDataObjects - Sample data (Source and Destination)
doc_rptObjects - Table schema (Source and Destination)

Hope this helps !



I mean the schema for your tables on SQL Server not MS Access. As in CREATE TABLE YourTableName(bla, bla, bla)

Sure, enclosed the tSQL of the source and destination tables.

Hope this helps !

Thanks tSQL-DestinationTable.txt tSQL-SourceTable.txt
Ok.  Now I need some sample data and your expected result in your destination table.

Enclosed a MS Excel workbook. The first worksheet contains sample data and the second tab has the expected output.

In addition have I also included the same sample data as a comma separated CSV file.

MS Excel work

 doc-rptSampleDataObjects-2011042.xls


CSV file

 doc-rptSampleDataObjects-2011042.csv
 doc-rptSampleDataObjects2-201104.csv
Ok, I think I see what you are trying to do and it should be possible to write a single SQL Statement to output the results.  However, for clarity I will try and translate literally your code to T-SQL using some (ugh!) CURSORs or some other looping structure.  This will not be as performant, but should be easier for you to follow.

This is a tad complicated, so if you do not have a solution I will endeavor to work on it tomorrow night.

Incidentally, I was unable to open or download the file doc-rptSampleDataObjects-2011042.csv, however I suspect it is just a repeat of what I extracted from the spreadsheet.
Great !

The main issue is the processing rate. This is principally due to the time taken to look up the (x, y) parameters in the source table.

The algorthim/method within the script only takes seconds to calculate the position aswell as writing to the destination table.

At present I do not have a solution but rewriting the VBA to T-SQL will signficantly increase the processing rate as MS Access only uses 10 - 20 of the overall processor/cores.

Also, your assumption is correct, the CSV files were copies of the Excel spreadsheets. There should be around 26,000 is sample data.

Thanks very much for your assistance, much apprciated !


 

ASKER CERTIFIED SOLUTION
Avatar of Anthony Perkins
Anthony Perkins
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial

Thank you for the literal translation of the MS Access VBA script.

With regards to the caveats please see below my thoughts.

1). I agree that by the spilting the data into subsets it becomes more manageable. With that in mind could you guide me into adapting your script into more manangeable chunks, say every 10k records so it reduces the processing time from the current 30 seconds ?

2). From the research I have being undertaking I need an approximation not a precise value. This is then used in the second phase of the project to determine which constituencies hold which zipcodes/postcodes. (See below for second phase of the VBA - TSQL conversion.)

3). The initaliation of the (x, y) parameters is purely used to reset the counter. This is just a delimiter, but I was glad logically speaking the script reset the parameters on every new region.

In the second part of the conversion, it takes the centriod value of the region (x, y) and matches to see where the zipcodes/postcodes can be found. If it cannot find a region in then moves onto the next region until a match is found. If a match is found it brings backs the constituency name and writes to a separate table 'WriteTable'.

Please see below VBA found in the GeoSpatial6 module.

This module consists of the following

Sub postcodeLoop - here the postcode is tested to determine which region it belongs too. This is based on locating all surrounding regions within a 200 km radius.

Function regionLoop - as the name suggests, this is a further loop to determine which constituency the region belongs too.

Function pnpoly - is the second loop, kind of like.... a loop within a loop. Here it finally works out the brings backs the name of the constituency the postcode belongs too.

Could I therefore ask for assistance to convert the remaining VBA scripts to T-SQL.

Enclosed attachment of zipped MS Access 2007 database plus text files containing the VBA script and the T-SQL of the table objects.



 GeoSpatial-db-centriod-64-bit.zip VBA-GeoSpatial6.txt
>>1) I agree that by the spilting the data into subsets it becomes more manageable.<<
I was not suggesting you split the data into subset, but rather your ultimate goal is to do the whole thing in a single SQL Statement rather than going RBAR (Row By Agonizing Row)

2) It is not a question of aproximation, my results don't even come close to yours.  See for yourself:
Your results:
TableID      id      X-Centroid      Y-Centroid
604      87      4455.21            5385.34
605      88      30021.43            56343.65
606      89      25132.76            50780.82
607      90      1071.15            5273.96

My output:
TableID      id      X-Coordinate      Y-Coordinate
1      87      224865.23      171055.68
2      88      39790.30            18455.97
3      89      192108.23      215826.23
4      90      51576.90            41659.06

3) >> I was glad logically speaking the script reset the parameters on every new region.<<
That is just it, I am reseting them.  I was trying to keep my code as close as possible to yours.


>>Could I therefore ask for assistance to convert the remaining VBA scripts to T-SQL<<
We need to get the first one resolved first.
Apologises for not being in contact, I was away on business !

In any case, I was looking for approximation ! My main concern was to find a more productive way to query the paramters, rather than RBAR approach.

Any other alternatives ?
This question has been classified as abandoned and is closed as part of the Cleanup Program. See the recommendation for more details.