Convert VBA Module to T-SQL

Posted on 2011-04-22
Last Modified: 2012-05-11
I am trying to convert the below VBA sub procedure to T-SQL. I am new to MS SQL Server and would appreciate any assistance. Ideally I would like to create a stored procedure.

VBA Script -

Sub CreateCentroidTablesNew1()

    Dim tblRegion As String
    Dim numRegion As Integer
    Dim regionID As Integer
    Dim numVertex As Long
    Dim vertexID As Long
    'Dim openDynaset As String
    Dim x As Double
    Dim y As Double
    'tblRegion = "dbo_westminster_const_region6" '' adapt for user input
    tblRegion = "dbo_scotland_and_wales_const_region5"
''define recordset
Dim dbDatabaseToRead As DAO.Database
Dim boundaryLine As Recordset

Set dbDatabaseToRead = CurrentDb
Set boundaryLine = dbDatabaseToRead.OpenRecordset(tblRegion, dbOpenDynaset)
    x = 0
    y = 0
    numRegion = DMax("id", tblRegion) - DMin("id", tblRegion) + 1 ' later subtract DMin
'    MsgBox numRegion & " regions found"
    regionID = DMin("id", tblRegion) ' later adapt to DMin
    For regionID = 87 To DMax("id", tblRegion)

        numVertex = DMax("RunningCounterModID", tblRegion, "id =" & regionID) - DMin("RunningCounterModID", tblRegion, "id =" & regionID) + 1
'        MsgBox numVertex & " vertices found for region " & regionID
        vertexID = DMin("RunningCounterModID", tblRegion, "id =" & regionID)
        While vertexID < DMax("RunningCounterModID", tblRegion, "id =" & regionID)
            boundaryLine.FindFirst "[RunningCounterModID]=" & vertexID & " AND [id] =" & regionID
            x = x + boundaryLine![X-Coordinate]
            y = y + boundaryLine![Y-Coordinate]
'            x = x + DLookup("[X-Coordinate]", tblRegion, "id =" & regionID & " AND RunningCounterModID =" & vertexID)
'            MsgBox "X Centroid of region " & vertexID & " = " & x
'            y = y + DLookup("[Y-Coordinate]", tblRegion, "id =" & regionID & " AND RunningCounterModID =" & vertexID)
'            MsgBox "Y Centroid of region " & vertexID & " = " & y
        Debug.Print "Lookup - VertexID: (" & vertexID & "), X-Coordintate: (" & x & "), Y-Coordinate: (" & y & ")"
        vertexID = vertexID + numVertex / 1000


        x = x / numVertex
        y = y / numVertex

        Dim dbDatabaseToWrite As DAO.Recordset
        Set dbDatabaseToWrite = dbDatabaseToRead.OpenRecordset("centroid_table_devolved_const")

    '' write to table
        dbDatabaseToWrite("id").Value = regionID
        dbDatabaseToWrite("X-Centroid").Value = x
        dbDatabaseToWrite("Y-Centroid").Value = y
        Debug.Print "Writing to table - RegionID: (" & regionID & "), XCoordintate: (" & x & "), YCoordinate: (" & y & ")"

    Next regionID

End Sub

Many Thanks,

Question by:cosmicgazer
    LVL 38

    Expert Comment

    by:Aaron Tomosky
    My problem is that I don't know Vba well enough to follow this.

    Author Comment


    This script basically takes (x, y) parameters from a source table and processes the coordinates into the below spatial algorithm. The output is then written to destination table.

    As the source table contains several millions rows of data, the processing rate is an issue !

    Ideally if the script can be adopted into a stored procedure this could then be automated.
    LVL 75

    Expert Comment

    by:Anthony Perkins
    You are going to have to provide a schema of all the tables involved.  A SQL script populating those tables with some sample data and a clear description of what you would expect to see once the update is done.

    Author Comment


    Included two attachments.

    doc_rptSampleDataObjects - Sample data (Source and Destination)
    doc_rptObjects - Table schema (Source and Destination)

    Hope this helps !


    Author Comment

    LVL 75

    Expert Comment

    by:Anthony Perkins
    I mean the schema for your tables on SQL Server not MS Access. As in CREATE TABLE YourTableName(bla, bla, bla)

    Author Comment


    Sure, enclosed the tSQL of the source and destination tables.

    Hope this helps !

    Thanks tSQL-DestinationTable.txt tSQL-SourceTable.txt
    LVL 75

    Expert Comment

    by:Anthony Perkins
    Ok.  Now I need some sample data and your expected result in your destination table.

    Author Comment


    Enclosed a MS Excel workbook. The first worksheet contains sample data and the second tab has the expected output.

    In addition have I also included the same sample data as a comma separated CSV file.

    MS Excel work


    CSV file

    LVL 75

    Expert Comment

    by:Anthony Perkins
    Ok, I think I see what you are trying to do and it should be possible to write a single SQL Statement to output the results.  However, for clarity I will try and translate literally your code to T-SQL using some (ugh!) CURSORs or some other looping structure.  This will not be as performant, but should be easier for you to follow.

    This is a tad complicated, so if you do not have a solution I will endeavor to work on it tomorrow night.

    Incidentally, I was unable to open or download the file doc-rptSampleDataObjects-2011042.csv, however I suspect it is just a repeat of what I extracted from the spreadsheet.

    Author Comment

    Great !

    The main issue is the processing rate. This is principally due to the time taken to look up the (x, y) parameters in the source table.

    The algorthim/method within the script only takes seconds to calculate the position aswell as writing to the destination table.

    At present I do not have a solution but rewriting the VBA to T-SQL will signficantly increase the processing rate as MS Access only uses 10 - 20 of the overall processor/cores.

    Also, your assumption is correct, the CSV files were copies of the Excel spreadsheets. There should be around 26,000 is sample data.

    Thanks very much for your assistance, much apprciated !


    LVL 75

    Accepted Solution

    Please find below a literal translation of your VBA code. But first the caveats:
    1. It is a terrible way to execute code in SQL Server.  You need to start thinking in sets of data instead of the prehistoric row navigation as employed by many still using MS Access.  This is not a nicety, but more of a necessity as you can tell by the time it takes to process these few rows.  In my case and admitedly without any type of indexing it took over 30 seconds.

    2. More importantly for the immediate problem it returns the wrong results (see below) as compared to the results you posted.  However, it should be pretty easy to debug, as I have included the same Print statements you have, so this will allow you to easily compare.  If I was to guess, I suspect it has to do with how T-SQL rounds integers when dividing two values.

    3. I notice that you only initialize X and Y at the start and not after every region.  I have duplicated that, however I am not sure if that is correct.
    DECLARE	@X decimal(18,2), 
    	@Y decimal(18,2),
    	@NumRegion integer,
    	@RegionID integer,
    	@MaxRegionID integer,
    	@NumVertex integer,
    	@VertexID integer,
    	@MaxVertexID integer
    SELECT	@X = 0,
    	@Y = 0,
    	@NumRegion = MAX(id) - MIN(id) + 1,
    	@RegionID = MIN(id),
    	@MaxRegionID = MAX(id)
    FROM	dbo.westminster_const_region6
    WHILE @RegionID <= @MaxRegionID 
    		SELECT	@NumVertex = MAX(RunningCounterModID) - MIN(RunningCounterModID) + 1,
    			@VertexID = MIN(RunningCounterModID),
    			@MaxVertexID = MAX(RunningCounterModID)
    		FROM	dbo.westminster_const_region6
    		WHERE	id = @RegionID
    		WHILE @VertexID < @MaxVertexID
    				SELECT	@X = @X + [X-Coordinate],
    					@Y = @Y + [Y-Coordinate]
    				FROM	dbo.westminster_const_region6
    				WHERE	id = @RegionID
    					AND RunningCounterModID = @VertexID
    				PRINT 'Lookup - VertexID: (' + CAST(@VertexID AS varchar(10)) + '),  
    					X-Coordinate: (' + CAST(@X AS varchar(20)) + '), 
    					Y-Coordinate: (' + CAST(@Y AS varchar(20)) + ')'
    				SET @VertexID = @VertexID + @NumVertex / 1000
    		SELECT	@X = @X / @NumVertex,
    			@Y = @Y / @NumVertex
    		INSERT	dbo.centroid_table_westminster(
    			id, [X-Coordinate], [Y-Coordinate])
    		VALUES  (@RegionID, @X, @Y)
    		Print 'Writing to table - RegionID: (' + CAST(@RegionID AS varchar(10)) + '), 
    			XCoordinate: (' + CAST(@X AS varchar(20)) + '), 
    			YCoordinate: (' + CAST(@Y AS varchar(20)) + ')'
    		SET @RegionID = @RegionID + 1
    TableID	id	X-Coordinate	Y-Coordinate
    1	87	224865.23		171055.68
    2	88	39790.30		18455.97
    3	89	192108.23		215826.23
    4	90	51576.90		41659.06

    Open in new window


    Author Comment


    Thank you for the literal translation of the MS Access VBA script.

    With regards to the caveats please see below my thoughts.

    1). I agree that by the spilting the data into subsets it becomes more manageable. With that in mind could you guide me into adapting your script into more manangeable chunks, say every 10k records so it reduces the processing time from the current 30 seconds ?

    2). From the research I have being undertaking I need an approximation not a precise value. This is then used in the second phase of the project to determine which constituencies hold which zipcodes/postcodes. (See below for second phase of the VBA - TSQL conversion.)

    3). The initaliation of the (x, y) parameters is purely used to reset the counter. This is just a delimiter, but I was glad logically speaking the script reset the parameters on every new region.


    Author Comment

    In the second part of the conversion, it takes the centriod value of the region (x, y) and matches to see where the zipcodes/postcodes can be found. If it cannot find a region in then moves onto the next region until a match is found. If a match is found it brings backs the constituency name and writes to a separate table 'WriteTable'.

    Please see below VBA found in the GeoSpatial6 module.

    This module consists of the following

    Sub postcodeLoop - here the postcode is tested to determine which region it belongs too. This is based on locating all surrounding regions within a 200 km radius.

    Function regionLoop - as the name suggests, this is a further loop to determine which constituency the region belongs too.

    Function pnpoly - is the second loop, kind of like.... a loop within a loop. Here it finally works out the brings backs the name of the constituency the postcode belongs too.

    Could I therefore ask for assistance to convert the remaining VBA scripts to T-SQL.

    Enclosed attachment of zipped MS Access 2007 database plus text files containing the VBA script and the T-SQL of the table objects. VBA-GeoSpatial6.txt

    Author Comment

    LVL 75

    Expert Comment

    by:Anthony Perkins
    >>1) I agree that by the spilting the data into subsets it becomes more manageable.<<
    I was not suggesting you split the data into subset, but rather your ultimate goal is to do the whole thing in a single SQL Statement rather than going RBAR (Row By Agonizing Row)

    2) It is not a question of aproximation, my results don't even come close to yours.  See for yourself:
    Your results:
    TableID      id      X-Centroid      Y-Centroid
    604      87      4455.21            5385.34
    605      88      30021.43            56343.65
    606      89      25132.76            50780.82
    607      90      1071.15            5273.96

    My output:
    TableID      id      X-Coordinate      Y-Coordinate
    1      87      224865.23      171055.68
    2      88      39790.30            18455.97
    3      89      192108.23      215826.23
    4      90      51576.90            41659.06

    3) >> I was glad logically speaking the script reset the parameters on every new region.<<
    That is just it, I am reseting them.  I was trying to keep my code as close as possible to yours.

    LVL 75

    Expert Comment

    by:Anthony Perkins
    >>Could I therefore ask for assistance to convert the remaining VBA scripts to T-SQL<<
    We need to get the first one resolved first.

    Author Comment

    Apologises for not being in contact, I was away on business !

    In any case, I was looking for approximation ! My main concern was to find a more productive way to query the paramters, rather than RBAR approach.

    Any other alternatives ?
    LVL 10

    Expert Comment

    This question has been classified as abandoned and is closed as part of the Cleanup Program. See the recommendation for more details.

    Write Comment

    Please enter a first name

    Please enter a last name

    We will never share this with anyone.

    Featured Post

    How your wiki can always stay up-to-date

    Quip doubles as a “living” wiki and a project management tool that evolves with your organization. As you finish projects in Quip, the work remains, easily accessible to all team members, new and old.
    - Increase transparency
    - Onboard new hires faster
    - Access from mobile/offline

    A little background as to how I came to I design this code: Around 5 years ago I designed an add-in that formatted Excel files to a corporate standard, applying different cell colours and font type depending on whether the cells contained inputs,…
    This article will guide you to convert a grid from a picture into Excel format using Microsoft OneNote and no other 3rd party application.
    Viewers will learn how to use the UPDATE and DELETE statements to change or remove existing data from their tables. Make a table: Update a specific column given a specific row using the UPDATE statement: Remove a set of values using the DELETE s…
    This Micro Tutorial demonstrates how to create Excel charts: column, area, line, bar, and scatter charts. Formatting tips are provided as well.

    761 members asked questions and received personalized solutions in the past 7 days.

    Join the community of 500,000 technology professionals and ask your questions.

    Join & Ask a Question

    Need Help in Real-Time?

    Connect with top rated Experts

    12 Experts available now in Live!

    Get 1:1 Help Now