• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 576
  • Last Modified:

Write Once to a DataRange instead of Looping

I have a spreadsheet (Table) with about 80,000 rows and 50 columns.  One of the requirements is that if data is pulled from the database, any empty cells should have the word "Null" written in them, and if a Date is less than XXX, that, again, the word "Null" be placed within the cell.  There are a couple of other items, but those two consume the most time within the formatting portion of the Data pull code.

Currently a For..If...Next is being used for each.  So the code has to cycle twice through all the cells. Since the ability to perform a write once, after pulling the data from the database is possible, is there a way to select a range from the Table, oTableName.DataBodyRange, and perform a write once, or some other method which limits the amount of looping?

I did try the:
oSQLDataTbl.DataBodyRange.Select
 With Selection
.SpecialCells(xlCellTypeBlanks).FormulaR1C1 = "NULL"
.SpecialCells(xlCellTypeBlanks).HorizontalAlignment = xlCenter
End with

Open in new window

But that only works for about 10 out of the several hundred blank cells that I can see, why still remains a mystery.

The  If .Cells = "" or Date < XXX Then, doesn't work either, as Dates have proven to be problematic.  I need to check the format of cell, to see if has been formatted as a Date AND that the Date is < 01/01/2011. Trying to just use If Date is < 01/01/2011 doesn't work alone.  Excel converts the date to a numerical string, so a cell value less than 39000 will be overwritten with Null. Therefore, the cell needs checked first to see if it has been formatted as a date, then the date number of 01/01/2011 is less than anything this year.  So it needs a seperate IF..Then loop.

Given all of this, is there way to reduce the formatting portion of the code?
0
Cook09
Asked:
Cook09
  • 12
  • 12
  • 4
  • +1
1 Solution
 
NorieCommented:
Why are you looping twice?

What are you doing differently in each loop?

Can you post the code you are using?
0
 
SteveCommented:
Write a better method for outputting from Access...

If you do not want Nulls in your data, fix that in SQL rather than VBA:

eg... SELECT iif(isnull(Column1),"NULL",Column1) as Column1,

This will save a lot of time and effort.

So a good question has to be... how are you outputting to Excel?
0
 
NorieCommented:
The Barman

Where is Access mentioned in the post?

Did I miss something?
0
 [eBook] Windows Nano Server

Download this FREE eBook and learn all you need to get started with Windows Nano Server, including deployment options, remote management
and troubleshooting tips and tricks

 
SteveCommented:
Ah, I seem to have imagined that I read Access in there. Doh!
I still stand by formatting the data during extraction rather than in VBA afterwards.
0
 
andrewssd3Commented:
You really do need to post a sample, but a couple of things in the meantime:

- the code you post should work ok if the cells are actually blank, except that you need to move line 4 before 3: if you fill all the blank cells with the word 'NULL', they won't be blank when you come to try to format them.  

- if they don't get filled with 'NULL', maybe they aren't really blank as Excel sees it - cells that contain just space(s) or formulas that result in a blank string are not really blank.

- the best way to deal with large ranges is by reading into a variant array in one go using the Value property - see this code for an example.  What you can't do in this way is apply formatting - it only works with formulas or values.

Option Explicit

Sub Sample()

Dim oSQLDataTbl As ListObject
Dim aData As Variant
Dim r As Long
Dim c As Long

Set oSQLDataTbl = ActiveSheet.ListObjects(1)

'oSQLDataTbl.DataBodyRange.Select
' With Selection
'.SpecialCells(xlCellTypeBlanks).HorizontalAlignment = xlCenter
'.SpecialCells(xlCellTypeBlanks).FormulaR1C1 = "NULL"
'End With

aData = oSQLDataTbl.DataBodyRange.Value

For r = 1 To UBound(aData, 1)
    For c = 1 To UBound(aData, 2)
        If Len(Trim$(aData(r, c))) = 0 Then
            ' the cell is blank
            aData(r, c) = "Null"
        Else
            If IsDate(aData(r, c)) Then
                If aData(r, c) < DateSerial(2011, 1, 1) Then
                    aData(r, c) = "Null"
                End If
            End If
            
        End If
    Next c
Next r

' write it back
oSQLDataTbl.DataBodyRange.Value = aData

End Sub

Open in new window

0
 
Cook09Author Commented:
I like about being about to write back in one shot (adata), however, The Barman, if I did that in a query with 52 columns, would that have to be written for each one and what would it do to the extract performance?
Cook
0
 
SteveCommented:
You would need to mod each column in the extract (if they all have this issue). The loss in extract time will be minimal compared to the equivalent time in VBA. Good data out of the database is always best imo and SQL is not that slow.
0
 
andrewssd3Commented:
I'm still confused about what you want: does it involve a SQL query, and if so where? I agree with the Barman that IF you have a query you may as well try to return the correct data - that's nearly always more efficient,  but it's difficult to suggest a solution when we don't really know the problem.
0
 
SteveCommented:
As the table is called oSQLDataTbl I think it safe to assume that this is an SQL source.
So even if the table is extracted in full that would be a SELECT *
There may not be a complex SQL Query currently, but it is likely there somewhere.
0
 
andrewssd3Commented:
So if you modify your query so the exact data you want is returned, that just leaves the question of the formatting, and you could use conditional formatting to the whole table range to achieve that to save having to search through all cells.
0
 
Cook09Author Commented:
For the most part, the Excel Code that I am using now is below, as you can see, some of it is very familiar to what was suggested above. Additionally, I have to perform some other housekeeping or formatting items.  Below the SQL code is an example of one of these. In excecuting it, to retrieve up to 10,000 rows ("Select (Top 1000)* from Database") , the VBA code was fine, everything came back as it should and was formatted properly.  

As I ran the entire query: "Select * from Database", it took about 30 minutes to return 82,000 rows with 52 columns. The 82,000 rows was downloaded, but it seems, as it began the Read and Write Once portion, which is the next step, I got a VBA 7 Memory Error.  The spreadsheet looks like the raw data placed within their cells.  No formatting was applied.

Does the Download Data stay within Memory and if so, how could that be cleared out?  I could probably use the:
    SELECT iif(isnull(Column1),"NULL",Column1) as Column1
but I would have to write that 52 times for each of the columns, and if the number of columns increases, which is likely, then some maintenance needs to be performed.  Yesterday, I ran a similar piece of code, without the read and write once, and queried it on 50,000 lines and it worked fine, except for the issues that I wrote about. Plus, on the 10,000 rows performed today, the perfomance was significantly improved, with the new code. Could I be passing a certain memory threshold?  

Would welcome your input...

Cook


Dim oSQLDataTbl As ListObject
Dim con As ADODB.Connection
Dim cmd As ADODB.Command   
Dim rst As ADODB.Recordset, sConnect As String, query As String
Dim aData As Variant, r As Long, c As Long

    Set con = CreateObject("ADODB.Connection")
    Set cmd = CreateObject("ADODB.Command")

    Set wks = Workbooks("Reporting.xlsm").Worksheets("Exceptions")
    sConnect = "Connecton String"
     con.Open sConnect
    query = "SELECT * from RAW_DATA"

    Set rst = New ADODB.Recordset
    With rst
          .ActiveConnection = con 
          .Open query 
 
      With Sheets("Exceptions")
           For Column = 0 To rst.Fields.Count - 1
         .Cells(1, Column + 1).Value = rst.Fields(Column).Name
      Next
      .Range(.Cells(1, 1), .Cells(1, rst.Fields.Count)).Font.Bold = True
     .Cells(2, 1).CopyFromRecordset rst
    End With

    .Close
  End With

    Set rst = Nothing
    Set con = Nothing
    Set cmd = Nothing

   Set oSQLDataTbl = wks.ListObjects("Table__RAW_DATA1")
   aData = oSQLDataTbl.DataBodyRange.Value
For r = 1 To UBound(aData, 1)
    For c = 1 To UBound(aData, 2)
        If Len(Trim$(aData(r, c))) = 0 Then
            ' the cell is blank
            aData(r, c) = "Null"
        Else
            If IsDate(aData(r, c)) Then
                If aData(r, c) < DateSerial(2011, 1, 1) Then
                    aData(r, c) = "Null"
                End If
            End If
            
        End If
    Next c
Next r

' write it back
oSQLDataTbl.DataBodyRange.Value = aData

Open in new window


    Range("Z:Z,AD:AD,AH:AM,AV:AV,BA:BA,BC:BC,BF:BF").Select
      With Selection
      .NumberFormat = "[$-409]m/d/yy h:mm AM/PM;@"
      .HorizontalAlignment = xlCenter
      .EntireColumn.AutoFit
     End With

Open in new window

0
 
Cook09Author Commented:
As an update, I just tried to run it with the same 10000 rows that I did this morning, and received the same Memory Error message.  I need to clear cache, virtual, or could this be programs related?

Cook
0
 
SteveCommented:
I would use MS Query to extract into Excel.

From Data > Other Data Sources > Microsoft Query.

This will run far far faster than 30 min for 82k records (I get that in less than 30 sec)
From this you can record yourself setting the Query and then use VBA to change the SQL.

I would untick "use the query wizard to create / edit query"
And turn off automatic refresh when in MS Query its self.

Give it a go, I would expect you to be able to improve on 30 minutes considerably.
0
 
SteveCommented:
Also when building a query, I use the following method for building SQL Syntax

query = "SELECT "
query = query & "Field1 as name1, "
query = query & "Field2 as name2, "
query = query & "Field3, "
query = query & "Field4, "
query = query & "Fieldn"
query = query & vbcrlf
query = query & "FROM RAW_DATA"
'query = query & vbcrlf 'use if doing a Where
'query = query & "WHERE" ' where statement if needed

Open in new window


This will allow you to handle and follow easily what you are doing with each Field in the SQL for additions and formula such as null handling.
0
 
Cook09Author Commented:
The issue seems to be with my system.  After downloading the data, I stopped the process, closed down Excel, ran the process through the read and write once and the aData went out of memory.  I'm a a little confused as to my system and why 82,000 / 52 columns, would cause me to go out of memory.
0
 
andrewssd3Commented:
One possible problem is that your Table__RAW_DATA1 listobject is not tied to the query results by the way you retrieve the data, so it's possible it is much larger than the result set you are returning.  When you use MS Query as the barman suggested, excel will create a table for you, but that does not happen with native adodb queries, so the listobject will not have adjusted itself to the number of rows and columns returned.  On my system your code only takes a few seconds with 85000 records.

Try typing
?ActiveSheet.ListObjects("Table__RAW_DATA1").databodyrange.rows.count

Open in new window

into the Immediate window and see what you get - it may be lots more than 85000, which would explain your problem.  If not try ...columns.count as well.  Try returning the listobject to a normal range.  Then instead of
oSQLDataTbl.DataBodyRange.Value

Open in new window

, you could use
activesheet.cells(1).currentregion.value

Open in new window

to get the range you have returned.
0
 
Cook09Author Commented:
I did both a Total Rows at 82195 and Total Columns at 62, which is right, it goes to "BJ."  The reason mine takes so long is that I may not be the only one accessing the data, and I'm on a corporate LAN, and even in SSMS is takes about that long.

The_Barman:  I'm assuming that where you have listed the word name1, name2, etc, that you are referring to manually inputting what the column name actually is, that is being returned.  Is that correct?  Since this needs to be dynamic, is there a way of just using Column1, Column2, etc.  And then possibly having the query assign the column Headers after the download.

andrewssd3: How could I use conditional formatting to achieve some of the results that you have seen that I need?  

Is there a way to use the Trim with SpecialCellsBlank, so that all of the Blank Cells are now truly blank and then that one line of code can be run?  Or, how could the r,c code be modified to work as a loop and write to each cell.

At the moment, I have the raw data returned in a spreadsheet.  So, I can focus just on the inserting and formatting portion, without having to spend the time to perform a download each time, since we have narrowed the issue to after the data has been returned.
0
 
SteveCommented:
The query syntax of Field1 as name1, refers to Field as the original name from the datatable being renamed to Name as part of the SQL.
As you seem quite patient then VBA time formatting after extraction is going to be a fraction of 30 min... though I would focus on a faster process.
So just cycling through each cell will probably do you.
I am certain Andrew can facilitate you on this.
0
 
Cook09Author Commented:
The_Barman:
The initial aspect of this thread was to try and find a faster method, yet keep it simple enough so there is not a lot of maintenance that needs to done if a change is made.  Right now a Select * from ,is pretty simple, but there is a certain reluctance to do more as those using queries don't really know Excel (they use SSMS), so it's a little outside of their comfort zone.  I'm familiar with Excel, but a little weak in queries, but willing to learn if the process can be improved.  Especially if we have 300,000 rows and not 80,000.

I'm patient, in trying to find the best method. Andrews original code cut the formatting down sigificanatly, but after about 40,000-50,000 row, errors developed.  What now is the next best alternative, given my current knowledge base.

From what I read in your posts, is where to put the:
SELECT iif(isnull(Column1),"NULL",Column1) as Column1
after all of the Select fieldname statements.  Would that be on your line 9?
Thanks...
0
 
SteveCommented:
OK, first have you tried MS Query to connect to the data source?
It is great at getting data into excel, and can be recorded by the VBA editor too.
So life gets very simple when you start with MS Query, you will ofc move on from that as you get more proficient. But it is a good start.

As for the query, with 56 columns it will be fun to write the SQL.
Again - Using MS query will help as it will build the SQL with all the field names that can be "grabbed" from the Connection String in Excel.

To test your data take the column with the nulls in (or one of them) and..
swap the '*' for Fieldname and see that it returns all the values.
Then test the iif(isnull(Fieldname),"NULL",Fieldname) as Fieldname this may not work depending on the datasource. or Trim(Fieldname) as this may help too.
0
 
Cook09Author Commented:
Okay, I'll give it a go...
I attempted to use the Query Editor (Data - Existing Connections), Definition, then changed the Command Type from Table to SQL and revised the Command text to:
 Select Fieldname From  
    Server.Database.Table.etc.

It came back with a "Invalid column name 'Fieldname."
0
 
SteveCommented:
You will need to replace Fieldname with the actual name of a field from the table.
If you post the first row (Header row) in an excel file, can create the select query for you.
But using MS Query is far far simpler.
0
 
Cook09Author Commented:
I've attached a worksheet with a few of the headings, once I see how you've done these, I can do the rest.

For some reason, I'm having some difficulty with MS Query, not sure exactly what the issue is, but now that the code is finished, it can be investigated more tomorrow.

Thanks,
Cook
Headings.xlsx
0
 
Cook09Author Commented:
After I sent that, it occured to me that you may have written a macro that would take the headings and generate a query, and save a lot of time.  If not then don't worry about the rest.
Headings1.xlsx
0
 
SteveCommented:
OK, try (as a start) replacing the
query = "SELECT * from RAW_DATA"

Open in new window

with this built 'select query'...
query = "SELECT USI, "
query = query & "LO_COND_ID, "
query = query & "INSTANCE_ID, "
query = query & "SEGMENT, "
query = query & "SAMPLEDEXCLUSION, "
query = query & "PRTFL, "
query = query & "PWKGP, "
query = query & "SWKGP, "
query = query & "SWKGP_OVRD_FLG, "
query = query & "SWKGP_ASSGN, "
query = query & "SWKGP_RESP_NM, "
query = query & "SWKGP_CMNT, "
query = query & "INV, "
query = query & "COND_ID, "
query = query & "COND_TXT, "
query = query & "RULE_ID, "
query = query & "RULE_TXT, "
query = query & "MSP_CLNT_NUM, "
query = query & "SUPREME, "
query = query & "LOAN_NUM, "
query = query & "ST, "
query = query & "RVW_TYPE, "
query = query & "CURR_STG, "
query = query & "CMNT, "
query = query & "GUID_QUES_RESP, "
query = query & "MAX_CMNT_DTTM, "
query = query & "MAX_RVWR_ROLE, "
query = query & "MAX_CMNT, "
query = query & "RVWR_NM, "
query = query & "FILE_DT, "
query = query & "DAYS_IN_PROCESS, "
query = query & "DAYS_REMAINING_IN_SLA, "
query = query & "EXCEEDED_SLA, "
query = query & "INITRVW_STRT_TM, "
query = query & "INITRVW_FNSH_TM, "
query = query & "FRSTRVW_STRT_TM, "
query = query & "FRSTRVW_FNSH_TM, "
query = query & "SECRVW_STRT_TM, "
query = query & "SECRVW_FNSH_TM, "
query = query & "INIT_AGREE_PRI, "
query = query & "FRST_AGREE_PRI, "
query = query & "SEC_AGREE_PRI, "
query = query & "AGREEWITHFINDINGS, "
query = query & "RESP_CMNT, "
query = query & "AGREE_WITH_HARM, "
query = query & "HARM_RESP_CMNT, "
query = query & "REM_AMT, "
query = query & "OCC_CAT, "
query = query & "PFG_HARM, "
query = query & "PFG_MONETARY_ERR, "
query = query & "SNT_STRT_TM, "
query = query & "STAT, "
query = query & "CLRNC_DT, "
query = query & "CLRNC_CMNT, "
query = query & "WV_DT, "
query = query & "WV_CMNT, "
query = query & "TERM, "
query = query & "TERMDATE, "
query = query & "DEF_TYPE, "
query = query & "SCH_SALE_DT, "
query = query & "HOLD_FLG, "
query = query & "HOLD_RSN, "
query = query & vbcrlf
query = query & "FROM RAW_DATA"

Open in new window

0
 
Cook09Author Commented:
The_Barman,
Sorry for the delay, but something came up that needed some immediate attention, and I wanted to connect through MS Query, which I did.  The code worked well, the only issue was removing the last " ," after HOLD_RSN.  Once I did that, then it went very smooth.

However, when I try to change one of the columns to:
Query = Query & "iif(isnull(SWKGP),"NULL",SWKGP) As SWKGP, "
I get all kinds of errors, depending upon what I try, from, to many arguments, to syntax errors around a ",", to SQL erroring that it doesn't accept IIf as a builtin function.  I may try COAlESCE or maybe Replace.  Do you have any ideas?
0
 
SteveCommented:
depending upon your data source the language can change slightly.
For MS Access IIF can be used, for SQL Server COALESCE or Nz.
so with IIF not working, try query = query & "Nz (SWKGP,"NULL") as SWKGP, "
It can get a little bit "trial and error" but is all good experience and will enhance your understanding and abillities for the future.
0
 
Cook09Author Commented:
The nz doesn't work and it also doean't like the "Null", it seems SQL prefers ' Null ',   I did try a Case statement and it seemed to work.

query = query & "CASE SWKGP WHEN ' ' THEN ' NULL ' ELSE SWKGP END AS SWKGP, "
I didn't try COALESCE, but I may need to look into it as another solution.

One final question, with there being so many of these, is there a way that something like this could be set up as a template, with the Column Names stored in Name Manager?

Example:
 sQuery = sQuery & "CASE COLUMN1 WHEN ' ' THEN 'strADD' ELSE COLUMN1 END AS COLUMN1, "

Then in Name Manager COLUMN1 = SWKGP & strADD = ' Null ' ?  Is there a downside to doing this?
0
 
SteveCommented:
I thend to transpose the fields into an excel sheet and then work with them in the sheet.
This is why I tend to "build" the statement the way I have shown.
I do not use name manager, but I rarely have more than 15 columns.
I would suppose whatever you are comfortable with is best.
But having all the column names in place of * is considered better.
0
 
Cook09Author Commented:
Thanks for your help and sorry it took so long, but I did learn a lot through it.
0

Featured Post

Free Tool: ZipGrep

ZipGrep is a utility that can list and search zip (.war, .ear, .jar, etc) archives for text patterns, without the need to extract the archive's contents.

One of a set of tools we're offering as a way to say thank you for being a part of the community.

  • 12
  • 12
  • 4
  • +1
Tackle projects and never again get stuck behind a technical roadblock.
Join Now