• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 488
  • Last Modified:

Revised (Can my code run as a .vbs file? I need to run DBEngine.CompactDatabase, after each file it imports)

I have some import VBA code to import arround 3,500 csv files in to an Access database and that works fine. I'm sourcing 18G of log data. However Access has a 2GB limit and I don't need all the duplicates that are imported. Filtering out many of the columns helps. But in the end instead of the 60K records I imported I only need to see the unique records that will sure to be only around 400 or so.

I want to filter out duplicates for a record and not import them.
For example if I am importing 5 columns I want to identify a duplicate when all five columns have the same data as any previous imported record compared the next row to import. And if there is a duplicate, skip it and move to the next line in the csv file.

so:
1|1|1|1|1 - Imports
1|1|1|1|1 - Dose not import, duplicate across all values for record
1|2|1|1|1 - Imports
0
Dooglave
Asked:
Dooglave
  • 8
1 Solution
 
DooglaveAuthor Commented:
This is my import Code:

Sub DoImport()

    Dim fso As Object
    Dim fld As Object
    Dim fil As Object
    Dim FldPath As String
   
    Const DestTable As String = "FABLogs"
   

    FldPath = "C:\Syslogs\"
   
    Set fso = CreateObject("Scripting.FileSystemObject")
    Set fld = fso.GetFolder(FldPath)
   
    With DoCmd
        .SetWarnings True
        For Each fil In fld.Files
            If UCase(Right(fil.Name, 3)) = "CSV" Then
           ' MsgBox "fil.Name is " & fil.Name
           ' MsgBox "fil.path is " & fil.path
                .TransferText acImportDelim, "FABImportLogs", DestTable, fil.path, False
            End If
        Next
        .SetWarnings True
    End With

    Set fil = Nothing
    Set fld = Nothing
    Set fso = Nothing
   
    MsgBox "done"

End Sub

Open in new window

0
 
DooglaveAuthor Commented:
Well I think I just did it.  Thinking, make this a function and call it in my other loop between imports.

Public Sub subDeleteLogDups()

   Dim strSQL As String
   Dim objLogRS As DAO.Recordset2
   Dim strPrevRuleName, strPrevAppID, strPrevPort, strPrevProtocol, srtPrevAction

   'Construct SQL statement
   strSQL = "SELECT * FROM logs ORDER BY RuleName, AppID, Port, Protocol, Action"

   'Instantiate objLogRS
   Set objLogRS = CurrentDb.OpenRecordset(strSQL, dbOpenDynaset, dbSeeChanges)

   'First Record
   objLogRS.MoveFirst

   'Loop through Logs
   Do While objLogRS.EOF = False
        If (objLogRS.Fields("RuleName") = strPrevRuleName) And (objLogRS.Fields("AppID") = strPrevAppID) And (objLogRS.Fields("Port") = strPrevPort) And (objLogRS.Fields("Protocol") = strPrevProtocol) And (objLogRS.Fields("Action") = strPrevAction) Then

         'Same Log as Prev Record... delete the record
     objLogRS.Delete

      Else

         'New Log... store new values for Previous Row
         strPrevRuleName = objLogRS.Fields("RuleName")
         strPrevAppID = objLogRS.Fields("AppID")
         strPrevPort = objLogRS.Fields("Port")
         strPrevProtocol = objLogRS.Fields("Protocol")
         strPrevAction = objLogRS.Fields("Action")
         

      End If
      
      'Next Record
      objLogRS.MoveNext

   Loop
   
   'Close RS
   objLogRS.Close

   'Clean Up
   Set objLogRS = Nothing

End Sub

Open in new window

0
 
DooglaveAuthor Commented:
Not sure I should modify the registry, seems like I'm just doing something wrong.

hmm getting error when I call the sub with line:
Call subDeleteLogDups()
 
at this line: objLogRS.Delete

Run-time '3052':
FileSharing lock count exceeded. Increase MaxLocksPerFile Registry entry.


    With DoCmd
        .SetWarnings True
        For Each fil In fld.Files
            If UCase(Right(fil.Name, 3)) = "CSV" Then
           ' MsgBox "fil.Name is " & fil.Name
            MsgBox "fil.path is " & fil.path
                .TransferText acImportDelim, "AppIDOnly", DestTable, fil.path, False
            End If
        Call DeleteLogDups
        Next
        .SetWarnings True
    End With

Open in new window

0
What does it mean to be "Always On"?

Is your cloud always on? With an Always On cloud you won't have to worry about downtime for maintenance or software application code updates, ensuring that your bottom line isn't affected.

 
DooglaveAuthor Commented:
Same thing when I put it all in one Sub, same error, same line:

Sub DoImport()

    Dim fso As Object
    Dim fld As Object
    Dim fil As Object
    Dim FldPath As String
    Dim strSQL As String
    Dim objLogRS As DAO.Recordset2
    Dim strPrevRuleName, strPrevAppID, strPrevPort, strPrevProtocol, srtPrevAction
    
   
    Const DestTable As String = "FABLogs"
   

    FldPath = "C:\Syslog\"
   
    Set fso = CreateObject("Scripting.FileSystemObject")
    Set fld = fso.GetFolder(FldPath)
   
    With DoCmd
        .SetWarnings True
        For Each fil In fld.Files
            If UCase(Right(fil.Name, 3)) = "CSV" Then
           ' MsgBox "fil.Name is " & fil.Name
            MsgBox "fil.path is " & fil.path
                .TransferText acImportDelim, "AppIDOnly", DestTable, fil.path, False
            End If
            
            strSQL = "SELECT * FROM FABLogs ORDER BY RuleName, AppID, Port, Protocol, Action"
            Set objLogRS = CurrentDb.OpenRecordset(strSQL, dbOpenDynaset, dbSeeChanges)
            objLogRS.MoveFirst

              Do While objLogRS.EOF = False
                If (objLogRS.Fields("RuleName") = strPrevRuleName) And (objLogRS.Fields("AppID") = strPrevAppID) And (objLogRS.Fields("Port") = strPrevPort) And (objLogRS.Fields("Protocol") = strPrevProtocol) And (objLogRS.Fields("Action") = strPrevAction) Then
                  objLogRS.Delete
                Else
                  strPrevRuleName = objLogRS.Fields("RuleName")
                  strPrevAppID = objLogRS.Fields("AppID")
                  strPrevPort = objLogRS.Fields("Port")
                  strPrevProtocol = objLogRS.Fields("Protocol")
                  strPrevAction = objLogRS.Fields("Action")
                End If
                
               objLogRS.MoveNext
              Loop
   
            objLogRS.Close
            Set objLogRS = Nothing
    Next
        .SetWarnings True
    End With

    Set fil = Nothing
    Set fld = Nothing
    Set fso = Nothing
   
    MsgBox "done"

End Sub

Open in new window

0
 
DooglaveAuthor Commented:
Well I set:
DAO.DBEngine.SetOption dbmaxlocksperfile, 25000

Deleted the Table, Repaired the Database
Closed Access and re-opened.

I get through three files now it seems before running into the error.

I need a way to clear the locks if I'm going to get through 3500 files.
0
 
DooglaveAuthor Commented:
Set to 250000 and now, it's running.  Lets see how far it goes :)
0
 
DooglaveAuthor Commented:
I'm starting to hate MS Access.   File grows to 2G, I repair and compact, goes back down to 1.5mb.  Run another batch, over and over and over again.

Is there any better open source databases I can use.  Like MySQL Lite, can I script with that like this?
0
 
DooglaveAuthor Commented:
I wonder if I can make my code run as a .vbs file. Then I could do a CompactDatabase command in the loop..... maybe
0
 
PatHartmanCommented:
There is a no code solution:
If you set a unique index on the fields that make the record unique (Access supports up to 10), the TransferText Method will import the unique rows and discard the duplicates.

To create a multi-column index.
1. Open the table in design view.
2. Open the indexes dialog
3. In the first completely empty row.
    3a. Add an index name.
    3b. Select the first column
    3c. Set the unique property to yes
4. In the second and subsequent rows.
    4a. Skip the index name
    4b. Select the next column for the index.

Leaving the index name blank tells Access that all the columns belong to a single index so they are used together.  That means the combination of a+b+c must be unique rather than the individual values of a, b, c.

Another method is to use a query.  This works even if you have more than ten columns.  Link to the file instead of appending it.

Make a totals query.  This will automatically add Group By to the Total row.  For the columns you are not grouping by, change the Total value to First, Min, etc., whatever makes sense.  Usually first would be best.  That way all the values will come from the same record.

The final step is to change the Totals query into an append query.
0

Featured Post

Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

  • 8
Tackle projects and never again get stuck behind a technical roadblock.
Join Now