?
Solved

Select a specific Excel file to import into SQL table in SSIS, HOW?!!!

Posted on 2011-10-21
37
Medium Priority
?
1,245 Views
Last Modified: 2013-11-26
Hello,

What I'm trying to do is insert data into a SQL table from an excel spreadsheet. The challenge is being able to grab the most recent excel spreadsheet as the data source.

The file has a naming convention similar to:
File(1).xls, File(2).xls, File(3), File(4) etc.,

Can someone tell me how I can grab the latest file? The issue is the person who is placing the file in the directory only increments the file each day, so if today was Monday, the file will be named File(1).xls, and Tuesday would be File(2).xls. To make things more complicated, we cannot used "Date modified" becasue the file is swept up by a an auto-sync program which makes all of the dates the same because it all gets downloaded and synced at the same time.

Also, the data has been placed on its own individual cell in the excel spreadsheet.

So the logic would be grabbing the file with the greatest number in the parenthesis and then pulling the data off the spreadsheet from specific cells.

Right now, the data is being inserted correctly if I preselect an excel spreadsheet, however, not pulling the most recent file in the folder.

The data is being inserted using Excel Source --> OLE DB Destination. I'm using SQL command and doing a select * from [SHEET1$A2:A2] as name, [SHEET1$D3:D3] as Address, [SHEET1$F5:F5] as Invoice  

(Yes the data I need are all on separate lines).

Does anyone have any idea how to accomplish this?  Thanks so much in advance!
0
Comment
Question by:TeknikDev
  • 19
  • 14
34 Comments
 
LVL 14

Accepted Solution

by:
sentner earned 1000 total points
ID: 37006056
Here is a batch file script that can copy the file with the highest number in the (), and copy it to a fixed name, that you can then use to import.  

@echo off
set /a _biggest=0
Setlocal EnableDelayedExpansion
 
for /f "tokens=2delims=()" %%i in ( 'dir /b "file(*).xls"' ) do (
   if %%i GTR !_biggest! (set /a _biggest=%%i)
)
echo "Latest: file(%_biggest%).xls"
copy "file(%_biggest%).xls" masterfile.xls

Open in new window

0
 

Author Comment

by:TeknikDev
ID: 37006074
So I assume I drag a "Script Task" into the Control flow prior to the Import so it can grab the largest file?
0
 
LVL 14

Expert Comment

by:sentner
ID: 37006362
From looking at the SSIS documentation, it looks like you can set up an "Execute Process" task.  This can be used to execute a batch file.  You should of course modify the code to first change to the proper directory where the files are stored.  

http://msdn.microsoft.com/en-us/library/ms141166.aspx
http://www.sqlmonster.com/Uwe/Forum.aspx/sql-server-ssis/1725/run-bat-file-from-SSIS

0
What does it mean to be "Always On"?

Is your cloud always on? With an Always On cloud you won't have to worry about downtime for maintenance or software application code updates, ensuring that your bottom line isn't affected.

 

Author Comment

by:TeknikDev
ID: 37006781
Thank you sentner, do you think of another solution without having to use the batch? I rather keep everything inside SSIS. Only because we are doing something simliar for 3 other jobs but its using a flat file text file which is obviously easier.
0
 
LVL 14

Expert Comment

by:sentner
ID: 37006947
Yes, you can probably do the whole thing within SSIS as a script task, by essentially porting the batch script to that language.  If you do that you probably don't need to copy the latest file, but rather just find the name and pass it back to the task that does the import.

There are several threads on MSDN doing a similar thing.  See:  
http://social.msdn.microsoft.com/Forums/en/sqlintegrationservices/thread/9ec4e837-6188-4eb2-aac0-f5f38a89bc91
0
 

Author Comment

by:TeknikDev
ID: 37006981
Hmm....i need to find the code that will find the files and grab the largest number in parenthesis. Then take the values in specific cells and import in SQL.
0
 
LVL 14

Expert Comment

by:sentner
ID: 37007056
Yep, don't know that you can just "find" the code existing.  Your best bet is to find something "similar" and modify it.  I'd suggest looping through all the files in the directory, using a regular expression to match the pattern and pull out the number enclosed within parens (regex search format would be something like ".*\(([0-9]+)\).xls"), and compare against the biggest one found so far like I did in the batch file.  You then reconstruct the filename from that, or you could save that name as a separate variable inside the loop.

0
 

Author Comment

by:TeknikDev
ID: 37007222
thats the thing, im new to this., i wouldnt know how to start coding.
0
 

Author Comment

by:TeknikDev
ID: 37007917
Looks like I might be able to get away with using the last modified date as the critieria. If ideally only 1 .xls file gets put in the directory. I can try to pick up the newest file as long as the modified date is today's date.

Can someone pleaseeee provide help!
0
 
LVL 14

Expert Comment

by:sentner
ID: 37008444
It can still be done using a script task.  As I said, I'm not familiar with SSIS, though I have found several threads on this type of thing out there from googling.  

This link has some sample code that can be tweaked... I'll take a blind stab at doing some of that in the code box.  I'm sure there are plenty of bugs in there, since I have nothing to test on, but it should be a place to start.  I'm not sure how you get the "useFiles" variable into something useful to SSIS, but that's where the filename to use should end up.

Public Class ScriptMain    
  Inherits UserComponent     
  Public Overrides Sub FindNewestFile()        
    Dim fileName As String         
    Dim useFile As String = String.Empty
    Dim number As Integer
    Dim biggest As Integer = 0
    Dim pattern As String = String.Empty
    Dim r As Regex = Nothing
    pattern = “.*\(([0-9])\).*”
    r = New Regex(pattern, RegexOptions.Compiled)

    For Each fileName In Directory.GetFiles(Me.Connections.ImportFilesDir.AcquireConnection(Nothing).ToString())            
      number = Integer.parse(Regex.Replace(fileName, pattern, “$1”))
      if number > biggest
        biggest = number
        useFile = fileName
      End If
    Next         
    
  End Sub 
End Class

Open in new window

0
 

Author Comment

by:TeknikDev
ID: 37008960
blah, where does the inherit goes? it doesnt like it when i copy and paste, could I just not use a function and put the code in the main section?
0
 
LVL 14

Expert Comment

by:sentner
ID: 37009324
I suppose so.. As I said, I'm not familiar with SSIS specifically, but I am versed in other scripting and programming languages.  I'm just trying to help steer you in the right direction.
0
 

Author Comment

by:TeknikDev
ID: 37017323
Thank you!

Can someone help me write a vb script to find the largest of number in parenthesis and the modified date equals today and the select a few cells in Ms excel spreadsheet and enter into SQL table? I need help ASAP! My boss needs this done by end of the week!
0
 
LVL 21

Expert Comment

by:Jason Yousef, MS
ID: 37021561
Hi,
You can use that as a reference to get the last creation date of the file and import only the daily file

http://asqlb.blogspot.com/2011/08/ssis-process-files-that-has-been.html

Share  the excel file and let me know which cells you need to import
0
 

Author Comment

by:TeknikDev
ID: 37033903
Thanks, but I manage to complete this process with the help of batch files!  

Huslayer, thanks for that, it looked like it would have helped, but I found an alternative solution.

0
 

Author Comment

by:TeknikDev
ID: 37033912
This was the answer to selecting the largest value in the parenthesis for excel file name.
0
 

Author Comment

by:TeknikDev
ID: 37045215
So i ran a test today and the import and everything worked fine up until today. For some reason, I think its once it hits the number 10 or double digits in the parenthesis.

For file 1-9, it worked file until the 10th. Any idea how I can expand the check to grab the largest to 6 or 7 decimal places instead of 1?
0
 
LVL 14

Expert Comment

by:sentner
ID: 37045376
Are you using the VB script, or the batch file?   I had tested the batch file with double-digits to ensure it worked before I pasted it.  Can you paste the script you put into place, so I can see where it may be failing?
0
 

Author Comment

by:TeknikDev
ID: 37045396
@echo off
set /a _biggest=1
Setlocal EnableDelayedExpansion
 
for /f "tokens=2delims=()" %%i in ( 'dir /b "TEST1 (*).xls"' ) do (
   if %%i GTR !_biggest! (set /a _biggest=%%i)
)

echo "Latest: TEST1 (%_biggest%).xls"

echo copy /Y "E:\File\TEST1 (%_biggest%).xls"

copy "E:\File\TEST1 (%_biggest%).xls"   "E:\File Main (0).xls"
 
0
 

Author Comment

by:TeknikDev
ID: 37045397
batch file
0
 

Author Comment

by:TeknikDev
ID: 37045461
I thought of another thing, the batch file is not in the same directory. It cannot be because it syncs with the server. Not sure if thats an issue.
0
 
LVL 14

Expert Comment

by:sentner
ID: 37045477
Just to be clear, you have "TEST1" listed as the filename, so the next files are not "TEST2", "TEST3", etc, right?  It has to have the number within the parentheses.

I just tested again, creating files name "TEST1 (1).xls" through "TEST1 (13).xls" and each time the script picks the biggest number.  I even then made a couple that were several digits long, and it still worked as expected.

Can you try removing the "@echo off" line, and run it, so we can see what it's doing?

0
 

Author Comment

by:TeknikDev
ID: 37045493
ok something really funny occured. so i ran the batch directly and it copied the largest file over, but the ssis package does not and copies the wrong file. WHy is that?
0
 
LVL 14

Expert Comment

by:sentner
ID: 37045496
It shouldn't matter whether it's in the same directory, but you should add a line before the "for" loop, that changes to the correct directory where the files are located.  Just a line that reads something like "cd c:\datadir"
0
 
LVL 14

Expert Comment

by:sentner
ID: 37045507
Could the SSIS package be running the script from the wrong directory, and thus grabbing a file from whatever directory it runs it from?

Add that "CD" line to the script, and try again.
0
 

Author Comment

by:TeknikDev
ID: 37045508
hmmmmmm....in ssis for the "execute process task" I set the working directory to the path which contains all the files to compare and it works.

strange...................................ill contineu to test.
0
 
LVL 14

Expert Comment

by:sentner
ID: 37045518
Yes, if you set the working directory from SSIS it should work as well.  The working directory is the place where it will run the script from.  The other way to do it is to set the directory within the script itself (which is what I was suggesting).
0
 

Author Comment

by:TeknikDev
ID: 37045605
wait, the workign directory has been set to where the files are located where it does the compare not where the batch file is. Should it be where the batch file should be?
0
 
LVL 14

Expert Comment

by:sentner
ID: 37045616
No, the working directory should be where the files are.  But if you add the "CD" line to the batch file, you won't have to worry about where the working directory is set.
0
 

Author Comment

by:TeknikDev
ID: 37045626
can u provide the syntax and where it should go?
0
 
LVL 14

Expert Comment

by:sentner
ID: 37045704
Sure.  replace "C:\datadir" in the code below with the actual location where the excel files are stored.

the "/d" flag to "cd" tells it to also change the to that drive if necessary.

@echo off

cd /d "C:\temp"

set /a _biggest=1
Setlocal EnableDelayedExpansion
 
for /f "tokens=2delims=()" %%i in ( 'dir /b "TEST1 (*).xls"' ) do (
   if %%i GTR !_biggest! (set /a _biggest=%%i)
)

echo "Latest: TEST1 (%_biggest%).xls"

echo copy /Y "E:\File\TEST1 (%_biggest%).xls" 

copy "E:\File\TEST1 (%_biggest%).xls"   "E:\File Main (0).xls"

Open in new window

0
 

Author Comment

by:TeknikDev
ID: 37045706
Ok I added the CD before the FOR loop and added the working directory path and the issue is fixed.

Seems like the batch file still couldnt find the right file until I put CD before the FOR loop and SSIS couldnt execute the batch file correctly until the working directory was added.

Thanks for your help!
0
 
LVL 14

Expert Comment

by:sentner
ID: 37045708
Sorry... I called it "C:\temp" in that script instead of "C:\datadir", so THAT is what you should replace with the actual location of the excel files.
0
 

Author Comment

by:TeknikDev
ID: 37045719
sorry my latest comments were prior to you posting your code with the CD stuff. But either way, it works. I am too lazy to proceed to test without clearing out the working directory in SSIS. But as long as it works, im good!

Thanks!
0

Featured Post

Free Tool: ZipGrep

ZipGrep is a utility that can list and search zip (.war, .ear, .jar, etc) archives for text patterns, without the need to extract the archive's contents.

One of a set of tools we're offering as a way to say thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Creating a Cordova application which allow user to save to/load from his Dropbox account the application database.
This shares a stored procedure to retrieve permissions for a given user on the current database or across all databases on a server.
Via a live example, show how to setup several different housekeeping processes for a SQL Server.
Viewers will learn how to use the SELECT statement in SQL to return specific rows and columns, with various degrees of sorting and limits in place.
Suggested Courses

864 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question