Importing files in a directory and subdirectories

Hello guys,

I'm trying to import a bunch of HTML files that reside in a directory and its subdirectories into a notes database.
The reason to do this is to easily be able to search the former html documents using Domino's full text search engine and automatically archive documents that expire depending on their creation date.

The problem I am facing is with the subdirectories. I can't get LotusScript to recursively read subdirectories. Apprently Dir and Dir$ always return regular files and not directories, no matter what I do.

Here's the code I use to perform the import of files (i've cut out the actual parsing of the files and replaced it with a sort of log command):

Function importHTMLFiles(dbWebImport As NotesDatabase, strHTMLPath As String, strLogTitle As String) As Integer
      REM ///////////////////////////////////////////////////////////////////////////////////////////////////////
      REM This function will recursively walk through all files in the passed path and import them in the passed database
      REM ///////////////////////////////////////////////////////////////////////////////////////////////////////
      
      REM =====================================================================================
      REM Initialize local error routine
      REM =====================================================================================
      importHTMLFiles = False
      On Error Goto ErrorHandler
      
      REM =====================================================================================
      REM Initialize local variables
      REM =====================================================================================
      Dim subdir As String
      Dim filename As String
      
      REM =====================================================================================
      REM Recursively walk through the passed HTML path
      REM =====================================================================================
      'process files in the directory
      Call LogAction(strLogTitle, "processing files in " & strHTMLPath)
      filename = Dir(strHTMLPath & "\*.*", 0)
      While filename <> ""
            Call LogAction(strLogTitle, "processing file: " & strHTMLPATH & "\" & filename)
            filename = Dir
      Wend
      'process directories, call this function recursively per found directory
      subdir = Dir(strHTMLPath & "\*.*", 16)
      While subdir <> ""
            If subdir <> "." And subdir <> ".." Then
                  Call LogAction(strLogTitle, "move to subdirectory: " & strHTMLPath & "\" & subdir)
                  importHTMLFiles = importHTMLFiles(dbWebImport, strHTMLPath & "\" & subdir, strLogTitle)
                  subdir = Dir(strHTMLPath & "\*.*", 16)
            Else
                  subdir = Dir
            End If
      Wend
      
      REM =====================================================================================
      REM Exit function
      REM =====================================================================================
      importHTMLFiles = True
      Exit Function
      
      REM =====================================================================================
      REM Error Handler
      REM =====================================================================================
ErrorHandler:
      Call LogError(strLogTitle, Err, Error$ & " in line " & Erl)
      importHTMLFiles = False
      Exit Function
End Function

The script runs fine as long as it's reading files, but when it should get the directories and recall itself using the extended path it fails, due to the fact that it reads the first file again and passes this to itself.

Anybody any ideas if i'm doing anything wrong? Is this a bug in LotusScript? If so, is there a workaround?
I am working with R5.0.12.
An upgrade to R6.x is out of the question, by the way.

Regards,
JM
LVL 8
Jean Marie GeeraertsApplication EngineerAsked:
Who is Participating?
 
Sjef BosmanConnect With a Mentor Groupware ConsultantCommented:
jerrith,

I suppose you have to call Dir repeatedly at the beginning of your function, to get all the names present, and store them in an array. Then walk through the array.

Sjef
0
 
Sjef BosmanGroupware ConsultantCommented:
Hi jerrith,

I'd always do a Dir to get the full directory, not only the directories but also the files, then walk through the files and do a GetFileAttr to find out what type of file you have got.

Cheers!
   Sjef
0
 
Jean Marie GeeraertsApplication EngineerAuthor Commented:
Okay. This might work.
Any idea if a subsequently call to dir$ will return the next file when the function returns from it's recursive call?
0
Receive 1:1 tech help

Solve your biggest tech problems alongside global tech experts with 1:1 help.

 
Jean Marie GeeraertsApplication EngineerAuthor Commented:
I feared as much :-)
I'll get cracking and let you know how I go.

Do you have any idea of how to import .jpg, .gif, .svg, .css and .js files as image resources using script or LS2API?
I've been searching the LDD but haven't found a good solution yet. I'd post a 500 point question for this as this doesn't seem to be too easy.
0
 
Sjef BosmanGroupware ConsultantCommented:
jerrith,

Won't be easy, anyway, you need Designer privileges for this. I could ask someone with a lot of LS2API experience? Don't have the answer myself, I've been looking for the same thing in the past but solved everything with normal file attachments. In Web, use docname/$File/picture.jpg

Sjef
0
 
Jean Marie GeeraertsApplication EngineerAuthor Commented:
The problem is I am importing a complete web site in a notes database and I would like to keep the HTML code as is.
The easiest way would be if I imported all images as image resources with as name the path relative to the root directory and the filename of the image, .css, .js, ...
This way I wouldn't have to convert any links in the original HTML code which is entered in a notes document with the options to display the document contents as HTML. That's why I'd like to use the image resources.
Anyways, that's only part two of the problem. First I need to get the html files imported :-)

TTYL
0
 
Sjef BosmanGroupware ConsultantCommented:
Q&D solution: Dump all images etc. in one document, so you'd have db.nsf/view/doc/$File as a BASE
0
 
Jean Marie GeeraertsApplication EngineerAuthor Commented:
Hello Hemanth,

I've written my own version for importing the files now and it's a lot shorter :-)

Here's the code that does the trick for me at the moment (just log the file names for the time being):

REM =====================================================================================
      REM Recursively walk through the passed HTML path
      REM =====================================================================================
      'Build list of filenames and directories and put this list in an array
      strFileList = ""
      filename = Dir$(strHTMLPath & "\*.*", 16)
      Do While (Not filename = "")
            If filename <> "." And filename <> ".." Then
                  strFileList = strFileList & filename & ":"
            End If
            filename = Dir$
      Loop
      If strFileList <> "" Then
            varFileList = StrExplode(strFileList, ":", False)
            'Walk through files and call the function recursively for directories
            Forall f In varFileList
                  intAttributes = Getfileattr(strHTMLPath & "\" & f)
                  If intAttributes And 16 Then
                        rc = importHTMLFiles(dbWebImport, strHTMLPath & "\" & f, strLogTitle)
                  Else
                        Call LogAction(strLogTitle, "Processing file: " & strHTMLPath & "\" & f)
                  End If
            End Forall
      End If

StrExplode is a function I wrote myself to to mimic the behaviour of @Explode and return a variant with the list of files.

Since Sjef put me on the right track I'll award points to him.

Now for the import of graphic and other files into image resources, that's something completely different and I'll post a new question for that.
0
 
CRAKCommented:
Done that before...
See http://oldlook.experts-exchange.com:8080/Applications/Email/Lotus_Notes/Q_11758718.html

It provides a working recursive solution.
2nd option (same PAQ):
Launch MS DOS and run
    DIR /B/S > C:\MYDIR.TXT
All you need now is the info in C:\MYDIR.TXT

;-))
0
 
Jean Marie GeeraertsApplication EngineerAuthor Commented:
Looks very similar your code :-)
Must be that great minds think alike ;->
0
 
CRAKCommented:
Beter goed gejat dan slecht gebouwd
(=better stolen well than build crappy)

I'm pretty sure that mine was an original!
0
 
Jean Marie GeeraertsApplication EngineerAuthor Commented:
LOL
Very true, especially if you're working against the clock \:-D
0
All Courses

From novice to tech pro — start learning today.