Jean Marie Geeraerts
asked on
Importing files in a directory and subdirectories
Hello guys,
I'm trying to import a bunch of HTML files that reside in a directory and its subdirectories into a notes database.
The reason to do this is to easily be able to search the former html documents using Domino's full text search engine and automatically archive documents that expire depending on their creation date.
The problem I am facing is with the subdirectories. I can't get LotusScript to recursively read subdirectories. Apprently Dir and Dir$ always return regular files and not directories, no matter what I do.
Here's the code I use to perform the import of files (i've cut out the actual parsing of the files and replaced it with a sort of log command):
Function importHTMLFiles(dbWebImpor t As NotesDatabase, strHTMLPath As String, strLogTitle As String) As Integer
REM ////////////////////////// ////////// ////////// ////////// ////////// ////////// ////////// ////////// ///////
REM This function will recursively walk through all files in the passed path and import them in the passed database
REM ////////////////////////// ////////// ////////// ////////// ////////// ////////// ////////// ////////// ///////
REM ========================== ========== ========== ========== ========== ========== =========
REM Initialize local error routine
REM ========================== ========== ========== ========== ========== ========== =========
importHTMLFiles = False
On Error Goto ErrorHandler
REM ========================== ========== ========== ========== ========== ========== =========
REM Initialize local variables
REM ========================== ========== ========== ========== ========== ========== =========
Dim subdir As String
Dim filename As String
REM ========================== ========== ========== ========== ========== ========== =========
REM Recursively walk through the passed HTML path
REM ========================== ========== ========== ========== ========== ========== =========
'process files in the directory
Call LogAction(strLogTitle, "processing files in " & strHTMLPath)
filename = Dir(strHTMLPath & "\*.*", 0)
While filename <> ""
Call LogAction(strLogTitle, "processing file: " & strHTMLPATH & "\" & filename)
filename = Dir
Wend
'process directories, call this function recursively per found directory
subdir = Dir(strHTMLPath & "\*.*", 16)
While subdir <> ""
If subdir <> "." And subdir <> ".." Then
Call LogAction(strLogTitle, "move to subdirectory: " & strHTMLPath & "\" & subdir)
importHTMLFiles = importHTMLFiles(dbWebImpor t, strHTMLPath & "\" & subdir, strLogTitle)
subdir = Dir(strHTMLPath & "\*.*", 16)
Else
subdir = Dir
End If
Wend
REM ========================== ========== ========== ========== ========== ========== =========
REM Exit function
REM ========================== ========== ========== ========== ========== ========== =========
importHTMLFiles = True
Exit Function
REM ========================== ========== ========== ========== ========== ========== =========
REM Error Handler
REM ========================== ========== ========== ========== ========== ========== =========
ErrorHandler:
Call LogError(strLogTitle, Err, Error$ & " in line " & Erl)
importHTMLFiles = False
Exit Function
End Function
The script runs fine as long as it's reading files, but when it should get the directories and recall itself using the extended path it fails, due to the fact that it reads the first file again and passes this to itself.
Anybody any ideas if i'm doing anything wrong? Is this a bug in LotusScript? If so, is there a workaround?
I am working with R5.0.12.
An upgrade to R6.x is out of the question, by the way.
Regards,
JM
I'm trying to import a bunch of HTML files that reside in a directory and its subdirectories into a notes database.
The reason to do this is to easily be able to search the former html documents using Domino's full text search engine and automatically archive documents that expire depending on their creation date.
The problem I am facing is with the subdirectories. I can't get LotusScript to recursively read subdirectories. Apprently Dir and Dir$ always return regular files and not directories, no matter what I do.
Here's the code I use to perform the import of files (i've cut out the actual parsing of the files and replaced it with a sort of log command):
Function importHTMLFiles(dbWebImpor
REM //////////////////////////
REM This function will recursively walk through all files in the passed path and import them in the passed database
REM //////////////////////////
REM ==========================
REM Initialize local error routine
REM ==========================
importHTMLFiles = False
On Error Goto ErrorHandler
REM ==========================
REM Initialize local variables
REM ==========================
Dim subdir As String
Dim filename As String
REM ==========================
REM Recursively walk through the passed HTML path
REM ==========================
'process files in the directory
Call LogAction(strLogTitle, "processing files in " & strHTMLPath)
filename = Dir(strHTMLPath & "\*.*", 0)
While filename <> ""
Call LogAction(strLogTitle, "processing file: " & strHTMLPATH & "\" & filename)
filename = Dir
Wend
'process directories, call this function recursively per found directory
subdir = Dir(strHTMLPath & "\*.*", 16)
While subdir <> ""
If subdir <> "." And subdir <> ".." Then
Call LogAction(strLogTitle, "move to subdirectory: " & strHTMLPath & "\" & subdir)
importHTMLFiles = importHTMLFiles(dbWebImpor
subdir = Dir(strHTMLPath & "\*.*", 16)
Else
subdir = Dir
End If
Wend
REM ==========================
REM Exit function
REM ==========================
importHTMLFiles = True
Exit Function
REM ==========================
REM Error Handler
REM ==========================
ErrorHandler:
Call LogError(strLogTitle, Err, Error$ & " in line " & Erl)
importHTMLFiles = False
Exit Function
End Function
The script runs fine as long as it's reading files, but when it should get the directories and recall itself using the extended path it fails, due to the fact that it reads the first file again and passes this to itself.
Anybody any ideas if i'm doing anything wrong? Is this a bug in LotusScript? If so, is there a workaround?
I am working with R5.0.12.
An upgrade to R6.x is out of the question, by the way.
Regards,
JM
ASKER
Okay. This might work.
Any idea if a subsequently call to dir$ will return the next file when the function returns from it's recursive call?
Any idea if a subsequently call to dir$ will return the next file when the function returns from it's recursive call?
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
I feared as much :-)
I'll get cracking and let you know how I go.
Do you have any idea of how to import .jpg, .gif, .svg, .css and .js files as image resources using script or LS2API?
I've been searching the LDD but haven't found a good solution yet. I'd post a 500 point question for this as this doesn't seem to be too easy.
I'll get cracking and let you know how I go.
Do you have any idea of how to import .jpg, .gif, .svg, .css and .js files as image resources using script or LS2API?
I've been searching the LDD but haven't found a good solution yet. I'd post a 500 point question for this as this doesn't seem to be too easy.
jerrith,
Won't be easy, anyway, you need Designer privileges for this. I could ask someone with a lot of LS2API experience? Don't have the answer myself, I've been looking for the same thing in the past but solved everything with normal file attachments. In Web, use docname/$File/picture.jpg
Sjef
Won't be easy, anyway, you need Designer privileges for this. I could ask someone with a lot of LS2API experience? Don't have the answer myself, I've been looking for the same thing in the past but solved everything with normal file attachments. In Web, use docname/$File/picture.jpg
Sjef
ASKER
The problem is I am importing a complete web site in a notes database and I would like to keep the HTML code as is.
The easiest way would be if I imported all images as image resources with as name the path relative to the root directory and the filename of the image, .css, .js, ...
This way I wouldn't have to convert any links in the original HTML code which is entered in a notes document with the options to display the document contents as HTML. That's why I'd like to use the image resources.
Anyways, that's only part two of the problem. First I need to get the html files imported :-)
TTYL
The easiest way would be if I imported all images as image resources with as name the path relative to the root directory and the filename of the image, .css, .js, ...
This way I wouldn't have to convert any links in the original HTML code which is entered in a notes document with the options to display the document contents as HTML. That's why I'd like to use the image resources.
Anyways, that's only part two of the problem. First I need to get the html files imported :-)
TTYL
Q&D solution: Dump all images etc. in one document, so you'd have db.nsf/view/doc/$File as a BASE
ASKER
Hello Hemanth,
I've written my own version for importing the files now and it's a lot shorter :-)
Here's the code that does the trick for me at the moment (just log the file names for the time being):
REM ========================== ========== ========== ========== ========== ========== =========
REM Recursively walk through the passed HTML path
REM ========================== ========== ========== ========== ========== ========== =========
'Build list of filenames and directories and put this list in an array
strFileList = ""
filename = Dir$(strHTMLPath & "\*.*", 16)
Do While (Not filename = "")
If filename <> "." And filename <> ".." Then
strFileList = strFileList & filename & ":"
End If
filename = Dir$
Loop
If strFileList <> "" Then
varFileList = StrExplode(strFileList, ":", False)
'Walk through files and call the function recursively for directories
Forall f In varFileList
intAttributes = Getfileattr(strHTMLPath & "\" & f)
If intAttributes And 16 Then
rc = importHTMLFiles(dbWebImpor t, strHTMLPath & "\" & f, strLogTitle)
Else
Call LogAction(strLogTitle, "Processing file: " & strHTMLPath & "\" & f)
End If
End Forall
End If
StrExplode is a function I wrote myself to to mimic the behaviour of @Explode and return a variant with the list of files.
Since Sjef put me on the right track I'll award points to him.
Now for the import of graphic and other files into image resources, that's something completely different and I'll post a new question for that.
I've written my own version for importing the files now and it's a lot shorter :-)
Here's the code that does the trick for me at the moment (just log the file names for the time being):
REM ==========================
REM Recursively walk through the passed HTML path
REM ==========================
'Build list of filenames and directories and put this list in an array
strFileList = ""
filename = Dir$(strHTMLPath & "\*.*", 16)
Do While (Not filename = "")
If filename <> "." And filename <> ".." Then
strFileList = strFileList & filename & ":"
End If
filename = Dir$
Loop
If strFileList <> "" Then
varFileList = StrExplode(strFileList, ":", False)
'Walk through files and call the function recursively for directories
Forall f In varFileList
intAttributes = Getfileattr(strHTMLPath & "\" & f)
If intAttributes And 16 Then
rc = importHTMLFiles(dbWebImpor
Else
Call LogAction(strLogTitle, "Processing file: " & strHTMLPath & "\" & f)
End If
End Forall
End If
StrExplode is a function I wrote myself to to mimic the behaviour of @Explode and return a variant with the list of files.
Since Sjef put me on the right track I'll award points to him.
Now for the import of graphic and other files into image resources, that's something completely different and I'll post a new question for that.
Done that before...
See http://oldlook.experts-exchange.com/questions/11758718/Recursive-routine.html
It provides a working recursive solution.
2nd option (same PAQ):
Launch MS DOS and run
DIR /B/S > C:\MYDIR.TXT
All you need now is the info in C:\MYDIR.TXT
;-))
See http://oldlook.experts-exchange.com/questions/11758718/Recursive-routine.html
It provides a working recursive solution.
2nd option (same PAQ):
Launch MS DOS and run
DIR /B/S > C:\MYDIR.TXT
All you need now is the info in C:\MYDIR.TXT
;-))
ASKER
Looks very similar your code :-)
Must be that great minds think alike ;->
Must be that great minds think alike ;->
Beter goed gejat dan slecht gebouwd
(=better stolen well than build crappy)
I'm pretty sure that mine was an original!
(=better stolen well than build crappy)
I'm pretty sure that mine was an original!
ASKER
LOL
Very true, especially if you're working against the clock \:-D
Very true, especially if you're working against the clock \:-D
I'd always do a Dir to get the full directory, not only the directories but also the files, then walk through the files and do a GetFileAttr to find out what type of file you have got.
Cheers!
Sjef