We help IT Professionals succeed at work.

VERY HARD!!!  Huge Strings and Manipulating Them

vblogic
vblogic asked
on
I thought I was out of the woods of requiring to manually alter the RTF codes of the data in a richtextbox control in my project.  I was wrong...

I need to alter the margins so that the file is formatted correctly when viewed in some external application such as WordPad or MS Word.  I need to insert the "\margl", "\margr", etc. codes manually.  At first, I tried altering the TextRTF property, but it doesnt save (re: http://support.microsoft.com/support/kb/articles/Q184/1/98.asp)

If I alter the TextRTF property and set it back, its gone as immediately as it was set (if it ever was.)  So why not just store it in a string variable you ask?  For a couple of reasons:

1) The final document is over 6MB in size and storing the data in that variable and then manipulating the string thereafter is a huge memory and CPU hog
2) Trying to write that string to disk using the Print# statement is horrible in performance against the richtextbox's intrinsic SaveFile method.


So I need to alter the rtf codes, which only need to be inserted near the beginning of the document.  I am hoping there is some slick CopyMemory routine to insert the codes I need or something similar.  Please help!!
Comment
Watch Question

Commented:
If you have the text in rtf format on disk, you can read the data in chunks of data. There is no need to read all data at once. You could read say 50kb from the source file, add the "\margl" or "\margr" strings where needed and store the chunks to the destination file.

Author

Commented:
VBmaster,
I am not reading an RTF file from disk.  I am generating a report using the RichTextBox control, and then, hoping to save this generated report to disk with the proper margins.  The problem is inserting the margin rtf codes before saving to a file given the performance issues above.
The sendmessage api call can set the margins in a text box, but I am not sure that would help you
Ber

Commented:
Just a suggestion...
Save it to disk.Open the File as text and add the RTF tag in the relevant place.(As far as I know you would only have to write about 20 characters to the file in total, no matter what the size as the margin affects the entire document)Once this is written in in the correct place it should have the desired affect when you open the file as RTF again.

I know this may slow you down a bit but it will be nowhere near as much if you start using huge strings and you avoid attempting to mess around with memory.

HTH
Later
Ber...

Commented:
Hi!

You can format it in the RichTextBox no need to the variable.


Dim x&
RichTextBox1.Visible = False
x = RichTextBox1.SelStart
RichTextBox1.SelStart = 0
RichTextBox1.SelLength = Len(RichTextBox1.Text) 'use constant to limit the size, it will take about 10-15sec to format whole control if over 6meg is loaded
RichTextBox1.SelIndent = 2300
RichTextBox1.SelHangingIndent = -300
RichTextBox1.SelStart = Len(RichTextBox1.Text)
RichTextBox1.SelStart = x
RichTextBox1.SelLength = 0
RichTextBox1.Visible = True
RichTextBox1.SetFocus




Matti

Author

Commented:
Ber,
If you know of a routine to open a file, insert some text, and save it back w/o having to read in the whole file just to write it back out again, please let me know.  Otherwise, I will still be stuck with the problem of writing 6MB of data back to disk.

Matti,
I really need more flexibility to set margins for each side, but even still, note the link I mentioned in my original question.  Indent properties set via code do not persist to a file or variable, and become lost.


All,
I really feel my only solution is to add these rtf margin codes manually...the real question is, how to do it in an efficient manner.  No need to display anything on my form...this is strictly saving the report generated to a file which can be then opened later in WordPad or MS Word.

Commented:
>note the link I mentioned

That link doesn't say "indent properties are not saved to file."

"when the textRTF property is saved in a variable or in a file and then restored, the SelHangingIndent property is set to 0 ..."

So, you can use Matti's code.

--
Here is some code to compare speed.  Note that "direct save to file" method will work OK only if TextRTF does not contain previous paragraph formatting.

Option Explicit
Private Declare Function GetTickCount Lib "kernel32" () As Long

Private Sub Form_Click()
    Dim tim0 As Long, res As VbMsgBoxResult
    Dim pardpos As Long, ff As Integer
    Dim filename As String

    filename = "c:\windows\desktop\tst.rtf"     ' CHANGE DIRECTORY
   
    res = MsgBox("Update RTF control?", vbYesNoCancel Or vbQuestion, "Save RTF")
   
    tim0 = GetTickCount
    Dim s As String
   
    If res = vbYes Then
        ' this will also update control
        With Me.RichTextBox1
            .Visible = False
           
            .SelStart = 1
            .SelLength = Len(.Text)
           
            .SelIndent = 555
            .SelRightIndent = 555
            .SelHangingIndent = 300
           
            .SaveFile filename
           
            .SelLength = 0
            .Visible = True
        End With
    ElseIf res = vbNo Then
        ' save without updating control
        s = RichTextBox1.TextRTF
        pardpos = InStr(s, "\pard\")
       
        On Error Resume Next
        Kill filename
       
        ff = FreeFile
        Open filename For Binary Access Write Lock Read As #ff
        Put #ff, , Left$(s, pardpos - 1)
        Put #ff, , "\pard\li855\ri555\fi-300\"
        Put #ff, , Mid$(s, pardpos + 6)
        Close #ff
       
    Else
        Caption = "File not saved"
        Exit Sub
    End If
    Caption = (GetTickCount - tim0) / 1000 & " seconds to save"
End Sub

Private Sub Form_Load()
    Me.RichTextBox1.Text = String(6000000, "a")
    Caption = "Click form to Save RTF contents"
End Sub

Author

Commented:
ameba,
Granted, perhaps the Indent properties are persisted in the context of what I was requesting, but perhaps I need to clarify that its not really indenting I am interested in...it is the margins.  Or maybe they are the same...do the "li", "ri", "fi" rtf codes perform the same functionality as the "margl", "margr", etc. codes?  In essence, I need to alter the margins such that the report takes up nearly a full printed page. (i.e. small margins)

I use the RichTextBox to generate my formatted report, with font coloring, bolding, underlining, and inserting graphics.  The control is invisible, and does not need to be displayed, just saved.  So, with that...

1) Why is the SaveFile method so much faster than any of VB's intrinsic file I/O functions?
2) Is there any way to alter the TextRTF property to include the margin settings I desire (or perhaps the Indent properties do offer what I need)

ameba, your two methods above were much too slow...both taking over a minute on my machine to complete.  I really need to get this down to as little time as possible.  Currently, if I could assume that margins were acceptable, simply calling the SaveFile method executes in 2 seconds.  Surely, there must be something we can do to get somewhat remotely close to this efficiency.

Commented:
> do the "li", "ri", "fi" rtf codes perform the same functionality as the "margl", "margr", etc. codes?

I don't know - maybe you can check RTF specifications, or inspect TextRTF property, or create RTF via MS Word and check file contents.

> your two methods above were much too slow...both taking over a minute

Here, it was 3.9 and 1.2 seconds.

1) Why is the SaveFile method so much faster than any of VB's intrinsic file I/O functions?

SaveFile takes 0.4 seconds (it's 2 seconds on your PC)
To save 6 MB string to file in VB it takes THE SAME time (0.35-0.4 seconds)

This line takes 0.66 seconds
     s = RichTextBox1.TextRTF

That is the most critical line.  This simple code takes c/a 1.05 seconds:
     Open filename For Binary Access Write Lock Read Write As #ff
     Put #ff, , RichTextBox1.TextRTF
     Close #ff
If we add few lines to insert rfc formatting, it will total to 1.2 seconds
I think that is good enough.


How to make it faster than 1.2 seconds?

It is possible to avoid that critical line (s = RichTextBox1.TextRTF):  you'll have to use SaveFile (0.4 seconds) and then modify the file in less than 0.8 seconds.

I was able to modify the file in 1.0 second - maybe you can do it somewhat faster if you use byte array, or if you do the reading/writing to file without opening it twice, or if you use RAMdisk or something similar for temporary file. But, I am not sure it is worth doing this, when given code is so simple.

I'll repeat the code - I added one line (RichTextBox1.Text = vbNullString) to save some memory (?), I moved file from desktop to avoid refreshing desktop, and richtextbox is hidden.

' 1.2 seconds (Celeron on 700, 128 MB), should be less than 6 seconds on your PC, which is
' far from 1 minute (maybe you had swapping to disk, too many applications open?)

' Form1, add RichTextBox
Option Explicit
Private Declare Function GetTickCount Lib "kernel32" () As Long

Private Sub Form_Click()
    Dim tim0 As Long, pardpos As Long, ff As Integer
    Dim filename As String, s As String
    On Error Resume Next
   
    filename = "c:\tst.rtf"
    Kill filename
    On Error GoTo 0
   
    tim0 = GetTickCount ' start timer
   
    s = RichTextBox1.TextRTF
    RichTextBox1.Text = vbNullString
    pardpos = InStr(s, "\pard\")
           
    ff = FreeFile
    Open filename For Binary Access Write Lock Read Write As #ff
    Put #ff, , Left$(s, pardpos - 1)
    Put #ff, , "\pard\li855\ri555\fi-300\"   ' or some other formatting
    Put #ff, , Mid$(s, pardpos + 6)
    Close #ff
   
    Caption = (GetTickCount - tim0) / 1000 & " seconds"
End Sub

Private Sub Form_Load()
    RichTextBox1.Visible = False
    RichTextBox1.Text = String(6000000, "a")
    Caption = "Click form to Save RTF contents"
End Sub

Commented:
Other options you have:

Print directly from RichTextBox - there is some code to render RTF to printer and set margins.

If you are using MS Word for printing RTF files, consider using Word Automation - I use it for most of my reports - code to create report can be generated by using Macro Recorder, users can modify templates...
Valliappan ANSenior Tech Consultant
CERTIFIED EXPERT

Commented:
listening...

Author

Commented:
>>or create RTF via MS Word and check file contents

That is what I did initially...and I'm pretty sure that the "margl", "margr", etc. codes are the ones I am interested in.


>>Open filename For Binary Access Write Lock Read Write As #ff
>>Put #ff, , RichTextBox1.TextRTF
>>Close #ff

This method takes approx. 40 seconds on my machine, whereas, the SaveFile method of the RTB takes approx. 0.9 seconds.  I am not sure why it takes so long on my machine.  At home I have a 400MHz machine with 196MB memory, and here at work (where I just got the results I mentioned), I have a 600MHz machine with 128MB memory.  These arent the latest and greatest machinesbut they are good...and probably better than a lot of the machines our clients will be using.  What could be causing such a drastic difference in my performance?


I am simply generating these reports to be saved to disk, for printing or emailing later.  That is up to the user.  We are generating RTF files since we want clients who do not have MS Word to be able to view and print these reports as well.


Commented:
' I don't know what or how you are testing.  Please test provided code - start new VB
'         project, tripple click here, and copy code to VB form.
'
' Form1, add RichTextBox
Option Explicit
Private Declare Function GetTickCount Lib "kernel32" () As Long

Private Sub Form_Click()
   Dim tim0 As Long, ff As Integer, filename As String, s As String
   On Error Resume Next
   filename = "c:\tst.rtf"
   Kill filename
   On Error GoTo 0
'----------------------------
   tim0 = GetTickCount ' start timer
   s = RichTextBox1.TextRTF
   Caption = (GetTickCount - tim0) / 1000 & " to create s."
'----------------------------
   tim0 = GetTickCount ' start timer
   ff = FreeFile
   Open filename For Binary Access Write Lock Read Write As #ff
   Put #ff, , s
   Close #ff
   
   Caption = Caption & " Put: " & (GetTickCount - tim0) / 1000 & " seconds"
   Debug.Print Caption
End Sub

Private Sub Form_Load()
   RichTextBox1.Visible = False
   RichTextBox1.Text = String(6000000, "a")
   Caption = "Click form to Save RTF contents"
End Sub

' results:
' 0.676 to create s. Put: 0.385 seconds  (Celeron 700/128MB, the same is in IDE or compiled)
' 5.4 to create s. Put: 2.7 seconds  (P120/64MB, MS Word running)

Commented:
Hi!

Those "margl", "margr" codes are not supported by the VB rtf control, so if you wan't to "leave" it then the whole syntax need to be parsed huh! And don't think it can be done very fast, more likely need to take care this when you generate the report and add rtf codes there if you do not wan't to use those RICHTX32.OCX "li", "ri" etc codes.

It do come from database query? and as text.

Also a 6M report is a bit of thud on any pop3 account.

Consider html report.


Matti

Author

Commented:
ameba,
Results:
41 seconds to create s
.78 seconds to put

For some reason, my computer is having much trouble allocating storage for the string.



Matti,
>>more likely need to take care this when you generate the report

I would love to do that, but the problem is, I have tried doing this early on in the report creating process.  Since these codes need only appear near the beginning of the file, I tried manipulating the rtf string early on, but once I have done so, and set the altered value back to the TextRTF property, it is lost.  I will generate a string with the "\margl" and "\margr" codes but if I ever try to set it back to the RichTextBox, its gone...likely because, as you said, it doesnt support those codes.

>>Also a 6M report is a bit of thud on any pop3 account

Yes, I understand.  The length of the report is mainly due to graphic pictures that must be embedded in the reports.  The text itself is not lenghty.



A thought crosses my mind as I type this...perhaps I can insert pictures at the end of the report processing?  For example, I can embed flags in the text like:

[embedpic]

where a graphic must be embedded.  At the end of creating the report, load the text into a string, add my margin rtf codes, and then go back and do a search and replace to embed the images. Of course, I am back to going and replacing and searching this entire report again...which will probably just introduce new performance problems.

ameba, if we an figure out why my computer is taking so long to create the string in memory, we can move on from here.

Commented:
Hi!

About those pictures, do you have them all or just generate them in the process of the report?

This is a need of jpg library if the pictures are not ready.

The data is much longer when a bitmap is saved as rtf.
It's more like two pictures there. This is very bad problem when you send it via mail!

>The length of the report is mainly due to graphic >pictures that must be embedded
>in the reports.  The text itself is not lenghty.

Also free ziper librarys are available and to send them in exe auto extracting pack's whit folder info.

Shuld that be jpg to save some band as the jpg is more like a web format?

Paste picture get the rtftext and compare it as RtfText and see how it change if you do paste one or two pictures etc.
You may save these to object tag's as string, then they do not consume GDI resources, as you do not like variables, but for the memory result's there are not much difference.
A variable or array can be easyly emptied or redim as a object or form unloaded.  

Those "Supported RTF Codes" are listed in MSDN.  
If you wan't to use the VB rtfbox then stay whit these.

 
Matti

Commented:
vblogic,
Do you hear your hard disk working very hard during these 41 seconds?

If yes, then maybe your swap file is on shared drive, or your disk, where swap file is, has less than 5% free space.
Check your disk for errors, clear \temp and \tmp directories.

If not, then you have some other big apps working, or you are not using my code.  You are using *exact* code, aren't you?  The one with 6 mil. of "a" characters.

Of course, it is possible to avoid spending these 12 MB for 6MB string (you can write file to disk using SaveFile, then read and process files in chunks) - and I would use such technique for 486 machine, but this should not be necessary for your 128MB Pentium machine.

Author

Commented:
Matti,
I generate the pictures at run-time using a Graphics Server component.  I have been using bitmaps because that is what the OLEObjects collection of the RichTextBox allows me to add.  I looked at using jpegs, but when I paste a jpg into the RichTextBox control, only an icon appears, which I guess, is just some sort of link that will open up the user's default jpeg viewer.  I would very much prefer (and my boss may demand) that the pictures be directly embedded and visible within the report itself.


ameba,
I did notice that the hard disk was not working very hard during that time.  I tried your *exact* code on my machine at home and at work and they produced almost identical results (e.g., 40+ seconds to load the TextRTF property into a string variable).

About your "save-open-alter-restore" method using chunks...do you think it is practical?  How slow could that be?

>>and I would use such technique for 486 machine

That may not be too far off fom what some of our clients may use...so maybe this is an idea.  Perhaps you can help with good "chunking" code so we can test performance.

Commented:
>using chunks...do you think it is practical?

Instead of few simple code lines, we have one extra (temporary) file, and complex code - imagine maintenance costs.  :-)

It works a bit faster here, it doesn't use much memory and I hope it'll work OK for you.  I'll post code in separate comment, so you can use tripple-click to select code.
Commented:
' Form1, add RichTextBox
Option Explicit
Private Declare Function GetTickCount Lib "kernel32" () As Long

Private Sub Form_Click()
    Dim tim0 As Long
    Dim filenameTmp As String, ffsrc As Integer, filesize As Long
    Dim filename As String, ffdest As Integer
    Const chunksize As Long = 4096  ' or: 2048, 4096, 8192
    Dim lastchunksize As Long, numchunks As Long, i As Long
    Dim buffer As String
    On Error Resume Next
   
    filenameTmp = "c:\tstTmp.rtf" ' or use gettempfilename API
    filename = "c:\tst.rtf"
    Kill filenameTmp
    Kill filename
    On Error GoTo 0
   
' Step 1 - fast save -----------------------------------
    tim0 = GetTickCount ' start timer
    RichTextBox1.SaveFile filenameTmp
    Caption = "SaveFast: " & (GetTickCount - tim0) / 1000 & " s. "
   
' Step 2 - work with two files -------------------------
    tim0 = GetTickCount ' start timer
   
    ' open source file
    ffsrc = FreeFile
    Open filenameTmp For Binary Access Read Lock Read Write As #ffsrc
    filesize = LOF(ffsrc)
    Print "Reading File, Size=" & FormatNumber(filesize, 0)
   
    ' calculate number of chunks
    numchunks = Int(filesize \ chunksize) + 1
    lastchunksize = filesize Mod chunksize
    If lastchunksize = 0 Then
        numchunks = numchunks - 1
        lastchunksize = chunksize
    End If
    Print "Chunks " & numchunks - 1 & " * " & chunksize & " + " & lastchunksize
   
    ' open destination file
    ffdest = FreeFile
    Open filename For Binary Access Write Lock Read Write As #ffdest
   
    ' prepare buffer
    buffer = Space(chunksize)
   
    ' read & write chunks
    For i = 1 To numchunks
        If i = numchunks Then
            buffer = Space(lastchunksize)
        End If
        Get #ffsrc, , buffer
       
        If i = 1 Then
            Put #ffdest, , ModifyHdr(buffer)
        Else
            Put #ffdest, , buffer
        End If
    Next
   
    ' close files
    Close #ffsrc
    Close #ffdest
   
    Caption = Caption & " Read/Write chunks: " & (GetTickCount - tim0) / 1000 & " s."
End Sub

' modify RTF header
Function ModifyHdr(FirstChunk As String) As String
    ModifyHdr = Replace(FirstChunk, "\pard\", "\pard\li855\ri555\fi-300\")
End Function

Private Sub Form_Load()
   RichTextBox1.Visible = False
   RichTextBox1.Text = String(6000000, "a")
   Caption = "Click form to Save RTF contents"
End Sub

Author

Commented:
ameba,
Initial testing on my home computer is superb!  Total time is approx. 3 seconds.  I will test this with my actual project code first thing tomorrow and let you know how it goes...but I think we have found a solution.  Great work!

Author

Commented:
Thanks to all who commented.  I know some of you mentioned the method that I ended up utilizing in the end, but I am going to award ameba for his time, dedication, and hard work put in on this question.  Again, thanks to everyone.

Commented:
Thanks!

Explore More ContentExplore courses, solutions, and other research materials related to this topic.