Rich Edit - extract formatted text

I need to be able to get the contents of a rich edit control.  specifically i need all of the text and to know where any bold and/or underlined text starts and ends.  I can use EM_STREAMOUT to get contents, but how do I just parse out text, bold, and underline data?
Thank You
LVL 1
marvinmAsked:
Who is Participating?
 
AlexVirochovskyConnect With a Mentor Commented:
You need to use EM_GETCHARFORMAT message
and CHARFORMAT structure

The EM_GETCHARFORMAT message determines the current character formatting in a rich edit control.

EM_GETCHARFORMAT
wParam = (WPARAM) (BOOL) fSelection;
lParam = (LPARAM) (CHARFORMAT FAR *) lpFmt;

In CHARFORMAT structure dwMask field may has( or not has)
CFM_BOLD/CFM_UNDERLINE attribute.
0
 
robpittCommented:
I think you'll have to write a simple RTF to plain text parser that ignores all RTF control sequences apart from bold and underline.

0
 
marvinmAuthor Commented:
Do you have any examples of doing this? (non-MFC)
Thank You
0
Free Tool: IP Lookup

Get more info about an IP address or domain name, such as organization, abuse contacts and geolocation.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

 
marvinmAuthor Commented:
Do I need to set the selection to one character and do this for each character position?
Thank You
0
 
robpittCommented:
Whilst you could call EM_GETCHARFORMAT iteratively for each character this would obviously not be very efficent. Then again it would work and saving you writing a lot of code.


If you wanted to parse the RTF data, then you should read the section of MSDN entitled "Rich Text Format (RTF) Specification, version 1.6".
If you don't have MSDN, I can mail you the section.


Try opening an RTF file in a text editor you'll see the format is very simple. Its a header followed by text with control codes embeded.

All control codes start with a "\".
Bold on/off is "\b" and "\b0" respectively.
Underline on/off is "\ul" and "\ulnone" respectively.
All control codes sequences terminate with a space " ".
The \ character is represented by "\\"

Rob
0
 
AlexVirochovskyCommented:
I don't recommend you make RTF parser. I 've made
this and it is read headpain.
>> Do I need to set the selection to one character and do this for each character position?

I don't see something wrong in this (may be a bit slow, but can make in 10 min, parse is MINIMUM 2 weeks of job,
and after that you find text, that use some added tags and
all again...
  But:  RTF doc
http://www.wotsit.org/wtext/rtf15.zip (Rich Text File Format v1.5)
http://www.wotsit.org/wtext/rtfadd97.zip (Word 97 Addendum)
0
 
robpittCommented:
It really depends on exactly what Marvin wants to do.

If he has text in a control and then needs to establish the formating of a particular character then yes EM_GETCHARFORMAT is the thing for him.

Using EM_GETCHARFORMAT for a more complex situation may work or it may be prohibitively exepensive in terms of performance.

Writing a full rtf parser would indeed be a task and a half but remember, he only needs to extract bold & underline information.

Rob
0
 
chandasCommented:
Rob, would he really have to call EM_GETCHARFORMAT for each letter? I think maybe he could select a word at a time using GetSel(...). Maybe he could just look for space characters to break at each word and then select a word at a time.

Just my $0.02

Chandas
0
 
marvinmAuthor Commented:
This should be the easiest way to accomplish my goal.  There will not be much text to get, so efficiency is not a major concern.
Thanks to all - mm
0
 
chandasCommented:
It would help if you awarded points for the comment that helped you out
0
 
marvinmAuthor Commented:
I DID!  AlexVirochovsky suggested using EM_GETCHARFORMAT which is easier than writing a RTF parser.  robpitt's comments were informative, and certainly the more thourough approach, but that is not the direction I am taking.  I have not implemented this yet as I have been pulled into a different project, but this will be my approach when I return to it.
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.