How to find sections and Subsections present in Word Document

How to find sections and Subsections present in Word Document

Ex:

Assume document with the following section and sub sections

1   Example
      1.1      Example
      1.2      Example
            1.2.1      Example
2      Example
      2.1      Example
            2.1.1      Example
•      Level 1
o      Level 2
      Level 3
      Level4
            2.1.2       Example
                  2.1.2.1      Example
                        2.1.2.1. a      Example
                              2.1.2.1. a.1 Example



How to find count of each section have how many subsections Using C# or ASP,net?
mannevenu26Asked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Dr. KlahnPrincipal Software EngineerCommented:
It is difficult to parse Microsoft Word format, as there have been many releases and versions of it.  What works in one release may not work in the next.

However, if you open the Word document and save it in Rich Text Format, parsing an RTF file is far easier.  RTF files are a human-readable text markup language.  Search for the section separator tags (which one you need to search for depends on how the sections were created, so I can't be specific here), count them, and you're done.

new-1.jpg
In RTF, the file looks like this:

{\rtf1\ansi\ansicpg1252\uc1 \deff0\deflang1033\deflangfe1033{\fonttbl{\f0\froman\fcharset0\fprq2{\*\panose 02020603050405020304}Times New Roman;}{\f16\froman\fcharset238\fprq2 Times New Roman CE;}{\f17\froman\fcharset204\fprq2 Times New Roman Cyr;}
{\f19\froman\fcharset161\fprq2 Times New Roman Greek;}{\f20\froman\fcharset162\fprq2 Times New Roman Tur;}{\f21\froman\fcharset186\fprq2 Times New Roman Baltic;}}{\colortbl;\red0\green0\blue0;\red0\green0\blue255;\red0\green255\blue255;
\red0\green255\blue0;\red255\green0\blue255;\red255\green0\blue0;\red255\green255\blue0;\red255\green255\blue255;\red0\green0\blue128;\red0\green128\blue128;\red0\green128\blue0;\red128\green0\blue128;\red128\green0\blue0;\red128\green128\blue0;
\red128\green128\blue128;\red192\green192\blue192;}{\stylesheet{\widctlpar\adjustright \fs20\cgrid \snext0 Normal;}{\*\cs10 \additive Default Paragraph Font;}}{\*\listtable{\list\listtemplateid67698703\listsimple{\listlevel\levelnfc0\leveljc0\levelfollow0
\levelstartat1\levelspace0\levelindent0{\leveltext\'02\'00.;}{\levelnumbers\'01;}\fi-360\li360\jclisttab\tx360 }{\listname ;}\listid888301111}}{\*\listoverridetable{\listoverride\listid888301111\listoverridecount0\ls1}}{\info
{\title Page \{ PAGE \} of Section \{ SECTION \}}{\author Windows User}{\operator Windows User}{\creatim\yr2014\mo10\dy14\hr21\min24}{\revtim\yr2014\mo10\dy14\hr21\min24}{\version2}{\edmins0}{\nofpages2}{\nofwords16}{\nofchars94}{\*\company  }
{\nofcharsws115}{\vern59}}\widowctrl\ftnbj\aenddoc\formshade\viewkind4\viewscale75\pgbrdrhead\pgbrdrfoot \fet0\sectd \linex0\endnhere\sectdefaultcl {\*\pnseclvl1\pnucrm\pnstart1\pnindent720\pnhang{\pntxta .}}{\*\pnseclvl2
\pnucltr\pnstart1\pnindent720\pnhang{\pntxta .}}{\*\pnseclvl3\pndec\pnstart1\pnindent720\pnhang{\pntxta .}}{\*\pnseclvl4\pnlcltr\pnstart1\pnindent720\pnhang{\pntxta )}}{\*\pnseclvl5\pndec\pnstart1\pnindent720\pnhang{\pntxtb (}{\pntxta )}}{\*\pnseclvl6
\pnlcltr\pnstart1\pnindent720\pnhang{\pntxtb (}{\pntxta )}}{\*\pnseclvl7\pnlcrm\pnstart1\pnindent720\pnhang{\pntxtb (}{\pntxta )}}{\*\pnseclvl8\pnlcltr\pnstart1\pnindent720\pnhang{\pntxtb (}{\pntxta )}}{\*\pnseclvl9\pnlcrm\pnstart1\pnindent720\pnhang
{\pntxtb (}{\pntxta )}}\pard\plain \widctlpar\adjustright \fs20\cgrid {Page \{ PAGE \} of Section \{ SECTION \}
\par \sect }\sectd \sbknone\linex0\endnhere\sectdefaultcl \pard\plain \widctlpar\adjustright \fs20\cgrid {
\par Page \{ PAGE \} of Section \{ SECTION \}
\par \sect }\sectd \linex0\endnhere\sectdefaultcl \pard\plain \widctlpar\adjustright \fs20\cgrid {
\par Page \{ PAGE \} of Section \{ SECTION \}
\par }}

Open in new window


In this example, search for occurrence of "\par \sect" to count sections.
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
C#

From novice to tech pro — start learning today.

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.