Solved

How to find sections and Subsections present in Word Document

Posted on 2014-10-14
1
312 Views
Last Modified: 2014-10-31
How to find sections and Subsections present in Word Document

Ex:

Assume document with the following section and sub sections

1   Example
      1.1      Example
      1.2      Example
            1.2.1      Example
2      Example
      2.1      Example
            2.1.1      Example
•      Level 1
o      Level 2
      Level 3
      Level4
            2.1.2       Example
                  2.1.2.1      Example
                        2.1.2.1. a      Example
                              2.1.2.1. a.1 Example



How to find count of each section have how many subsections Using C# or ASP,net?
0
Comment
Question by:mannevenu26
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
1 Comment
 
LVL 27

Accepted Solution

by:
Dr. Klahn earned 500 total points
ID: 40381371
It is difficult to parse Microsoft Word format, as there have been many releases and versions of it.  What works in one release may not work in the next.

However, if you open the Word document and save it in Rich Text Format, parsing an RTF file is far easier.  RTF files are a human-readable text markup language.  Search for the section separator tags (which one you need to search for depends on how the sections were created, so I can't be specific here), count them, and you're done.

new-1.jpg
In RTF, the file looks like this:

{\rtf1\ansi\ansicpg1252\uc1 \deff0\deflang1033\deflangfe1033{\fonttbl{\f0\froman\fcharset0\fprq2{\*\panose 02020603050405020304}Times New Roman;}{\f16\froman\fcharset238\fprq2 Times New Roman CE;}{\f17\froman\fcharset204\fprq2 Times New Roman Cyr;}
{\f19\froman\fcharset161\fprq2 Times New Roman Greek;}{\f20\froman\fcharset162\fprq2 Times New Roman Tur;}{\f21\froman\fcharset186\fprq2 Times New Roman Baltic;}}{\colortbl;\red0\green0\blue0;\red0\green0\blue255;\red0\green255\blue255;
\red0\green255\blue0;\red255\green0\blue255;\red255\green0\blue0;\red255\green255\blue0;\red255\green255\blue255;\red0\green0\blue128;\red0\green128\blue128;\red0\green128\blue0;\red128\green0\blue128;\red128\green0\blue0;\red128\green128\blue0;
\red128\green128\blue128;\red192\green192\blue192;}{\stylesheet{\widctlpar\adjustright \fs20\cgrid \snext0 Normal;}{\*\cs10 \additive Default Paragraph Font;}}{\*\listtable{\list\listtemplateid67698703\listsimple{\listlevel\levelnfc0\leveljc0\levelfollow0
\levelstartat1\levelspace0\levelindent0{\leveltext\'02\'00.;}{\levelnumbers\'01;}\fi-360\li360\jclisttab\tx360 }{\listname ;}\listid888301111}}{\*\listoverridetable{\listoverride\listid888301111\listoverridecount0\ls1}}{\info
{\title Page \{ PAGE \} of Section \{ SECTION \}}{\author Windows User}{\operator Windows User}{\creatim\yr2014\mo10\dy14\hr21\min24}{\revtim\yr2014\mo10\dy14\hr21\min24}{\version2}{\edmins0}{\nofpages2}{\nofwords16}{\nofchars94}{\*\company  }
{\nofcharsws115}{\vern59}}\widowctrl\ftnbj\aenddoc\formshade\viewkind4\viewscale75\pgbrdrhead\pgbrdrfoot \fet0\sectd \linex0\endnhere\sectdefaultcl {\*\pnseclvl1\pnucrm\pnstart1\pnindent720\pnhang{\pntxta .}}{\*\pnseclvl2
\pnucltr\pnstart1\pnindent720\pnhang{\pntxta .}}{\*\pnseclvl3\pndec\pnstart1\pnindent720\pnhang{\pntxta .}}{\*\pnseclvl4\pnlcltr\pnstart1\pnindent720\pnhang{\pntxta )}}{\*\pnseclvl5\pndec\pnstart1\pnindent720\pnhang{\pntxtb (}{\pntxta )}}{\*\pnseclvl6
\pnlcltr\pnstart1\pnindent720\pnhang{\pntxtb (}{\pntxta )}}{\*\pnseclvl7\pnlcrm\pnstart1\pnindent720\pnhang{\pntxtb (}{\pntxta )}}{\*\pnseclvl8\pnlcltr\pnstart1\pnindent720\pnhang{\pntxtb (}{\pntxta )}}{\*\pnseclvl9\pnlcrm\pnstart1\pnindent720\pnhang
{\pntxtb (}{\pntxta )}}\pard\plain \widctlpar\adjustright \fs20\cgrid {Page \{ PAGE \} of Section \{ SECTION \}
\par \sect }\sectd \sbknone\linex0\endnhere\sectdefaultcl \pard\plain \widctlpar\adjustright \fs20\cgrid {
\par Page \{ PAGE \} of Section \{ SECTION \}
\par \sect }\sectd \linex0\endnhere\sectdefaultcl \pard\plain \widctlpar\adjustright \fs20\cgrid {
\par Page \{ PAGE \} of Section \{ SECTION \}
\par }}

Open in new window


In this example, search for occurrence of "\par \sect" to count sections.
0

Featured Post

Free Tool: Port Scanner

Check which ports are open to the outside world. Helps make sure that your firewall rules are working as intended.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

IntroductionWhile developing web applications, a single page might contain many regions and each region might contain many number of controls with the capability to perform  postback. Many times you might need to perform some action on an ASP.NET po…
The article shows the basic steps of integrating an HTML theme template into an ASP.NET MVC project
Nobody understands Phishing better than an anti-spam company. That’s why we are providing Phishing Awareness Training to our customers. According to a report by Verizon, only 3% of targeted users report malicious emails to management. With compan…
How to Install VMware Tools in Red Hat Enterprise Linux 6.4 (RHEL 6.4) Step-by-Step Tutorial

752 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question