Formatting and layout in word or pdf

Shailesh Shinde
Shailesh Shinde used Ask the Experts™
on
Hi All,

We have an requirement to adjust formatting and layout (like., fonts, line breaks in table, columns and margins) of word or PDF files using any script [Perl, Python or Ruby].

Can you please suggest or provide any references or sample codes for this or suggest which scripting languages will be good for such requirements.

Thanks,
Shail
Comment
Watch Question

Do more with

Expert Office
EXPERT OFFICE® is a registered trademark of EXPERTS EXCHANGE®
Walter RitzelSenior Software Engineer

Commented:
The best starting point I know is this: https://automatetheboringstuff.com/chapter13/
It uses python and give pointers to python libraries that can handle PDF and word documents.
Colleen Kayter4D Assets

Commented:
Why are you scripting vs. applying a theme that sets all that on the fly? Just curious. With security restrictions becoming more prevalent, I would think that scripting might not work well everywhere.
Shailesh ShindeLocalization Engineering & Automation

Author

Commented:
Hi Colleen Kayter,

The reason for scripting is to include this script in existing automated processing workflow.
This script will read the config file which will contains
font-size=##
font-name=###
....
and manipulate the input source word or pdf files.

Thanks,
Shail
Build an E-Commerce Site with Angular 5

Learn how to build an E-Commerce site with Angular 5, a JavaScript framework used by developers to build web, desktop, and mobile applications.

Shailesh ShindeLocalization Engineering & Automation

Author

Commented:
Hi All,

Can formatting be applied to the text level and the page level to a specific paragraph, a set of paragraphs, a range of pages. Is this possible using perl or python scripts?

Thanks,
Shail
4D Assets
Commented:
Shailesh, different type of formatting are applied/stored to different elements depending on the type of formatting.

Fonts/font attributes (everything you see in the Fonts group on the Home tab) are applied at text level, with a default being stored at paragraph level.

Bullets, justification, line spacing, keep together, keep with next, tabs, etc. (everything in the Paragraphs tab) are stored at paragraph level.

Nothing is stored at the page level, but SECTIONS... Sections can force page breaks or they can be continuous, allowing you to format different parts of the same page in different ways. Margins, orientation, columns (everything in the Page Setup group of the Layout tab) PLUS headers and footers, page numbering, and backgrounds are stored at SECTION level.

Off the top of my head, I think the only thing that are stored at document level are the theme, available styles, tables of contents, and bookmarks/cross references.

If you want to see all the parts, make a copy of one of your documents, replace the .docx extension with .zip and view the contents of the zip file. the one named document.xml contains your text.

As for programming with Python or Perl, I'll leave that answer to one of the coder experts.
Shailesh ShindeLocalization Engineering & Automation

Author

Commented:
Hi,

Waiting for coder experts comments.

Thanks,
Shail
Most Valuable Expert 2011
Top Expert 2016

Commented:
This looks at things from a PHP perspective, so it may or may not fit your environment, but since PHP is free and open-source it could be worth considering.

PHP has two well-supported libraries for building PDF documents: FPDF and TCPDF.  Both are self-contained object-oriented libraries.  The documentation is pretty good. and they have online examples.  I have never used them to import and adjust pre-existing PDF files, but some others in the E-E forums claim this can be done.  Most of my work has been to take external inputs (forms, databases, API data) and build PDF documents.  For this kind of work, either of the extensions will work well, giving you access to a variety of fonts, colors, layouts, and image placements.

Do more with

Expert Office
Submit tech questions to Ask the Experts™ at any time to receive solutions, advice, and new ideas from leading industry professionals.

Start 7-Day Free Trial