# Calculate the cost for word wrapping algorithm in Python using dynamic programming

I am trying to implement word wrapping in Python using dynamic programming.

Given a sequence of words from a file, and a limit on the number of characters that can be put in one line (line width), put line breaks in the given sequence such that the lines are printed neatly. Assume that the length of each word is smaller than the line width This problem must be solved using dynamic programming.

This function should take in a list of strings (each string in the list is a word from the file) and an integer M (which represents the max number of characters per line, including spaces). This function should return the cost (an integer that represents the optimal value of the function, as seen in dynamic programming) and a string that contains the entire text from the file as one string with newline characters. It should not end with a blank line.

I have not implemented the word wrapping functionality yet, so the function currently only returns the cost, not the string of text contents.

I implemented an algorithm to calculate the cost of the function based on the pseudocode on page 15-38 of this document (http://www-bcf.usc.edu/~shanghua/teaching/Spring2010/public_html/files/HW3_Solutions_A.pdf) as well as an implementation in C++ (http://www.geeksforgeeks.org/dynamic-programming-set-18-word-wrap/).

My main issue is that I cannot get my function to compute the cost correctly using dynamic programming. I believe I have implemented the function correctly using dynamic programming, but I still cannot arrive to the correct answer. When I run printingneatly.py, it prints out the cost. I am not sure what specifically is wrong with my algorithm.

When you run the print_test_neatly.py file, the correct output for the cost of the function for each particular sample text file should be generated into a log file. The output is shown below (this can also be viewed in the output.log file that I have attached):

kubla_kahn.txt
cost =  1545
true cost =  1545

magna_carta.txt
cost =  13910
true cost =  13910

How can I optimize my word wrapping/cost algorithm so that it calculates the cost both correctly and efficiently (so that it takes only a couple of minutes to calculate it rather than an hour)?

printingneatly.py
kubla-kahn.txt
magna-carta.txt
output.log
print-neatly-test.py
###### Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Commented:
Do you have to use that algorithm?

I transformed the text into CrLf delimited strings, replacing the existing (CrLf) line breaks with space characters.  After applying the following regular expression, I get the word wrap results in the captured groups.
``````(.{1,50})(?: |\r\n|\$)
``````
0
Commented:
You can even use the replace method to insert your linefeed characters
0
Commented:
An even simpler solution would be to use the textwrap.py module.

Example:
``````import textwrap

f.close
for l in textwrap.wrap(sData,40):
print l
``````

Of course, you would package this into a form where you would pass the desired width to the wrap() method.
0
Author Commented:
I have to make the word wrap function from scratch, though. I can't use a built-in library.
0
Commented:
Sounds like coursework to me.  There is limited help for homework problems due to academic integrity issues.
0
Author Commented:
I got it working using Java I have much more experience with and and managed to cross-code into Python.
Now it works. At least I found a solution.
0

Experts Exchange Solution brought to you by