I am trying to implement word wrapping in Python using dynamic programming.
Given a sequence of words from a file, and a limit on the number of characters that can be put in one line (line width), put line breaks in the given sequence such that the lines are printed neatly. Assume that the length of each word is smaller than the line width This problem must be solved using dynamic programming.
This function should take in a list of strings (each string in the list is a word from the file) and an integer M (which represents the max number of characters per line, including spaces). This function should return the cost (an integer that represents the optimal value of the function, as seen in dynamic programming) and a string that contains the entire text from the file as one string with newline characters. It should not end with a blank line.
I have not implemented the word wrapping functionality yet, so the function currently only returns the cost, not the string of text contents.
I implemented an algorithm to calculate the cost of the function based on the pseudocode on page 15-38 of this document (http://www-bcf.usc.edu/~shanghua/teaching/Spring2010/public_html/files/HW3_Solutions_A.pdf
) as well as an implementation in C++ (http://www.geeksforgeeks.org/dynamic-programming-set-18-word-wrap/
My main issue is that I cannot get my function to compute the cost correctly using dynamic programming. I believe I have implemented the function correctly using dynamic programming, but I still cannot arrive to the correct answer. When I run printingneatly.py, it prints out the cost. I am not sure what specifically is wrong with my algorithm.
When you run the print_test_neatly.py file, the correct output for the cost of the function for each particular sample text file should be generated into a log file. The output is shown below (this can also be viewed in the output.log file that I have attached):
cost = 1545
true cost = 1545
bad lines = 0
cost = 13910
true cost = 13910
bad lines = 0
How can I optimize my word wrapping/cost algorithm so that it calculates the cost both correctly and efficiently (so that it takes only a couple of minutes to calculate it rather than an hour)?
Thank you for your help.