Learn how to a build a cloud-first strategyRegister Now

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 507
  • Last Modified:

Splitting a long string in java

Hello Experts,

I need to split a VERY long string in Java at every 132 characters.  But here's the trick:  I need to split the String at the last instance of a whitespace of the 132 characters.  So it's possible that the new String may be less than 132 characters.  You see, if I have a long sentence and the 132nd character is in the middle of a word, then I don't want to chop that word.  Rather I'd like to capture the whole word.  I can't go over 132 characters, but I can go under:
Ex:
String x = "..... impossible to realize."
Let's say the 132nd character is after the 't' in the word "to".  Now I don't want to split the 't' and 'o'.  I'd rather stop after the workd "impossible" and begin the next line with "to".

Any expert out there than can help?
0
Greengiants15
Asked:
Greengiants15
2 Solutions
 
Kamaraj SubramanianApplication Support AnalystCommented:
Try this.

Not tested. Because i am in office and not having java compiler
import java.util.StringTokenizer;

public class test
{
  public static voic main (String args[])
  {
    String s[] = new String[5];
	int i =0;
    String str = "Hello Experts,I need to split a VERY long string in Java at every 132 characters.  But here's the trick:  I need to split the String at the last instance of a whitespace of the 132 characters.  So it's possible that the new String may be less than 132 characters.  You see, if I have a long sentence and the 132nd character is in the middle";
    StringTokenizer token = new StringTokenizer(str, " ");
    while(token.hasMoreTokens())
	  if(s[i].length() < 132)
		s[i] = s[i] + token.nextToken();
	  else
		i = i+1;
  }
}

Open in new window

0
 
Kamaraj SubramanianApplication Support AnalystCommented:
Need to add some more condition after the concat.

will work and let u know
0
 
Greengiants15Author Commented:
Thanks, I'm not sure why we need the String array but will try your suggestion.
0
VIDEO: THE CONCERTO CLOUD FOR HEALTHCARE

Modern healthcare requires a modern cloud. View this brief video to understand how the Concerto Cloud for Healthcare can help your organization.

 
Greengiants15Author Commented:
Tried you suggestion but nothing is working.  Getting NULL pointers.  I'll investigate further.
0
 
Kamaraj SubramanianApplication Support AnalystCommented:
here it is
import java.util.StringTokenizer;

public class test
{
  public static void main (String args[])
  {
    String temp ="",pre_temp ="",next_token="";
        int i =0;
    String str = "Hello Experts,I need to split a VERY long string in Java at every 132 characters.  But here's the trick:  I need to split the String at the last instance of a whitespace of the 132 characters.  So it's possible that the new String may be less than 132 characters.  You see, if I have a long sentence and the 132nd character is in the middle Hello Experts,I need to split a VERY long string in Java at every 132 characters.  But here's the trick:  I need to split the String at the last instance of a whitespace of the 132 characters.  So it's possible that the new String may be less than 132 characters.  You see, if I have a long sentence and the 132nd character is in the middle";
	System.out.println(str.length());
    StringTokenizer token = new StringTokenizer(str, " ");
    while(token.hasMoreTokens())
	{
		if(temp.length()<132)
		{
			pre_temp =temp;
			next_token = token.nextToken();
			temp = temp + next_token + " ";
		}
		else
		{
			if(temp.length()>132)
			{
				System.out.println (pre_temp + "\n\n");
				System.out.println (pre_temp.length()+ "\n\n");
				temp  = next_token + " ";
			}
			else
			{
				System.out.println(temp.length()+ "\n\n");
				System.out.println (pre_temp+ "\n\n");
				temp ="";
			}
		}
	}
  }
}

Open in new window

0
 
Kamaraj SubramanianApplication Support AnalystCommented:
Try this
import java.util.StringTokenizer;

public class test
{
  public static void main (String args[])
  {
    String temp ="",pre_temp ="",next_token="";
        int i =0;
    String str = "Hello Experts,I need to split a VERY long string in Java at every 132 characters.  But here's the trick:  I need to split the String at the last instance of a whitespace of the 132 characters.  So it's possible that the new String may be less than 132 characters.  You see, if I have a long sentence and the 132nd character is in the middle Hello Experts,I need to split a VERY long string in Java at every 132 characters.  But here's the trick:  I need to split the String at the last instance of a whitespace of the 132 characters.  So it's possible that the new String may be less than 132 characters.  You see, if I have a long sentence and the 132nd character is in the middle";
	System.out.println(str.length());
    StringTokenizer token = new StringTokenizer(str, " ");
    while(token.hasMoreTokens())
	{
		if(temp.length()<132)
		{
			pre_temp =temp;
			next_token = token.nextToken();
			temp = temp + next_token + " ";
		}
		else
		{
			if(temp.length()>132)
			{
				System.out.println (pre_temp + "\n\n");
				System.out.println (pre_temp.length()+ "\n\n");
				temp  = next_token + " ";
			}
			else
			{
				System.out.println(temp.length()+ "\n\n");
				System.out.println (temp+ "\n\n");
				temp ="";
			}
		}
	}
  }
}

Open in new window

0
 
softwarepearls_comCommented:
The approach shown so far is a garbage collector's nightmare: it creates a huge number of temporary strings on each iteration. Given that you talked about a HUGE input string, this could be a problem.

If it is, then I'd opt for an approach that uses a simple character index, with logic which tests whether the index is a valid break (or cut) position, and if not, scans backwards one character at a time, until it hits a valid break position. This approach doesn't involve step-by-step building up of the cut string, and thus avoids the garbage problem.

In pseudocode, something like

index = 0

forever {
  index += 132
  if (index >= end of string) break
  while ( ! indexIsValid(index)) {
   index--;
  }
  emit string fragment ending at index
}
0
 
Kamaraj SubramanianApplication Support AnalystCommented:
@Greengiants15:

any update ?
0
 
Greengiants15Author Commented:
Hi, thanks everyone for their input.  I found a much more elegant approach and will be posting the solution shortly after I run through some more tests.
Please stand by for an update.
0
 
Kamaraj SubramanianApplication Support AnalystCommented:
waiting :)
0
 
Greengiants15Author Commented:
Here is the solution I came up with.  This has been tested and works great (so far)

/**
     * Returns a String array of Strings that are split at 132 characters or the last whitespace
     * Assumes the String length is less than Integer.MAX value.
     * @param toSplit
     * @return
     */
    public static String[] splitAt132(final String toSplit) {
       
        List filler = new ArrayList();        
        int nextStartingPoint = 0;
        boolean firstTime = true;
       
        for (int i=0,from=0,to=0,length = toSplit.length(); nextStartingPoint < length; i++) {            
           
            from = 0;
            to = Math.min(from + 132, length);
            String substring  = "";
           
            if(firstTime){
                substring = toSplit.substring(0, to).trim();
            }
            else{
                to = Math.min(nextStartingPoint + 132, length);
                substring = toSplit.substring(nextStartingPoint, to);
            }
            int theLength = substring.length();
           
            while(substring.charAt(theLength-1) != ' '){
                theLength = theLength - 1;
            }
            filler.add(i,substring.substring(0, theLength-1).trim());
            nextStartingPoint = nextStartingPoint + substring.substring(0, theLength).length();
           

            firstTime = false;
        }
        return toStringArray(filler.toArray());
    }  


    /**
     * Converts an object array to a String array.
     * @param array
     * @return
     */
    public static String[] toStringArray(Object[] array) {
        if (array == null) {
            return null;
        }
        final int length = array.length;
        final String[] returnValue = new String[length];
        for (int i=0; i<length; i++) {
            Object value = array[i];
            returnValue[i] = (value == null ? null : value.toString());
        }
        return returnValue;
    }
0
 
Kamaraj SubramanianApplication Support AnalystCommented:
what about my code ?

is not working ?
0
 
phoffricCommented:
itkamaraj:
I ran your program to see what would happen with multiple spaces. In C or C++, strtok has the effect of treating multiple consecutive delimiters as one (whether you like it or not). The result was that I added extra spaces to some of the existing spaces. The multiple consecutive spaces were consolidated into a single space by your version.

One other minor point is that the last lines that printed out were:
characters. So it's possible that the new String may be less than 132 characters. You see, if I have a long sentence and the 132nd


131

So, it appeared that the last sentence got truncated.
0

Featured Post

Technology Partners: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Tackle projects and never again get stuck behind a technical roadblock.
Join Now