Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
510 views
in Technique[技术] by (71.8m points)

regex - Large string split into lines with maximum length in java

String input = "THESE TERMS AND CONDITIONS OF SERVICE (the Terms) ARE A LEGAL AND BINDING AGREEMENT BETWEEN YOU AND NATIONAL GEOGRAPHIC governing your use of this site, www.nationalgeographic.com, which includes but is not limited to products, software and services offered by way of the website such as the Video Player, Uploader, and other applications that link to these Terms (the Site). Please review the Terms fully before you continue to use the Site. By using the Site, you agree to be bound by the Terms. You shall also be subject to any additional terms posted with respect to individual sections of the Site. Please review our Privacy Policy, which also governs your use of the Site, to understand our practices. If you do not agree, please discontinue using the Site. National Geographic reserves the right to change the Terms at any time without prior notice. Your continued access or use of the Site after such changes indicates your acceptance of the Terms as modified. It is your responsibility to review the Terms regularly. The Terms were last updated on 18 July 2011.";

//text copied from http://www.nationalgeographic.com/community/terms/

I want to split this large string into lines and the lines should not content more than MAX_LINE_LENGTH characters in each line.

What I tried so far

int MAX_LINE_LENGTH = 20;    
System.out.print(Arrays.toString(input.split("(?<=\G.{MAX_LINE_LENGTH})")));
//maximum length of line 20 characters

Output :

[THESE TERMS AND COND, ITIONS OF SERVICE (t, he Terms) ARE A LEGA, L AND B ...

It causes breaking of words. I don't want this. Instead of I want to get output like this:

[THESE TERMS AND , CONDITIONS OF , SERVICE (the Terms) , ARE A LEGAL AND B ...

One more condition added : If a word length is greater than MAX_LINE_LENGTH then the word should get split.

And solution should be without helping of external jars.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Just iterate through the string word by word and break whenever a word passes the limit.

public String addLinebreaks(String input, int maxLineLength) {
    StringTokenizer tok = new StringTokenizer(input, " ");
    StringBuilder output = new StringBuilder(input.length());
    int lineLen = 0;
    while (tok.hasMoreTokens()) {
        String word = tok.nextToken();

        if (lineLen + word.length() > maxLineLength) {
            output.append("
");
            lineLen = 0;
        }
        output.append(word);
        lineLen += word.length();
    }
    return output.toString();
}

I just typed that in freehand, you may have to push and prod a bit to make it compile.

Bug: if a word in the input is longer than maxLineLength it will be appended to the current line instead of on a too-long line of its own. I assume your line length is something like 80 or 120 characters, in which case this is unlikely to be a problem.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...