ansaurus

Question

Algorithm: Split a string into N parts using whitespaces so all parts have nearly the same length

Answer 1

+5 A:

If you're talking about line-breaking, take a look at Dynamic Line Breaking, which gives a Dynamic Programming solution to divide words into lines.

Larry 2010-03-04 18:00:05

maybe have a look at how LaTex does it?

jk 2010-03-04 18:18:26

Answer 2

A:

Partitioning into equal sizes is NP-Complete

Steve B. 2010-03-04 18:03:44

How would you reduce this problem to the Partitioning problem? I don't think this problem is NPC.

Jacob 2010-03-04 19:24:09

-1 I agree with Jacob, there are maximum three lines in the problem as stated, so if the length of the string is N, there are O(N^2) possible ways to split the string into three substrings regardless of the amount of whitespace. You can iterate through all of them in polynomial time. There is nothing NP-complete in this problem. Even in the general case, you can first choose a split point (O(N) possibilities) and then recursively split the two parts, yielding worst-case quadratic algorithm and in practice O(N log N).

antti.huima 2010-03-04 20:32:06

Answer 3

A:

Working python codes

codes by David Eppstein.

TheMachineCharmer 2010-03-04 18:19:42

Answer 4

+1 A:

I don't know about proven, but it seems like the simplest and most efficient solution would be to divide the length of the string by N then find the closest white space to the split locations (you'll want to search both forward and back).

The below code seems to work though there are plenty of error conditions that it doesn't handle. It seems like it would run in O(n) where n is the number of strings you want.

class Program
{
    static void Main(string[] args)
    {
        var s = "This is a string for testing purposes. It will be split into 3 parts";
        var p = s.Length / 3;
        var w1 = 0;
        var w2 = FindClosestWordIndex(s, p);
        var w3 = FindClosestWordIndex(s, p * 2);
        Console.WriteLine(string.Format("1: {0}", s.Substring(w1, w2 - w1).Trim()));
        Console.WriteLine(string.Format("2: {0}", s.Substring(w2, w3 - w2).Trim()));
        Console.WriteLine(string.Format("3: {0}", s.Substring(w3).Trim()));
        Console.ReadKey();
    }

    public static int FindClosestWordIndex(string s, int startIndex)
    {
        int wordAfterIndex = -1;
        int wordBeforeIndex = -1;
        for (int i = startIndex; i < s.Length; i++)
        {
            if (s[i] == ' ')
            {
                wordAfterIndex = i;
                break;
            }
        }
        for (int i = startIndex; i >= 0; i--)
        {
            if (s[i] == ' ')
            {
                wordBeforeIndex = i;
                break;
            }
        }

        if (wordAfterIndex - startIndex <= startIndex - wordBeforeIndex)
            return wordAfterIndex;
        else
            return wordBeforeIndex;
    }
}

The output for this is:

1: This is a string for
2: testing purposes. It will
3: be split into 3 parts

Brian 2010-03-04 18:35:28

Answer 5

A:

The way word-wrap is usually implemented is to place as many words as possible onto one line, and break to the next when there is no more room. This assumes, of course, that you have a maximum-width in mind.

Regardless of what algorithm you use, keep in mind that unless you are working with a fixed-width font, you want to work with the physical width of the word, not the number of letters.

BlueRaja - Danny Pflughoeft 2010-03-04 23:44:40

ansaurus

tags:

views:

answers:

Algorithm: Split a string into N parts using whitespaces so all parts have nearly the same length

related questions