Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
226 views
in Technique[技术] by (71.8m points)

How to iteratively generate k elements subsets from a set of size n in java?

I'm working on a puzzle that involves analyzing all size k subsets and figuring out which one is optimal. I wrote a solution that works when the number of subsets is small, but it runs out of memory for larger problems. Now I'm trying to translate an iterative function written in python to java so that I can analyze each subset as it's created and get only the value that represents how optimized it is and not the entire set so that I won't run out of memory. Here is what I have so far and it doesn't seem to finish even for very small problems:

public static LinkedList<LinkedList<Integer>> getSets(int k, LinkedList<Integer> set)
{
    int N = set.size();
    int maxsets = nCr(N, k);
    LinkedList<LinkedList<Integer>> toRet = new LinkedList<LinkedList<Integer>>();

    int remains, thresh;
    LinkedList<Integer> newset; 
    for (int i=0; i<maxsets; i++)
    {
        remains = k;
        newset = new LinkedList<Integer>();
        for (int val=1; val<=N; val++)
        {
            if (remains==0)
                break;
            thresh = nCr(N-val, remains-1);
            if (i < thresh)
            {
                newset.add(set.get(val-1));
                remains --;
            }
            else 
            {
                i -= thresh;
            }
        }
        toRet.add(newset);
    }

    return toRet;

}

Can anybody help me debug this function or suggest another algorithm for iteratively generating size k subsets?

EDIT: I finally got this function working, I had to create a new variable that was the same as i to do the i and thresh comparison because python handles for loop indexes differently.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

First, if you intend to do random access on a list, you should pick a list implementation that supports that efficiently. From the javadoc on LinkedList:

All of the operations perform as could be expected for a doubly-linked list. Operations that index into the list will traverse the list from the beginning or the end, whichever is closer to the specified index.

An ArrayList is both more space efficient and much faster for random access. Actually, since you know the length beforehand, you can even use a plain array.

To algorithms: Let's start simple: How would you generate all subsets of size 1? Probably like this:

for (int i = 0; i < set.length; i++) {
    int[] subset = {i};
    process(subset);
}

Where process is a method that does something with the set, such as checking whether it is "better" than all subsets processed so far.

Now, how would you extend that to work for subsets of size 2? What is the relationship between subsets of size 2 and subsets of size 1? Well, any subset of size 2 can be turned into a subset of size 1 by removing its largest element. Put differently, each subset of size 2 can be generated by taking a subset of size 1 and adding a new element larger than all other elements in the set. In code:

processSubset(int[] set) {
    int subset = new int[2];
    for (int i = 0; i < set.length; i++) {
        subset[0] = set[i];
        processLargerSets(set, subset, i);
    }
}

void processLargerSets(int[] set, int[] subset, int i) {
    for (int j = i + 1; j < set.length; j++) {
        subset[1] = set[j];
        process(subset);
    }
}

For subsets of arbitrary size k, observe that any subset of size k can be turned into a subset of size k-1 by chopping of the largest element. That is, all subsets of size k can be generated by generating all subsets of size k - 1, and for each of these, and each value larger than the largest in the subset, add that value to the set. In code:

static void processSubsets(int[] set, int k) {
    int[] subset = new int[k];
    processLargerSubsets(set, subset, 0, 0);
}

static void processLargerSubsets(int[] set, int[] subset, int subsetSize, int nextIndex) {
    if (subsetSize == subset.length) {
        process(subset);
    } else {
        for (int j = nextIndex; j < set.length; j++) {
            subset[subsetSize] = set[j];
            processLargerSubsets(set, subset, subsetSize + 1, j + 1);
        }
    }
}

Test code:

static void process(int[] subset) {
    System.out.println(Arrays.toString(subset));
}


public static void main(String[] args) throws Exception {
    int[] set = {1,2,3,4,5};
    processSubsets(set, 3);
}

But before you invoke this on huge sets remember that the number of subsets can grow rather quickly.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...