[Solved]: A heuristic for finding a maximum disjoint set

Problem Detail: Background I need to find a largest set of non-overlapping axis-parallel squares, out of a given collection of candidate squares. This problem is NP-complete. Many papers suggest approximation algorithms (see Maximum Disjoint Set in Wikipedia), but I need an exact algorithm. My current solution uses the following divide-and-conquer strategy:

Calculate all horizontal and vertical lines that pass through corners of the candidate squares. Each such line separates the candidates into three groups: candidates that are entirely at one side of the line, candidates that are entirely at the other side of the line, and candidates that are intersected by the line. Now there are two cases:
- Easy Case: There is a separator line $L$ that does not intersect any candidate square. Then, recursively calculate the maximum-disjoint-set among the squares on one side of $L$, recursively calculate the maximum-disjoint-set among the squares on the other side of $L$, and return the union of these two sets. The separator line guarantees that the union is still a disjoint set.
- Hard Case: All separator lines intersect one or more candidate squares. Select one of the separator lines, $L$; suppose that $L$ intersects $k$ squares. Calculate all $2^k$ subsets of these intersected squares. For each subset $X$ that is in itself a disjoint set, calculate the maximum-disjoint-set recursively as in the Easy Case, under the assumption that $X$ is in the set. I.e., recursively calculate the maximum-disjoint-set among the squares on one side of $L$ that do not intersect $X$, recursively calculate the maximum-disjoint-set among the squares on the other side of $L$ that do not intersect $X$, and calculate the union of these two sets with $X$. Out of all $2^k$ unions, return the largest one.

Question My question is: What is the best way to select the separator line $L$? There are two conflicting considerations: On one hand, we want $L$ to intersect as few squares as possible, so that the power set is not too large. On the other hand, we want $L$ to separate the candidate squares to subsets of balanced size, preferrably equal size, so that the recursion ends as fast as possible. What is the best way to balance these conflicting considerations? EDIT: Additional details My current heuristic is to pick the separator line that intersects the least number of squares. This heuristic allows the algorithm to process input sets with up to $n=30$ candidates, in several seconds. The optimal solution in these cases has about 10 squares. In general, the number of squares in the optimal solution is near $2cdotsqrt{n}$. When the input grows beyond 30 candidates, the running time becomes much slower (several minutes and more). My goal is to find a heuristic that will allow me to process larger sets of candidates.

Asked By : Erel Segal-Halevi

Answered By : D.W.

Rather than your approach, I suggest you formulate this as an integer linear program and feeding it to an off-the-shelf ILP solver. Alternatively, formulate it as a SAT problem and feed it to a SAT solver: you’ll probably need to take the decision problem version, where you ask whether there exists a subset of $k$ non-overlapping squares, and then use binary search on $k$. Those would be the first approaches I would try, personally. If you definitely want to try your approach based upon a “separating line”, then I think the best way to answer your question is going to be to pick a representative set of problem instances, and try some different heuristics on them to see which seems to work best. My intuition suggests that the best way to select the separator line may be to pick the line $L$ that intersects as few squares as possible, without worrying about how balanced the division is (though if there is a tie among multiple lines that each intersect the same number of squares, you could always use “how balanced the division is” as a tie-breaker). The reason is that you are getting an exponential multiplicative increase in the running time each time you enumerate all subsets of the squares that intersect the line $L$. I think your prime consideration is going to be keeping that blowup down. But that’s just my intuition, and my intuition might be wrong. I think you need to do the experiment to find out empirically what works best. If you do apply your separating line approach, you might consider using a branch-and-bound approach. Keep track of the best solution you’ve found so far (i.e., the largest set of non-overlapping squares you’ve been able to find so far); say that it is of size $s$ at any point in time. Now anytime your search tree enters a subtree where you can prove that all solutions below the subtree will have size $le s$, there is no need to explore that subtree. For instance, if you have a line $L$ and a subset $X$ where you know that the size of $X$ plus the size of the largest sub collection of non-intersecting squares above $L$ plus the total number of squares below $L$ is $le s$, then there is no need to recursively compute the largest sub collection of non-intersecting squares below $L$, which saves you one recursive call. But, if you use an off-the-shelf ILP solver, it will already implement these sort of branch-and-bound heuristics for you — hence my advice to start by formulating this as an ILP problem and applying an off-the-shelf ILP solver. Finally, the following paper apparently describes an $O(2^{sqrt{n}})$ time algorithm to compute the exact solution to your problem. This is an improvement over the obvious algorithm that enumerates all possible subsets of squares, which takes $O(2^n)$ time. An application of the planar separator theorem to counting problems. S.S. Ravi and H.B. Hunt III. Information Processing Letters, Volume 25, Issue 5, 10 July 1987, Pages 317–321. http://www.sciencedirect.com/science/article/pii/0020019087902067

Best Answer from StackOverflow

Question Source : http://cs.stackexchange.com/questions/20126

[Solved]: A heuristic for finding a maximum disjoint set

Asked By : Erel Segal-Halevi

Answered By : D.W.

Best Answer from StackOverflow

Related

Leave a ReplyCancel reply