[Solved]: What is wrong with this seeming contradiction with a paper about AND-compression of SAT?

Problem Detail: Got a simple construction seemingly contradicting a paper assuming plausible conjecture. Since it is unlikely the conjecture to be false, what is wrong with the argument? From a paper

An AND-compression is a deterministic polynomial-time algorithm that maps a set of SAT-instances $x_1,dots,x_t$ to a single SAT-instance $y$ of size $text{poly}(max_i |x_i|)$ such that $y$ is satisfiable if and only if all $x_i$ are satisfiable. … Unless the unlikely complexity-theoretic collapse $sf coNP subseteq NP/poly$ occurs, there is no AND-compression for SAT.

Construction: If $x_i$ are not in CNF, convert them to CNF possibly adding new variables. This is polynomial. In CNF one can encode AND gate $ C := A land B $ and OR gate $ C := A lor B $. The AND and OR gates have the property that for all satisfying assignments of their CNFs we have $C iff (A land B)$ and $C iff (A lor B)$. Let the $j$-th clause in $x_i$ be $x_{i,j} = [y_1, ldots, y_k]$ for literals $y_m$. Using the OR gate and new variables, compute variable $O_{i,j} := y_1 lor y_2 cdots lor y_k$. For all clauses in $x_i$ ($O_{i,j}$) and the AND gate compute variable $V_i := O_{i,1} land O_{i,2} land cdots land O_{i,m}$. By construction $x_i iff V_i$. For all $V_i$, using the AND gate compute $F := V_1 land cdots land V_t$. $F iff x_1 land cdots land x_t$. So the final formula $y$ is the union of the CNFs for $O_{i,j}$,$V_i$,$F$, and a unit clause $[F]$. $|y|$ is linear in the number of all literals, $t$ is polynomial in $max |x_i|$, which makes $|y|$ polynomial in $max |x_i|$. This appears to contradict the claim in the paper, unless the certain collapse happens.

What is wrong with this argument that appears to contradict the claim in the paper?

Similar construction works for OR-compression, when at least one $x_i$ must be satisfiable. The newly introduced variable are uniquely determined by the original variables.

OR gate in  CNF 3 := 1 / 2 : [[1 2 -3],[-1 3],[-2 3]] AND gate in CNF 3 := 1 / 2 : [[-1 -2 3],[1 -3],[2 -3]] 
Asked By : elluser

Answered By : Kyle Jones

The confusion arises from a misunderstanding of what being polynomial in the size of the largest instance means. It does not mean that polynomial growth of the compressor’s output is allowed as the number of instances ($t$) increases. Rather it means the compressor’s output is allowed to grow only as the maximum instance size grows, independent of $t$, and then only within a fixed polynomial of that value. Your compressor concatenates the CNF instances, converts them to circuits, and then converts the circuits back to CNF. Its output grows linearly with the number of instances ($t$) independent of the instance sizes and is therefore bound to exceed any polynomial function of the maximum instance size. Because of this, indeed because it does not do any compression at all, your compressor does not cause any hierarchy collapse. There also seems to be a misunderstanding as to what the AND-compressor function is supposed to do. It does not have to output an instance that preserves all satisfying assignments for all the input instances, nor does there need to be a parsimonious counting reduction from the input instances to the output instance. A conforming implementation could at the extreme output an empty instance if all the input instances are satisfiable and a single empty clause otherwise. A more plausible AND-compressor might adapt standard data compression techniques such as the use of a dictionary. A table of unsatisfiable subformulas could be precomputed and used to search against the input SAT instances. A match would imply a forbidden variable assignment, which when instantiated as a learned clause might subsume (eliminate) several other clauses. Another technique involves taking advantage the different sizes of the input instances. If the input instances are of different sizes, the larger instances could be searched for subformulas equivalent to one of the smaller instances. Such a match implies the existence of a variable assignment that could produce the subformula, and therefore the larger instance is satisfiable if the smaller instance is satisfiable. If the smaller instance is unsatisfiable, the larger instance might still be satisfiable but it doesn’t matter because compressor output should still produce an unsatisfiable formula. So the output instance is effectively decoupled from the satisfiability of the larger instance and thus the larger instance no longer has to be represented in the output at all, eliminating a large number of clauses. There are many other techniques. The theorem in the paper suggests that efficiently achieving high levels of compression is unlikely given the implied partial collapse of the polynomial hierarchy.
Best Answer from StackOverflow

Question Source : http://cs.stackexchange.com/questions/26361

Leave a Reply