[Solved]: Regular expressions and semi-linear sets

Problem Detail: In proving Parikh’s Theorem, my Theory of Computer Science textbook defines a linear set as: $u_0 + langle u_1, dots, u_m rangle = {u_0 + a_1u_1 + dots + a_mu_m mid a_1, dots, a_m in mathbb{N}}$ where $u_i$ are vectors of natural numbers. and a semi-linear set as a union of finitely many linear sets. It goes on to say ”For every semilinear set $S subset N^k$, it is not hard to construct a regular set $R subset Sigma^*$ such that $psi(R) = S$” (where $psi$ is the Parikh map, taking strings over an alphabet $Sigma$ to vectors where the first entry is the number of the first letter, the second entry is the number of the second letter, etc. So $psi({a, ab, ba, aaa})) = {(1), (1, 1), (3)}$.) I was trying to think why regular languages would be semi-linear instead of just linear, and it seems like the + (or) operation in regular expressions is to blame. Is this correct: are languages described by regular expressions which use only concatenation and $^*$ linear?

Asked By : Eli Rose

Answered By : Yuval Filmus

Your claim is correct for a one letter alphabet $Sigma = {a}$. We prove by induction that $psi(r)$ is linear for every regular expression without addition. When $r = a$, $psi(r) = 1$. When $r = r_1 r_2$, by the induction hypothesis $psi(r_1) = u_0 + sum_i mathbb{N} u_i$ and $psi(r_2) = v_0 + sum_j mathbb{N} v_j$, and so $psi(r) = u_0 + v_0 + sum_i mathbb{N} u_i + sum_j mathbb{N} v_j$. When $r = r_1^*$, $psi(r)$ is a subset of $mathbb{N}$ closed under addition. Let $g = operatorname{gcd}(psi(r))$. Then $psi(r)/g$ is a numerical semigroup, and so finitely generated, say $psi(r)/g = sum_i mathbb{N} u_i$. So $psi(r) = sum_i mathbb{N} gu_i$. This argument doesn’t quite work when $|Sigma| geq 2$, although there are some notions of GCD for multidimensional vectors, see for example this interesting presentation by Vadim Ponomarenko. Perhaps the argument above can be pushed through, or perhaps a different argument, which uses the fact that $psi(r_1)$ is linear, exists. It’s also possible that the claim is wrong in this case.
Best Answer from StackOverflow

Question Source : http://cs.stackexchange.com/questions/23768

Leave a Reply