[Solved]: Adding a node between two others, minimizing its maximum distance to any other node

Problem Detail: We are given an undirected graph weighted with positive arc lengths and a distinguished edge $(a,b)$ in the graph. The problem is to replace this edge by two edges $(a,c)$ and $(c,b)$ where $c$ is a new node, such that the length of the path $(a,c,b)$ is equal to the inital length of $(a,b)$, and such that the choice of the length of $(a,c)$ minimizes the maximum distance of node $c$ to any other node of the graph. Is there any graph-theory based algorithm that can solve such problem rather than brute force? Actually, the existence of the original edge $(a,b)$ is not essential. Only the end nodes $a$ and $b$, and the length matter. For example we have a graph:

1 2 10 2 3 10 3 4 1 4 1 5

In every line the first two values indicate the node number, and the 3rd one is the corresponding edge value. Now I want to find a new node $c$ between nodes 1 and 2. The answer gives a node $c$ at distance 2 to node 1 (and distance 8 to node 2). The distance of node $c$ is 2 to node 1, 8 to node 2, 1+5+2=8 to node 3, and 5+2=7 to node 4. The maximum distance is 8, which is minimal for all possible choices of the length of $(a,c)$.

Asked By : user3153970

Answered By : babou

Introduction

This answer is in two parts. The first is an analysis of the problem mixed with a sketch of the algorithm to solve it. As it is my first version, it is detailed, but results in an algorithm that is a bit more complex than needed. It is followed by a pseudo-code version of the algorithm, written sometime later, as the algorithm was clearer. This pseudo-code is more direct, and somewhat simpler, but probably easier to understand after reading the first part. In retrospect, the more important ideas are:

to see that the maximum distance of a node to the new node C, as a function of the length of $(A,C)$, is given by the superposition of roof shaped (inverted V) curves, characterized by their top, associated to nodes of the graph.
to see that irrelevant curves can be eliminated efficiently by a scan in monotonic order of abscissa of tops (with minimum backtrack).
then minimal points can be enumerated simply by intersecting the remaining curves two by two, again in monotonic order of abscissa of tops.

Sketch of an algorithm

NOTE: I changed the initial presentation of the question to get a more standard view of graphs, by considering that the edge $(A,B)$ is replaced by a new node $C$ and two edges $(A,C)$ and $(C,B)$ with the same total length as $(A,B)$. Let $(A,B)$ be the distinguished edge, and let $l$ be its length. Let $e$ and $n$ be respectively the number of edges and the number of nodes in the graph. First you remove the edge $(A,B)$. Then, for every node $U$ of the graph, you compute the distance $alpha(U)$ between $A$ and $U$, and the distance $beta(U)$ between $B$ and $U$. This can be done with Dijkstra’s shortest path algorithm, in time $O(e+nlog n)$. Then you add a new node $C$ with an edge $(A,C)$ and an edge $(C,B)$, of length respectively $a$ and $b$, such that $a+b=l$. The length $a$ is not known, and is to be found so as to minimize the distance of the most remote node to node $C$. The distance of $C$ to $A$ or $B$ may be 0. The difficulty of the problem is that the node most distant from $C$ via $A$ may be close to $C$ via $B$, and thus does not matter. Also, a node closer to $C$ via $A$ than via $B$ may inverse that situation as $C$ moves between $A$ and $B$, i.e. as $a$ varies in $[0,l]$. We consider three sets of nodes in the graph, forming a partition of all nodes:

the set $N_A$ of nodes that are always closer to $C$ via $A$ independently of the value of $a$: $N_A={U|alpha(U)+lleqbeta(U)}$
the set $N_B$ of nodes that are always closer to $C$ via $B$ independently of the value of $a$: $N_B={U|beta(U)+lleqalpha(U)}$
and $N_{AB}$ the set of nodes that can be closer to $C$ via $A$ or via $B$, depending on the value of $a$, actually closer via $A$ for smaller values of $a$, and closer via $B$ for larger values of $a$ in $[0,l]$: $N_{AB}={Umidalpha(U)+l>beta(U) wedge beta(U)+l>alpha(U)}$

Let $U_A$ the node in $N_A$ most remote from $A$, and $U_B$ the node in $N_B$ most remote from $B$. No other node in $N_Acup N_B$ is ever more distant from $C$ than the more distant of these two nodes, whatever the distance $a$ of $(A,C)$. Hence all other nodes in $N_Acup N_B$ can be discarded for the computation of the optimal value of $a$. Note that it may happen that $N_A$, or $N_B$, or both, are empty. In which case, the corresponding nodes $U_A$ and/or $U_B$ do not exist. We will build a distance curve on a plane with coordinate $(x,y)$, which will ultimately represent the longest distance $y$ of a node from $C$ as a function of the distance $x=a$ of $C$ from $A$. Hence $x=ain[0,l]$ This distance curve is actually a zig-zag composed of segments angled $pm 45^circ$. Once it is built, the point minimizing the longest distance is to be found in the minimal points of the curve. There may be several such points. All these points have to be computed, and then compared. We describe below how to compute them. Actually, we consider a collection of simpler distance curves, with only one or two segments in the interval $[0,l]$, such that the longest distance of a node from $C$ is always on or above the curve. The final curve is obtained by taking for every value of $x$ the maximum value of $y$ given by one of the distance curves. This can be done formally by computing the intersection of the segments in an ordered fashion. First we have (at most) two distance curves composed of only one segment. These are the curves corresponding to the distance of $U_A$ and $U_B$ to $C$. The first is upward at a $45^circ$ angle, and the second is downward at a $45^circ$ angle. Assuming no other node is ever more distant from $C$, the optimal value of $a$ is the abcissa of their intersection. At that abscissa, that value for $a$, the nodes $U_A$ and $U_B$ are at the same distance from $C$. Hence the distance is minimum when $a$ satisfies the equation $alpha(U_A)+a=beta(U_B)+b$. Since $a+b=l$, this gives $alpha(U_A)+a=beta(U_B)+l-a$, and finally $a=(beta(U_B)-alpha(U_A)+l)/2$. This corresponds to a distance $y=(alpha(U_A)+beta(U_B)+l)/2$ of both nodes to $C$. Together, the two curves give a V shaped curve, with a minimum at the point just computed. But we also have to account for all the nodes in $N_{AB}$. The distance of a node $Uin N_{AB}$ to $C$ first increases with $x$ while the shortest path goes through $A$, them it decreases with $x$ when the shortest path goes through $B$. Hence, the corresponding distance curve is formed of two segments, roof shaped, with a maximum at an abscissa such that the distance to $C$ is the same through $A$ or through $B$. This gives the equation $alpha(U)+a=beta(U)+b$. Resolving it as above, we get $x=a=(beta(U)-alpha(U)+l)/2$, and $y=alpha(U)+beta(U)+l)/2$. We note $T_U$ (for Top of $U$) the point with these coordinate. The top $T_U$ characterizes the roof shaped distance curve for node $U$ since the slope of both side is $pm 45^circ$. Now all the difficulty of the problem is that several nodes $Uin N_{AB}$ may be such that $T_U$ is above our first V-shaped curve, which will lead to the final zig-zag line. But, we first can eliminate all nodes $Uin N_{AB}$ such that $T_U$ is below the V-shaped distance curve, as they will not ever contribute to the longest distance. This is easily done and is skipped here for clarity. Then we want also to eliminates all nodes $Uin N_{AB}$ such that their roof-shaped curve is always below the curve of another node $Vin N_{AB}$. This is determined by the position of their tops, and we say that $T_V$ dominates $T_U$, or that $V$ dominates $U$. Actually $T_V$ and $T_U$ are in a domination relation iff the absolute value of the abcissa difference is less than the absolute value of the ordinate difference. The dominating node/top is the one with the highest ordinate. For this purpose, we first sort all tops of the remanining nodes of $N_{AB}$ in ascending order of increasing abscissa. This is done in time $O(nlog n)$. Then we remark that if top $T_U$ dominate a non adjacent top $T_V$, then any intermediary top $T_W$ dominate $T_V$, or is dominated by $T_U$, or both. Hence, to eliminate all nodes/tops that are dominated, on can simply proceed from left to right (by increasing abscissa), comparing only adjacent nodes, and removing any that is dominated. If the node on the right is dominated, it is removed and the node on the left is compared again to the next right. If the node on the left is dominated, it is removed and the node on the right is then compared with the previous one on its left. If none dominates the other, then the node on left is kept, and the node on right is then compared with the next right. This is repeated until the rightmost node/top has been thus processed, in time $O(n)$. Whatever nodes have been kept do contribute to the final curve, in left to right order of their tops, with the line for $U_A$ at the left and the line for $U_B$ at the right, if they exist. The minima are easy to compute by considering the intersections of the contributing curves from left to right, in time $O(n)$. Finally, the abcissa for the smallest distance of all nodes to $C$ may be found by a linear scan of all minima for the lowest ones. They may actually be several such abscissa, i.e. values for the length $a$ of edge $(A,C)$. If the distances have to be integers, the abscissa of the minima can be rounded up or down. The time complexity of the algorithm is: $O(e+nlog n)$. With thanks to user3153970 who remarked that my initial solution was not quite correct.

Pseudo-code version

The procedure described in the analysis of the problem is modified or improved in some ways.

the length l is changed to d to avoid confusion with the number 1.
for simplification, the V curve is replaced by two tops of roof curves placed on both end of the interval [0,d].
the sets $N_A$ and $N_B$ need not be created. Only $d_A$ and $d_B$ are needed, briefly, to crate the two extra tops.
the set $N_AB$ is not created either, only the corresponding set (actually a multiset, or bag, that may be implemented as a list) of tops, as two distinct nodes may produce the same top, if at the same distance of A and B.

(bugs are provided free of charge)

d := distance (A,B)   none := -1     % programming trick - none is for distances to be ignored % Remove the edge (A,B) from the graph.   Using Dijkstra's shortest path algorithm, for every node $U$ compute its shortest     distances α(U) and β(U) to respectively A and B.   d_A := none d_B := none   % We use lists of pairs, representing the [x,y] coordinate of points, where the %   % abcissa x stands for the choice of length (A,C), and the ordinate y stands is the %   % distance of some node to C. %   % The call first(L) returns a pointer to the first pair of list L. %   % Given a pointer t to a pair, t.x and t.y give the two components of the pair, %   % and the functions prev and next return a pointer to the previous or the next pair, or %   % NIL if it does not exist. The call add([a, b], L) adds the pair [a, b] to the list L. %   % The call remove(t) removes the element pointed by t from its list. % For every node U do    T := []     % T is the list of tops of curves for individual nodes %       if    α(U)+d ≤ β(U) then d_A := max(d_A, α(U))        elsif β(U)+d ≤ α(U) then d_B := max(d_B, β(U))        else add( [(β(U)-α(U)+d)/2, (β(U)+α(U)+d)/2], T) % Instead of V curve, create dummy tops at ends of [0,d] interval % if d_A ≠none then add( [d, d_A+d], T) if d_B ≠none then add( [0, d_B+d], T) Sort T in increasng order of abscissa % Remove from T all tops that are dominated by another top % t1 := first(T) t2 := next(t1) Repeat     if abs(t1.x-t2.x) ≤abs(t1.y-t2.y)     then  % one top dominates the other %         if t1.y≥t2.y             then  % t1 dominates %             remove(t2)             t2 := next(t1)             if t2=NIL then exit loop         else % t2 dominates %             remove(t1)             t1 := pred(t2)             if t1=NIL then                 t1 := t2;                  t2 := next(t1)                 if t2=NIL then exit loop      else % neither top dominate the other %          t1 := t2          t2 := next(t1)          if t2=NIL then exit loop % Compute all local minima % M := []  % M is the list of minima, where roof shaped curves intersect, %          % or intersect the boundaries of the interval [0.d]. % t2 := first(T); if t2.x≠0 then add( [0, t2.y-t2.x], M) repeat     t1 := t2     t2 := next(t1)     if t2=NIL then exit loop     add( [(t1.y-t2.y+t1.x+t2.x)/2, (t1.y+t2.y+t1.x-t2.x)/2], M) if t2.x≠d then add( [d, t2.y-d+t2.x], M) Select in M the pairs with the smallest ordinates. Their abcissas are all     the possible answers to the problem.

If the solutions must be in integer values, it is possible to round the coordinates of the points in M to the nearest integer(s), before selecting the pairs with smallest ordinates. However some more analysis is needed to check whether it can involve passing over a top, and whether abcissas and ordinates are always integer simultaneously. This is left to the reader as an exercise. All bug reports are welcome.

Best Answer from StackOverflow

Question Source : http://cs.stackexchange.com/questions/28520