A Migration Approach to Optimal Control of Wireless Networks – Communications and Networking Research Group

(NSF Grant CNS-1524317)

PI: Eytan Modiano

Recent growth in mobile and media-rich applications has greatly increased the demand for wireless capacity, straining wireless networks. This dramatic increase in demand poses a challenge for current wireless networks, and calls for new algorithms that make better use of scarce wireless resources. Recently developed algorithms for optimally managing wireless resources hold the promise of significant performance improvement, but require all of the nodes in the network to be upgraded with new functionality – which is both costly and impractical. This project introduces a novel architectural paradigm for wireless networks, whereby optimal algorithms are designed to operate in networks with both new and legacy nodes. This new paradigm allows optimal algorithms to be incrementally deployed alongside existing schemes, thus providing a migration path for new control algorithms, and the promise of dramatic improvement in network performance at incremental cost.

This project develops a novel overlay architecture for implementing optimal network control algorithms over legacy networks. New nodes, capable of implementing sophisticated network control algorithms, will be connected in a virtual network overlay that operates on top of the legacy network. The research will answer fundamental questions about which nodes must be upgraded with new functionality and the tradeoff between the number of new upgraded nodes and network performance. The project will develop new routing algorithms that send packets from their sources to their destinations in the overlay network, and transmission scheduling algorithms for mitigating the effect of wireless interference. These new algorithms will be designed to operate efficiently in a network with a mix of new and legacy nodes, by taking interoperability into account. Thus, this project will answer fundamental questions about the introduction of new control techniques into legacy networks, and provide a promising approach to bridging the gap between new techniques developed for universal deployment and the reality of the networks in operation today.

Our research goals include:

development of an overlay architecture for implementing optimal network control algorithms over a legacy network, using a limited number of overlay nodes.
development ofnew routing and flow control algorithms that operate over the overlay network.
development of optimal scheduling algorithms for a wireless network with a mix of controllable and uncontrollable nodes.
validation of our algorithms using a combination of simulation, emulation, and testbed implementation.

Participants:
Prof. Eytan Modiano, PI

Bai Liu, Graduate student
Qingkai Liang, former graduate student
Thomas Stahlbuhk, former graduate student
Abhishek Sinha, former graduate student

SIgnificant Results:

In [1] we developed an Overlay Architecture for Throughput Optimal Multipath Routing. Legacy networks are often designed to operate with simple single-path routing, like shortest-path, which is known to be throughput suboptimal. On the other hand, previously proposed throughput optimal policies (i.e., backpressure) require every device in the network to make dynamic routing decisions. We developed an overlay architecture for dynamic routing such that only a subset of devices (overlay nodes) need to make dynamic routing decisions. We determine the essential collection of nodes that must bifurcate traffic for achieving the maximum multicommodity network throughput. We apply our optimal node placement algorithm to several graphs and the results show that a small fraction of overlay nodes is sufficient for achieving maximum throughput.

In [2] we address the problem of optimal routing in overlay networks. The overlay nodes are capable of implementing any dynamic routing policy, however, the legacy underlay has a fixed, single path routing scheme and uses a simple work-conserving forwarding policy. Moreover, the underlay routes are pre-determined and unknown to the overlay network. The overlay network can increase the achievable throughput of the legacy network by using multiple routes, which consist of direct routes and indirect routes through other overlay nodes. We develop an optimal dynamic routing algorithm for such overlay networks called the Optimal Overlay Routing Policy (OORP). OORP is derived using the classical dual subgradient descent method, and it can be implemented in a distributed manner. We show that the queue-lengths can be used as a substitute for the dual variables in the algorithm. However, the underlay queuelengths are unknown to the overlay, so we propose two regression based schemes that learn simplified models of the backlog in the underlay using historical data and use them to estimate the queue-lengths in real time. Simulation results show that nearoptimal performance can be achieved without any knowledge of the underlay.

In [3] we consider the problem of throughput-optimal packet dissemination, in the presence of an arbitrary mix of unicast, broadcast, multicast and anycast traffic in general wireless networks. We propose an online dynamic policy, called Universal Max-Weight (UMW), which solves the above problem efficiently. To the best of our knowledge, UMW is the first throughput-optimal algorithm of such versatility in the context of generalized network flow problems. Conceptually, the UMW policy is derived by relaxing the precedence constraints associated with multi-hop routing, and then solving a min-cost routing and max-weight scheduling problem on a virtual network of queues. When specialized to the unicast setting, the UMW policy yields a throughput-optimal cycle-free routing and link scheduling policy. This is in contrast to the well-known throughput-optimal Back- Pressure (BP) policy which allows for packet cycling, resulting in excessive delay. Extensive simulation results show that the proposed policy incurs a substantially lower delay as compared to the BP policy.

In [4] We consider a wireless system consisting of both legacy (uncontrollable) and new (controllable) nodes. The Legacy network is uncooperative and its users transmit on their channels whenever their buffers are nonempty. The new (controllable) nodes cannot directly observe the legacy network’s transmissions and must rely upon receiver feedback to avoid collisions. Our objective is to devise scheduling policies for the controllable network, so that its users may opportunistically transmit over the channels without degrading the legacy network’s throughput. To this end, we first analyze a network consisting of a single legacy and single new user communicating over one shared channel. We characterize the achievable throughput of the new user by formulating the problem as a partially observable Markov decision process and derive upper and lower bounds on the maximum performance. We then use these results to characterize the capacity region of larger networks. We formulate a queue length-based scheduling policy, Longest Queue First, that adaptively allocates channel resources to the secondary users and relate its performance to the capacity region.

In [5] we consider the problem of minimizing queue length regret in adversarial settings. Stochastic models have been dominant in network optimization theory for over two decades, due to their analytical tractability. However, these models fail to capture non-stationary or even adversarial network dynamics which are of increasing importance for modeling the behavior of networks under malicious attacks or characterizing short-term transient behavior. In this paper, we focus on minimizing queue length regret under adversarial network models, which measures the finite time queue length difference between a causal policy and an “oracle” that knows the future. Two adversarial network models are developed to characterize the adversary’s behavior. We provide lower bounds on queue length regret under these adversary models and analyze the performance of two control policies (i.e., the MaxWeight policy and the Tracking Algorithm). We further characterize the stability region under adversarial network models, and show that both the MaxWeight policy and the Tracking Algorithm are throughput-optimal even in adversarial settings.

In [6] we develop a new reinforcement learning method for overlay networks, where the dynamics of the underlay are unknown. In particular, we consider using model-based reinforcement learning (RL) to learn the optimal control policy of queueing networks so that the average job delay (or equivalently the average queue backlog) is minimized. Existing RL techniques, however, cannot handle the unbounded state spaces of the network control problem. To overcome this difficulty, we propose a new algorithm, called Piecewise Decaying -Greedy Reinforcement Learning (PDGRL), which applies model-based RL methods over a finite subset of the state space. We establish that the average queue backlog under PDGRL with an appropriately constructed subset can be arbitrarily close to the optimal result. We evaluate PDGRL in dynamic server allocation and routing problems. Simulations show that PDGRL minimizes average queue backlog effectively.

Related Publications:

[1] Nathaniel M. Jones, Georgios S. Paschos, Brooke Shrader, and Eytan Modiano, “An Overlay Architecture for Throughput Optimal Multipath Routing,” IEEE/ACM Transactions on Networking, 2018.

[2] Anurag Rai, Rahul Singh and Eytan Modiano, “A Distributed Algorithm for Throughput Optimal Routing in Overlay Networks,” IFIP Networking 2019, Warsaw, Poland, May 2019.

[3] Abhishek Sinha and Eytan Modiano, “Optimal Control for Generalized Network-Flow Problems,” IEEE Infocom, 2017.

[4] Thomas Stahlbuhk, Brooke Shrader, Eytan Modiano, “Throughput Maximization in Uncooperative Spectrum sharing networks,” IEEE International Symposium on Information Theory, 2016.

[5] Qingkai Liang and Eytan Modiano, “Minimizing Queue Length Regret Under Adversarial Network Models,” ACM Sigmetrics, 2018.

[6] Bai Liu, Qiaomin Xiey, and Eytan Modiano, “Reinforcement Learning for Optimal Control of Queueing Systems,” Allerton Conference on Communication, Control, and Computing, September 2019.