July 31, 2015

Multipath TCP
Internet-Draft
Intended status: Standards Track
Expires: July 31, 2015
A. Walid
Bell Labs
Q. Peng
Caltech
J. Hwang
Bell Labs
S. Low
Caltech
January 27, 2015
Balanced Linked Adaptation Congestion Control Algorithm for MPTCP
draft-walid-mptcp-congestion-control-02
Abstract
This document describes the mechanism of Balia, the "Balanced linked
adaptation", which is a congestion control algorithm for Multipath
TCP (MPTCP). The recent proposals, LIA and OLIA, suffer from either
unfriendliness to Single Path TCP (SPTCP) or unresponsiveness to
network changes under certain conditions. The tradeoff between
friendliness and responsiveness is inevitable, but Balia judiciously
balances this tradeoff based on a new design framework that allows
one to systematically explore the design space. Balia has been
implemented in the Linux kernel and also included in the UCLouvain’s
MPTCP implementation.
Status of This Memo
This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current InternetDrafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
This Internet-Draft will expire on July 31, 2015.
Walid, et al.
Expires July 31, 2015
[Page 1]
Internet-Draft
MPTCP Congestion Control
January 2015
Copyright Notice
Copyright (c) 2015 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust’s Legal
Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License.
Table of Contents
1.
Introduction . . . . . . . . . . . .
1.1. Requirements Language . . . . . .
1.2. Terminology . . . . . . . . . . .
2. Balanced Linked Adaptation Algorithm
3. Theoretical justification . . . . . .
4. Implementation considerations . . . .
5. Experimental results . . . . . . . .
5.1. Khalili’s scenario . . . . . . .
5.2. Responsiveness . . . . . . . . .
5.3. NorNet experiment . . . . . . . .
6. Conclusion . . . . . . . . . . . . .
7. References . . . . . . . . . . . . .
7.1. Normative References . . . . . .
7.2. Informative References . . . . .
Authors’ Addresses . . . . . . . . . . .
1.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
2
3
3
4
4
5
6
6
8
9
9
10
10
10
11
Introduction
Various congestion control algorithms have been proposed as
extensions of TCP NewReno to MPTCP. A straightforward extension is
to run TCP NewReno on each subpath, e.g., [HONDA09]. This algorithm,
however, can be highly unfriendly when it shares a path with a SPTCP
user. This motivates the Coupled algorithm which is fair because it
has the same underlying utility function as TCP NewReno, e.g.,
[KELLY05], [HAN04]. It is found in [RFC6356], however, that the
Coupled algorithm responds slowly in a dynamic network environment.
The current default congestion control algorithm for MPTCP, called
LIA (Linked-Increases Algorithm), is more responsive than the Coupled
algorithm. However, it has been reported that LIA can sometimes be
excessively aggressive toward SPTCP users without any benefit to
Walid, et al.
Expires July 31, 2015
[Page 2]
Internet-Draft
MPTCP Congestion Control
January 2015
multipath users [KHALILI12]. Recently, OLIA (Opportunistic LinkedIncreases Algorithm) [KHALILI12] was proposed as a variant of Coupled
algorithm [KELLY05] which is as friendly as the Coupled algorithm.
We have found, however, that OLIA can be unresponsive to changes in
network conditions in some scenarios (e.g., when the paths used by a
user have similar round trip times (RTTs)) [PENG14].
In this draft, we introduce Balia, the "Balanced linked adaptation",
which is a window-based congestion control algorithm for MPTCP. The
main design goal of Balia is to systematically tradeoff different
properties such as TCP friendliness and responsiveness by developing
structural understanding of MPTCP algorithms in a new design
framework. For instance, it is widely suspected that there is a
tradeoff between friendliness and responsiveness and it is proved in
this framework that this tradeoff is indeed inevitable. By
parameterizing different structural properties, Balia generalizes
existing algorithms and explicitly balances the tradeoff. We also
prove mathematically that Balia has a unique equilibrium point, and
that it is asymptotically stable. Therefore, Balia can provide
balanced performance in terms of friendliness and responsiveness.
In [PENG14], we compare the performance of several MPTCP algorithms
over a testbed, including Balia, OLIA and LIA. Our experimental
results show that Balia is friendlier than LIA and more responsive
than OLIA. It also solves LIA’s problem identified by [KHALILI12].
Balia has been implemented in the Linux kernel and also included in
the UCLouvain’s MPTCP implementation.
1.1.
Requirements Language
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in [RFC2119].
1.2.
Terminology
Regular/Singlepath TCP (SPTCP): The standard version of TCP [RFC5681]
that uses a single pair of IP address and ports per connection.
Multipath TCP (MPTCP): A modified version of the regular TCP that
simultaneously uses multiple paths between hosts.
LIA: The Linked-Increases Algorithm for MPTCP [RFC6356].
OLIA: The Opportunistic Linked-Increases Algorithm for MPTCP
[KHALILI12].
Balia: The Balanced linked adaptation algorithm for MPTCP [PENG14].
Walid, et al.
Expires July 31, 2015
[Page 3]
Internet-Draft
MPTCP Congestion Control
January 2015
AIMD: The Additive Increase Multiplicative Decrease algorithm used in
TCP congestion avoidance.
w_r: The congestion window on a path r.
rtt_r: The Round-Trip Time (RTT) on a path r.
2.
Balanced Linked Adaptation Algorithm
Balia is a generalized MPTCP algorithm that strikes a good balance
between friendliness and responsiveness. The algorithm only applies
to the AIMD part of the congestion avoidance phase. The other parts
such as slow start, fast retransmit/recovery algorithms are the same
as in TCP [RFC5681]. The minimum ssthresh is set to 1 MSS instead of
2 when more than 1 path is available.
Each
be a
Each
trip
source s has a set of paths r. As a special case, the set can
singleton in which case Balia reduces to TCP Reno (see below).
path r maintains a congestion window w_r and measures its roundtime rtt_r. The window adaptation of Balia is as follows:
- For each ACK on path r, increase w_r by:
x_r
1 + alpha_r
4 + alpha_r
-------------------- * ( ----------- ) * ( ----------- )
rtt_r * (SUM(x_k))^2
2
5
- For each packet loss on path r, decrease w_r by:
w_r
----2
*
min { alpha_r,
1.5 }
where x_r = w_r / rtt_r and alpha_r = max { x_k } / x_r.
Note that Balia’s decrement algorithm multiplies the MD algorithm of
TCP Reno by a factor in the range of [1, 1.5].
If a Balia user uses only a single path, then alpha_r = 1, in which
case both the increment and the decrement algorithms of Balia reduce
to those of TCP Reno. Hence Balia reduces to TCP Reno on single
paths.
3.
Theoretical justification
In [PENG14], we have developed a unified model of MPTCP algorithms
and characterized the design space. This provides a framework to
systematically design MPTCP algorithms and analyze their behavior in
Walid, et al.
Expires July 31, 2015
[Page 4]
Internet-Draft
MPTCP Congestion Control
January 2015
a large network at design time. For instance, it has allowed us to
identify designs that guarantee the existence, uniqueness and
stability of network equilibrium. Balia is designed using this
framework to achieve specific design goals. In this section, we will
focus on how Balia balances three often conflicting design goals, TCP
friendliness, responsiveness and window oscillation.
TCP friendliness characterizes how much more throughput a MPTCP flow
will get when it competes with an SPTCP flow. A MPTCP flow is said
to be "TCP friendly" if it does not dominate the available bandwidth
when it shares the same network with a SPTCP flow.
Responsiveness characterizes how fast the MPTCP algorithm reacts to
changes in network conditions.
The window oscillation property characterizes how severely the window
size fluctuates around the equilibrium point. It is an inherent
property of AIMD-like algorithms.
In [PENG14], it is proved mathematically that there is an inevitable
tradeoff between TCP friendliness and responsiveness, and between
responsiveness and window oscillation. Thus, it is theoretically
impossible to maximize the performance in all three metrics
simultaneously.
Our design philosophy is to allow window oscillation up to an
acceptable level in order to improve both friendliness and
responsiveness. This is achieved by explicitly parameterizing these
properties and systematically choosing these parameters.
4.
Implementation considerations
To enable Balia to operate in a wide spectrum of applications
scenarios, i.e., with wide range of w_r and rtt_r, we need to rewrite the Balia’s additive increase (AI) formula in an equivalent
form which allows easier implementation in the Linux kernel with
fixed point operations, and avoids integer-overflow problems. Note
that in an extreme case, the sending rate on a path, x_r, may
increase to 2^30 (w_r/rtt_r) or more. In such a case, the term
(SUM(x_k))^2 in the current formula can easily cause 64-bit integer
overflow. In addition, there can be also a significant rounding
error when we do a fixed-point division by a large number.
Therefore, to mitigate the above issues, we rewrite Balia’s additive
increase (AI) formula as follows:
Walid, et al.
Expires July 31, 2015
[Page 5]
Internet-Draft
MPTCP Congestion Control
January 2015
x_r
x_r + max{x_r}
4 * x_r + max{x_r}
-------------------- * ( -------------- ) * ( ------------------ )
rtt_r * (SUM(x_k))^2
2 * x_r
5 * x_r
which can be simplified to:
(x_r + max{x_k}) * (4 * x_r + max{x_k})
--------------------------------------w_r * (SUM(x_k))^2 * 10
Now the term x_r is computed in bytes/sec and may increase up to
about 2^52. Thus we need to scale the sending rate through bit-shift
operations so that max{x_k} does not exceed a cerntain value, e.g.,
2^25, to safely calculate the term (SUM(x_k))^2. At this point, if
max{x_k} is a very large number, a small x_r can be sometimes 0 after
the right shift operation. This may hurt the accuracy of the
calculation. But we have seen that the overall rounding error is not
significant since max{x_k} is the dominant term in the formula while
x_r would be neglible in such cases.
5.
Experimental results
In this section, we summarize our experimental results that
illustrate the weaknesses of the current algorithms (LIA and OLIA).
We evaluate the MPTCP algorithms using the UCLouvain’s MPTCP
implementation [MPLKI]. The network parameters such as network
bandwidth and one-way delay are implemented by Dummynet [DUMMYNET].
Iperf is used to generate traffic and measure the throughput.
5.1.
Khalili’s scenario
In [KHALILI12], it has been revealed that LIA can be unfriendly to
SPTCP users even when its own MPTCP throughput is saturated. That
is, the throughputs of SPTCP flows are significantly degraded without
any benefit to MPTCP flows. To reproduce this scenario, we create a
testbed as shown in Figure 1. In this scenario, N1 type1 users can
be either single-path or multipath while N2 type2 users are always
single-path.
Walid, et al.
Expires July 31, 2015
[Page 6]
Internet-Draft
N1
Type1
flows
N2
Type2
flows
MPTCP Congestion Control
January 2015
+----+
|
|
Server
---------------------------------| C1 |-- for type1
--\
/--|
|
flows
\
/
+----+
\
/
Router1
\
+----+
+----+
/
\--|
|
|
|--/
| C2 |---|
|
Server
---------|
|
|
|------------------ for type2
+----+
+----+
flows
Router2 Router3
Figure 1: Testbed topology for Khalili’s scenario. The Router1
emulates the server-side bottleneck for type1 users and the Router2
emulates the shared bottleneck.
The aggregate throughputs of these users are shown in Table 1 for the
case when all users are SPTCP and the case when all type1 users are
upgraded to MPTCP users using different algorithms. We observe that
upgrading type1 users to MPTCP decreases type2 users’ throughput
without any benefit to type1 users if LIA is used; the type2 users
are worse off by 19% when N1=N2=5 and by 25% when N1=15 and N2=5.
Both OLIA and Balia are more friendly than LIA to SPTCP (type2)
users.
C1=C2=10Mbps
+-------------+----------------------------------+
| Type1 users |
Type1 users are multipath
|
|
are
+-----------+----------+-----------+
| single-path |
LIA
|
OLIA
|
Balia
|
+------+-------------+-----------+----------+-----------+
N1=5 |type1 |
9.47
|
9.26
|
9.25
|
9.25
|
+------+-------------+-----------+----------+-----------+
N2=5 |type2 |
9.29
|
7.55
|
8.13
|
8.32
|
------+------+-------------+-----------+----------+-----------+
N1=15 |type1 |
9.39
|
8.96
|
8.93
|
9.02
|
+------+-------------+-----------+----------+-----------+
N2=5 |type2 |
9.29
|
6.94
|
7.41
|
7.98
|
+------+-------------+-----------+----------+-----------+
Values are in Mbps.
Table 1: Throughput obtained by type1 and type2 users: Upgrading
type1 users to MPTCP decreases type2 users’ throughput without any
benefit to type1 users.
Walid, et al.
Expires July 31, 2015
[Page 7]
Internet-Draft
5.2.
MPTCP Congestion Control
January 2015
Responsiveness
To demonstrate the dynamic performance of MPTCP algorithms, we
implement a testbed topology as shown in Figure 2. One-way delay of
each single-path is about 10ms. In this scenario, a MPTCP flow is
long lived while 5 SPTCP flows start at 40s and end at 80s. Table 2
shows the convergence time, which is defined as the first time the
congestion window on the second path via Router2 reaches the average
congestion window after the SPTCP users have left.
Router1
Router3
+-------+ 20Mbps,10ms +-------+ 40Mbps
1 MPTCP ----------|
|-------------|
|-------- Server
flow --\
|
|
/-----|
|
(0-200s)
\
+-------+
/
+-------+
\
/
\
20Mbps,10ms
\
+-------+
/
\--|
|--/
5 SPTCP ----------|
|
flows
+-------+
(40-80s)
Router2
Figure 2: Testbed topology for the responsiveness scenario.
+------------------+-----------+----------+-----------+-----------+
|
| Coupled |
OLIA
|
LIA
|
Balia
|
+------------------+-----------+----------+-----------+-----------+
| Convergence time |
94.36
|
58.5
|
17.75
|
14.73
|
+------------------+-----------+----------+-----------+-----------+
Values are in seconds.
Table 2: Responsiveness: Convergence time of MPTCP user after SPTCP
users have left the network.
We observe that in this scenario Balia and LIA are quite responsive
while both Coupled and OLIA algorithms take an excessively long time
to recover. Note that in this scenario, the increment/decrement
algorithms of Coupled and those of OLIA are similar, and therefore
they behave in a similar way. For both algorithms, the excessively
slow recovery of the congestion window on the second path is due to
the design that increases the window roughly by w_r / (SUM(w_k))^2 on
each ACK assuming the RTTs are similar. After the SPTCP users have
left, w_2 is small while w_1 is large, so that w_2 / (w_1 + w_2)^2 is
very small. It therefore takes a long time for w_2 to increase to
its steady state value. In general, under the Coupled algorithm, a
route with a large throughput can greatly suppress the throughput on
another route even though the other route is underutilized.
Walid, et al.
Expires July 31, 2015
[Page 8]
Internet-Draft
5.3.
MPTCP Congestion Control
January 2015
NorNet experiment
To show that Balia works well on real Internet environments, we
create two virtual machine hosts A and B over the NorNet Core, a
country-wide (Norway) multi-homed research testbed [NORNET]. Host A
is connected to Internet via ISP(Internet service provider)-1 while
host B is connected via ISP-2 and ISP-3.
Considering a scenario where host B downloads a file from host A via
two interfaces, we measure the througputs of both SPTCP and MPTCP
with Reno and Balia respectively, as shown in Table 3. There are two
logical paths between the hosts, (ISP-1 to ISP-2) and (ISP-1 to ISP3), so we measure the bandwidth of each single-path with SPTCP and
both of the two paths with MPTCP. The measurement is repeated 30
times for each case. In Table 3, it is observed that MPTCP with
Balia aggregates the bandwidths of the two paths well.
+------------------+-------------+-------------+--------------+
|
| SPTCP(Reno) | SPTCP(Reno) | MPTCP(Balia) |
|
| A(1)->B(2) | A(1)->B(3) | A(1)->B(2,3) |
+------------------+-------------+-------------+--------------+
| Avg. throughput |
3.976
|
3.823
|
7.508
|
+------------------+-------------+-------------+--------------+
| Max. throughput |
4.08
|
3.83
|
7.69
|
+------------------+-------------+-------------+--------------+
| Min. throughput |
3.93
|
3.82
|
7.2
|
+------------------+-------------+-------------+--------------+
Values are in Mbps.
Table 3: Throughputs of SPTCP and MPTCP over the NorNet Core.
Numbers in parenthesis refer to the ISP number.
6.
Conclusion
In [PENG14], we have developed a model for MPTCP and identified
designs that guarantee the existence, uniqueness and stability of the
network equilibrium. We also characterize the design space and study
the tradeoff among TCP friendliness, responsiveness, and window
oscillation. Base on better understanding of the design space, our
new congestion control algorithm for MPTCP, Balia, generalizes prior
algorithms and strikes a good balance between friendliness and
responsiveness. Balia has been implemented in the Linux kernel and
tested in various scenarios.
Walid, et al.
Expires July 31, 2015
[Page 9]
Internet-Draft
7.
MPTCP Congestion Control
January 2015
References
7.1.
Normative References
[RFC2119]
Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, March 1997.
[RFC5681]
Allman, M., Paxson, V., and E. Blanton, "TCP Congestion
Control", RFC 5681, September 2009.
[RFC6356]
Raiciu, C., Handley, M., and D. Wischik, "Coupled
Congestion Control for Multipath Transport Protocols", RFC
6356, October 2011.
7.2.
Informative References
[HONDA09]
Honda, M., Nishida, Y., Eggert, L., Sarolahti, P., and H.
Tokuda, "Multipath congestion control for shared
bottleneck", PFLDNeT Workshop, 2009.
[KELLY05]
Kelly, F. and T. Voice, "Stability of end-to-end
algorithms for joint routing and rate control", ACM
SIGCOMM Computer Communication Review, vol. 35, no. 2, pp.
5-12, 2005.
[HAN04]
Han, H., Shakkottai, S., Hollot, C., Srikant, R., and D.
Towsley, "Overlay tcp for multi-path routing and
congestion control", IMA Workshop on Measurements and
Modeling of the Internet, 2004.
[KHALILI12]
Khalili, R., Gast, N., Popovic, M., Upadhyay, U., and J.
Le Boudec, "MPTCP is not Pareto-optimality: Performance
issues and a possible solution", ACM CoNext, 2012.
[MPLKI]
UCL, Louvain-la-Neuve, Belgium, "MultiPath TCP-Linux
kernel implementation", 2014, <http://multipath-tcp.org/>.
[PENG14]
Peng, Q., Walid, A., Hwang, J., and S. Low, "Multipath
TCP: Analysis, Design and Implementation", 2014,
<http://arxiv.org/abs/1308.3119v2>.
[DUMMYNET]
Carbone, M. and L. Rizzo, "Dummynet revisited", ACM
SIGCOMM Computer Communication Review, vol. 40, no. 2, pp.
12-20, 2010.
Walid, et al.
Expires July 31, 2015
[Page 10]
Internet-Draft
[NORNET]
MPTCP Congestion Control
January 2015
Gran, E., Dreibholz, T., and A. Kvalbein, "NorNet Core - A
multi-homed research testbed", Elsevier Computer Networks,
vol. 61, pp. 75-87, 2014.
Authors’ Addresses
Anwar Walid
Bell Labs
600 Mountain Ave
New Providence, NJ, USA
Email: [email protected]
Qiuyu Peng
Caltech
Department of Electrical Engineering
Pasadena, CA, USA
Email: [email protected]
Jaehyun Hwang
Bell Labs
Seoul, Republic of Korea
Email: [email protected]
Steven H. Low
Caltech
Department of Computing + Mathematical Sciences
Department of Electrical Engineering
Pasadena, CA, USA
Email: [email protected]
Walid, et al.
Expires July 31, 2015
[Page 11]