The cost of peer-to-peer distribution

Some ISPs such as PlusNet have expressed concerns about the impact that the BBC's iPlayer will have on their networks. Other ISPs such as Comcast have found themselves in trouble for aggressively throttling BitTorrent, the popular peer-to-peer application. In this note I will outline the problem.
  1. Central versus p2p distribution of streamed content
  2. The physical network, and its bottlenecks
  3. The logical network as seen by the IP layer
  4. The financial network
  5. The cost of streaming
  6. The cost of p2p
  7. Mitigating the cost of p2p

1. Central versus p2p distribution of streamed content

Broadly speaking we can classify content as streamed or stored. By streamed I mean content which is delivered in a steady stream to the user who views/uses it as it arrives. By stored I mean content which may be delivered out-of-order to the user, who views/uses it once he has received the entire file. For example, live video has to be streamed, whereas old TV programs can be either streamed for immediate viewing, or stored and watched afterwards, or (ideally) some hybrid.

Content may distributed centrally or via p2p. By central I mean that each user receives content direct from a central servers. By p2p I mean that most users receive content from other users. For example, a classic web server is for central distribution of stored content, and BitTorrent is for p2p distribution of stored content.

centralp2p
storedclassic httpdiPlayer download; BitTorrent
streamediPlayer streamedRawFlow

This note is really about central versus p2p distribution for streamed content. However, much of it also applies to stored content, at least as far as identifying bottlenecks and constraints is concerned.

2. The physical network, and its bottlenecks

Most UK broadband users use ADSL. The typical datapath is shown in Figure 1. The points of contention, at which capacity is scarce and must be paid for, are (i) the ATM links from the DSLAMS at the local exchanges, and (ii) the transit link from the ISP to the wider Internet.
Figure 1. Architecture of BT's IPStream service
BT is moving to its 21st Centry Network (21CN), and should be done around 2012. The overall datapath will be much as it is now, except that the first IP hop seen by the end-user is a little closer to the end-user. Also, the links from the DSLAM into the core network will run ethernet, so they will likely be cheaper and have an easier upgrade path. It is likely that any network operator pursuing local-loop unbundling (i.e. running its own DSLAMs) will build/lease a network with similar backhaul capacity.
Figure 2. Architecture of BT's 21st Century Network
Cable broadband is illustrated in Figure 3. The head-end corresponds to a local exchange, and the backhaul from the head-end will face much the same issues as ADSL backhaul. There are two notable differences: (i) there is significant contention on the upstream link from end-user to CMTS, since the link is a shared medium, and the only point where policies may be applied is within the cable modems; (ii) traffic may be routed by an IP router at the head-end, whereas in DSL the first IP hop is deeper in the network.
Figure 3. Architecture of cable broadband network
There are plans for DSLAMS which will present an IP interface to the backhaul network, i.e. integrating the BRAS and the DSLAM to some extent. On one hand, integrated IP-DSLAMS will mean that IP-level traffic management can happen close to the end-user, which will make it easier to provide quality of service. On the other hand, IP-DSLAMS will not be able to do any clever routing such as BGP, since that would be unscalable.

3. The logical network as seen by the IP layer

The end-user application is only aware of IP-layer routers, i.e. everything at the PPP or ATM level is inaccessible to it.

Figure 4 shows the network as seen by the IP/BGP layer. It shows several networks, made up of IP routers. Each router belongs to some autonomous system (AS), e.g. an ISP. Within an AS, traffic routes are typically calculated by some interior protocol specific to that AS, e.g. OSPF. Routes between ASs are calculated by BGP. BGP is only concerned with AS-level routing, that is, all the routers within an AS will know rules like "If I want to forward a packet to destination IP address x, then it must first go to AS number y, and to reach the y I must first forward it to the egress router z within my AS."

Figure 4. The network, as seen at the IP layer
BGP chooses routes based on transit and peering relationships between ASs. These relationships reflect financial contracts (see Section 4). BGP will prefer to use a peering link if possible. In the UK, a big ISP will probably peer with the BBC, and with other major ISPs. The main peering point is LINX in London.

4. The financial network

5. The cost of streaming

Suppose a content provider is serving a peak of N end-users, each at rate r Mbps for one hour at peak hour. In the UK, ISPs have made very public their disquiet about BBC iPlayer, most of which traffic is streaming. Presumably the last bullet point doesn't actually hold, at least not at current broadband tarifs. PlusNet says that it's their downstream costs, not their transit costs, which hurt.

6. The cost of p2p

a. Uprate and downrate per user for a simple p2p structure

Suppose there is a p2p network, seen as a tree rooted at the content provider, with fanout f, i.e. each end-user sends out up to f copies of every packet he receives. Then roughly N/(f-1) users will be transmitting at rate fr, and the others will not be transmitting at all. To even things out, suppose the traffic is sliced into t slices, each of rate r/t and each on its own independent distribution tree. Then each end-user transmits at rate Bin(t,1/(f-1))fr/t, which means that 97.5% of users are transmitting at less than or equal to rf/(f-1)+2r(f/t)1/2. Of course, each user receives at rate r.

b. The bottleneck upstream of the DSLAM, for ADSL

Assume for now that all traffic has to go to the ISP central office before being rerouted. (We will consider other possibilities in Section 7.) The downstream rate to each user is r, exactly as for the streaming case. The average upstream rate from a user is rf/(f-1). By the nature of ADSL, this has to be much smaller than the downstream rate, by a factor of at least 10. The ATM links from DSLAMs, and the central pipes, are all symmetric. Therefore the upstream component should easily be sustainable at no extra cost. Presumably, backhaul upstream of a cable head-end costs a similar amount.

c. The bottleneck on the fibre, for cable

Any data transmitted from an end-user's cable model needs to use the shared upstream fibre. This is a shared access medium (i.e. when one user sends, he blocks all other users), and there is a contention protocol for deciding who gets to send in which timeslot. What's more, when there is more offered load than roughly 25% of capacity, the total throughput actually decreases. Long TCP connections are especially unfriendly, since TCP continually pushes for extra throughput. If the upstream path is congested, then the downstream path gets stifled too, because downstream traffic needs upstream ACKs. Cable was designed under the assumption that upstream traffic consists principally of short http requests, and p2p messes it up. Cable modems could in principle use DPI to throttle p2p traffic, but they do not, and it is infeasible to build such intelligence into widespread hardware.

According to Cisco a sensible number of subscribers per upstream line is around 220, based on peak demand of around 22 users active. For an upstream capacity of 3 Mbps, this means that each user can upload at around 0.13 Mbps. This limits r, f, t as calculated above. The true economic cost of uploading is: for every extra user who switches from streaming to p2p, and who uploads at 0.13 Mbps at peak hour, the cable company needs to install one user's worth of extra capacity. Virgin (formerly NTL) charges £18 per month for their base cable package; if we take the infrastructure cost to be just £5 per user per month then the cost of uploading is roughly £37*Nrf/(1-f).

d. Transit costs

A p2p client might choose its peers anywhere. Assuming that peers are chosen at random, among UK end-users, and given this distribution of ISP sizes, measured in millions of subscribers
BTVirginCarphone WarehousePipexSkyOrangeKingstonTHUSO2/BeEntanet other
4.253.702.702.001.201.140.200.130.100.09 <0.5
27%23%17%12%7%7%1%0.8%0.6%0.6%<3%
the chance that a peer is in the same ISP is roughly 20%. But remember: major UK ISPs peer with each other, so it is unlikely that peering costs will be very large for content served to a UK audience. Maybe at most 5% of p2p links will be uploading via a transit link. International content distribution networks could well involve more peering, if peers are chosen at random, but it should be easy to mitigate this cost by intelligent selection of peers.

e. Summary, compared to streaming

7. Mitigating the cost of p2p

a. Two ground rules

b. Some existing solutions

Sandvine claims that the only sensible way to mitigate the cost of p2p is to limit the number of upstream flows that a user generates [pdf whitepaper]. Their technology sits in the middle of the network and interferes with BitTorrent. Comcast apparently used Sandvine's technology.

A project called P4P [news, paper] aims to achieve similar ends, but assuming cooperation between the end-user application and the network. They suggest that each network could run an iTracker, and end-users could query this device to learn about preferred routes. They found it could reduce IP hop count. Personally, I think P4P misses the point. It can save transit costs, but I argued above that transit costs are a small component. It can't possibly cut the upstream costs for cable broadband, since those costs are incurred whatever route the traffic takes.

c. Choosing peers more effectively

I will suppose that the content server has the job of passing new end-users on to peers.

For fibre broadband, the only solution is to prevent end-users from uploading. They could download from end-users on ADSL (a bit unfair!), or they could download from a proxy run by the ISP, or they could stream the traffic from the content-server. I don't know how to tell if a given computer is connected via a cable modem.

Transit costs could easily be reduced, if we made sure that a pair of end-users only peers when their ISPs peer. It is likely that major national ISPs all peer. So, if the content server knows the IP addresses for major ISPs, or even if it can identify the country in which an IP address is located, it can easily mitigate most of the transit costs. This is much the same as what P4P tries to do, but centralized, less general, and simpler.

Between the first IP hop and the backbone is another place where costs might be mitigated, assuming that this is indeed a bottleneck. This is not the case for ADSL in the UK today, nor is it the case for BT's 21CN. Other network operators seem to be planning IP-DSLAMs, and cable broadband has IP at the head-end, so there may be some potential for savings here.

How much potential for savings? It is highly unlikely that such IP-DSLAMs will be fully-fledged routers running BGP, which means they will not know how to route packets from one ISP to another within the same local exchange. (For cable broadband in the US, each head-end belongs entirely to one ISP anyway.) So, consider an ISP who has a fraction m of the market, suppose the total number of end-users receiving content is N, and suppse there are E local exchanges. Then the number of local exchanges to which this ISP is serving data is E[1-(1-1/E)mN]. For m=20% and E=5000, the if we could reduce the cost from mN to the number of local exchanges then the cost savings would be

N=1,000 N=5,000 N=10,000 N=50,000 N=100,000 N=500,000
2%9%18%57%75%95%

How plausible is it that these savings could be realized? It would require that the content server (who, we assume, is directing end-users in their choice of peer) should know when two end-users have the same first IP hop. This does not seem unreasonable. End-users can after all run traceroute to learn their full route to the server. The only cooperation needed from the ISP is that they should allow the first IP hop to respond to traceroute queries.