rectrectrectrectrectrectrect

QoS Requirements for Internet2 (draft)

Ben Teitelbaum and Ted Hanss

Abstract

This purpose of this document is to specify the quality-of-service (QoS) requirements for Internet2 and to describe the dimensions of the solution space. This document is a product of the Internet2 QoS Working Group and is made available to the Internet2 community at large to encourage feedback on the working group's direction. The expected audience includes: campus network planners, gigapop operators, application developers, vendor partners, carriers, and the networking research and standards communities in general.

This document is organized as follows: Section 1 discusses the goals of Internet2, the need for QoS functionality, and requirements as dictated by the advanced applications that Internet2 aspires to support; Section 2 discusses the design space of possible QoS approaches; Section 3 provides a detailed analysis of operational requirements and engineering imperatives; and finally, Section 4 explains the short-term requirement for exploring a small number of QoS approaches in a test bed.

1. Problem Statement

The principal goal of Internet2 is to enable advanced network applications that will help universities fulfill their research and education missions in the next decade. Consider the highly-specialized researcher who is physically separated from research peers at other institutions, or the graduate student who needs to control a remote scientific instrument and reliably retrieve experiment data, or the working parent trying to gain the skills to move into a second career through distance learning classes. All would benefit from the availability of highly interactive networked applications that bridge the distance between researchers, students, instruments, and data.

Specific application components and attributes for Internet2-style environments include:

  • two-way interactive audio/video
  • collaborative virtual environments
  • video and audio streaming at high fidelity
  • remote control of devices (e.g., telescopes and microscopes)
  • large data transfers

Unfortunately, many valuable networked applications are simply not feasible today because the current internet is unable to provide them with the performance they require. Many of these applications rely on the ability to exchange rich content or interact across the network in close to real time and, to function correctly, require certain minimum bandwidth and maximum latencies from the network. Some classes of applications require further assurances from the network, such as bounds on jitter (variability in delay)

"Quality of service'' (QoS) is an often overloaded term that is used here to refer to both the performance of a network relative to application needs and the set of technologies that enable a network to make performance assurances. In a shared networking environment, QoS is necessarily about resource reservation.

Internet2 researchers are looking for end-to-end, wide-area QoS functionality for inter-institutional links running across separately managed network clouds. Any approach to the end-to-end, wide-area problem should also work for connections within a campus network.

1.1 Bandwidth and QoS

Users and developers are likely to desire QoS even as backbone network capacities increase rapidly. Since QoS needs are end-to-end, bandwidth can only eliminate the need for QoS if increasing bandwidth eliminates all congestion. The assertion that bandwidth will become so cheap and so plentiful as to remove all congestion from the more than 150 networks that comprise Internet2 should be regarded with appropriate skepticism.

Another contributing factor to continuing congestion will likely be ever more demanding applications. There is an important reciprocal relationship between applications, whose existence and popularity motivates improvements in the network, and the network itself, whose advance enables and inspires new applications. While it is impossible to foresee exactly what applications may evolve in the future, it is safe to assume that ever faster networks will foster ever more demanding applications.

Figure 1. The relationship between application and network development: new applications motivate network advances, which in turn enable further classes of application

Thus, the truism: bandwidth does not eliminate the need for QoS. The converse also holds: QoS does not create bandwidth. QoS should be seen as a means for reserving and allocating existing bandwidth, not as a substitute for adequate capacity planning.

1.2 QoS -- a new model for internet service

Once a network has made resource commitments, it may be unable to make further commitments. Thus, to the user, a computer network with QoS functionality begins to resemble the telephone system. There is first some sort of call setup, where the user attempts to initiate a connection. Then, assuming that the call goes through, the user is assured a clear channel on which to communicate (at least as long as there are no equipment failures). Alternatively, at call setup time, the user may instead get a busy signal and be denied the privilege of connecting at the desired level of QoS.

This service model is radically different from the "best-effort" service model on which the Internet has built its success. In today's internet, each network element along an IP packet's path makes nothing more than a good faith effort to forward the packet towards its destination. If a queue is overloaded, packets are dropped with little or no distinction between unimportant traffic and urgent traffic. The primary internet transport protocols (most importantly TCP) have been very carefully designed and tuned over the past 25 years to support a service model of graceful degradation, where connections are essentially never denied and every connection's performance simply degrades as the network load increases.

The service model for which Internet2 strives is neither strictly the circuit switched model of the telephone system, nor the best-efforts model of the current Internet, but rather an integrated approach where QoS traffic coexists with best-effort traffic. The primary design goal behind this approach is to share network resources in such a way that one can simultaneously achieve the benefits of circuit-switched networks (performance guarantees) and the benefits of best-efforts networks (low cost due to resource sharing).

Providing a production QoS service will not be easy. Both users and network engineers should expect an incremental approach, where features, stability, and scope improve over time, but where there exists a clear road map laying out the overall QoS direction. Additionally, certain key applications should derive value from deployed QoS features from the start. To push subsequent iterations and improvements of the Internet2 QoS service forward, users, developers, administrators, and network engineers need to engage in an ongoing dialogue and exchange. Only through such iteration and improvement, we can advance the network services environment.

2. Understanding the QoS Design Space

At a very macroscopic level, the design space of possible QoS approaches may be thought of as having three dimensions: scope, control model, and transmission guarantee. This model was developed in a recent IETF Internet Draft and is reproduced here with some embellishment.

2.1 Scope

The scope defines the boundaries of the QoS service. For example, an end-to-end scope is accessible to applications on end systems. An example of end-to-end scope is an RSVP reservation between hosts to deliver a pre-specified level of QoS. An intermediate scope service, on the other hand, does not necessarily allow end systems access to the service interface. An example of intermediate scope would be a gigapop-to-gigapop ATM CBR PVC.

2.2 Control Model

A QoS Control Model describes the granularity, duration, and locus of control of QoS requests. For example, QoS requests may be made from either endpoints or intermediate locations (proxy control). In addition, the effects of such requests may vary in both granularity and duration. The granularity of a request may extend from a single flow between hosts to site level aggregation, while the duration may extend from less than the lifetime of a single flow to several months.

2.3 Transmission Guarantee

A transmission guarantee is characterized by a granularity, a set of transmission parameters, and corresponding assurances about what the network will offer with respect to each. Like the control model, the granularity of a transmission guarantee may range from a single flow between hosts to site level aggregation

Transmission parameters describe the definable and configurable metrics of a QoS model. Typical parameters are packet loss rate, bandwidth, delay average and variation, reliability, and maximum transmission unit (MTU).

Transmission assurances are themselves characterized by two factors: 1) frame of reference; and 2) rigidity. The frame of reference characterizes whether the assurance is relative to those given to other types of flows or can be made in an absolutefashion, independent of other traffic. The rigidity of an assurance relates to whether it must be characterized in a soft or probabilistic way (i.e. "in the limit, no more than 5% of your packets will be delayed more than 30msec") or whether it may be characterized by a hard limit (i.e. "barring a network failure, your packets will never be delayed more than 30msec").

3. Requirements for Internet2 QoS

This section attempts to specify the requirements for Internet2 quality-of-service. These requirements are high-level and long-term. In the short term, any realistic approach to Internet2 QoS will be incremental in both scope and functionality, but should be conceived so as to converge eventually on a production QoS infrastructure that meets the following requirements:

  • enables advanced applications
  • admits multiple, concatenatable implementations of packet forwarding equipment and network clouds
  • scales well
  • is administratable
  • provides a measurable service
  • works with end host operating systems and middleware
  • deploys incrementally starting in 1998
  • Short-term requirements for initial testbed deployment of QoS are discussed in section 4.

3.1 Must Enable Advanced Applications

Advanced applications are the raison d'etre for Internet2. Yet, it makes little sense to engineer a network functionality for a specific set of applications. Rather, the most successful networking technologies are those that offer generic low-level services (e.g. packet forwarding, routing, transport, etc.) that work for any application. Twenty years, ago the key networked application was remote job entry (RJE). Ten years ago, the ``killer apps'' were email, telnet, and FTP. Today, the most important networked application is clearly the web browser. The key applications 5-10 years from now? -- nearly impossible to predict.

Therefore, in adopting a QoS plan, Internet2 should strive for an approach that is as general purpose and as durable as other core internet technologies (e.g. IP, UDP, TCP, BGP). Any successful QoS approach will almost surely require significant investments in protocol development and standardization, additional router functionalities, and host operating system software -- not to mention the human adjustments required by end users, network administrators, and applications developers and is therefore likely to leave a legacy that will long outlive the hegemony of any particular application.

Though it does not make sense to design a network around the specific needs of today's applications, it does make sense for network engineers to maintain a dialogue with advanced applications developers to ensure that the needs of these applications are being addressed in general and that the applications can cast their needs explicitly in terms that correspond to viable models of QoS. Through workshops, informal discussions, and a lengthy survey of application needs, the Internet2 engineering and applications teams and the QoS working group have attempted to maintain just such a dialogue.

3.1.1 Characterizing Application Requirements

Most applications designers, when pressed as to what they need from the network, may smile and say something like: ``as much bandwidth as you can give me with as little latency, jitter, and loss!'' This is not a completely tongue in cheek answer; the reality of working with a best-efforts internet service has compelled many applications developers to become expert at writing adaptive applications. These applications essentially always work regardless of how congested the network is -- they just work a lot better when the network is better.

However, not all advanced networked applications have such vague requirements. A number of important applications have very specific requirements based on human factors or hard real-time control needs. Generally these requirements amount to bandwidth requirements of some small number of megabits per second (<10Mbps) and maximum latency bounds between 30 msec and 500 msec. Bandwidth and latency needs are far more common than the need for assurances characterized by other transmission parameters. See Section 3.1.5 for detailed examples application transmission guarantee requirements.

One family of transmission guarantee currently getting a lot of attention is the set of approaches that offer relative assurances for different classes of best-effort traffic. These so-called class-of-service (CoS) approaches segregate traffic into multiple classes (e.g. "gold", "silver", and "bronze") each of which is assured that is it will perceive the network as less congested than the lower classes. CoS represents a modest differentiation of the current "one size fits all" best-effort service into multiple classes of best-effort service. Internet service providers (ISPs) see a huge demand for these type of services in the short term and several router vendors will soon have the necessary functionality to support IP CoS.

While CoS is likely to be an important piece of functionality for Internet2 member institutions that want coarse differentiation of network traffic to meet institutional priorities or to support applications that are in need of better service in the short term, CoS approaches will never meet the requirements of many important advanced applications with real-time needs. It is therefore the necessary that Internet2 support commercial CoS functionality as it becomes available, but focus mostly on supporting stronger notions of QoS.

Several important classes of application have requirements that go beyond how a particular flow's packets are forwarded. For example, interactive applications with multiple users often need multicast capability to scale gracefully. Multicast is an important priority for Internet2, as is the eventual extension of QoS to multicast flows. Another important requirement relates to the administrative framework that must accompany any QoS scheme. Consider an application like distance learning where classes and laboratories are scheduled in advance at known times. The instructor and students need the ability to schedule advanced QoS reservations that can not be preempted by later requests. It is therefore necessary that the Internet2 QoS administrative framework support such scheduling.

3.1.2 The User View

Next we provide an overview of requirements from the perspective of both application developers and application users, framing the requirements while remaining naive to the underlying implementation. The intention is to avoid being prescriptive about the technology. Rather, the objective is a focus on the "end game" from the developers' and users' eyes.

The user view of QoS is that, when invoked, an application should deliver as expected from start to finish for all components of the application (text, audio, video, etc.). For example, if the application involves real-time audio collaboration, the predictable quality could be defined along the lines of a telephone conversation (plain old telephone service or POTS) at a minimum. If the application features full-frame, full-motion streaming video, it may be analogous to VHS or NTSC range quality, or better. If higher expectations are set, they too should be met on a predictable basis.

Overall, managing expectations is going to be an important part of deploying QoS. That is, users can't expect QoS to "create bandwidth". But, when expectations are reasonably established, the application behavior should be predictable.

It's expected that we'll maintain a "best effort" service analogous to the current internet functionality. What QoS will offer users is some level of "better than best effort" service (the actual number of such levels is left for the investigation). The objective is not to create an environment with only advanced applications that utilize QoS features. Current versions of FTP, telnet, etc. must continue to work and have access to network services of comparable quality to today's overall best effort service (i.e., not be starved of resources).

It is reasonable to assume that for certain applications the sender makes the QoS reservation and at other times by the receiver makes the reservation. In cases of receiver control, users should be able to alter QoS expectations when making different invocations of the same program. For example, FTP could be set to default to low priority classification. At times, however, a large data transfer may require higher priority and this should be conveyable by the user.

Users should not expect services unreasonably from the network. This is the trade-off in order to get support for isochronous traffic, for example. Electronic mail is asynchronous and thus immediate, deterministic delivery is not expected.

Users also expect to "pay more to get more", thus combining engineering possibility with economic viability. This "payment" could be in the simple tradeoff of agreeing that email is low priority in order to obtain high priority space for streaming media applications. Or, it could come in the form of actually paying higher rates. The accounting and charging challenge for QoS is clearly one of the major areas of investigation. Whether it's a per-instantiation charge or on a subscription basis, the user should be informed somehow of the pricing structure (and it should be kept as simple as possible). However it works out, the allocation details must remain the province of each individual campus.

Users do not consider application security (e.g., data privacy) to be a QoS feature. That is, the QoS environment is not responsible for security of the application itself. However, any QoS control data linked to an individual should be private. In addition, users are willing to have to authenticate to the network, if necessary, to get "better than best effort" level service.

Users expect their access to QoS functionality to be both dynamic and scheduled. Dynamic invocations occur when one wants to use a QoS-enabled application "now", e.g., as one typically works and initiates new tasks. Scheduled invocations are for planned application use, e.g., when participating in a course lecture.

The user may get "busy signals" when trying to access "better than best effort" service levels. In these cases, an acceptable fallback may be to allow degraded performance (until the application completes or until the desired service level becomes available). Another fallback may be to alert the user as to when the desired service is likely to be available.

While it is important that quality of service reservations have an end-to-end effect, there is not a requirement that reservations must be made end-to-end. That is, the user doesn't require admission control be at the desktop versus an edge router as long as the application works as expected.

As collaborative, interactive applications are a key part of the Internet2 environment, bilateral and multilateral reservation paths must be possible.

Users recognize that client and server configurations are also factors that affect application performance (e.g., congestion, jitter). Application users have the responsibility to monitor their client system and shut down applications that may interfere with the QoS-enabled application. Users do expect that server operating systems be more sophisticated at handling prioritization than client systems -- although that gap is expected to close. A possible future task for users and developers is to establish a set of requirements for the "I2-enabled client workstation".

3.1.3 A Developer Perspective

Developers would like some standard form of feedback from the network. Such feedback would include reports from the network as to whether it can meet a certain request (so that the application perhaps can negotiate for a more limited set of resources), whether it can no longer meet a certain request (and thus the application must negotiate or adapt), and whether it is meeting the request (and thus there's a reasonable expectation for the end-to-end experience).

Developers must be able to access quality of service functionality via a well defined set of abstractions (APIs and libraries) that mask the implementation. This will allow the underlying mechanisms and technology to change without having to modify the application. And, it eases the programming task for the developer. For example, something like open video_conference(full_frame, full_motion) would invoke detailed bit rate, packet loss, and latency parameters. If desired, the developer should be able to get at and modify the underlying parameters. But this will be the exception not the rule.

Application developers must undertake the efforts to measure their application requirements. Answers such "as much bandwidth as you can give me" aren't suitable requirements. Even if the applications are adaptive, quantifying different quality levels is important. The application developers should work with the network engineers to see how the individual application requirements fit into the aggregate planning models required to forecast overall growth.

Adaptive applications still have a role in a QoS environment. Ranges of data rates, latency, packet loss, etc. will be established within which applications will run as expected. Within those ranges adaptive techniques can improve the overall quality of the applications experience. Or, if the network unexpectedly degrades, the application may decide which portions to scale back. If it's a surgery lecture, the video will get higher priority than the audio component. If it's a music video, the audio takes priority. These are application level decisions. Thus the model is a very cooperative role between network-aware applications and a network that shares information with the application layer.

Developers (and users) should have reasonable expectations for the environments in which their applications will be deployed. That is, for applications that are very delay sensitive, there may be certain "delay radii" within which the application will reasonably run, but beyond that there are limits outside network engineering control (e.g., the speed of light).

Application developers have the good citizen responsibility to use the network efficiently. E.g., use multicast technology if that minimizes bandwidth consumption and adaptive techniques that still maintain the user-level quality within acceptable ranges.

3.1.4 Measures of Success

The following is an illustrative (but not exhaustive) summary of how any Internet2 QoS approach will be evaluated by end users and applications developers:

  • Application users can invoke current applications and they will operate no worse than they do today
  • Application users can invoke new, advanced applications and get bandwidth and/or delay assurances throughout the duration of the application
  • Application users can invoke QoS-enabled applications with a minimum of "busy signals" (for which a specific metric needs to be determined)
  • Application users need to trust the integrity of the authentication, authorization, and accounting features of QoS reservations
  • Application users can adjust, as appropriate, QoS requests
  • Application developers access QoS features through standard abstractions (not network layer primitives) Application developers have access to tools that allow them to quantify their application requirements

3.1.5 Example Applications

The paucity of hard and fast numbers from applications developers makes the QoS engineering challenge more difficult. Until we can conduct the necessary analysis, we must work with anecdotal examples.

Consider the needs of a secure video conferencing and remote instrument control application (provided by Andy Adamson and John Mansfield at the University of Michigan). This application is a real-time, interactive application using high quality video and high fidelity audio distributed among two or more sites (it can be multicast or unicast depending on the number of participants). Using VIC/VAT, the video streams were encoded with MJPEG hardware at 30 frames/second (audio at CD-quality 16 bit, 44.1kHz sampling). The entire session was then software encrypted. The quality of the video was necessary as it involved relaying the very fine resolution output from a scanning electron microscope (SEM) in addition to video conferencing remote participants. In addition to the audio and video flows, control flows are used to remotely manipulate the SEM. Minimal latency is important as operations like "focus" and "platform shift" must truly be interactive for the remote user.

A typical session with the SEM lasts anywhere from 15 minutes to two hours. In an instructional setting, up to 40 or so receivers might be part of the application. Any use of the SEM must be scheduled, as it is an expensive device. Thus, QoS must be schedulable.

The minimum bandwidth the application needs are:

  • Video: average 3.5mbps (bursts up to 4.5mbps)
  • Audio: average 1mbps (bursts to 1.5mbps)
  • Control: .1mbps

Delay needs depend on whether the video is being used for controlling the SEM or videoconferencing to the remote site. With the microscope, delay needs to be 60ms or less before control becomes a problem. For video conferencing, a delay up to 500ms is tolerable (but certainly less than ideal).

The application does not degrade gracefully. Audio drops out, for example and video freezes.

Finer detailed numbers on packet loss, leaky bucket models, and jitter bounds require more analysis of the application.

3.1.6 Expected QoS Load

A very important but open question is: "what demand for QoS will Internet2 see and from what classes of applications?". Will demand come from a few high-end applications like tele-immersion, data mining, and real time weather forecasting? Or, from widespread use of less-demanding mainstream collaborative tools? The answer to this question has profound implications for how a QoS service is provisioned and engineered. For example, in the later case, statistical multiplexing of a large number of QoS reservations may greatly reduce the need for dynamic resource allocation without sacrificing the utilization efficiency. Leaning what the expected QoS load is for Internet2 will be part of the iterative deployment process that must be taken.

3.2 Concatenatability

Any Internet2 QoS approach must admit multiple, interoperable implementations of both network services and individual network elements.

3.2.1 Concatenatability of Network Clouds

We consider the term network cloud to be a loosely defined term denoting a collection of packet forwarders under a single administrative control. A network cloud may correspond to an autonomous system (AS), a collection of ASs, or even part of an AS. The extent of a cloud may be defined by the extent of a particular implementation technology or by the location of congestion points. Clouds may provide network services to individuals or organizations under the same administrative control (in the case of a campus network) or to other network clouds (as with a gigaPoP or an ISP).

Consider the path between the two hosts in Figure 2. This path flows from the source host through its campus network to the regional gigaPoP through a national interconnect (e.g. the vBNS) through another gigaPoP and another campus network to the destination host. This is the typical situation in the Internet2 environment. Since campuses, gigaPoPs, and interconnects are each under separate administrative control, there is a need to standardize a notion of QoS across a cloud, so that the composition of clouds can provide a meaningful end-to-end service.

Figure 2. Separately administered clouds must be able to provide QoS services that concatenate to achieve an end-to-end QoS

Concatenatability can be considered to be a facet of administratable scalability. QoS-enabled flows and call setup signaling should be treated in a very standard and well-understood way at cloud-to-cloud boundaries, but clouds must be allowed to implement QoS internally in potentially very different ways. Internal implementations of QoS may vary depending on a cloud's underlying technologies, internal policies, and provisioning decisions.

3.2.2 Interoperability of Equipment

Any QoS approach adopted by Internet2 should be implementable by at least one (and preferably multiple) vendor partners. In a heterogeneous internetwork the scale of Internet2, demanding multi-vendor interoperability is the only reasonable approach, and one taken by standards bodies and often vendors themselves when the success of an important new networking technology is in question. To ensure the wider success of QoS, Internet2 is very interested in pursuing a QoS strategy that is aligned with the directions of internet standards organizations such as the IETF. Though it is very likely that initial Internet2 QoS deployment will begin before the standards work is complete, the Internet2 QoS experience should provide invaluable early feedback to the ongoing standards work, including feedback on the interoperability or non-interoperability of alternative implementations.

3.3 Scalable

Certainly the most challenging engineering imperative to providing end-to-end, per-flow QoS is scalability. QoS approaches that require substantial per-flow state or computational overhead in the forwarding engines simply will not scale as the number of QoS-enabled flows through the network grows. Scalability is a particular concern in core routers, where the fan-in from the network edges can easily burden routers with forwarding thousands of flows.

Figure 3. Scalability is a serious concern in the network core where thousands of flows may pass through each router Initially, the number of flows required by advanced applications may be small enough that unscalable experimental architectures may suffice. However, as today's advanced applications move out of the lab and onto thousands of hosts, an unscalable architecture will quickly fail. Pressure to carry voice-over-IP traffic could further accelerate the point where scalability becomes the prime concern.

One promising approach to the scalability problem is to aggregate the QoS assurances of individual flows so that core routers may forward the aggregate traffic in a single, simple manner consistent with the needs of the bundled individual flows.

3.4 Administratable

As with any scarce resource, there will need to be mechanisms to allocate and account for QoS. These mechanisms must operate efficiently, providing end users ready access to QoS features without unduly burdening network planning and operations staff with additional management complexity. Furthermore, these administrative mechanisms must support a flexible set of policies and deter theft of QoS services.

3.4.1 A New Economy

Because quality-of-service is better service, it will come at a cost to those that use it. As with any scarce resource, there is a need to control access to prevent theft. This in turn leads to the need to authenticate the identity of the user or institution that requests the resource, to make an administrative decision about whether to grant access to it (admissions control), and to account for its use appropriately.

Access to the QoS capacity of shared network resources will inevitably lead to allocation policies, pricing schemes, and payment arrangements. The rise of an "economy of QoS" will present unique policy challenges to Internet2 member institutions and a pressing need for a notion of QoS that lends itself to simple, understandable business models.

The macroeconomic effects of an economy of QoS are generally understood as follows: the premium users who value the network's resources most highly pay for their QoS and these payments provide an additional revenue source for capital improvements (i.e. additional capacity); certain network resources are reserved for the premium users and are therefore unavailable to non-premium, best-efforts users; however, in the long run, the additional capacity made possible by the paying premium users lifts all boats; this is true even if all additional capacity is premium capacity since the bursty nature of data traffic leaves "holes" that can be exploited by best-efforts flows.

3.4.2 Authentication, Admissions Control, and Policy

When a request is made for a QoS contract, there must be a secure means to authenticate the requester's identity. Once identity is known, an admissions control decision must be made. To support admissions control, there needs to be a policy framework and a means to determine resource availability. Since Internet2 flows may pass through multiple administrative domains, admissions control is not just a local decision, but may require setup across multiple administrative domains. Post hoc admissions control in the form of policing may also be required.

Because Internet2 is comprised of scores of member institutions each administered separately and with its own internal resource allocation policy, any Internet2 QoS approach must grant individual institutions maximal flexibility to manage their QoS resources as they see fit. Furthermore, since resource allocation policies are often sensitive, it should be possible to handle extra-domain admissions requests without extensive sharing of policy information. Detailed network utilization information is similarly sensitive and should be allowed to remain private.

3.4.3 Accounting and Payment

Although QoS models may vary widely in terms of the granularity and characterization of the transmission guarantee, every QoS model necessitates some notion of accounting and payment. Models of QoS that lead to simpler accounting mechanisms and business relations are more desirable.

3.5 Measurable

Since institutions (and eventually users) will pay for the QoS they receive, there must be a means for end users to measure and audit their network performance. The requirement of measurability translates not only to the need for measurement tools, but also to the need for well-understood network performance metrics. Network providers may need additional measurement tools to aide engineers in provisioning for QoS or in support of measurement-based admissions control mechanisms.

3.6 Host Requirements

In the long run, hosts should be able to initiate QoS requests for their flows. In the short term, however, QoS might need to be statically configured on a per-wall jack, per-port number, or per-protocol basis. Hosts must also be able to authenticate themselves or their users appropriately to the network for purposes of authentication, authorization, and accounting. Additionally, to provide true end-to-end QoS, host operating systems will need to support QoS-enabled flows. This kind of real-time functionality is not present in the operating systems of most current Internet2 hosts, where packets may experience bottlenecks within the network stack, the memory system, or even the process scheduler.

3.7 Incrementally Deployable

It would be unrealistic to adopt a QoS approach that required changing all network elements before it could be used. A far better approach would be one where a valuable service would arise from even a partial deployment in the network (for example only at the most congested points). This could have immediate benefits for a number of important existing applications that require QoS.

Another aspect to incremental deployment is the need to begin developing experience with QoS without having to tackle all the myriad complexities of a full-blown production QoS service. There are certain functionalities that may eventually be needed but that may be delayed until more experience has been obtained with early, simplified implementations (QoS routing is a good example). The need for early experimentation with QoS services in a test bed environment is discussed further in Section 4.

To push forward on Internet2 QoS in general, a project goal has been set to begin QoS deployment in calendar year 1998.

3.8 Non-Requirements

The term "quality of service" is sometimes used so expansively, that it is worth stating that certain common misinterpretations are explicitly non-requirements for Internet2 at this point. Specifically, this document does not consider sophisticated network services like fault tolerance and privacy to be within scope.

4. Need for an Internet2 QoS Test Bed

Although quality-of-service has been an area of very active research and debate for some time, there currently exists no turnkey solution that would meet all the requirements outlined above. Furthermore, industry convergence around anything stronger than class-of-service is nowhere in sight. Since many important Internet2 applications even today will not work reliably without QoS assurances from the network, it will be necessary for Internet2 to begin experimenting with QoS in a test bed environment before there is industry and community consensus around standards.

Researchers have known for some time how to support hard, per-flow, end-to-end QoS guarantees. Most approaches for doing so however, are expensive and unscalable, requiring per-flow state in all network elements along a path. Internet2 needs to explore scalable ways to provide medium-hard models of QoS, as well as to explore the spectrum of softer models based on statistical assurances.

There is a pressing need for Internet2 to begin to provide even experimental QoS functionality, so that researchers, developers, network engineers, and network administrators can begin to explore a QoS service and to develop a body of experience that will help push the relevant technologies forward. Perhaps even more important is the need to put some service in place soon to prompt a focused discussion on the administrative and policy transitions that will be needed.

Clearly, even the deployment of a QoS test bed is a major undertaking for Internet2 and should therefore reflect the best current thinking of the QoS research community, as well as a careful assessment of the migration strategy for evolving the particular QoS approach explored in the test bed into a production service that meets the requirements laid out in this document. The best QoS strategies for Internet2 will lend themselves to an iterative approach based on carefully chosen abstractions that allow for experimentation with multiple implementation alternatives.

5. Earlier Work

At a series of Internet2 workshops held during 1997, the problem of providing QoS in Internet2 was discussed at length and from various angles during dedicated breakout sessions. The first of these workshops (Ann Arbor, July 1997) focused on the network requirements of key advanced applications and on the need for good QoS middleware and APIs. The second workshop (Davis, September, 1997) focused on campus networks and on the administrative and engineering concerns of campus network planners. A third workshop (San Jose, November 1997) was for gigapop planners and addressed the administrative and engineering complexities of extending QoS through an Internet2 gigapop. Finally, at the October 1997 membership meeting, two breakout sessions were held: the first to focus on the QoS strategies of Internet2's vendor partners; and the second, to discuss the administrative and policy ramifications of deploying inter-domain QoS. Though not formally documented, the discussions at these events brought many important concerns to the attention of the Internet2 engineering team and QoS working group and have subsequently been incorporated into this document.

Another early attempt to lay out the requirements for an Internet2 QoS approach was the work of the ad hoc Internet2 working group that met in San Jose in August 1997 to make initial attempts to clarify the requirements for Internet2 QoS and multicast. These discussions resulted in the submission of an Internet-Draft and helped significantly to clarify the QoS problem as well as the design space of potential solutions.

6. Acknowledgments

The authors acknowledge the entire Internet2 QoS working group, as well as Matt Zekauskas and Guy Almes for their invaluable comments on this document.

Author's Addresses

Ben Teitelbaum
Internet Engineer
200 Business Park Drive
Armonk, NY 10504
voice:914-765-1118
fax:914-273-1809
e-mail: ben@internet2.edu

Ted Hanss
Director of Applications
3025 Boardwalk
Suite 100
Ann Arbor, MI 48108
voice: 734-913-4256
fax: 734-913-4255
e-mail: ted@internet2.edu

Last Edited April 22, 1998

[Home] [Overview] [Agenda] [Presentations] [Attendees] [Papers] [Electronic Forum] [Webcast]