U N I V E R S I T Y C O L L E G E L O N D O N
MSc Data Communications
Networks &
Distributed Systems
D58: Pragmatic General Multicast
J O N C R O W C R O F T
MARIA PSALTAKI
RODRIGO ARAUJO
GHADAH ALDABBAGH
PANAGIOTIS KOUNIAKIS
ANDREAS GIANNOPOULOS
Executive Summary
This report is submitted along with the main project report, as an executive summary.
The title of the project developed is: "Pragmatic General Multicast (PGM) host implementation for FreeBSD." This included designing, writing the code and testing of the PGM host according to the specification (PGM draft). This project was sponsored by Cisco Systems
Pragmatic General Multicast (PGM) is a reliable multicast transport protocol mainly aimed for applications that require ordered, duplicate-free, multicast data delivery from multiple sources to multiple receivers. The advantage of PGM over traditional multicast protocols is that it guarantees that a receiver in the group either receives all data packets from transmissions and retransmissions, or is able to detect unrecoverable data packet loss. PGM is specifically intended as a workable solution for multicast applications with basic reliability requirements. Its central design goal is simplicity of operation with due regard for scalability and network efficiency.
The main aim of this project, as stated in the specification, was to develop the host side implementation for PGM on FreeBSD, based only on the specification provided by the company. This was a basic requirement, as it will support their goal of IETF standardisation of the protocol. Cisco have already developed a host implementation for Windows NT. Furthermore, a test plan had to be designed and performed.
There were also some optional requirements that could be implemented, however due to time constraints, only the compulsory requirements were met. The optional requirements involved:
The main report describes the approach taken by the group members and their course of action in order to complete the project.
Four of the chapters in the report are devoted to the four main phases of the project development: Research, Analysis and Design, Implementation and Testing.
The first chapter is an introduction to the project, as well as its aims and objectives.
Additionally, there is a detailed description of the project management in terms of the group organisation, the workplan and the work organisation.
The second chapter (Research) describes the research stage. There is a description of reliable multicast protocols in general and a detailed description of how PGM works.
The third chapter (Analysis and Design) presents the design methodology employed.
In the fourth chapter (Implementation) the implementation strategy is presented, along with a detailed description of the PGM sender and receiver code developed by the group.
The fifth chapter is devoted to the testing procedure and the methodology followed in order to assess the functionality and performance of the PGM FreeBSD host.
The final chapter provides an assessment of the overall work, in terms of the organisation and the goals achieved, including the skills acquired by the group members as well as some thoughts and conclusions.
Abstract
Pragmatic General Multicast (PGM) is a reliable multicast transport protocol aimed for applications that demand ordered, duplicate free, multicast data delivery from multiple sources to multiple receivers. This report provides a detailed interpretation of the PGM draft and more particularly the design and implementation of the host side (sender and receiver) supporting the PGM protocol for FreeBSD.
The report consists of six chapters. The first is an introduction to the project aims and objectives, the group structure and management. It also contains a detailed description of the time plan.
The second chapter describes the research stage. There is a description of reliable multicast protocols in general and a detailed description of how PGM works along with an explanation of the kernel structure and functionality.
The third chapter presents the design methodology employed. Furthermore, the initial use cases and initial design are shown along with their refined versions.
The fourth chapter is devoted mainly to the implementation phase, presenting the implementation strategy, the main building blocks of the networking code within the kernel. There is also a detailed description of the PGM sender and receiver code developed by the group.
The fifth chapter describes the testing procedure and the methodology followed in order to assess the functionality and performance of the PGM FreeBSD host.
The last chapter provides a critical assessment of the project in terms of the goals achieved and the skills acquired.
CONTENTS
Chapter 1
*Introduction
*1.1 Overview
*1.2 Project Title
*1.3 Group Members
*1.4 Specification
*1.5 Aims and Objectives
*1.6 A brief introduction to PGM
*1.7 Project management
*1.7.1 Group leader
*1.7.2 Meetings
*1.7.3 Group Organisation
*1.7.4 Time management
*Task Assignment
*Teams
*1.8 Platform of Implementation
*1.9 Choice of programming language
*Chapter 2
*Research and Analysis
*2.1 Overview
*2.2 Multicast protocols
*2.3 Reliable Multicast
*Session?
*ACK/NAK
*2.4 PGM protocol description
*2.5 Refining the given specification
*2.5.1 UML
*2.6 Protocol requirements
*2.7 FreeBSD Kernel
*2.7.1 Kernel directories
*Chapter 3
*Design
*3.1 Overview
*3.2 Methodology
*3.3 Use Cases
*3.3.1 Sender Use Cases
*3.3.2 PGM Receiver Use Cases
*3.4 Sender Architecture Diagram
*3.5 Receiver Architecture Diagram
*3.6 Use cases refined
*Chapter 4
*Implementation
*4.1 Overview
*4.2 Development strategy
*4.3 Networking code within the kernel
*4.3.1 Mbufs: Memory Buffers
*4.3.2 Protocol Control Blocks
*4.3.3 The protosw Structure
*4.4 Implementation of common structures
*4.5 Mapping of design to the kernel programming needs
*4.5.1 Sender Code Description
*4.5.2 Receiver Code Description
*Chapter 5
*Testing
*5.1 Overview
*5.2 Importance of testing
*5.3 Methodology
*5.3.1 Fidelity and Requirements errors
*5.4 General testing plan
*5.5 Functional testing of the Sender’s module
*5.5.1 Testing plan
*5.5.2 ODATA functional testing
*5.5.3 Traffic Shaper functional testing
*5.5.4 SPM functional testing
*5.5.5 Advance Window Buffer functional testing
*5.5.6 The NCF and NAK functional testing
*5.6 Receiver Testing Plan
*5.6.1 Testing plan for Receive ODATA/RDATA Use Case
*5.6.2 Testing the functionality of the Receive SPM Use Case
*5.6.3 Testing the functionality of the Receive NAK Use Case
*5.6.4 Testing the functionality of the Receive NCF Use Case
*5.6.5 Testing the functionality of the Retransmit locally RDATA Use Case
*5.6.6 Testing the functionality of the Send NAK Use Case
*5.7 Performance Testing
*5.7.1 PGM Sender Performance Testing
*5.7.2 PGM Receiver Performance Testing
*Chapter 6
*Conclusions
*6.1 Overview
*6.2 Achievements
*6.3 Future development
*6.4 Skills acquired
*6.5 Critical assessment
*6.6 Comments and Conclusions
*Bibliography
*APPENDIX A
*APPENDIX B
*APPENDIX C
*APPENDIX D
*
Acknowledgements
The group members would like to thank their supervisor, Professor J. Crowcroft, for his help and guidance throughout the duration of the project.
Many thanks to Mr. J. Andrews for his technical advice.
This chapter provides an introduction to the project, its specification, the aims and objectives as well as the platform of implementation and the tools used. The time plan is described along with the group organisation and each member’s responsibilities.
The title of this project is: "Pragmatic General Multicast (PGM) implementation for FreeBSD. This includes designing, writing the code and testing the host implementation according to the PGM specification (PGM draft). This project was sponsored by CISCO Systems.
DCNDS group1: Ghadah Aldabbagh
Rodrigo Araujo
Andreas Giannopoulos
Panagiotis Kouniakis
Maria Psaltaki
The following is the specification given to the group, in order to develop the PGM protocol.
Pragmatic General Multicast (PGM) is a departure from previous end to end reliable protocols that exploit Internet Multicast. It has both end system and router elements to the protocol. This is an attempt to overcome the scaling problems of protocol reliability techniques (ACK or NAK, retransmission) when operating them over intermittently lossy IP Networks. PGM is targeted at one to many applications, but of course could be used for many to many simply by using multiple sessions.
The deliverable will be a host side implementation of the protocol for FreeBSD. There is already a prototype of a host implementation for Windows NT by CISCO. One of the main goals is to have an independent implementation developed purely from the specification, as this will support the objective of IETF standardisation of the protocol.
An optional part of the implementation work suggested is to enhance the host PGM to implement the router part of the protocol. CISCO routers run IOS, but FreeBSD UNIX can operate on a PC with multiple link interfaces as a router. It is relatively simple to do the router enhancements, as there are two things that need to be done:
i) Add IP router alert option handling in the IP packet processing
ii) Add PGM processing
Other technical aspects of the work include
A variety of applications could be built ranging from newsfeed semi-reliable
video or audio feed (based on VIC or RAT tools available in the CS department)
including possible layered coding.
b) Designing a test plan.
c) Analyse the performance of the protocol for a range of network topologies and end
system usage patterns.
A recent version of the specification, which was used for this project, can be found in
/cs/docs/research/internet-drafts/draft-speakman-pgm-spec-03.txt
CISCO provided the protocol specification and 3 routers that implement the network side of PGM.
The main aim of this project, as stated in the specification, is to develop the host side implementation for PGM on FreeBSD, based only on the specification provided by the company. Furthermore, a test plan must be designed and performed.
There are also some optional requirements that can be implemented, provided that there is enough time. These include:
1.6 A brief introduction to PGM
"Pragmatic General Multicast (PGM) is a reliable multicast transport protocol mainly aimed for applications that require ordered, duplicate-free, multicast data delivery from multiple sources to multiple receivers. The advantage of PGM over traditional multicast protocols is that it guarantees that a receiver in the group either receives all data packets from transmissions and retransmissions, or is able to detect unrecoverable data packet loss. PGM is specifically intended as a workable solution for multicast applications with basic reliability requirements. Its central design goal is simplicity of operation with due regard for scalability and network efficiency"[1].
It was realised very early that a complex and demanding project such as this one could not be completed successfully, unless a well-planned management strategy was employed.
Following is a description of the decisions taken concerning the project management.
A group leader was elected, undertaking the tasks of the overall project management i.e. organise the whole course of action, assign roles to members and equally divide the tasks among them. At the same time, the group leader was responsible for organising the meetings and for ensuring that the work was done according to schedule.
The purpose of each meeting is to ensure that the work is being done according to schedule, and to try and resolve any arising difficulties. Group meetings were taking place on a regular basis (usually twice per week). Frequent meetings with the project supervisor were arranged so that he could view our progress, provide guidance and answer questions. That way it was easier for both sides (supervisor and group) to assess the development of the project and monitor the overall performance.
Minutes of all meetings were kept, so that the group could keep track of their progress and assess whether they have met their goals for a specific task. These minutes are included with this report. (Refer to App. C)
The group consisted of five students with different academic backgrounds and experience. Consequently, the allocation of tasks was done on a skills basis, however with taking into account that everyone had to get involved with every stage of the project. This was essential, as everyone would acquire knowledge and experience on different tasks, which they had never been involved with in the past.
An initial meeting was held in order to assess the group members' knowledge and potential in terms of programming, design and research. Following that, the tasks were divided among people in a more fair way. However, as mentioned previously, it had be ensured that everyone gets involved with all the aspects of the project implementation, to an extent that is possible, so as to gain experience and knowledge in areas not familiar to them.
During the first meeting the group organisation was discussed and the main areas comprising the project were identified. These were:
Since there were five students in the group, it was decided that it would be a good practice to assign responsibility of each section to each group member. This meant that each member would be in charge of a section in terms of deciding on the workplan of that particular section, and assigning tasks to other members. However all members would get involved in all stages of the design and implementation. The tasks were allocated as follows:
Research – G.Aldabbagh
Analysis and Design – A.Giannopoulos
Implementation – P.Kouniakis
Performance and Testing – M.Psaltaki
Application – R. Araujo, who was also the group leader.
The practical work on the project begun with research, which involved reading and understanding the given specification (PGM draft) in order to understand how the protocol works. Further to that, additional studies and reports on PGM were obtained mainly from the Internet so that together with the draft they would provide the group members with a clear picture of the issues involving the behaviour of the protocol.
Another area that required additional research was kernel and protocol programming. According to the specification, coding would have to be done within the FreeBSD kernel and there are several reasons for that decision. (Refer to Ch.2 section 2.7.1). Therefore some research was conducted on the way that the kernel functions, and the particular requirements of kernel programming.
The second phase of the project consisted of the analysis and design. This involved the detailed analysis of the protocol behaviour in terms of the sender and receiver, since these had to be implemented in FreeBSD. The design was mainly done with the use of the Unified Modelling Language (UML) in order to create diagrams that would help the group members visualise the actual sequence of events concerning the sender and receiver parts of the protocol. The choice of using UML for a protocol that would not be implemented in an object-oriented manner was taken mainly because of the structured and complete way that events and states could be described with UML.
The third phase was the actual implementation of the sender and receiver side of the PGM protocol. PGM is designed for use over IP (Internet Protocol) which means that it will be used in much the same way as TCP (Transport Control Protocol) and UDP (User Datagram Protocol) are used. Therefore, the FreeBSD code for the above two protocols was used as guide in writing the PGM code.
The next phase involved the testing of the protocol. Initially the testing methodology proposed, involved conducting several tests:
The last stage, that of report writing, began during the implementation phase, and it was completed shortly after the testing procedure was finished.
Time plans or workplans are the essential starting point of any project. It is very important to estimate the amount of time that each particular task will take. Moreover, following a time plan means that group members will be able to manage their time accordingly in order to produce results within the deadlines.
Previous experience has shown that it is not always possible to exactly follow the time plan due to the fact that some tasks may take more or less time than initially thought. Time plans however are the only way of ensuring that the projects are finished within a specified amount of time.
The project work officially started in April and lasted until the beginning of September In the beginning of June, the group members had finished the initial research concerning the project requirements and then it was possible to produce a detailed time management plan for the summer work on the project. The time plan shown in Appendix A includes the main activities of the project and illustrates the initial estimates for the time that each task would take, so as to allow for their successful completion.
It was realised that it would not always be possible to follow the plan exactly, however it was a good way of monitoring the overall progress of the project.
In the previous section it was described how each stage of the project development had a different manager. For each stage of analysis and design, its manager produced a workplan and allocated tasks to the other group members.
The workplan includes:
These workplans are presented below.
Research Workplan
Subtasks/Output
Task Assignment
All group members will take part in the research and try to find as much information as possible. Furthermore everyone has to read the draft.
Deadlines
1st week: subtasks 1, 2
2nd week: subtasks 3, 4
3rd week: subtasks 3 (cont.), 4 (cont.)
Analysis and Design Workplan
Subtasks/Output
The work can be divided into two teams, one team for the sender and one for the receiver. Research on implementation and kernel issues will run in parallel.
Team A (Sender): Andreas, Rodrigo
Team B (Receiver): Ghadah, Panos
Kernel Research: Maria
Deadlines
4th week: subtasks 1, 2, 3, 7
5th week: 4, 5, 7
6th week: 5(cont.), 6, 7
Implementation Workplan
Subtasks/Output
Task assignment
The two teams will work together initially, in order to implement the common structures and functions. Then the teams will split and develop the sender and receiver code.
Teams
Team A (Sender): Andreas, Rodrigo, Maria
Team B (Receiver): Ghadah, Panos
Deadlines
7th week: subtask 1
8th week: subtasks 2, 3, 4
9th week: subtasks 2 (cont.), 3 (cont.), 4 (cont.)
10th week: subtasks 2 (cont.), 3 (cont.), 4 (cont.), 5
Performance testing and Report writing
Subtasks/Output
Task assignment
The sender and receiver code will be put together therefore performance testing can now be done. The report will have to be completed before the end of the second week.
Deadlines
11th week: subtasks 1, 2
12th week: subtask 2 (cont.)
1.8 Platform of Implementation
As it is already mentioned the implementation platform was FreeBSD. Since there is already one implementation of PGM by Cisco, for Windows NT, the objective was to develop another implementation purely from the specification for another platform, thus supporting the goal of IETF standardisation of the protocol.
1.9 Choice of programming language
The choice of the programming language used for the implementation of the PGM protocol was rather straightforward, as the code developed for PGM would have to interact with the code for IP. Since both TCP and UDP implementations are in C, the obvious choice was to use C for PGM as well. In the initial report submitted for this project it was mentioned that another option would be C++. However after discussing the project issues with the supervisor and after conducting further research in the protocol programming area, it was obvious that the most suitable choice was C.
This chapter includes an introduction to multicast and multicast protocols in general. An analysis of the specification is presented as well as a preliminary design based on the analysis.
Multicast is one-to-many communication. There are numerous applications today that involve communications between one or more sources and many receivers. These applications include push technologies and multimedia applications as well as shared whiteboard applications, data conferencing, software distribution etc.
Multicasting is a technique that enables a single packet transmission to reach one or more destinations or group. The primary benefits of a packet reaching multiple destinations from a single transmission are the following: bandwidth minimisation, parallelism in the network, and optimisation of transmitter costs.
According to RFC 1458, the requirements of multicast protocols are:
The Transmission Control Protocol (TCP) meets the general requirements for reliable, ordered delivery of packets for unicast transmission. However such general-purpose protocol for multicast transmission does not exist yet.
With the term reliable it is meant that the protocol is able to achieve:
Reliable multicast communication is becoming increasingly important, especially for applications such as multimedia conferencing, replicated file servers, distributed interactive simulation and many others. Due to the fact that group communication applications have a wide variety of reliability requirements, many different reliable multicast protocols have been developed, none of which are dominant standards as TCP is for reliable unicast.
Group communication can be one-to-many or many-to-many with small or large group sizes. Some applications require packets to be delivered in order, while others do not. Some applications require stability (the fact that the sender knows that a packet has been received), while others do not. Different protocols provide different levels of reliability depending on their particular application.
The protocol reliability can be sender-initiated or receiver-initiated. That is, either of two is responsible for the detection of lost packets.
A sender-initiated reliability protocol places the burden of loss detection on the sender. A positive acknowledgement (ACK) is required from every receiver for every packet sent. A lost packet is detected when the sender fails to receive an ACK from every receiver within some time limit. When a loss is detected, the packet is retransmitted and the sender again waits for an ACK from every receiver.
Advantages: simple, limited buffering
Disadvantages: ACK-implosion, scalability
A receiver-initiated reliability protocol requires the individual receivers to detect if any packets are lost. Receivers generate a negative acknowledgement (NAK) when they detect a lost packet. The packet is retransmitted in response to one or more NAKs. NAK implosion is still possible if a large number of receivers lose the same packet. Suppression mechanisms can be used to minimise the number of duplicate NAKs produced when such correlated losses occur. Similar suppression mechanisms can be used to prevent a flood of retransmissions when any member with the appropriate data may respond to a NAK.
Advantages: more scalable, no ACKs
Disadvantages: unlimited buffering, NAK-implosion, more complex (timers)
The table below shows an overview of the different reliable multicast protocols that currently exist.
Protocol (authors)
|
Reliability semantics |
Participant structur e
|
Knowledge of participants? |
||
SRM (LBLet al.) |
Reliable |
No |
Local recovery groups |
Via wb session msgs |
NAK, receiver reliable |
RMTP (Bell Labs) |
Reliable |
Yes, with late join |
Hierarchy of regions, DR’s |
Optional may be known |
Window of pkts ACK/NAK’ed
|
MTP-2 (TU Bremen, also RFC1301) |
Reliable, totally ordered atomic delivery |
Yes, with explicit join |
Master, producer, consumer |
Known |
NAK (?) |
RAMP (TASC) RFC1458 |
Reliable |
Yes, with late join |
None |
Known |
NAK with selective ACK |
TMTP (Kentucky) |
Reliable |
No |
Tree-based, via ring search |
No |
Restricted NAKs, NAK suppression, periodic ACKs |
Log-based (Stanford) |
Reliable |
No |
Logging hierarchy |
Estimated |
NAK, statistical ACK |
RMP (NASA, UC Berkeley) |
Reliable up to totally ordered |
Explicit membership |
Peers, full members, clients |
Via membership |
ACK/NAK, fully stable |
Figure 2.1: Reliable multicast protocols
As it can be seen from the table, there are many reliable multicast protocols currently proposed and implemented. There are essentially two classes of reliability:
Protocols also differ in their design, in terms of group membership mechanisms, receiver/sender reliability, whether they use NAK, ACK or both and the logical structure of communication pathways.
As mentioned in the previous chapter, PGM is a reliable transport protocol for applications that require ordered duplicate-free, multicast data delivery from multiple sources to multiple receivers. PGM was designed with the aim if simplicity in mind. It is intended to be used with multicast applications with basic reliability requirements.
In the following, the PGM functionality is described, focusing on certain key issues such as:
In PGM there is no notion of group membership, a reliable multicast data delivery is provided, within a transmit window.
PGM loss detection and recovery - NAK suppression
Receivers detect lost packets based on gaps in the received sequence number sequence and unicast a NAK for each missing packet to the next-hop upstream PGM network element on the distribution tree for the TSI.
That PGM network element multicasts a NAK Confirmation (NCF) on the receiving interface in response to any NAK it receives on that interface. As soon as the receivers receive the corresponding NCF, they stop unicasting NAKs. Note that NCFs are not propagated by PGM network elements; they confirm the receipt of a NAK across a single PGM hop
However, the receiver does not send a NAK immediately when it detects a gap. First it delays the transmission of the NAK for a small random interval. If a corresponding NCF is received during that interval, the receiver cancels its NAK generation. This way only, one receiver in a LAN unicasts a NAK for a given TSI/SQN.
PGM loss detection and recovery - NAK elimination
PGM network elements create Retransmit State for each NAK they receive. The Retransmit State is associated with the interface on which the NAK is forwarded. It records the TSI and SQN of the NAK along with a list of the interfaces on which any instance of the NAK was received. Once the retransmit state exists for a given TSI/SQN, the PGM network elements confirm but do not forward further instances of that NAK.
This results in only one instance of a NAK being forwarded by a PGM network element.
PGM loss detection and recovery – Retransmit constraint
When a NAK is received, the source multicasts the requested retransmission (RDATA). The PGM network elements forward the RDATA only if they have the corresponding Retransmit State and only on those interfaces in the corresponding interface list. At the same time, the PGM network elements discard the current Retransmit State.
Upon receipt of a NAK, a source multicasts the requested retransmission (RDATA). The RDATA packets have exactly the same format as ODATA packets, however they differ in the type field. Therefore, retransmissions only propagate across the network segments, which reach receivers that lost the corresponding transmission.
PGM Local repair - Any receiver
Any receiver that receives an NCF for which it has the corresponding RDATA may multicast that RDATA (following a random back-off), thus resulting in a decrease in retransmit latency below the point of local recovery.
PGM Local Repair Designated Local Retransmitter (DLR)
A DLR is a dedicated host function configured to act as a retransmitter for selected groups in which it must also act as a receiver. In response to the NCFs that it receives for these selected groups, it multicasts a repeat of that NCF with an option providing its own NLA.
PGM network elements and receivers that receive redirecting NCFs may thereafter unicast NAKs for losses in the group directly to the DLR rather than to the source.
In response to directed NAKs, the DLR provides RDATA just as the source would.
PGM—Transmit Window Advancement
Sources may advance the trailing edge of the window arbitrarily. Implementations may support automatic adjustments such as keeping the window at a fixed size in bytes or packets, or fixed real time duration. Additionally they may optionally delay window advancement in the absence of NAKs.
When the transmit window is advanced, some trailing fraction of the transmit window, called the increment window, becomes unavailable for retransmission.
NAKs for packets with sequence numbers outside the transmit window are confirmed but do not result in the transmission of corresponding RDATA.
The trailing edge of the transmit window is carried in all PGM packets so receivers can detect unrecoverable data loss.
PGM Source Path establishment
Interleaved with ODATA, sources periodically multicast Source Path Messages (SPMs) to establish source path state for a given TSI in all PGM network elements and receivers on the distribution tree from the source. SPMs are propagated PGM-hop-by-PGM-hop from the source along the distribution tree for the TSI.
At each PGM network element, the source network-layer address (NLA) of the next-hop upstream PGM neighbour is recorded on the RPF interface for the TSI. Each PGM network element then re-sources the SPM with the network-layer address of each outgoing interface
SPMs also act to alert receivers that the oldest data in the transmit window is about to be removed from the transmit window and will, thereafter, not be available for retransmission from the source.
2.5 Refining the given specification
The specification given to the group consisted of the general requirements (goals) presented in Ch.1 and the PGM draft, which does not constitute and RFC at the time of writing, as it is still being revised by various researchers.
According to the specification, the main objectives are:
The first objective was also the main project requirement. In order to implement the host side, it was necessary to read and thoroughly understand the PGM draft.
As mentioned in Ch.1, the Unified Modelling Language (UML) was used to analyse the protocol behaviour in terms of the sender and receiver functions.
The UML is a notational system aimed at modelling systems using object oriented concepts, more specifically, it is a language for specifying, visualising and constructing the artefacts of software systems (Booch et al., 1997). Developed by Booch, Rumbaugh and Jacobson, the UML has received de facto approval in industry and the UML is likely to represent the standard for some time to come. Therefore, the group believed it would be in their best interest to use this method as the basis of analysis and design of the project.
The UML is a relatively large body of notation, however due to time constraints and in the spirit of simplified modelling, it was agreed to select areas of the UML that would best suit the project's needs. Therefore the following were developed:
Development of these models commenced during the analysis stage of the project. These diagrams can be found in Appendix A.
Having read the draft, the protocol and implementation requirements were identified. This was the first step in analysing the protocol behaviour in terms of the functions that would be used in the coding stage. The following are the protocol (user requirements for the host and the receiver.
User requirements (source)
User requirements (receiver)
Additionally, the PGM protocol has some non-functional requirements, which are:
Reliability
One of the goals of PGM is to get a strong definition of reliability from a simple protocol. ACK implosion can be addressed in a variety of effective but complicated ways, most of which require re-transmit capability from other than the original source. An alternative is to dispense with positive acknowledgements and resort instead to timeouts at the source to manage transmission resources. The definition of reliability with PGM is a direct consequence of the following design decision: PGM guarantees that a receiver either receives all data packets from transmissions, or is able to detect unrecoverable data packet loss.
PGM includes strategies for repeatedly soliciting NAKs from receivers, and for adding reliability to the NAKs themselves. By reinforcing the NAK mechanism, PGM minimises the probability that a receiver will detect a missing data packet so late that the packet is unavailable for retransmission either from the source, form another receiver, or from a designated local retransmitter (DLR).
Group membership
PGM group membership may change during the course of a transport session without the knowledge of the source, or any consequence to the source or the remaining receivers. Late joining allows a source to indicate whether or not receivers may request all available retransmissions when they initially join a particular transport session.
Efficiency
While PGM avoids the implosion of positive acknowledgements, simply by dispensing with ACKs, the implosion of negative acknowledgements is addressed directly. Receivers observe a random back off prior to generating a NAK. If, during this interval, a matching NAK or NCF is received then the receiver "suppresses" the NAK- it does not send it, but behaves as if it had-. In addition, PGM network elements eliminate duplicate NAKs received on different interfaces on the same network element, The combination of these two strategies usually results in the source receiving just a single NAK for any given lost data packets. Whether a retransmission is provided from another receiver, a DLR, or the original source, it is important to constrain that retransmission to only those network segments containing members that negatively acknowledged the original transmission rather than propagating it throughout the group. PGM specifies procedures for network elements to use the pattern of NAKs to define a sub-tree within the group upon which to forward the corresponding retransmission so that it reaches only those receivers that missed it in the first place.
Simplicity
PGM is designed to achieve the greatest improvement in reliability (as compared to the usual UDP) with the least complexity. As a result, PGM does NOT address conference control, global ordering amongst multiple sources in the group, nor recovery from network partitions.
Operability
PGM is designed to function even in the presence of network elements that have no knowledge of PGM. To that end, all PGM data packets can be conventional multicast routed by non-PGM network elements with no loss of functionality, but with some inefficiency in the propagation of RDATA and NCFs. In addition, since NAKs are unicast, to the last-hop PGM network element and NCFs are multicast to the group, NAK/NCF operation is also consistent across non-PGM network elements. However, since the NAK suppression back off delay is a protocol constant, and receivers rely on the NCF to suppress NAKs, receivers should always have a PGM network element as a first hop network element between themselves and every path to every PGM source. If receivers are several hops removed from the first PGM network element, the efficiency of NAK suppression may degrade
Performance requirements
The performance of PGM should be comparable to that of other reliable multicast protocols
Platforms (Operational environment)
Both Senders and receivers will have FreeBSD as their operating system.
Compliance to Standards
PGM will comply with IP multicast standards, utilising already implemented protocols and underlying operating systems.
Adaptability requirements
PGM must be able to adapt to various amounts of traffic and network load.
Accuracy requirements
Accuracy of calculations in terms of specifying the transmit window and the increment window, is crucial for reliable transfer in PGM.
Documentation requirements
There will be a textual description of the features of architecture (API), system requirements, key mechanism and designs of the design of the system.
The kernel is a single piece of software, which is permanently memory-resident and deals with a UNIX system's process scheduling and I/O control. All user processes and all file system accesses will be resourced, monitored and controlled by the kernel.
Typically the kernel includes the following:
A kernel may also include a manager for the operating system's address space in memory storage, sharing these among all components and other users of the kernel's services. A kernel's services are requested by other parts of the operating system or by applications, through a specified set of programming interfaces known as system calls.
Because the code that makes up the kernel is needed continuously, it is usually loaded into computer storage in an area that is protected so that it will not be overlaid with other less frequently used parts of the operating system.
To summarise, the specific responsibilities of the kernel are:
- To provide a number of additional services for use of other programs
The FreeBSD kernel is comprised of a number of different subdirectories that represent different parts of the kernel.
The most important, as far as the development of PGM is concerned, is the /sys/netinet directory. Within netinet resides the code for TCP, UDP, IP, IGMP and eventually PGM.
Since PGM operates over IP, not over UDP - applications would use PGM much as they use TCP. A consequence of this is that the implementation will be inside the kernel and there are several reasons for this:
i) If a process crashes but the system is up, then it can report the fact to the far end reliably
ii) Timers can be fired more accurately within the operating system
iii) If a user space protocol is written that is implemented above IP, it is only possible to have one such process per IP protocol number (due to the impossibility of de-multiplexing incoming packets to the right process otherwise).
The advantage of doing a kernel PGM host implementation is that it is more robust, generally more useful and can be used by small modifications to existing applications that use the socket API.
This chapter describes the methodology and the stages of the design phase of the PGM protocol.
Having completed the analysis of the specification it became obvious that the host behaviour can be very complicated to understand without the use of a structured and organised approach. It was then decided that the best way to overcome the problems of organising the design would be to use the concepts of the Unified Modelling Language (UML).
The group members formed two teams in order to develop the design for the sender and the receiver.
Although UML is a modelling language better suited to application development, the group chose to use it for protocol development. This decision was based purely on the belief that the UML principles could be used in order to visualise the sender and receiver sequence of actions as well as the different states within the protocol, and how the transitions from one state to the other occur.
In addition, defining the use cases would make sure that every possible aspect of the protocol behaviour was included in the design.
Once the UML design was completed, the sender and receiver architecture diagrams were produced. These illustrate the actions that take place to send or receive packets, and how these actions are controlled. These diagrams are presented in Appendix A.
A use case is a narrative document that describes the sequence of events of an actor (an external agent) using a system to complete a process. (Jacobson, 1992). It describes stories or cases of using a system, thereby illustrating requirements, by defining the domain processes and the external environment, which consists of actors who participate in these processes. Strictly speaking, use cases are not actually object oriented analysis artefacts and are often used within non-object oriented approaches. The UML formally includes the notion of use cases and use case diagrams.
Abbreviations
ACK Acknowledgement
NAK Negative Acknowledgement
NCF NAK Confirmation
ODATA Original Data
RDATA Retransmitted Data
SPM Source Path Message
Transmit Window (Definitions)
TXW_MAX_RTE Maximum transmit rate (in bytes/sec) maintained by a source to regulate its bandwidth utilisation.
TXW_SECS Amount of transmitted data (in seconds) a source retains for retransmissions.
TXW_BYTES The size of the transmit window in bytes.
TXW_SQNS The size of the transmit window in sequence numbers.
TXW_TRAIL The sequence number of the oldest data packet available for retransmission.
TXW_LEAD The sequence number of the most recent data packet source has transmitted
TXW_ADV_SECS The fraction of the transmit window size (in seconds of data) by which the transmit window is advanced.
TXW_INC The leading (or right) edge of the increment window.
TXW_ADV_IVL_TMR A timer that the source starts in anticipation of advancing the transmit window. This timer runs for TWX_ADV_IVL if not reset by special events meanwhile
TXW_ADV_IVL The interval after which the source will advance transmit window. It has a value in the range (0 - TXW_ADV_SECS)
Data transmission
Sources multicast ODATA packets to the group within the transmit window at a given transmit rate. A source numbers ODATA packets in the order in which they are submitted for transmission by the application. A source transmits ODATA packets in sequence and only within the transmit window beginning with TXW_RAIL at no greater rate than TXW_MAX_RTE. Source uses a token bucket scheme combined with a leaky bucket drained at a maximum transmit in order to regulate its transmit rate.
Source Path State
Source multicast SPMs to the group, interleaved with ODATA if present, to establish source path state in PGM network elements. There are two ways the source transmits SPMs. Ambient SPMs are interleaved with ODATA and RDATA at a rate at least sufficient to maintain current source path state in PGM network elements. In the absence of data to transmit, a source transmits heartbeat SPMs at a decaying rate in order to assist in early detection of lost data, to maintain current source path state in PGM network elements and to maintain current receive window state in the receivers.
NAK reliability
A source immediately multicasts an NCF in response to any NAK it receives. The NCF is required since the alternative of responding immediately with RDATA would not allow other PGM network elements on the same subnet to do NAK anticipation, nor would it allow DLRs on the same subnet to provide retransmissions.
Data retransmission
Sources multicast RDATA packets to the group in response to NAKs received for data packets within the transmit window. They do that after having send the corresponding NCF and while respecting TXW_MAX_RTE. RDATA are transmitted at priority over concurrent ODATA.
Transmit window advance
Sources may advance the trailing edge of the window arbitrarily. Implementations may support automatic adjustments such as keeping the window at a fixed size in bytes, a fixed number of packets or fixed real time duration. In addition they may optionally delay window advancement based on NAK-silence for a certain period. There are three suggested models for advancing the transmit window.
Advancing with Data: The transmit window maintains the last TXW_BYTES bytes of data for retransmission. TXW_MAX_RTE is calculated from both ODATA and RDATA, and NAKs reset TXW_ADV_IVL_TMR. While TXW_ADV_IVL_TMR is running, a source uses the receipt of a NAK for ODATA within the increment window to reset the timer to TXW_ADV_IVL. That way the transmit window advancement is delayed until no NAKs for data in the increment window are seen for TXW_ADV_IVL seconds. If the transmit window should fill in the meantime, further transmissions would be suspended until the transmit window can advance.
Advancing with Time: The transmit window maintains the last TXW_SECS of data in real-time, regardless of whether any data was sent in that real time period or not. TXW_MAX_RTE is calculated from ODATA only to maintain a constant data throughput rate by consuming extra bandwidth for retransmissions. TXW_ADV_IVL has the value of 0 which advances the transmit window without regard for whether NAKs for data in the increment window are still being received.
Advancing under explicit application control: An implementation may provide this ode and allow an application to explicitly control the window in terms of APDUs.
A PGM Receiver can: Receive PGM packet
Receive ODATA/RDATA
Receive Source Path Messages (SPM)
Receive NCF
Receive NAK
Send Negative Acknowledgements (NAK)
Retransmit locally RDATA
Some definitions first:
Contiguous data:
Contiguous data is comprised of those data packets within the receive window that have been received and are in the range from RXW_TRAIL up to (but not including) the first missing sequence number in the receive window. The most recently received data packet of contiguous data defines the leading edge of contiguous data.
Receive PGM packet Use Case:
When the receiver receives a PGM packet it checks if this packet is outside the receiving window. If yes it discards it otherwise it has to check the type of the packet. If the packet is ODATA/RDATA then it has to check if this packet is duplicated. If yes it discards it otherwise it will dispatch it to one of the four Use cases (Receive ODATA/RDATA, Receive NCF, Receive NAK, Receive SPM) according to the type of the packet.
Receive ODATA/RDATA Use Case:
The receiver initiates Data Reception by beginning with the first ODATA_SQN it receives within the advertised transmit window. This sequence number temporarily defines the trailing edge of the transmit window from the receiver's perspective. That is, it is assigned to RXW_TRAIL_INIT within the receiver, and until the trailing edge sequence number advertised in subsequent packets (SPMs or ODATA or RDATA) increments through RXW_TRAIL_INIT, the receiver must only request retransmissions for sequence numbers subsequent to RXW_TRAIL_INIT.
When the receiver receives any ODATA or RDATA packet it checks the sequence number of the packet. If the received packet is the next in the sequence then the receiver adds this packet to the contiguous data. A receiver must deliver only contiguous data packets to the application, and it must do so in the order defined by those data packets' sequence numbers. By receiving this packet the receiver updates the trail and the lead of the receive window. Finally A receiver may maintain full copies of any packet in the receive window for possible retransmission even after having delivered that data to the application.
Receive Source Path Messages (SPM) Use Case:
By receiving an SMP message the receiver determines/updates the NLA (Network Layer Address) of the upstream PGM network element for use in NAK addressing for every TSI. Also it must check for gaps in the expected data sequence by comparing the SPM_LEAD of the received SPM with the leading edge of the contiguous data. If the receiver has not received all intervening data packets, it must initiate selective NAK generation for each missing sequence number (See later). A receiver cannot initiate retransmit requests until it has received at least one SPM for the corresponding TSI.
Send Negative Acknowledgements (NAK) Use Case:
The receiver receives an ODATA/PDATA packet with a particular sequence number and detects gaps in the expected data sequence by comparing the sequence number on
the most recently received ODATA or RDATA packet with the leading edge of contiguous data. So it initiates a selective NAK generation for each intervening missing sequence number.
For every NAK the receiver:
Therefore NAK generation follows the detection of a missing data packet and is the cycle of waiting for NAK_RB_IVL, listening for matching NCFs, transmitting a NAK if a matching NCF is not heard, waiting NAK_RDATA_IVL, and recommencing NAK generation if the matching data is not received. During that period if any of the timers NAK_GEN_IVL or NAK_RPT_IVL goes off then the NAK generation is cancelled indicating unrecoverable data loss.
So the general procedure is: Assume that the receiver wants to generate a NAK with a particular NAK_SQN and NAK_TSI. Firstly receiver must suspend the NAK generation if a duplicate of this NAK is already pending from this receiver. A NAK is pending from this receiver if NAK_RB_IVL for this NAK has been initiated in this receiver but has not yet passed. Also the receiver must suspend the NAK generation if a duplicate of this NAK is already outstanding from this or another receiver. A NAK is outstanding from this or another receiver if NAK_RDATA_IVL for this NAK has been initiated in this receiver but has not yet passed.
If the NAK is not suspended, then before transmitting it the receiver waits some interval NAK_RB_IVL chosen randomly and uniformly over NAK_BO_IVL, during which it listens for a matching NCF that may be transmitted in response to the same NAK from another receiver. It also triggers the NAK_GEN_IVL timer. A matching NCF must match NCF_TSI with NAK_TSI, and NCF_SQN with NAK_SQN. If the receiver hears a matching NCF within the NAK_RB_IVL it suspends the NAK, triggers the NAK_RDATA_IVL and waits for the data. If a matching NCF is not received the receiver, after the expiry of NAK_RB_IVL, transmits a NAK (and also multicasts it with TTL = 1 to the local group) in order (from the oldest to the newest) to the upstream PGM network element for the TSI specifying the transport session identifier and missing sequence number. It must repeat the NAK at a rate of NAK_RPT_RTE for an interval of NAK_RPT_IVL until it receives a matching NCF. It must then wait NAK_RDATA_IVL before recommencing NAK generation. If it hears a matching NCF during NAK_RDATA_IVL, it must wait anew for NAK_RDATA_IVL before recommencing NAK generation (i.e., NCFs restart NAK_RDATA_IVL). Receivers should transmit NAKs for data packets in the increment window at priority over NAKs for data packets in the remainder of the receive window.
After sending a NAK the receiver triggers the NAK_RPT_IVL timer. If this timer goes off before a matching NCF has been received, then the receiver is indicated unrecoverable data loss. If a matching NCF is received then the NAK_RPT_IVL timer stops and the NAK_RDATA_IVL timer is triggered next. If this timer also expires before RDATA has been received then the receiver selects a new value of NAK_RB_IVL, and starts again. If the NAK_GEN_IVL expires before RDATA has been received, then the NAK generation is cancelled and the receiver is indicated unrecoverable data loss.
A receiver must discard any NCFs -it hears for data packets outside the receive window. If a receiver hears an NCF for a data packet in the receive window for which it has no retransmit state, it should discard the NCF only if it has already received the matching data packet. If it has not already received the matching data packet, it should wait NAK_RDATA_IVL and then commence NAK generation itself, beginning with the random back off procedure.
Retransmit locally RDATA Use Case:
By receiving an NCF packet, receivers may detect retransmit requests from other receivers by comparing the sequence number on any NCF received for any data packet in the receive window. If the receiver has received the corresponding data packet, it may initiate RDATA generation for that packet.
After detecting the retransmission the receiver sets a timer called RDATA_RB_IVL, which randomly delays the transmission of a given RDATA packet. RDATA_RB_IVL is counted down from the time the retransmit request is detected. Expiry of RDATA_RB_IVL causes transmission of the RDATA packet. Also the receiver must cancel RDATA generation if a duplicate of the RDATA packet is already pending from this receiver. An RDATA packet is pending from this receiver if RDATA_RB_IVL for this RDATA packet has been initiated in this receiver but has not yet passed.
During the RDATA_RB_IVL interval the receiver listens for a matching RDATA packet that may be transmitted from another receiver in response to the same NCF. In this case the receiver must cancel RDATA generation. A matching RDATA packet must match RDATA_TSI and RDATA_SQN.
Upon expiry of RDATA_RB_IVL, a receiver may multicast the RDATA packet to the group. The RDATA packet, other than its type (and therefore its checksum), must be an exact duplicate of the corresponding ODATA packet.
Receive NCF Use Case:
By receiving an NCF packet the receiver must check if it has this packet. If yes then activates the local retransmission Use Case. Otherwise, call the Send NAK Use Case (Back off and wait for the RDATA)
Receive NAK Use Case:
The receiver checks the NAK_SQN and the NAK_TSI to figure out if it has the packet. If yes then it ignores the NAK. If it doesn't have the packet but it has already sent a NAK then it ignores it again and waits for the NCF or RDATA. If there is a NAK pending then the receiver suspends the NAK generation and backs off (see Send NAK Use Case).
3.4 Sender Architecture Diagram
The following figure illustrates the initial design for the sender in terms of sequence of events and programming concepts.
Figure 3.1: Initial sender architecture diagram
As it can be seen form the figure, there are a number of threads: AdvanceThread, SPMThread, ODATAThread, TokenThread, RDATAThread, RegulatorThread, LeakyThread, and NAKReceiverThread.
The data is received from the application and put into the Window buffer. The ODATAThread takes the data from the Window, adds a PGM header and puts them in the ODATA queue. From there, the ODATA will be passed on to the Leaky buffer by the Regulator thread, as soon as it collects the right amount of tokens. Similarly, the SPM packets are put in the SPM queue by the SPM thread. As soon as a NAK is received, the NAKReceiver thread puts it in the NAK buffer. Next the RDATA thread arranges for an NCF to be sent and gets the corresponding packet from the Window buffer, which is then put in the RDATA queue. In addition it signals the AdvanceThread to advance the Window buffer.
3.5 Receiver Architecture Diagram
The following figure illustrates the initial design for the receiver. As in the case of the sender, there are a few threads: Input Handler, DATA Handler, NCF Handler, SPM Handler, NAK Handler, NAK Initiator, Local Retransmitter and Create NAK.
Figure 3.2: Initial receiver architecture diagram
The data is received and put in the input buffer. From there the input handler checks the type of packet in order to put it in the appropriate queue. Further checks are done to determine if the packet is out of range or if it is a duplicate. If the packet is RDATA, then the input handler informs the local retransmitter. If a missing packet is detected, then the NAK initiator is activated which in turn activates the Create NAK thread that creates and puts the corresponding NAK in the output buffer.
In the following the refined use cases for the sender and the receiver are presented.
Use Case: Send PGM ODATA packet
Section: Main
Use Case: Send PGM ODATA packet
Purpose: Senders multicast sequenced PGM packets.
Overview: The application opens a socket, calls sendto system call, the timers are initialised and the SPM generator is set to ambient mode. New sequence numbers and extension headers are created for the packets. The socket’s send buffer is used as the window buffer. The ODATA packet is copied to the socket buffer and then it is appended to the ODATA queue.
Type: primary and essential
Typical Course of Events
|
|
Action |
Protocol Response |
|
|
Use Case: Send Source Path State (SPM)
Section: Main
Use Case: Send Source Path State (SPM)
Purpose: Senders send SPMs in order to establish source path state.
Overview: Source multicast SPMs to the group, interleaved with ODATA if present, to establish source path state in PGM network elements. There are two ways the source transmits SPMs. Ambient SPMs are interleaved with ODATA and RDATA at a rate at least sufficient to maintain current source path state in PGM network elements. In the absence of data to transmit, a source transmits heartbeat SPMs at a decaying rate in order to assist in early detection of lost data, to maintain current source path state in PGM network elements and to maintain current receive window state in the receivers.
Type primary and essential
Typical Course of Events
|
|
Action |
Protocol Response |
|
|
Use Case: Data Retransmission
Section: Main
Use Case: Data Retransmission
Purpose: Senders retransmit packets (RDATA) in response to requests.
Overview: Sources multicast RDATA packets to the group in response to NAKs received for data packets within the transmit window. They do that after having send the corresponding NCF and while respecting TXW_MAX_RTE. RDATA are transmitted at priority over concurrent ODATA.
Type primary and essential
Typical Course of Events
|
|
Action |
Protocol Response |
|
|
Use Case: NAK Reliability
Section: Main
Use Case: NAK Reliability
Purpose: Senders multicast NCFs in response to any NAK they receive.
Overview: A source immediately multicasts an NCF in response to any NAK it receives. The NCF is required since the alternative of responding immediately with RDATA would not allow other PGM network elements on the same subnet to do NAK anticipation, nor would it allow DLRs on the same subnet to provide retransmissions.
Type: primary and essential
Typical Course of Events
|
|
Action |
Protocol Response |
|
|
Following are the receiver use cases.
Use Case: Receive PGM packet
Input_handler
Section: Main
Use Case: Receive PGM packet
Purpose: Receivers receive PGM packets within the transmit window and eliminate any duplicates.
Overview: When the receiver receives a PGM packet it checks if this packet is duplicated or outside the receiving window. If yes it discards it otherwise it will dispatch it to one of the four Use Cases (Receive ODATA/RDATA, Receive NFC, Receive NAK, Receive SPM)
Type: primary and essential
Cross References: Functions:
Typical Course of Events
|
|
Action |
Protocol Response |
|
|
Section: find_tsi
Typical Course of Events
|
|
Action |
Protocol Response |
|
If these conditions were met then, go to step 4. Otherwise, go to step 5.
|
Section: is_in_range
Typical Course of Events
|
|
Action |
Protocol Response |
|
|
Section: lr_check_if_pending
Typical Course of Events
|
|
Action |
Protocol Response |
|
|
Section: nak_initiator
Typical Course of Events
|
|
Action |
Protocol Response |
|
|
Section: check_if_pending
Typical Course of Events
|
|
Action |
Protocol Response |
|
|
Section: is_duplicate
Typical Course of Events
|
|
Action |
Protocol Response |
|
4. Return 0. (No match was found) |
Section: deliver_contiguous_data
Typical Course of Events
|
|
Action |
Protocol Response |
|
|
Use Case: Receive ODATA/RDATA
Data_handler
Section: Main
Use Case: Receive ODATA/RDATA
Purpose: Receive ODATA/RDATA from Receive PGM packet (T1), order them in the buffer if out of sequence, if a missing packet was noticed
Overview: The receiver initiates Data Reception by beginning with the first ODATA_SQN it receives within the advertised transmit window. This sequence number temporarily defines the trailing edge of the transmit window from the receiver’s perspective. That is, it is assigned to RXW_TRAIL_INIT within the receiver, and until the trailing edge sequence number advertised in subsequent packets (SPMs or ODATA or RDATA) increments through RXW_TRAIL_INIT, the receiver must only request retransmission for sequence numbers subsequent to RXW_TRAIL_INIT. When the receiver receives any ODATA or RDATA packet it checks the sequence number of the packet. It discards any data packet that duplicates one already received in the transmit window and also discards any data packet to the contiguous data. A receiver must deliver only contiguous data packets to the application, and it must do so in the order defined by those data packets’ sequence numbers. By receiving this packet the receiver updates the trail and the lead of the receive window. Finally A receiver may maintain full copies of any packet in the receive window for possible retransmission even after having delivered that data to the application.
Type primary and essential
Cross-References: Functions:
Use Cases: Receive PGM packet use case initiates this use case. (Input_handler calls data_handler)
Typical Course of Events
|
|
Action |
Protocol Response |
|
|
Section: Create_tsi
Typical Course of Events
|
|
Action |
Protocol Response |
|
|
Section: Create_pckt
Typical Course of Events
|
|
Action |
Protocol Response |
|
Return newly created packet structure. |
Section: advance_window
Typical Course of Events
|
|
Action |
Protocol Response |
|
|
Use Case: Receive Source Path Messages (SPM)
Spm_handler
Section: Main
Use Case: Receive Source Path Messages (SPM)
Purpose: Receivers use SPMs to determine the last-hop PGM network element for a given TSI to which to direct their NAKs.
Overview: By receiving an SPM message the receiver determines/updates the NLA (Network Layer Address) of the upstream PGM network element for use in NAK addressing for every TSI. Also it must check for gaps in the expected data sequence by comparing the SPM_LEAD of the received SPM with the leading edge of the contiguous data. If the receiver has not received all intervening data packets, it must initiate selective NAK generation for each missing sequence number (See later). A receiver cannot initiate retransmit requests until it has received at least one SPM for the corresponding TSI.
Type primary and essential
Cross-References: Functions:
Use Cases: Receive PGM packet use case initiates this use case. (Input_handler calls spm_handler)
Typical Course of Events
|
|
Action |
Protocol Response |
|
|
Section: del_pckts_rqueue
Typical Course of Events
|
|
Action |
Protocol Response |
|
|
Section: cancel_pending_naks
Typical Course of Events
|
|
Action |
Protocol Response |
|
|
Section: suspend_nak
Typical Course of Events
|
|
Action |
Protocol Response |
|
|
Use Case: Receive NAK Confirmation (NCF)
Ncf_handler
Section: Main
Use Case: Receive NAK Confirmation (NCF)
Purpose: By receiving an NCF packet the receiver must check if it has this packet. If yes then activates the local retransmission Use Case. Otherwise, call the Send NAK Use Case (Back off and wait for the RDATA)
Overview: This use case begins when an NCF packet for a given transport session identified by a TSI is dispatched by Receive PGM packets use case and so the "NCF handler" checks the NAK_SQN and the NAK_TSI to figure out if it has the packet. If yes then it ignores the NAK. If it doesn’t have the packet then it informs the "NAK Initiator" for the arrival of this NAK.
Type primary and essential
Cross-References: Functions:
Use Cases: Receive PGM packet Use Case initiates this use case. (Input_handler calls ncf_handler)
Typical Course of Events
|
|
Actor Action |
System Response |
|
|
Section: Find_packet
Typical Course of Events
|
|
Action |
Protocol Response |
|
|
Use Case: Receive Negative Acknowledgement (NAK)
Nak_handler
Section: Main
Use Case: Receive Negative Acknowledgement (NAK)
Purpose: Receive NAKs and direct them to proper action.
Overview: The receiver checks the NAK_SQN and the NAK_TSI to figure out if it has the packet. If yes then it ignores the NAK. If it doesn’t have the packet but it has already sent a NAK then it ignores it again and waits for the NCF or RDATA. If there is a NAK pending then the receiver suspends the NAK generation and backs off.
Type primary and essential
Cross-References: Functions:
Use Cases: Receive PGM packet Use Case initiates this use case. (Input_handler calls the nak_handler)
Typical Course of Events
|
|
Action |
Protocol Response |
|
|
Use Case: Send Negative Acknowledgments (NAK) Use Case
Nak_state_machine
Section: Main
Use Case: Send Negative Acknowledgement (NAK)
Purpose: Intiate the process of sending NAKs to the PGM upstream network element.
Overview: This use case begins when the "NAK Initiator" is activated and has received requests for NAK generations. For every NAK request, the "NAK Initiator" will do:
Step 1a: Activates a timer called NAK_RB_IVL, which randomly delays the transmission of a given NAK from a receiver. If this timer goes off, go to step 3. If it receives the corresponding ODATA/RDATA during that time, goes to step SUSP. If it receives a corresponding NAK/NCF, it stops NAK_RB_IVL timer and goes to step 4.
Step 1b: Activates a timer called NAK_GEN_IVL, which limits the length of time for which the "NAK Initiator" will retry a NAK while waiting for the corresponding RDATA. If this timer goes off then, the "NAK Initiator" goes to step UNRECOVERABLE DATA.
Step 3: The "NAK Initiator" activates the Send NAK function and sends the NAK. Then the "NAK Initiator" activates another timer, called NAK_RPT_IVL, which limits the length of time for which the "NAK Initiator" will repeat a NAK while waiting for the corresponding NCF. If this timer goes off the "NAK Initiator" goes to step UNRECOVERABLE DATA. On receipt of a corresponding NCF it stops the NAK_RPT_IVL timer and goes to step 4. On receipt of ODATA/RDATA the "NAK Initiator goes to step SUSP. Corresponding NAKs are ignored.
Step 4: Activates a final timer, called NAK_RDATA_IVL, that limits the length of time for which the "NAK Initiator" will wait for the RDATA corresponding to a confirmed NAK. On receipt of the corresponding ODATA/RDATA the receiver goes to step SUSP. If this timer goes off then the "NAK Initiator" has to go back to step 1. Receipt of corresponding NCFs causes the "NAK Initiator" to restart the timer and stay at step 4. Corresponding NAKs are ignored.
Step SUSP: The "NAK Initiator" stops all timers and indicates successful receipt of missing packets.
Step UNRECOVERABLE DATA: The "NAK Initiator" stops all timers and indicates unrecoverable missing data.
Type primary and essential
Cross-references: Functions:
Use Cases: Receive NCF Use Case and NAK handler Use Case both initiate this use case.
Typical Course of Events
|
|
Action |
Protocol Response |
|
|
Section: unrecoverable_data
Typical Course of Events
|
|
Action |
Protocol Response |
|
|
Section: send_nak
Typical Course of Events
|
|
Action |
Protocol Response |
|
|
Use Case: Retransmit locally RDATA
Lr_state_machine
Section: Main
Use Case: Use Case: Retransmit locally RDATA
Purpose: Receivers may multicast retransmissions of any data in their receive windows for which they receive a matching NFC. Receivers may also suppress retransmissions for which a matching retransmission is received during the retransmit back-off interval.
Overview: By receiving an NCF packet, receivers may detect retransmit requests from other receivers by comparing the sequence number on any NFC received for any data packet in the receive window. If the receiver has received the corresponding data packet, it may initiate RDATA generation for that packet. After detecting the retransmission the receiver sets a timer called RDATA_RB_IVL, which randomly delays the transmission of a given RDATA packet. RDATA_RB_IVL is counted down from the time the retransmit request is detected. Expiry of RDATA_RB_IVL causes transmission of the RDATA packet. Also the receiver must cancel RDATA generation if a duplicate of the RDATA packet is already pending from this receiver. An RDATA packet is pending from this receiver if RDATA_RB_IVL for this RDATA packet has been initiated in this receiver but has not yet passed. During the RDATA_RB_IVL interval the receiver listens for a matching RDATA packet that may be transmitted from another receiver in response to the same NCF. In this case the receiver must cancel RDATA generation. A matching RDATA packet must match RDATA_TSI and RDATA_SQN. Upon expiry of RDATA_RB_IVL, a receiver may multicast the RDATA packet to the group. The RDATA packet, other than its type (and therefore its checksum), must be an exact duplicate of the corresponding ODATA packet.
Type primary and essential
Cross-References: Functions:
Use Cases: SPM handler Use Case intiates this use case. (spm_handler calls lr_state_machine)
Typical Course of Events
|
|
Action |
Protocol Response |
|
|
Section: transmit_rdata
Typical Course of Events
|
|
Action |
Protocol Response |
|
|
Section: local_retransmitter
Typical Course of Events
|
|
Action |
Protocol Response |
|
|
Section: lr_suspend
Typical Course of Events
|
|
Action |
Protocol Response |
|
|
This chapter describes the implementation procedure and the development strategy followed by the group members in order to do the coding.
As it was described in Ch.3, the design was divided in two sections: sender and receiver. Separate Use Cases were developed for each section and their functions were identified. Consequently, the implementation procedure was done in a similar way. The coding of the sender and the receiver was done concurrently by the two teams in which the group had split.
Initially the group worked together in writing code that would be used both by the sender and the receiver. Having completed that stage, two teams were formed to do the coding for the sender and the receiver.
Before elaborating on the coding stages, some kernel concepts and structures have to be presented.
4.3 Networking code within the kernel
The networking code in the kernel is organised in three layers:
Networking protocols are very demanding in terms of memory management of the kernel; moreover the protocol performance is directly related to the memory management scheme used within the kernel. This means that buffers of varying sizes need to be manipulated when prepending and appending data to them.
Memory buffers are used within the networking code of the FreeBSD kernel. The main use of mbufs is to hold the user data that travels from the process to the network interface and vice versa. They are also used to contain a variety of other data such as source and destination addresses, socket options and so on. Mbufs were used extensively within the PGM code to hold various kinds of information.
Another important issue in the development of the PGM code was the use of Protocol Control Blocks (PCBs). PCBs are used at the protocol layer to hold the various pieces of information required for each PGM socket. PGM also makes use of the Internet PCB. The Internet PCB is a transport layer data structure and is used by TCP, UDP, PGM and raw IP, but not by IP, ICMP or IGMP. The Internet PCB contains information common to all PGM (also TCP and UDP) endpoints. For PGM these are:
The PGM Protocol Control Block contains all the state information such as sequence numbers in both directions, window sizes, retransmission timers etc.
At compile time, a protosw structure is allocated for each protocol in the kernel and groups the structures for all protocols within a single domain into an array. Each domain structure references the appropriate array of protosw structures. A kernel may provide multiple interfaces to the same protocol by providing multiple protosw entries.
4.4 Implementation of common structures
The implementation of TCP and UDP were used as a guidance in order to develop the PGM code. The PGM code was structured in a similar way using some of the functions already present in UDP and TCP, modifying them however to suit the protocol needs in terms of structures and variables. Following that, the functions specific to PGM were created in order to implement the exact functionality of the protocol as it was described in the draft given to the group.
The first step towards coding was to create the structures that would be used by both the sender and the receiver.
pgm.h
This is the header file for PGM that contains structures relative to the PGM packet headers.
The initial structures added to this file involved the packet headers and the packet header extensions such as:
struct pgmhdr { }
PGM SPM header extension
struct pgm_spm_ext { }
PGM header extension for NAKs and NCFs
struct pgm_neg_ext { }
PGM header extension for ODATA and RDATA packets
struct pgm_data_ext { }
PGM Packet headers combined
struct pgm_neg_hdr { }
struct pgm_spm_hdr { }
struct pgm_data_hdr { }
More structures were added to pgm.h as both teams were developing their code.
Another file that contained structures as well as some constants common to the sender and the receiver was pgm_var.h
pgm_var.h
This file contains the PCB (Protocol Control Block) structures and their corresponding fields.
struct pgmiphdr { }
struct pgmcb { }
4.5 Mapping of design to the kernel programming needs
During the early programming phase it was realised that kernel programming needs a significantly different approach than the one initially employed. Therefore the diagrams presented in the previous section were adapted to the new requirements and were defined again in a more detailed way.
The following diagram shows the sender architecture diagram from Ch.3, adapted to the kernel programming needs.
Figure 4.1: Detailed sender architecture
This section presents a more detailed description of the sender part of the code and it will be divided into sub-sections. Each sub-section will cover a particular part or functionality of the sender module. If the reader wishes to acquirer a deeper understanding of the code, the same should refer to the source code appended to this report.
4.5.1.1 Sending ODATA
First of all, the application opens a socket by invoking the open socket system call. Then it calls the pgm_usrreq(so, req, m, addr, control) function with PRU_ATTACH as the request (req). This causes the pgm_attach(so) to run which in turn allocates a new inpcb for the socket, creates a new pgmcb and links the two of them.
Next, the application calls sendto system call. This again triggers pgm_usrreq( ) with PRU_SEND as request. Inside the send block the timers are initialised, the SPM generator is set to ambient mode (since now there are packets sent). New sequence numbers are created pgm_seq_num(tp, GEN_OD_SEQ) as well as the PGM extension header for the packet. The socket's send buffer is used as the window buffer sbappendrecord(&so® so_snd, m) and the packet (containing ODATA i.e. original data, sent for the first time) is copied into the socket buffer and then appended to the ODATA queue:
dcopy = m_copy(m, 0, M_COPYALL);
AppendToQueue(tp® ODATAQueue, dcopy, addr, control);
The traffic shaping function traffic_shaper(inp, tp) implements the token bucket and leaky bucket functionality shown in the original sender architecture diagram. What this function does is going through the three queues, SPM, RDATA and ODATA, in this order and if it finds a packet then the function pgm_output(inp, elem® buf, addr, control, pkt_type) is called and the packet is passed to the IP layer while resetting the timer. However, before calling the pgm_output() the traffic shaper first checks if there are enough tokens in the token buffer to send the packet. If this is the case pgm_output() is called and the amount of packets required for this packet is subtracted from the token buffer, otherwise nothing happens, i.e. the pgm_output() is not called hence the packet is not sent.
4.5.1.2 Receiving a NAK
When there is a NAK at the interface, IP calls pgm_input_recv(m, ip, ph, phext, type) which checks whether the destination address of the packet is unicast or multicast. If it is unicast then the packet is a NAK, it then checks to see if there is a matching PCB (i.e. for that destination address and port number) for it. When the PCB is found, the sender NAK handler is called snd_nak_handler(inp, phext). This function generates an NCF ncf = mgethdr(M_DONTWAIT, MT_HEADER) fills in the fields and the header extensions and then calls pgm_output(inp, ncf, tp® dest_addr, tp® control, PGM_NCF_TYPE) so that a response for the incoming NAK is acknowledged straight way.
Next, there is a check to see if the data corresponding to the NAK received are in the Window Buffer. If the ODATA is found, the advance window will be notified about the NAK, and the data is appended to the RDATAQueue: appendToQueue(tp® RDATAQueue, rdata, tp® dest_addr, tp® control).
4.5.1.3 Sending SPMs
As stated in the Sending ODATA section the SPM timer is triggered whenever the first user request is called to send some data. Therefore, this timer will run continuously according to the mode, which it is operating in. Each time that this timer expires the SPM generator function SPMGenerator(tp) is called. Its initial mode of operation is ambient mode as this is set when the user sends some data. This mode will change if no data has been sent when the timer expires. If the mode is changed to heartbeat, then SPMs are generated with a decay delay, until a maximum value is reached. When this value is reached, it becomes again a steady heartbeat.
The switch from ambient to SPM mode occurs in the following way: every time the user sends some data the mode is set to ambient, however, every time that an SPM is sent the mode is switched back to heartbeat.
Therefore, during the timer life cycle, if there are no new packets from the user the mode will remain heartbeat. Therefore, the SPM generator starts working in the heartbeat mode, sine there were no packets sent for that time period. If this is the case, each time the timer expires, as explained before, the next delay is increased until it reaches a maximum value. The generation of the new SPM hence the new mbuff is as follows:
spm = m_gethdr(M_DONTWAIT, MT_HEADER)
then the fields of the SPM are filled in and the packet is appended to the SPMQueue, in order to be delivered by the traffic shaper to pgm_output().
appendToQueue(tp® spmQueue, spm, tp® dest_addr, tp® control).
4.5.1.4 Timers used by the sender
The following three timers are used by the sender:
The function pgm_slowtimo() is called by the operating system every 500 ms. This function searches through all the PCBs and updates all the active timers. That is, it decreases all active timers by one, and if a particular timer reaches zero after this subtraction it then calls the pgm_timer() function with the appropriate parameter.
ipnxt = ip->inp_list.le_next;
tp = intopgmcb(ip);
if (tp == 0)
continue;
for (i = 0; i < PGM_NTIMERS; i++)
4.5.2 Receiver Code Description
The following diagram shows the receiver architecture diagram from Ch.3, adapted to the kernel programming needs.
Figure 4.2: Detailed receiver diagram
4.5.2.1 Receiving ODATA
When there is a packet at the interface, IP calls pgm_input_recv(m, ip, ph, phext, type) which checks whether the destination address of the packet is unicast or multicast. If it is multicast then the packet is an ODATA, it then checks to see if there is a matching PCB (i.e. for that destination address and port number) for it. When the PCB is found, then the function input_handler(inp, m, ph) is called. This function checks the type of the PGM packet received in order to dispatch it to the appropriate handler. There are also some further checks that are performed such as a check to see if the packet is within the receive window with the function is_in_range(m, inp, ph). If the packet is not in range it is discarded. There is also a check for duplicate packet with the function is_duplicate(m, tsi, sqn).
A similar procedure is followed in the case of RDATA, SPMs, NAKs and NCFs. Each time that one of the above packets is received, its type is checked and it is dispatched to the appropriate handler.
4.5.2.2 Sending NAK
A missing packet is detected by the input_handler() with the following procedure: when an ODATA packet is received and it is not a duplicate, the input_handler() calls the function check_if_pending(tsi, sqn) which checks whether there is a pending NAK for the packet with the specified sequence number. If not then a NAK cycle is initiated nak_state_machine(inp, tsi, nak, pgm_pkt_type).
Additionally the nak_state_machine() can be initiated by the nak_handler() and the ncf_handler().
4.5.2.3 Transmitting RDATA
This section begins when an NCF packet is received. Then the local retransmission queue is checked to see if it has the specified ODATA packet. If the packet exists, then the function local_retransmitter(tp, tsi, sqn) is called, which checks if there are pending RDATA. If there are no pending RDATA, then the local retransmission state machine is initiated lr_state_machine(inp, tsi, lr, event).
This chapter describes the testing procedure, starting with the testing methodology and then continuing with the tests performed on the sender and receiver.
Testing is one of the most important stages in software development. It is the stage where it can be proved whether the implementation and the requirements are in fidelity with each other. Testing can also confirm if the software specifications are complete and consistent. One of the main goals of testing is to have a minimum number of test cases that will find a majority of the implementation errors.
There are mainly four kinds of different test that can be performed.
These are:
- Conformance tests
- Functional tests
- Performance tests
- Quality of service testing
Conformance tests: they are used to verify whether the tested object is compliant to the standard. Protocol standards are defined layer per layer therefore the conformance tests are layer oriented.
Conformance is usually verified through extensive testing of an implementation in which tests are derived directly from the protocol specification. The implementation is said to conform to the specification if it behaves in a manner predicted by the protocol specification. If it doesn’t conform then an error exists in the implementation of the protocol.
Although this method does not formally verify that a protocol specification and an implementation are consistent, it represents the state-of-the-practice in this domain of software development. [Callahan 96]
The methodology to write and execute conformance tests is specified in ISO-9646
Functional tests
: they are used to examine the functionality of the system to be tested. Functional tests may go beyond conformance tests as they check the non-standardised features of the system under test as well. A much higher number of test cases is needed compared to conformance testing as functionality testing is inherently more complex. The key issue is the efficient generation of test cases.
Performance tests: they are used to measure the behaviour of the object to be tested related to time and resources. In the case of PGM performance tests concerned mainly CPU usage and memory usage. Additionally, the results obtained were compared with similar results for UDP.
Quality of Service testing: mainly performance testing at the host interfaces.
All the above are the general test methodology that were adapted to the testing needs of the project.
5.3.1 Fidelity and Requirements errors
The errors involved with protocol development can be categorised into two main areas, fidelity and requirement errors.
Fidelity errors are usually caused from inconsistencies between the requirements and the implementation. They can be identified by the fact that the implementation fails to behave in the way specified by the requirements. Requirements errors are normally due to either incomplete or inconsistent specifications. In the case of PGM there were no requirements errors as the group used as specification the latest version of the PGM draft. This was developed following the Cisco implementation of the protocol, therefore any requirements errors present in the previous version of the draft had been corrected.
The methodology described above was used as a basis in creating the testing plan for the PGM implementation.
Functional tests were conducted during the development of both the sender and receiver. Each group tested their implementation whenever a new addition in the code was done.
This was mainly done by adding printf() messages within each function and other crucial parts of the code. That way every time the kernel (and the code) was compiled, it was possible to see whether the code was actually behaving the way it was supposed to.
Both the sender and receiver were tested separately in the above way.
The next stage in testing was to merge the sender and receiver code in order to perform further functional as well as conformance tests.
5.5 Functional testing of the Sender’s module
This section illustrates the approach taken in order to test the functionality of the sender. The main objective of this procedure is not only to check if all the sender functions work according to their specification, but also to produce a well-documented report in order to demonstrate the internal testing carried out during the code development.
The internal testing is where the developer must have access to the source code so that it is possible to internally check the transitions from one state to the next. This testing does not involve the whole system, therefore treating the protocol as a black box, instead it goes inside the code and in the PGM case with some printf() messages showing the behaviour of the protocol step by step.
Further, as it can be realised through this section the sender functions were developed and tested in a systematic and sequential manner, that is, for example the protocol was first able to transmit ODATA without having any traffic shaping. Therefore, whenever these functions were behaving as expected more functionality was added to it. This helped the developers substantially, as the whole problem could be broken down into smaller problems.
The testing plan was designed according to the implementation plan. This is because, as stated before, the whole problem was broken down into stages, that is a very simple protocol was first created without almost any functionality and following that more complex functionality was added and tested.
Particularly in the sender development team, the first step was to test all the main components, which were also used by the receiver team. This included the data structures for the PGM protocol located in the pgm.h, pgm_var.h, pgm_timer.h and the pgm_usrreq.c, pgm_timer.c files containing functions such as pgm_usrreq(), pgm_timer(), pgm_slowtimo(), pgm_input(), pgm_output(), pgm_ctloutput().
When those functions as well as the data structures were properly finished and tested, then both teams had a base to work with. So, the receiver team then worked in dcnds-pc1 whereas the sender team in dcnds-pc2. The reason for this approach is that shared files would have to be changed and recompiled, therefore if both teams used the same machines many problems would occur.
After that, the sender team developed a very simple version of an ODATA transmission which would only take the data from the application and give it to the pgm_output(), hence to the ip_input() without any extra functionality added to it.
The traffic shaper and the three data structure queues ODATA, RDATA, SPM introduce enabling priorities of SPM over RDATA and RDATA over ODATA. The traffic shaper is also responsible for shaping the traffic by generating tokens in a particular rate, which in turn are stored/removed from the TokenBuffer. At this stage we had the opportunity to test the pgm_timer() as well as the pgm_slowtimo() since traffic shaper() is a function which is activated (called) from pgm_slowtimo(), hence pgm_timer().
The SPM generator was created in order to enable the generation of SPM messages. Note that this testing case was needed in order to test the generation of SPMs in ambient mode as well in heartbeat mode.
At this stage the team was able to test the WindowBuffer which stores the incoming ODATA so that retransmission is possible. The WindowBuffer uses the output buffer of the socket to do this.
At this stage, the advance window functions were added to the code and tested. The advance window can be set into three different modes. First mode is advance by time, the second mode advance by data, and the third advance by the application. The advance by time should be the default one, otherwise the user should set some socket options specifying whether a different advance method should be used, such as advance by data or advance by application. This testing case again tested the timers as well as the removal of some data from the window buffer, the output buffer of the socket.
Following the NCF and the NAK functionality was introduced. The first step was to test if a NCF message is generated and sent immediately after a NAK has been received. Therefore, the NAK handler, the function that is responsible for this section, must be able to search for the data. If the data exists, the NAK handler creates a NAK message and appends it into the NAK queue for retransmission. The NAK handler should also signal the advance window to delay the window advancement by one time slot. If the data is not found, the process will terminate without putting any notifications on the network, apart from the NCF, which was sent when the process was first triggered.
The last component to be tested independently was the control functions. This tested whether when the user sets the advance window, for example, to advance by data, the protocol behaves as requested.
5.5.2 ODATA functional testing
Step 1 - In ODATA testing, the application will first call Socket system call, which will trigger pgm_usrreq for the PRU_ATTACH request. In response to this request the kernel will create a new pgmcb for this socket.
Step 2 - Thereafter, the application will call SendTo system call passing some flags as well as the message to be transmitted. This call will trigger the pgm_usrreq function with a PRU_SEND request.
Step 3 - In pgm_usrreq the pgm extension header is created and added to the message. The sequence number of this message is also created and added to the seq_num in the pgm extension header. The pgm_output function is then called.
Step 4 - In the pgm_output function the pgmip header is created and appended in front of the pgm extension header and the message. This packet is then handed over to the ip_output routine.
5.5.3 Traffic Shaper functional testing
Step 1 – During the first SendTo system call the traffic sharper timer is initialised. Therefore, instead of handing the ODATA packet straight to the pgm_output as in step 3 of ODATA testing, the packet will be placed in the ODATA queue.
Step 2 – The traffic shaper is then responsible of fetching this packet from the ODATA queue by obeying some priority rules. The message is only sent whenever the traffic shaper finds enough tokens in the TokenBuffer. When this is the case pgm_output function is called and the relevant amount of tokens is removed from the TokenBuffer. Therefore, the remaining part of this test is the same as step 4 of ODATA testing.
Step 3 - In the case that all queues are empty and the detach flag is set, the traffic shaper signals the pgm_usrreq to carry on with the detach procedure and it will finish without resetting its timer. If this is not the case, the detach option is not set, the traffic shaper resets its timer and finishes as normal.
Step 1 – The SPM timer gets initialised when the first SendTo system call is invoked. The SPMs have two modes, ambient and heartbeat. At this stage the ambient mode is set. Note that this variable which holds the mode of the SPM, ambient or heartbeat mode, should be reset every time the SendTo system call is called in order to work according to the design.
Step 2 – Every time this timer expires, the SPM generator function is called and it generates an SPM message. The message is then put into the SPM queue where will be later sent by the traffic shaper.
Step 3 - Before this function ends it resets its timer again. However, this timer is set according to what mode it is supposed to be operating on. This is achieved by looking at the current value of the advance_mode variable, which holds the current state mode. If the current mode is ambient mode, it will only reset the timer in a pre-defined time and reset the advance_mode to heartbeat. Otherwise, if it is in the heartbeat mode the timer will be reset with a decay value. This value is increased until a maximum value is reached. Note that in this case the advance_mode variable should stay in heartbeat mode.
Window Buffer functional testing
Step 1 – During step 3 in ODATA testing the message will be added to the socket buffer. The results, if this works properly, can only be demonstrated when some data are actually needed to be fetched from it.
5.5.5 Advance Window Buffer functional testing
Step 1 – When the sendto system call is called for the first time the advance window timer is set. This setting is done according to the mode the advance window will work. The default mode is advance by time, however the application can ask a different mode by setting the appropriate socket options. The two other possible modes are advance by data and advance by application.
Step 2 – When the advance window timer expires, the advance window function is called. It first checks if the window can be advanced. Secondly, it checks if the NAK handler has signalled the advance window to delay the advancement for the next time slot. If it can be advanced and the NAK handler has not signalled, the window tail is advanced by the TXW_INC variable.
Step 3 – If the window has been advanced, then the expired data must be freed. The function will then reset its timer and finish.
5.5.6 The NCF and NAK functional testing
Step 1 - Firstly the ip_input() calls pgm_input(). The function pgm_input() calls pgm_input_snd().
Step 2 – The pgm_input_snd() calls snd_nak_handler(). The first action of the NAK handler is then to create a new NCF message. The function pgm_output() is then called in order to send the NCF immediately.
Step 3 – The NAK handler will then check if the wanted data is in the window buffer. If it is, advance_window is notified so that it can delay the window advancement by one time slot. The data is then copied into the RDATA queue and the function returns.
Step 4 – In case the wanted data packet is not found, the NAK handler will only terminate without any notifications and messages apart from the NCF.
5.6.1 Testing plan for Receive ODATA/RDATA Use Case
A testing plan has been designed to check the functionality of the Receive ODATA/RDATA Use case for a given TSI value.
Reaction:
Lead ³ ODATA sqn ³ Trail.
Reaction:
Lead < ODATA sqn ³ Trail.
Reaction:
Lead > ODATA sqn > Trail
Lead > continuous lead > Trail
Reaction:
(Lead > continuous lead > Trail) &&
(ODATA sqn = (continuous lead + 1)).
Then
Lead < ODATA sqn > Trail.
Lead = continuous lead > Trail
Reaction:
(ODATA sqn = (continuous lead + 1) &&
(Lead = continuous lead > Trail)).
Then
Lead = ODATA sqn > Trail.
Lead > continuous lead > Trail
Reaction:
(Lead < continuous lead > Trail) &&
(ODATA sqn = (continuous lead + 1) ).
Then
Lead > ODATA sqn < Trail.
Lead > continuous lead < Trail
Reaction:
(Lead > continuous lead < Trail) &&
(ODATA sqn = (continuous lead + 1)).
Then
Lead > ODATA sqn > Trail.
Lead > continuous lead = Trail
Reaction:
(ODATA sqn = (continuous lead + 1) &&
(Lead < continuous lead = Trail)).
Then
Lead > ODATA sqn = Trail.
Lead > continuous lead < Trail
Reaction:
(Lead > continuous lead < Trail) &&
(ODATA sqn = (continuous lead + 1) ).
Then
Lead > ODATA sqn > Trail.
Lead > continuous lead > Trail.
Reaction:
(Lead > continuous lead > Trail) &&
(ODATA sqn > (continuous lead + 1)).
Then
] continuous lead, ODATA sqn [
Lead < ODATA sqn > Trail.
Lead > continuous lead > Trail.
Reaction:
(Lead > continuous lead > Trail) &&
(ODATA sqn > (continuous lead + 1)).
Then
Lead < ODATA sqn > Trail.
Lead < continuous lead > Trail
Reaction:
(Lead < continuous lead > Trail) &&
(ODATA sqn > (continuous lead + 1)).
Then
Lead < ODATA sqn > Trail.
Lead = continuous lead > Trail
Reaction:
(ODATA sqn = (continuous lead + 1) &&
(Lead > continuous lead > Trail)).
Then
Lead = ODATA sqn > Trail.
Lead > continuous lead > Trail
Reaction:
(Lead < continuous lead > Trail) &&
(ODATA sqn > (continuous lead + 1)).
Then
Lead > ODATA sqn > Trail
Lead > continuous lead < Trail
Reaction:
(Lead > continuous lead < Trail) &&
(ODATA sqn > (continuous lead + 1)).
Then
Lead > ODATA sqn > Trail.
Lead > continuous lead = Trail
Reaction:
(ODATA sqn > (continuous lead + 1) &&
(Lead < continuous lead = Trail)).
Then
] continuous lead, ODATA sqn [
Lead > ODATA sqn = Trail.
Lead > continuous lead < Trail
Reaction:
(Lead > continuous lead < Trail) &&
(ODATA sqn > (continuous lead + 1) ).
Then
Lead > ODATA sqn > Trail
Lead > continuous lead > Trail
Reaction:
(Lead > continuous lead > Trail) &&
(ODATA sqn < (continuous lead + 1)).
Then
5.6.2 Testing the functionality of the Receive SPM Use Case
Reaction:
Reaction:
Reaction:
Reaction:
Reaction:
Reaction:
Reaction:
Reaction:
5.6.3 Testing the functionality of the Receive NAK Use Case
Reaction:
Reaction:
Reaction:
Reaction:
Reaction:
Reaction:
Reaction:
Reaction:
Reaction:
Reaction:
Reaction:
Reaction:
Reaction:
5.6.4 Testing the functionality of the Receive NCF Use Case
Reaction:
Reaction:
Reaction:
Reaction:
Reaction:
Reaction:
Reaction:
Reaction:
Reaction:
Reaction:
Reaction:
Reaction:
Reaction:
Reaction:
Reaction:
Reaction:
Reaction:
5.6.5 Testing the functionality of the Retransmit locally RDATA Use Case
Activating the local retransmitter will set the local retransmission timer –RDATA_RB_IVL- to start, and the retransmission cycle will begin.
The local retransmission cycle has two states:
A list of all possible scenarios that might occur once a call is made to the local retransmitter will be presented as follows:
The possible events are:
Reaction:
Reaction: Do nothing, because we haven’t started the RDATA_RB_IVL timer yet.
Reaction:
Reaction:
Reaction:
5.6.6 Testing the functionality of the Send NAK Use Case
Receiving requests for NAK generation activates the NAK initiator. Six states are associated with every NAK request:
A list of all possible scenarios that might occur once a call is made to the NAK initiator will be presented as follows:
The possible events are:
Reaction:
Reaction: Do nothing, because we haven’t started the NAK_RB_IVL nor NAK_GEN_IVL timers yet.
Reaction:
Reaction:
Reaction:
Reaction:
Reaction:
Reaction:
Reaction:
Reaction:
Reaction:
Reaction:
Reaction:
Reaction:
Reaction:
Reaction:
Reaction:
Reaction:
Once in the we are in the SUSPEND State of the NAK initiation cycle, the "NAK Initiator" stops all timers and indicates successful receipt of missing packet.
Once we are in the UNREC_DATA State of the NAK initiation cycle, the "NAK Initiator" stops all timers and indicates unrecoverable missing data.
Performance testing was the last part of the testing procedure. It was mainly related to the performance of the PGM protocol in terms of speed. This involved testing various sections of the code to determine how fast some functions execute.
The objective was to measure the time that it takes from the point that a sendto system call is called, to the point where the packet is sent. In order to obtain realistic results, this procedure was repeated fifty times, and the average was taken.
A similar procedure was used to obtain the time elapsed since a packet is received at ip_input() until it arrives at the application layer. Again, this was repeated fifty times and the average was taken.
In order to be able to analyse the results, the same testing procedure was repeated for UDP. That way the results for UDP could be compared with the results for PGM.
5.7.1 PGM Sender Performance Testing
Example 1
In the graph above it can be seen how send requests are served w.r.t time. The reason that values decrease as more send requests are invoked, has to do with the values that were chosen for this particular measurement. There are five values than are responsible for this outcome. These are: Maximum Transmit Rate (TXW_MAX_RTE), Transmit Window Size (TXW_BYTES), Packet Size, and the Number of Packets to be transmitted. In the table below, we can see the values of those four variables as chosen for this example.
TXW_MAX_RTE |
TXW_BYTES |
Packet Size |
Number of Packets |
150kBytes/sec |
560kBytes |
3kBytes |
50 |
The amount of bytes to be send can fit in the transmit window without filling it up. This means that the application doesn’t have to block waiting for the window to advance in order to continue sending the rest of its packets. Also, the transmission rate can support the whole number of bytes to be transmitted in one go. That means that the total amount of data can be sent in one second. That’s exactly what happens. The application appends the data in the ODATA transmit buffer before the first second (sec) expires. The traffic shaper timer wakes up when the first second is over and transmits the whole amount of data in one go. The duration of the send request is measured from the time the application invokes the PRU_SEND request to the time the PGM protocol handles the packet to the IP layer and returns. According to this, the first values are a bit higher and decrease as the number of packets increases since the wait time until the traffic shaper is active is included to their duration but wears off for the rest of them.
Example 2
In this example, a new set of values for the variables mentioned above is chosen. The protocol response, as the graph shows, is quite different. The table below contains the new values for the variables. The response is normal if the facts are considered carefully.
TXW_MAX_RTE |
TXW_BYTES |
Packet Size |
Number of Packets |
50kBytes/sec |
560kBytes |
3kBytes |
50 |
The difference in this example is that the maximum transmit rate is smaller than the amount of data to be transmitted. That means that the traffic shaper cannot transmit the whole amount of data in one go. The token bucket limiting the traffic will be emptied and the rest of the packets will have to wait for the next token to be produced. That is what creates the steps in the graph. Every time the traffic shaper runs out of tokens, the packets have to wait for the next second to be transmitted. In between the steps it can be noticed that the duration of a request decreases for the same reason it did in the first graph.
Example 3
In this example a different scenario was chosen. In this case, the limiting factor is the transmission window size. The response is very different. The values for this example can be found in the table below.
TXW_MAX_RTE |
TXW_BYTES |
Packet Size |
Number of Packets |
100kBytes/sec |
50kBytes |
3kBytes |
50 |
Since the window size cannot accommodate the full amount of data to be transmitted, the application has to suspend transmission until the protocol advances the window, thus emptying some space. Then the transmission can resume for as long as there is space in the window. This behaviour can be seen in the graph as a big step. After that, the transmission can continue with the normal small steps because of the transmission rate, which is a bit higher, this time. That’s why the small steps are smoother this time.
5.7.2 PGM Receiver Performance Testing
This section presents the results of the performance testing from the receiver’s perspective. The way that this is done, is to count the time elapsed from the moment that pgm_input() was called up to the moment that the packet is delivered to the application. Although a PGM receiver may receive other packets as well, this measurement is focused only to ODATA/RDATA packets since only these types of packets are delivered up to the application layer. All other packets (NCFs, NAKs, SPMs) have only an internal, protocol specific use.
Moreover, a small application was created that opens a socket and sends fifty IP packets running on dcnds-PC1. On dcnds-PC2 the receiver was waiting for these fifty packets. For every single packet, the time is counted and stored in a structure. This experiment was done twice. The first time UDP was used as the underlying transport protocol and the second time PGM was used. This was done in order to make a comparison between these protocols. TCP was not used for the comparison, as it is not a multicast protocol. The table below shows the results using UDP.
According to the above diagram UDP delivers packet with an average delay of 224,9 m sec to the application layer. Not a big surprise since UDP is a light weight protocol with no extra functionality. PGM is supposed to be a lightweight protocol too. However one would expect it to be much slower than UDP since PGM has to keep a copy of every received packet in the right order, inside the receive window. In addition, a packet reception may cause other effects too such as receive window advance, where the receiver has to delete the last packets of the receive window. Also it must be taken into account that a receiver does not deliver packets out of sequence to the application. For instance, if the receiver loses a packet then it stores the next received packets but does not deliver them to the application layer until the sender retransmits the lost packet. So one would expect even longer delays in these kinds of scenarios. However in this case the receiver received all packets at the right order. The following diagram presents the results.
Not surprisingly, PGM receiver is slower than UDP. The reason for this is the extra functionality associated with a PGM receiver. The average delay for the receiver is 9170,367 m sec, which is slightly greater than that of UDP. The whole UDP code is 800 lines while just the PGM receiver code is more than 1500 lines! PGM after all is not as lightweight as it seems and definitely not optimised.
This final chapter provides an assessment of the overall project, in terms of achievements, state of the art, group organisation and performance. It also outlines the skills acquired by the group members during the project development.
The project was completed within the time allocated and the compulsory requirements of the specification were implemented successfully. The group members feel that their initial goals are met and all their objectives have been accomplished.
More specifically, the group:
Although the compulsory requirements of this project were implemented, there is still some more functionality that can be added. The group would have done this if there was more time available.
An extra feature that can be added easily is to configure the host to act as a router. That way it will be able to forward packets to other hosts, thus implementing the network element functionality described in the PGM draft. This can be achieved by:
Finally the performance of the protocol can be analysed for a range of network topologies and end system usage patterns. This will provide useful information as to how PGM behaves in different network topologies, and it could be also used to measure the performance as compared to other multicast protocols.
Academic skills
The project was inherently more difficult than any other project the group members had been involved in the past. Part of it required programming within the FreeBSD kernel, a task none of the members was familiar with.
Nevertheless, the project was completed successfully and the group members acquired a great amount of new skills and knowledge, and also improved the skills they already had.
Starting from the research and analysis, the group members gained knowledge on multicast protocols and especially on how the Pragmatic General Multicast protocol functions.
PGM is a new technology that solves a lot of the reliability problems that traditional multicast protocols have. Therefore knowing how PGM works in detail is an advantage, since it is believed that PGM will be used broadly in the near future.
The group also gained experience in using the Unified Modelling Language. UML was used in the analysis stage in order to create diagrams that would help the group members visualise the protocol behaviour. Although everyone had used UML before, it is agreed that using it for protocol analysis was significantly different and more difficult than using it for application design.
Kernel programming was another important issue. Experience was gained in learning the way that the kernel is structured and the way the code within the kernel interacts both with the application and the network. Furthermore, the group members had to study how the TCP and UDP code is structured within the kernel. This provided valuable insight as to how protocols are implemented.
Coding the PGM sender and receiver was the next step in the implementation procedure. This required advanced C programming and the group members acquired a lot of valuable experience. The coding procedure was more demanding than other stages of the implementation due to the fact that a lot of the already existing code, common to all protocols had to be used. This would sometimes make it more difficult to find errors in the code.
During the testing stage, the group used a complete methodology, which enabled them to perform tests at all levels. The methodology included conformance, functional and performance tests. This provided the group with valuable experience in terms of conducting well-organised and complete tests.
Non-Academic skills
Along with the very useful academic experience that the group members gained, there are also some other issues, such as teamwork, time management, and group organisation, which the group feel they have improved to a great extent.
Although everyone had taken part to group projects during their undergraduate degrees, the group agree that they have improved their communication and teamwork skills. Due to the fact that this was a very demanding project, the group improved their time and work management skills in order to produce the expected results within the deadline.
Additionally, since the project was divided into several stages, there were points where each person was working on a different thing. This fact enabled the group members to further develop the communication skills they already had, as each member would have to explain and inform the rest of the group about the completed tasks.
The design and implementation of the PGM host was a very challenging project, as the group members had no previous experience in the field of protocol and kernel programming.
The specification given to the group consisted only of the draft and some requirements-suggestions on how to structure their work. Furthermore, the amount of information regarding PGM, available on the Internet is very limited. This is due to the fact that PGM is a relatively new protocol. Therefore the group had to work from the specification only, and this was part of the requirements, as it would assist the goal of IETF standardisation.
That fact alone made this project an extremely demanding one.
The work completed was according to the specifications and a fully functional PGM host was implemented.
All the group members worked very hard and very professionally in every stage of the project, making sure that all tasks were completed within the stage-specific deadlines.
A suggested improvement in the PGM protocol is the following: the first network element (upstream) does not back off, so if it did, an improvement could be achieved as it would give time to a local host to re-transmit the data so that the packet loss could be resolved locally.
Further, only the first network element would back off otherwise a long delay could be added depending on the size of the tree from receiver until source. A suggestion is that this could be achieved by the source NLA as the net element is able to check if the address of the incoming packet is the same as the NAK originator.
The main aim of this project was to analyse, design and implement the PGM host for FreeBSD. The PGM host has been successfully implemented and the project aims have been met.
The final state of the project is as follows:
The group members believe they have met the project aims. They agree that this project, despite being very demanding, has been a great learning experience. The group members are sure that the experience and the skills acquired during this project will prove to be very useful in their future careers.
[1] Speakman, T., Farinacci, D., Lin, S., and Tweely, A. "PGM Reliable Transport Protocol Specification", INTERNET DRAFT Draft_speakman_pgm_spec_03, January 1999.
[2] Wright, G., Stevens, R. "TCP/IP Illustrated, The Implementation", Vol. 2, Addison Wesley 1994.
[3] Booch, G., Rumbaugh, J., Jacobson. I. (1999). "The Unified Modeling User Guide", Addison Wesley Logman Inc.
[4] Booch, G., Jacobson, I., Rumbaugh, J. (1997). The UML specification documents. Santa Clara, CA.: Rational Software Corp.
[5] Callahan, J., Montgomery, T, "An Approach to Verification and Validation of a Reliable Multicasting Protocol", NASA Software IV&V Facility, January 1996.
[6] http://web.jet.es/sola/inet98.html. "Scalability of Internet Multicast Protocols".
[7] http://www-mesh.cs.berkeley.edu/ns/ns-contributed.html. "UCB/LBNL Network Simulator: Contributed Code"
[8] http://www.east.isi.edu/rm/orlando-meeting.html. "The Reliable Multicast Research Group meeting in Orlando, Feb 1998"
[9] http://eewww.eng.ohio-tate.edu/drcl/grants/middleware97/framework.html. "Unified Framework for Achieving Multi-Dimensional QoS"
[10] http://w3.abdn.ac.uk/videomedical/papers/comparison_of_reliable_multicast.htm "Comparison of Reliable Multicast Protocols (RMP)"
[11] http://www.cisco.com/warp/public/146/february98/6.html. "Cisco Announces Support For Reliable Multicast Technology"
[12] http://www.mcclellanconsulting.com/whitepapers/pgm.html
McClellan Consulting, "Pragmatic General Multicast".
work plan
diagrams
code
minutes