Core Based Trees (CBT)

https://people.eecs.berkeley.edu/~sylvia/cs268-2019/papers/cbt.pdf

Paper read (background: 1993)

  • Problem:

    • Scalability is the main challenge in forming the delivery tree of one-to-many wide-area communication

      • I.e. too much memory, bandwidth, or too many processing resources

    • Dependent of the underlying unicast routing algorithm

      • Q: how and why??

  • Goal: design a lost-cost, relatively simple, and efficient multicast protocol

  • Killer application: audio and video conferencing, replicated database updating and quering, software update distribution, stock market information services, and resource discovery

  • Existing approach

    • Multicasting to many different types of networks and environments

      • Shortest-path source-based delivery tree between senders and recipients

      • Tight coupling tree algorithms with particular unicast algorithms so at boundary, ad hoc means are used when different multicast algorithms interfere

    • Properties

      • Host Group Model conformance

        • E.g. anonymous group members, sender needs not to be a member

      • High probability of delivery

        • Successful delivery rate should remain high enough to allow for recovery of lost packets by E2E protocols

      • Low delay

        • Example application: video conferencing

        • LANs: low delay; but WAN: higher due to greater geographic extent --> need to optimize multicast routes

    • Need to extend the properties

      • Scalability

        • Large-increase in wide-area multicasts; scaling properties across a full range of applications

        • Q: scalability in terms of what? # of group members? different applications?

      • Robustness

        • Maintaining and repairing connectivity between group members

      • Information hiding

        • Routers / bridges should not have to know any information as to the existence of that group evne though they are forwarding multicast packets

      • Routing Algorithm independence

        • Resulting in much simplified multicast tree formation across domain boundaries

      • Multicast tree flexibility

        • applications vary in sender / receiver population, traffic characteristics, and membership dynamics

    • Existing Algorithms

      • Three parts: environment, DVMRP, link-state multicast

      • Env

        • Broadcast LANs (e.g. Ethernet, FDDI, ATM) intrinsically support multicast addressing

          • IP: class D addressing (an address taken from a portion of the IP address space that is used for uniquely identifying a host group

        • Routers: promiscuously receive all multicast packets; polls the LAN for host memberships at intervals

          • Single router: membership interrogator / designated router

          • Implementation of poll: implemented by Internet Group Management Protocol (IGMP)

        • Host: only does so under application request

      • DVMRP

        • Send and prune (i.e. information about absence of group members propagates back up the tree towards the source) and graft (i.e. rejoin the sending router to the multicast tree)

        • But

          • Need modest amount of processing power, overhead determined by the stability of dynamictiy of the internet

      • Link-state Multicast Routing

        • Router includes the list of groups that have members on that link

        • Any router can compute the shortest-path tree from any source to any group using Dijkstra

        • But

          • Link-states packets flooded when topology or membership changes

          • Overhead of global group membership maintained by all routers (regardless of whether they form part of the trees or not)

      • Critique on existing approach

        • Scalability

          • S x N: number of active sources per multicast group x number of multicast groups present

          • Flood and prune: routers not interested in sending / receiving multicast packets are involved in reception, generation, and interpretation of prune and graft messages and additional storage of prunes per (src, group) pair

            • I.e. off-tree routers to keep per-source state

        • Underlying unicast-dependency

          • Tight coupling

            • Requires modifying the underlying unicast algorithm to take multicast into consideration

            • Requires specialized solutions for multicasting between domains running different multicast algorithms

    • CBT - the new architecture overview

      • Single node (i.e. router) to be the core of the tree

      • Branches have other routers (i.e. non-core routers)

        • form a shortest path between a member-host's directly attached router and the core

      • Leaf router: a router at the end of a branch

      • Pros

        • Scaling: S x N --> N

          • One multicast tree per group v.s One tree per (source, group) pair

        • Tree creation

          • Receiver-based

            • No router is invovled in becoming part of a tree for a particular group unless the router is intent on or is becoming the member of that group

          • Only one tree is every created per group

        • Unicast routing separation

          • All multicast tree information can be delivered solely from a router's existing unicast forwarding tables with no addtional processing necessary

      • Two routing phases

        • Unicast routing is used to route multicast packets to a multicast tree, allowing multicast groups and packets to remain "invisible" to routers not on the tree

        • Once on the corresponding tree, multicast packets span the tree based on the packet's group identifier (i.e. group-id)

      • Require no partition of the unicast address space

CBT protocol

  • Fields

    • Group

      • One or more core addresses: normal unicast addresses of the core routers

        • Use: get packets to the tree

      • A group identifier

        • Use: multicast based on this identifier

        • Group-id: flat, 32-bit identifier chosen independently by each core router

      • A group name

    • Cores and core placement

      • A group: core list, each router in the list is assigned a priority or ranking by the group initiator

      • Highest rank: primary core

      • Placement

        • No research on fidning the centre of a dynamic multicast spanning tree

        • But should positively reflect that group's characteristics

    • Forwarding algorithm (packet travel in IP datagrams)

      • Control packet

        • Tree building, re-configuration, tree tear-down

        • Processed at each hop

      • Data packet

        • Carry the group core address in "destination field", group id in the "option field"

        • Once arrived at the on-tree router: core address is discarded, group-id is placed in the destination address field

          • Allows faster tree switching

          • CBT multicast packets are prevented being forwarded (unicast) by these routers that is non-CBT

      • Can think of it as an overlay of the underlying unicast routing algorithm

      • JOIN-REQUEST (JOIN-ACK)

        • Unicast until it reaches the addressed core or reaches a CBT-capable router that is already part of the tree (i.e. identified by the group-id)

      • QUIT-REQUEST

      • Failure

        • Either 1) JOIN-REQUEST by itself to the highest-priority reachable core or 2) Send a FLUSH-FREE message downstream to allow each router independently re-attach itself to the tree, potentially via a better route than previously

        • Single-core CBT trees

          • Path / cores can fail

          • Use multiple backup cores

          • Some proposal to have robust single-core tree

            • How non-primary try join the primary core, and they can act as the targets of CBT control packets

        • Multi-core CBT trees

          • Useful for groups that are topologically widespread

          • Need to have explicit protocol operating between the cores

      • Loop detection

        • Explicit loop detection

          • "active" bit: indicates that the join is in the process of reaching a tree router

          • "rejoin" bit: set if the originating router is attempting to re-join the tree subsequent to that router's parent or parent-link having failed

          • Loop-detector with these two bits

          • QUIT-REQUEST: when the router that originated the active join receive the corresponding inactive join (?)

  • Why scalable?

    • Routers not on a multicast tree need know nothing about the existence of groups of which they are not a member

      • therefore can route packet from non-member senders to the tree as per unicast

    • Routers on a multicast tree need only know their intermediate up-tree and down-tree neighbor routers (i.e. minimal amount of info w/ respect to a group)

  • Disadvantages of CBT

    • Core placement and shortest-path trees

      • May not provide the most optimal paths between members of a group. This is especially true for small, localized groups that have a non-local core.

        • Quotes: "manual "best-guess" placement will be acceptable" ??

      • The core as a point of failure

        • --> multiple cores, with added complexity

    • Looking at other materials

      • I.e. SM-PIM over CBT was that discovering who the senders are could be separated from building efficient trees from those senders to receivers

Future work

  • Testing: shape of CBT tree affect performance / delay characteristics? Per-node wtiching overheads? how quickly can CBT adapt to failures?

  • Dynamic core selection and placement

  • Dynamic group id assignment: failures to address this means the group id conflicts are more likely to occur

    • Why specifically? any example?

  • Scoping

    • When applications operate in the global MBone, it is clear that not all groups should have global scope. This is especially the case for performance reasons with flood and prune multicast routing protocols, but it also the case with other routing protocols for application security reasons and because multicast addresses are a scarce resource. Being able to constrain the scope of a session allows the same multicast address to be in use at more than one place so long as the scope of the sessions does not overlap.

      Multicast scoping can currently be performed in two ways which are known as TTL scoping and administrative scoping. Currently TTL scoping is most widely used, with only a very few sites making use of administrative scoping.

  • Policy routing and multicasting

    • Right now evolve independently

    • But if internet-wise AD policies are going to be dynamic and wide-ranging, group membership could be severly constrained resluting in lack of openness

  • CBT security architecture

    • Auth, encryption

Question oriented

What problem does this paper address?

  • The most important problem this paper aims to address is the scalability of multicast architecture with respect to Internet of very large size (e.g. large number of sources / senders and groups)

  • It also tries to

    • 1) reduce the high cost of current multicast architecture in terms of computational overhead and storage requirements

    • 2) provide flexibility: simplify the design on top of heterogeneous domains with different unicast routing mechanisms

      • via decoupling multicast tree-building with a particular unicast algorithm to simplify designs

Do you believe the problem is/was important? Explain.

  • Yes.

  • The paper points out two important trends of the Internet: 1) ever-increasing size (i.e. large-number of internetwork-wide multicast applications) 2) ever-increasing heterogeneity and complexity (e.g. Internet with ASs running internal routing protocols of their choice).

  • The centre problem on how to scale multicast architecture thus became important on top of this fast-growing, enormously complex heterogeneous structure.

What is the authors' main insight?

  • Existing algorithms back then (e.g. DVMRP, Link-State Multicast Routing Protocol) were

    • 1) building source-based delivery trees for each group, requiring routers to store per source information per group

    • 2) flooding either data or group membership information to every router in the internetwork, even to routers that are off-tree

    • 3) coupling multicast and unicast algorithms

    • These mechanisms do not scale with larger number of sources and groups and more heterogeneous networks in the Internet.

  • In contrast, the authors proposed to construct a single core (center)-based delivery tree per group, regardless of the source. The core is made up of individual router, sometimes a set of routers in the internetwork. The core addresses are unicast addresses of these core routers. When a host wishes to join a group, it queries for both the group-id and core address at the local router and sends JOIN-REQUEST towards the core. The join continues until either it reaches the core or a CBT-capable router that is already part of the tree in the group. Then the receiving router sends back an acknowledgement (JOIN-ACK) to the original router. During this process, the tree branch is created, and the intermediate routers traversed by the ACK will record its parent and child interfaces. When sending data, unicast routing is used to route multicast packet to this multicast tree; once on the tree, these packets span the tree based on the group-id.

Do you think the solution is a good one? Explain.

  • I think this solution is good it outlines a mechanism that

    • 1) improves scaling factor from S x N to N where S is the # of sources and N is the # of groups. Each router only needs to store state information per group.

    • 2) decouples multicast tree-building with underlying unicast routing: simplify the design and development process, reduce additional processing overhead

  • Other comments

    • Core placement is critical to performance; it determines the efficiency of routes between sender and group members. The paper left this as an opening question and mentioned only some trivial heuristics.

    • Sometimes whether the solution is good or not also depends on the sparsity of the underlying multicast groups. For groups where recipients are densely distributed everywhere, the flood and prune approach might work even better in terms of performance compared to CBT, which provides no guarantee on optimal shortest path from each receiver to sender and can result in inefficient data paths.

    • Sharing a single tree per group can concentrate load in a specific set of edges / links that belong to this tree. The source-based approach might do a better job in terms of path diversity and load distribution. For multicast applications like audio conferencing that are delay-sensitive this might be a problem.

    • To understand the quality of this solution, we need to see a real implementation and analysis on top of an actual system, with varying workload scenarios.

Any other comments/thoughts?

  • The paper compares single-core v.s multi-core CBT trees when it mentions the system robustness upon core failure. Another case where single-core may be disadvantageous is that under the CBT design, most of the traffic load can happen around this single core router and its neighbors. Load balancing is also important to minimize delay exacerbation.

  • The paper talks about "multicast tree flexibility" as one of its proposed properties. It points out that multicast applications vary according to sender and receiver population, traffic characteristics, and membership dynamics, and the delivery tree should be built to reflect this nature. But later on this property along with the tree-building strategies on different applications are not mentioned again. Maybe this relies on prototype evaluations.

  • How will this architecture behave when the stability or dynamicity of the Internet changes quickly? Will the network be flooded with JOIN-REQUEST and JOIN-ACK?

Did you enjoy reading this paper?

  • Yes. The motivation(e.g. trends of Internet, inefficiencies of existing works) and how it leads to the design of CBT architecture are clearly stated. I also like the future work session entailing discussions on dynamic core selection and placement and policy routing. This approach seemed to be the earliest work proposing center-based multicast routing. It is intuitive and simple, and is innovative in the sense that it raises awareness on the current trends and provides a completely different way to structure the architecture according to it (i.e. core-based over source-based).

Question

  • Why not going to application-layer multicast but go to IP layer multicast?

  • No discussions on the performance (no evals either, just a proposal)

  • S x N: how many sources is normal?

  • Can each single router has multiple group ids? yes?

  • LAN v.s WAN is interesting; why is fault tolerance not considered carefully in our setup?

  • Core router traffic bottleneck? Load balancing? (it mentions about multi-core for robustness)

  • Sparsity of the network (i.e. how many members in the group) and how does that affect the result?

Last updated