Research done for the design of Dragonfly Streaming Scheduler.

Here P2P Streaming means that the data downloaded by the peer which will then be passed to the target place, is not going to stored in the local file system. The data would be maintained in the memory, and then ditectly send to the target node.

This is necessary for the P2P downloading when the peer server have a cloud disk. In that case, the process of writing the data to the file system of peer server would be limited to the network I/O. But if the P2P streaming is adopted, the writing process would be eliminated.

In this post, I would do research on how to design the scheduler in the supernode/server to distribute the pieces among peer servers.

# Is Random Scheduling Sufficient in P2P Video Streaming?

P2P streaming system can be broadly classified into two categories, namely tree-based and mesh-based. The tree-based systems have well-organized overlay structures and typically distribute data from a peer to its children peers. While the mesh-based P2P streaming system is not restricted to the static topology. The peering relationships are established/terminated based on the content availability and bandwidth availability on peers. Dragonfly is just based on the last type of systems.

### Baseline Scheduling: Random Pull Scheduling

This scheduling algorithm is kind of like the current DF algorithm. I’ll post it below

• Distribute the number of pieces evenly. Select the pieces with the smallest number in the entire P2P network so that the distribution of each piece in the P2P network is balanced to avoid “Nervous resources”.
• Nearest distance priority. For a peer, the piece closest to the piece currently being downloaded is preferentially selected, so that the peer can achieve the effect of sequential read and write approximately, which will improve the I/O efficiency of file.
• Local blacklist and global blacklist. An example is easier to understand: When peer A fails to download from peer B, B will become the local blacklist of A, and then the download tasks of A will filter B out; When the number of failed downloads from B reaches a certain threshold, B will become the global blacklist, and all the download tasks will filter B out.
• Self-isolation. PeerA will only download files from the supernode after it fails to download multiple times from other peers, and will also be added to the global blacklist. So the peer A will no longer interact with peers other than the supernode.
• Peer load balancing. This mechanism will control the number of upload pieces and download pieces that each peer can provide simultaneously and the priority as the target peer.

### Adaptive Queue-based Chunk Scheduling

This scheduling algorithm is not that useful at the use case of DF scenario.

A pull signal are sent from peers a to the server whenever its forwarding queue becomes empty (or falls below a threshold) (step 1 in Figure 1). The server responds to a pull signal by sending three data chunks back to peer a (step 2). These chunks will be stored in the forwarding queue (step 3) and be relayed to peer b and peer c (step 4). When the server has responded to all ’pull’ signals in its ’pull’ signal queue, it uploads one data chunk to all peers (step 5). This data chunk will not be stored in peer’s forwarding queue and will not be relayed further.

### Random Useful Packet Forwarding

A brand new concept is brought up: the fresh data chunks

The source server in RUPF always gives highest priority to the fresh data chunks that have not been sent out before.

And there are differences between the RUPF and DF architecture. The formal one is design for the user, which means that the peer would have to choose a subset of peers to serve. While the DF peer would serve follow the request from other peers, which is scheduled by the supernode. There doesn’t exist the choice. Thus, the concept of deprived peers could be dropped by DF.

0%