Summary of ASoC 2020

Project Information

1. Project Name

Dragonfly - an intelligent P2P based image and file distribution system.

2. Project Task Description

Now the client of Dragonfly will random read and write disk multiple times during the downloading process.
For directly using dfget to download a file:

  • dfget random writes a piece into disk after downloading it
  • dfget server random reads the piece from disk to share it
  • dfget sequential reads the file from disk after downloading to do checksum

And for using dfdaemon to pull images, there’re extra disk IO by Dragonfly:

  • dfdaemon sequential reads the file from disk to send it to dockerd

It’s not a problem when the host has a local disk. But it will be a potential
bottleneck when the Dragonfly client runs on a virtual machine with a cloud disk, all the disk IO will become network IO which has a bad performance when read/write at the same time.

So a solution is needed to reduce the IO times generated by Dragonfly.

3. Implementation Plan

As proposed in the Issue #1164, the P2P streaming is come up for the problem discussed above. And my work in ASoC 2020 is to implement part of the streaming download method. The task can be summaried into four aspects, which are supernode scheduler, dfget downloader, dfget uploader, IPC between user and Dragonfly stream dfget.

3.1 Supernode Scheduler

The work in supernode can be categorized into two respectives:

72036461-e505b200-32d5-11ea-929e-0955ce5be993 (1).png)

  • Maintain the sliding window in supernode.

    • Initialize: The size of window will be registered in the dfget task registration phrase. And then, the window state is recorded in the ProgressManager as the SyncMap indexed by ClientID. The size of window is staic once it is recorded by supernode.
    • Update: The dfget client will report to the supernode about the result of downloading for every piece. In the report API of supernode server, the window will be updated based on the received piece number and piece status. To be specific, the window keeps sliding until the unacknowledged piece. It should be noted that the supernode is using the sender window, which has the range of [una, una + wnd).
    • Finish: Update the DeleteCID method of ProgressManager to delete the sliding window state if it is in stream mode.
  • Schedule the pieces according to the window state

    The only modification that I made in this part lies at the GetPieceProgressByCID method of ProgressManager. The available pieces in regular mode means the success pieces which are not running; while for the stream mode, the available pieces means the cached pieces which are not running. After the modification, the ProgressManager would only return the available pieces inside the window when stream mode is on.

    A new kind of piece status is created: UNCACHED. When the piece is downloaded successfully, I assume that it will sotred into the cache immediately. Afterwards, the piece maybe popped out of the cache. And in this case, the handler of supernode server deletePieceCache at supernode/server/router.go:101 would be called to change the state of the piece.

    It should be noted that, the scheduler in stream mode currently uses the same scheduler as the regular mode, which may demands future optimization.

3.2 DFGET Downloader

Since the client stream writer has been implemented under the p2p_downloader folder, which means that the pieces downloading process has been finished. My work here is to fullfill the task of handling the successfully downloaded pieces to uploader. Besides that, in the perparation phase at dfget/core/core.go:171, the registration of stream task is necessary for uploader. Here are all the modifications:

  • dfget/core/core.go:210: The doDownload method is refactored. One new interface is defined as DownloadTimeoutTask. The interface has two implementations, which are regular downloader and the stream downloader.

  • dfget/core/downloader/downloader.go:79: The Start() method of the StreamDownloadTimeoutTask struct will call the RunStream method of the downloader, and then pass the stream reader to the startWriter method.

  • dfget/core/downloader/downloader_util.go:55: The startWriter method is used to fetch the pieces from the stream reader concurrently, and then upload the pieces to uploader in order. The logic of the startWriter is referenced from the stream downloading method in CDN manager at supernode/daemon/mgr/cdn/super_writer.go:59.

    The unit test has been added.

3.3 DEGET Uploader

The previous uploader is solely used for the regular downloading method. For the stream uploader, I have added new APIs for it. Apart from that, the cache manager is added to manage the cache pieces.

  • dfget/core/api/supernode_api.go:359: A new kind of updating piece status API has been implemented. It is used to change the peice status from SUCCESSFUL to UNCACHED.

  • dfget/core/api/uploader_api.go:108: The uploader API RegisterStreamTask is added to register the stream task at cache manager. The initialization of the according entry at cache manager is finshed after the call.

  • dfget/core/api/uploader_api.go:127: The uploader API DeliverPieceToUploader is used to handle the piece from downloader to uploader.

  • dfget/core/uploader/cache.go:60: The FIFO cache manager is created to control the piece caches. It would store and pop the piece in the style of FIFO. It should be noted that the cache manager is using the receiver window, which has the range of [start, una).

    The unit test has been added.

  • TODO: GC for cache manager

3.4 IPC Between DFGET and User

This part demands further discussion. Currently, I propose that the DFGET can directly output the content of the pieces to stdout, and then the user who calls the DFGET command, can redirect the stdout of the DFGET command to the write side of the pipe by generally supported method popen.

After that, the user can directly read from the pipe, and get the successfully downloaded content. Since the method uses the unnamed pipe, the whole process is not related with the file system.

4. Milestone Review

Date Milestone
07/07 - 07/14 Community Bonding & Source Code Reading
Issue 1403
07/15 - 07/27 Research on the Scheduler of Supernode for Stream Downloading
Commit: Support Stream Mode in Supernode
07/28 - 08/10 Discuss the Stream Implementation with a Mentor from Ant.
Commit: Implement Work Flow of Stream Mode in DFGET
08/11 - 08/26 Research on the IPC between the User and Stream DFGET.
commit: Add Unit Test and Fix Bugs for Integration Test

Here is the changes of code lines after the commits: (+2033, -77).

Project Summary

1. Project Deliverables

1.1 Detailed Design

The Overrall Proposal: https://github.com/dragonflyoss/Dragonfly/issues/1436#issue-659970568

Design of Supernode Scheduler: https://github.com/dragonflyoss/Dragonfly/pull/1447#issue-459701444

Design of DFGET downloader, uploader and IPC between user and DFGET: https://github.com/dragonflyoss/Dragonfly/pull/1447#issuecomment-681619384

1.2 Source Code

Github Commits: https://github.com/dragonflyoss/Dragonfly/pull/1447/commits

1.3 Test Document

The tests have been implemented into the test files. And the annotation inside the source files would be necessary for the developer to test.

cmd/dfget/app/root_test.go: https://github.com/dragonflyoss/Dragonfly/pull/1447/files#diff-14e54b172ea500c227d81045b4b132b4R139

dfget/core/downloader/downloader_util_test.go: https://github.com/dragonflyoss/Dragonfly/pull/1447/files#diff-9b3047c6073bec2c5c3a02c87cfce3e2R1

dfget/core/uploader/cache_test.go: https://github.com/dragonflyoss/Dragonfly/pull/1447/files#diff-61f496afc10516997e19f5ac636278f8R1

2. Project Highlights

The highlights of my work is that I have implemented almost all of the downloading in sream mode. The sum of the modification code lines are (2033 + 77) = 2110. And I believe that the stream mode will definitely improve the efficiency of the Dragonfly downloading process in the case of the remote file system.

What’s more, this will be one of the features that Dragonfly owns compared to the traditional P2P systems.

3. Experience

The whole ASoC experience benefits me a lot. First of all, I have done much research work during the application phase, which makes me familiar with the traditional distributed systems. And then, during the development phase, I learnt the system design, concurrent programming in golang, start-of-the-art architecture of Dragonfly and the group work experience with mentors in Alibaba.

Mentors are really good at what they are working for: by the simple discussion, they can always quickly find the weakness of my design and point out the correct direction of the next move.

In a word, it is so lucky for me to be chosen to be part of the 2020 ASoC program.

0%