Project Information
1. Project Name
Dragonfly - an intelligent P2P based image and file distribution system.
2. Project Task Description
Now the client of Dragonfly will random read and write disk multiple times during the downloading process.
For directly using dfget
to download a file:
dfget
random writes a piece into disk after downloading itdfget server
random reads the piece from disk to share itdfget
sequential reads the file from disk after downloading to do checksum
And for using dfdaemon
to pull images, there’re extra disk IO by Dragonfly:
dfdaemon
sequential reads the file from disk to send it todockerd
It’s not a problem when the host has a local disk. But it will be a potential
bottleneck when the Dragonfly client runs on a virtual machine with a cloud disk, all the disk IO will become network IO which has a bad performance when read/write at the same time.
So a solution is needed to reduce the IO times generated by Dragonfly.
3. Implementation Plan
As proposed in the Issue #1164, the P2P streaming is come up for the problem discussed above. And my work in ASoC 2020 is to implement part of the streaming download method. The task can be summaried into four aspects, which are supernode scheduler, dfget downloader, dfget uploader, IPC between user and Dragonfly stream dfget.
3.1 Supernode Scheduler
The work in supernode can be categorized into two respectives:
.png)
Maintain the sliding window in supernode.
- Initialize: The size of window will be registered in the dfget task registration phrase. And then, the window state is recorded in the ProgressManager as the SyncMap indexed by ClientID. The size of window is staic once it is recorded by supernode.
- Update: The dfget client will report to the supernode about the result of downloading for every piece. In the
report
API of supernode server, the window will be updated based on the received piece number and piece status. To be specific, the window keeps sliding until the unacknowledged piece. It should be noted that the supernode is using the sender window, which has the range of[una, una + wnd)
. - Finish: Update the
DeleteCID
method of ProgressManager to delete the sliding window state if it is in stream mode.
Schedule the pieces according to the window state
The only modification that I made in this part lies at the
GetPieceProgressByCID
method of ProgressManager. The available pieces in regular mode means the success pieces which are not running; while for the stream mode, the available pieces means the cached pieces which are not running. After the modification, the ProgressManager would only return the available pieces inside the window when stream mode is on.A new kind of piece status is created: UNCACHED. When the piece is downloaded successfully, I assume that it will sotred into the cache immediately. Afterwards, the piece maybe popped out of the cache. And in this case, the handler of supernode server
deletePieceCache
atsupernode/server/router.go:101
would be called to change the state of the piece.It should be noted that, the scheduler in stream mode currently uses the same scheduler as the regular mode, which may demands future optimization.
3.2 DFGET Downloader
Since the client stream writer has been implemented under the p2p_downloader
folder, which means that the pieces downloading process has been finished. My work here is to fullfill the task of handling the successfully downloaded pieces to uploader. Besides that, in the perparation phase at dfget/core/core.go:171
, the registration of stream task is necessary for uploader. Here are all the modifications:
dfget/core/core.go:210
: ThedoDownload
method is refactored. One new interface is defined asDownloadTimeoutTask
. The interface has two implementations, which are regular downloader and the stream downloader.dfget/core/downloader/downloader.go:79
: TheStart()
method of theStreamDownloadTimeoutTask
struct will call theRunStream
method of the downloader, and then pass the stream reader to thestartWriter
method.dfget/core/downloader/downloader_util.go:55
: ThestartWriter
method is used to fetch the pieces from the stream reader concurrently, and then upload the pieces to uploader in order. The logic of thestartWriter
is referenced from the stream downloading method in CDN manager atsupernode/daemon/mgr/cdn/super_writer.go:59
.The unit test has been added.
3.3 DEGET Uploader
The previous uploader is solely used for the regular downloading method. For the stream uploader, I have added new APIs for it. Apart from that, the cache manager is added to manage the cache pieces.
dfget/core/api/supernode_api.go:359
: A new kind of updating piece status API has been implemented. It is used to change the peice status from SUCCESSFUL to UNCACHED.dfget/core/api/uploader_api.go:108
: The uploader APIRegisterStreamTask
is added to register the stream task at cache manager. The initialization of the according entry at cache manager is finshed after the call.dfget/core/api/uploader_api.go:127
: The uploader APIDeliverPieceToUploader
is used to handle the piece from downloader to uploader.dfget/core/uploader/cache.go:60
: The FIFO cache manager is created to control the piece caches. It would store and pop the piece in the style of FIFO. It should be noted that the cache manager is using the receiver window, which has the range of[start, una)
.The unit test has been added.
TODO: GC for cache manager
3.4 IPC Between DFGET and User
This part demands further discussion. Currently, I propose that the DFGET can directly output the content of the pieces to stdout, and then the user who calls the DFGET command, can redirect the stdout of the DFGET command to the write side of the pipe by generally supported method popen.
After that, the user can directly read from the pipe, and get the successfully downloaded content. Since the method uses the unnamed pipe, the whole process is not related with the file system.
4. Milestone Review
Date | Milestone |
---|---|
07/07 - 07/14 | Community Bonding & Source Code Reading Issue 1403 |
07/15 - 07/27 | Research on the Scheduler of Supernode for Stream Downloading Commit: Support Stream Mode in Supernode |
07/28 - 08/10 | Discuss the Stream Implementation with a Mentor from Ant. Commit: Implement Work Flow of Stream Mode in DFGET |
08/11 - 08/26 | Research on the IPC between the User and Stream DFGET. commit: Add Unit Test and Fix Bugs for Integration Test |
Here is the changes of code lines after the commits: (+2033, -77).
Project Summary
1. Project Deliverables
1.1 Detailed Design
The Overrall Proposal: https://github.com/dragonflyoss/Dragonfly/issues/1436#issue-659970568
Design of Supernode Scheduler: https://github.com/dragonflyoss/Dragonfly/pull/1447#issue-459701444
Design of DFGET downloader, uploader and IPC between user and DFGET: https://github.com/dragonflyoss/Dragonfly/pull/1447#issuecomment-681619384
1.2 Source Code
Github Commits: https://github.com/dragonflyoss/Dragonfly/pull/1447/commits
1.3 Test Document
The tests have been implemented into the test files. And the annotation inside the source files would be necessary for the developer to test.
cmd/dfget/app/root_test.go: https://github.com/dragonflyoss/Dragonfly/pull/1447/files#diff-14e54b172ea500c227d81045b4b132b4R139
dfget/core/downloader/downloader_util_test.go: https://github.com/dragonflyoss/Dragonfly/pull/1447/files#diff-9b3047c6073bec2c5c3a02c87cfce3e2R1
dfget/core/uploader/cache_test.go: https://github.com/dragonflyoss/Dragonfly/pull/1447/files#diff-61f496afc10516997e19f5ac636278f8R1
2. Project Highlights
The highlights of my work is that I have implemented almost all of the downloading in sream mode. The sum of the modification code lines are (2033 + 77) = 2110. And I believe that the stream mode will definitely improve the efficiency of the Dragonfly downloading process in the case of the remote file system.
What’s more, this will be one of the features that Dragonfly owns compared to the traditional P2P systems.
3. Experience
The whole ASoC experience benefits me a lot. First of all, I have done much research work during the application phase, which makes me familiar with the traditional distributed systems. And then, during the development phase, I learnt the system design, concurrent programming in golang, start-of-the-art architecture of Dragonfly and the group work experience with mentors in Alibaba.
Mentors are really good at what they are working for: by the simple discussion, they can always quickly find the weakness of my design and point out the correct direction of the next move.
In a word, it is so lucky for me to be chosen to be part of the 2020 ASoC program.