Multithreaded Iterative Dir Tree Scan

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Multithreaded Iterative Dir Tree Scan

Barak Sason Rofman
Hello everyone!
Following a discussion I had with @Susant Palai some time ago, we have decided to look into an option to improve the rebalance process in the DHT layer by modifying the underlying mechanism. Currently, dir-tree crawling is done recursively, by a single thread, which is likely slow and also poses the risk of stack overflow. An iterative multithreaded solution might improve performance and also stability (by eliminating the risk of stack overflow). I have prepared a POC doc on the matter, including a sample implementation of the iterative multithreaded solution. The doc can be found at:
https://docs.google.com/document/d/1JCl0T9zeagOcFFpgVQF8zNyhlR54VqkNAZ7TJb42egE/edit
Apart from the rebalance process, maybe this approach can be useful for other use-cases where dir-tree crawl is being performed? Any comments on the concept, the design of the solution and the implementation are welcome.

--
Barak Sason Rofman

Gluster Storage Development

Red Hat Israel

34 Jerusalem rd. Ra'anana, 43501

bsasonro[hidden email]    T: +972-9-7692304
M: +972-52-4326355


_______________________________________________

Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://bluejeans.com/441850968




Gluster-devel mailing list
[hidden email]
https://lists.gluster.org/mailman/listinfo/gluster-devel

Reply | Threaded
Open this post in threaded view
|

Re: Multithreaded Iterative Dir Tree Scan

Amar Tumballi-3
This looks like a good effort to pick up Barak. A needed one indeed.

-Amar

On Mon, Mar 23, 2020 at 3:18 PM Barak Sason Rofman <[hidden email]> wrote:
Hello everyone!
Following a discussion I had with @Susant Palai some time ago, we have decided to look into an option to improve the rebalance process in the DHT layer by modifying the underlying mechanism. Currently, dir-tree crawling is done recursively, by a single thread, which is likely slow and also poses the risk of stack overflow. An iterative multithreaded solution might improve performance and also stability (by eliminating the risk of stack overflow). I have prepared a POC doc on the matter, including a sample implementation of the iterative multithreaded solution. The doc can be found at:
https://docs.google.com/document/d/1JCl0T9zeagOcFFpgVQF8zNyhlR54VqkNAZ7TJb42egE/edit
Apart from the rebalance process, maybe this approach can be useful for other use-cases where dir-tree crawl is being performed? Any comments on the concept, the design of the solution and the implementation are welcome.

--
Barak Sason Rofman

Gluster Storage Development

Red Hat Israel

34 Jerusalem rd. Ra'anana, 43501

bsasonro[hidden email]    T: +972-9-7692304
M: +972-52-4326355

_______________________________________________

Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://bluejeans.com/441850968




Gluster-devel mailing list
[hidden email]
https://lists.gluster.org/mailman/listinfo/gluster-devel



--
--
Container Storage made easy!


_______________________________________________

Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://bluejeans.com/441850968




Gluster-devel mailing list
[hidden email]
https://lists.gluster.org/mailman/listinfo/gluster-devel

Reply | Threaded
Open this post in threaded view
|

Re: Multithreaded Iterative Dir Tree Scan

Sankarshan Mukhopadhyay


On Fri, 24 Apr 2020 at 08:05, Amar Tumballi <[hidden email]> wrote:
This looks like a good effort to pick up Barak. A needed one indeed.


Should this be tracked with a release label and planned? The content of the document should probably transfer itself to the issue tracking the PR(s)

 
On Mon, Mar 23, 2020 at 3:18 PM Barak Sason Rofman <[hidden email]> wrote:
Hello everyone!
Following a discussion I had with @Susant Palai some time ago, we have decided to look into an option to improve the rebalance process in the DHT layer by modifying the underlying mechanism. Currently, dir-tree crawling is done recursively, by a single thread, which is likely slow and also poses the risk of stack overflow. An iterative multithreaded solution might improve performance and also stability (by eliminating the risk of stack overflow). I have prepared a POC doc on the matter, including a sample implementation of the iterative multithreaded solution. The doc can be found at:
https://docs.google.com/document/d/1JCl0T9zeagOcFFpgVQF8zNyhlR54VqkNAZ7TJb42egE/edit
Apart from the rebalance process, maybe this approach can be useful for other use-cases where dir-tree crawl is being performed? Any comments on the concept, the design of the solution and the implementation are welcome.

--
Barak Sason Rofman

Gluster Storage Development

Red Hat Israel

34 Jerusalem rd. Ra'anana, 43501

bsasonro[hidden email]    T: +972-9-7692304
M: +972-52-4326355



--
sankarshan mukhopadhyay
<https://about.me/sankarshan.mukhopadhyay>

 

_______________________________________________

Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://bluejeans.com/441850968




Gluster-devel mailing list
[hidden email]
https://lists.gluster.org/mailman/listinfo/gluster-devel