Modifying gluster's logging mechanism

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
12 messages Options
Reply | Threaded
Open this post in threaded view
|

Modifying gluster's logging mechanism

Barak Sason Rofman

Hello Gluster community,

My name is Barak and I’ve joined RH gluster development in August.
Shortly after my arrival, I’ve identified a potential problem with gluster’s logging mechanism and I’d like to bring the matter up for discussion.

The general concept of the current mechanism is that every worker thread that needs to log a message has to contend for a mutex which guards the log file, write the message and, flush the data and then release the mutex.
I see two design / implementation problems with that mechanism:

  1. The mutex that guards the log file is likely under constant contention.

  2. The fact that each worker thread perform the IO by himself, thus slowing his "real" work.


Initial tests, done by removing logging from the regression testing, shows an improvement of about 20% in run time. This indicates we’re taking a pretty heavy performance hit just because of the logging activity.


In addition to these problems, the logging module is due for an upgrade:

  1. There are dozens of APIs in the logger, much of them are deprecated - this makes it very hard for new developers to keep evolving the project.

  2. One of the key points for Gluster-X, presented in October at Bangalore, is the switch to a structured logging all across gluster.


Given these points, I believe we’re in a position that allows us to upgrade the logging mechanism by both switching to structured logging across the project AND replacing the logging system itself, thus “killing two birds with one stone”.

Moreover, if the upgrade is successful, the new logger mechanism might be adopted by other teams in Red Hat, which lead to uniform logging activity across different products.


I’d like to propose a logging utility I’ve been working on for the past few weeks.
This project is still a work in progress (and still much work needs to be done in it), but I’d like to bring this matter up now so if the community will want to advance on that front, we could collaborate and shape the logger to best suit the community’s needs.

An overview of the system:

The logger provides several (number and size are user-defined) pre-allocated buffers which threads can 'register' to and receive a private buffer. In addition, a single, shared buffer is also pre-allocated (size is user-defined). The number of buffers and their size is modifiable at runtime (not yet implemented).

Worker threads write messages in one of 3 ways that will be described next, and an internal logger threads constantly iterates the existing buffers and drains the data to the log file.

As all allocations are allocated at the initialization stage, no special treatment it needed for "out of memory" cases.

The following writing levels exist: 

  1. Level 1 - Lockless writing: Lockless writing is achieved by assigning each thread a private ring buffer. A worker threads write to that buffer and the logger thread drains that buffer into a log file.

In case the private ring buffer is full and not yet drained, or in case the worker thread has not registered for a private buffer, we fall down to the following writing methods:

  1. Level 2 - Shared buffer writing: The worker thread will write it's data into a buffer that's shared across all threads. This is done in a synchronized manner.

In case the private ring buffer is full and not yet drained AND the shared ring buffer is full and not yet drained, or in case the worker thread has not registered for a private buffer, we fall down to the last writing method:

  1. Level 3 - Direct write: This is the slowest form of writing - the worker thread directly write to the log file.

The idea behind this utility is to reduce as much as possible the impact of logging on runtime. Part of this reduction comes at the cost of having to parse and reorganize the messages in the log files using a dedicated tool (yet to be implemented) as there is no guarantee on the order of logged messages.

The full logger project is hosted on: https://github.com/BarakSason/Lockless_Logger

For project documentation visit: https://baraksason.github.io/Lockless_Logger/

I thank you all for reading through my suggestion and I’m looking forward to your feedback,

--
Barak Sason Rofman

Gluster Storage Development

Red Hat Israel

34 Jerusalem rd. Ra'anana, 43501

bsasonro[hidden email]    T: +972-9-7692304
M: +972-52-4326355


_______________________________________________

Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/441850968


NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/441850968

Gluster-devel mailing list
[hidden email]
https://lists.gluster.org/mailman/listinfo/gluster-devel

Reply | Threaded
Open this post in threaded view
|

Re: Modifying gluster's logging mechanism

Atin Mukherjee-2
This is definitely a good start. In fact the experiment you have done which indicates a 20% improvement of run time perf with out logger does put this work for a ‘worth a try’ category for sure. The only thing we need to be mindful here is the ordering of the logs to be provided, either through a tool or the logger itself taking care of it.

On Thu, 21 Nov 2019 at 18:34, Barak Sason Rofman <[hidden email]> wrote:

Hello Gluster community,

My name is Barak and I’ve joined RH gluster development in August.
Shortly after my arrival, I’ve identified a potential problem with gluster’s logging mechanism and I’d like to bring the matter up for discussion.

The general concept of the current mechanism is that every worker thread that needs to log a message has to contend for a mutex which guards the log file, write the message and, flush the data and then release the mutex.
I see two design / implementation problems with that mechanism:

  1. The mutex that guards the log file is likely under constant contention.

  2. The fact that each worker thread perform the IO by himself, thus slowing his "real" work.


Initial tests, done by removing logging from the regression testing, shows an improvement of about 20% in run time. This indicates we’re taking a pretty heavy performance hit just because of the logging activity.


In addition to these problems, the logging module is due for an upgrade:

  1. There are dozens of APIs in the logger, much of them are deprecated - this makes it very hard for new developers to keep evolving the project.

  2. One of the key points for Gluster-X, presented in October at Bangalore, is the switch to a structured logging all across gluster.


Given these points, I believe we’re in a position that allows us to upgrade the logging mechanism by both switching to structured logging across the project AND replacing the logging system itself, thus “killing two birds with one stone”.

Moreover, if the upgrade is successful, the new logger mechanism might be adopted by other teams in Red Hat, which lead to uniform logging activity across different products.


I’d like to propose a logging utility I’ve been working on for the past few weeks.
This project is still a work in progress (and still much work needs to be done in it), but I’d like to bring this matter up now so if the community will want to advance on that front, we could collaborate and shape the logger to best suit the community’s needs.

An overview of the system:

The logger provides several (number and size are user-defined) pre-allocated buffers which threads can 'register' to and receive a private buffer. In addition, a single, shared buffer is also pre-allocated (size is user-defined). The number of buffers and their size is modifiable at runtime (not yet implemented).

Worker threads write messages in one of 3 ways that will be described next, and an internal logger threads constantly iterates the existing buffers and drains the data to the log file.

As all allocations are allocated at the initialization stage, no special treatment it needed for "out of memory" cases.

The following writing levels exist: 

  1. Level 1 - Lockless writing: Lockless writing is achieved by assigning each thread a private ring buffer. A worker threads write to that buffer and the logger thread drains that buffer into a log file.

In case the private ring buffer is full and not yet drained, or in case the worker thread has not registered for a private buffer, we fall down to the following writing methods:

  1. Level 2 - Shared buffer writing: The worker thread will write it's data into a buffer that's shared across all threads. This is done in a synchronized manner.

In case the private ring buffer is full and not yet drained AND the shared ring buffer is full and not yet drained, or in case the worker thread has not registered for a private buffer, we fall down to the last writing method:

  1. Level 3 - Direct write: This is the slowest form of writing - the worker thread directly write to the log file.

The idea behind this utility is to reduce as much as possible the impact of logging on runtime. Part of this reduction comes at the cost of having to parse and reorganize the messages in the log files using a dedicated tool (yet to be implemented) as there is no guarantee on the order of logged messages.

The full logger project is hosted on: https://github.com/BarakSason/Lockless_Logger

For project documentation visit: https://baraksason.github.io/Lockless_Logger/

I thank you all for reading through my suggestion and I’m looking forward to your feedback,

--
Barak Sason Rofman

Gluster Storage Development

Red Hat Israel

34 Jerusalem rd. Ra'anana, 43501

bsasonro[hidden email]    T: +972-9-7692304
M: +972-52-4326355

_______________________________________________

Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/441850968


NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/441850968

Gluster-devel mailing list
[hidden email]
https://lists.gluster.org/mailman/listinfo/gluster-devel

--
--Atin

_______________________________________________

Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/441850968


NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/441850968

Gluster-devel mailing list
[hidden email]
https://lists.gluster.org/mailman/listinfo/gluster-devel

Reply | Threaded
Open this post in threaded view
|

Re: Modifying gluster's logging mechanism

Xie Changlong-2
In reply to this post by Barak Sason Rofman


在 2019/11/21 21:04, Barak Sason Rofman 写道:

I see two design / implementation problems with that mechanism:

  1. The mutex that guards the log file is likely under constant contention.

  2. The fact that each worker thread perform the IO by himself, thus slowing his "real" work.


Initial tests, done by removing logging from the regression testing, shows an improvement of about 20% in run time. This indicates we’re taking a pretty heavy performance hit just because of the logging activity.


Hi Barak Sason Rofman.  Amazing perf improvement! Could show me the detail test method ?

Thanks

    -Xie

In addition to these problems, the logging module is due for an upgrade:

  1. There are dozens of APIs in the logger, much of them are deprecated - this makes it very hard for new developers to keep evolving the project.

  2. One of the key points for Gluster-X, presented in October at Bangalore, is the switch to a structured logging all across gluster.


_______________________________________________

Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/441850968


NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/441850968

Gluster-devel mailing list
[hidden email]
https://lists.gluster.org/mailman/listinfo/gluster-devel

Reply | Threaded
Open this post in threaded view
|

Re: Modifying gluster's logging mechanism

Barak Sason Rofman
Thank you for your input Atin and Xie Changlong.

Regarding log ordering - my initial thought was to do it offline using a dedicated too. Should be straight forward, as the logs have time stamp composed of seconds and microseconds, so ordering them using this value is definitely possible.
This is actually one of the main reasons I wanted to bring this up for discussion - will it be fine with the community to run a dedicated tool to reorder the logs offline?
Reordering the logs offline will allow us to gain the most performance improvement, as keeping the logs order online will have some cost (probably through stronger synchronization).
Moreover, we can take log viewing one step further and maybe create some GUI system (JAVA based?) to view and handle logs (e.g. one window to show the combined order logs, other windows to show logs per thread etc').

Regarding the test method - my initial testing was done by removing all logging from regression. Modifying the method "skip_logging" to return 'true' in all cases seems to remove most of the logs (though not all, "to be on the safe side", really removing all logging related methods is probably even better).
As regression tests use mostly single-node tests, some additional testing was needed. I've written a couple of very basic scripts to create large number of files / big files, read / write to / from them, move them around and perform some other basic functionality.
I'd actually be glad to test this in some 'real world' use cases - if you have specific use cases that you use frequently, we can model them and benchmark against - this will likely offer an even more accurate benchmark.

On Fri, Nov 22, 2019 at 7:27 AM Xie Changlong <[hidden email]> wrote:


在 2019/11/21 21:04, Barak Sason Rofman 写道:

I see two design / implementation problems with that mechanism:

  1. The mutex that guards the log file is likely under constant contention.

  2. The fact that each worker thread perform the IO by himself, thus slowing his "real" work.


Initial tests, done by removing logging from the regression testing, shows an improvement of about 20% in run time. This indicates we’re taking a pretty heavy performance hit just because of the logging activity.


Hi Barak Sason Rofman.  Amazing perf improvement! Could show me the detail test method ?

Thanks

    -Xie

In addition to these problems, the logging module is due for an upgrade:

  1. There are dozens of APIs in the logger, much of them are deprecated - this makes it very hard for new developers to keep evolving the project.

  2. One of the key points for Gluster-X, presented in October at Bangalore, is the switch to a structured logging all across gluster.



--
Barak Sason Rofman

Gluster Storage Development

Red Hat Israel

34 Jerusalem rd. Ra'anana, 43501

bsasonro[hidden email]    T: +972-9-7692304
M: +972-52-4326355


_______________________________________________

Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/441850968


NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/441850968

Gluster-devel mailing list
[hidden email]
https://lists.gluster.org/mailman/listinfo/gluster-devel

Reply | Threaded
Open this post in threaded view
|

Re: Modifying gluster's logging mechanism

Xie Changlong-2


在 2019/11/22 17:43, Barak Sason Rofman 写道:
Thank you for your input Atin and Xie Changlong.

Regarding log ordering - my initial thought was to do it offline using a dedicated too. Should be straight forward, as the logs have time stamp composed of seconds and microseconds, so ordering them using this value is definitely possible.
This is actually one of the main reasons I wanted to bring this up for discussion - will it be fine with the community to run a dedicated tool to reorder the logs offline?
Reordering the logs offline will allow us to gain the most performance improvement, as keeping the logs order online will have some cost (probably through stronger synchronization).
Moreover, we can take log viewing one step further and maybe create some GUI system (JAVA based?) to view and handle logs (e.g. one window to show the combined order logs, other windows to show logs per thread etc').

Regarding the test method - my initial testing was done by removing all logging from regression. Modifying the method "skip_logging" to return 'true' in all cases seems to remove most of the logs (though not all, "to be on the safe side", really removing all logging related methods is probably even better).


Thanks for sharing, i'll go back to your method and do some perf test : )


As regression tests use mostly single-node tests, some additional testing was needed. I've written a couple of very basic scripts to create large number of files / big files, read / write to / from them, move them around and perform some other basic functionality.
I'd actually be glad to test this in some 'real world' use cases - if you have specific use cases that you use frequently, we can model them and benchmark against - this will likely offer an even more accurate benchmark.

On Fri, Nov 22, 2019 at 7:27 AM Xie Changlong <[hidden email]> wrote:


在 2019/11/21 21:04, Barak Sason Rofman 写道:

I see two design / implementation problems with that mechanism:

  1. The mutex that guards the log file is likely under constant contention.

  2. The fact that each worker thread perform the IO by himself, thus slowing his "real" work.


Initial tests, done by removing logging from the regression testing, shows an improvement of about 20% in run time. This indicates we’re taking a pretty heavy performance hit just because of the logging activity.


Hi Barak Sason Rofman.  Amazing perf improvement! Could show me the detail test method ?

Thanks

    -Xie

In addition to these problems, the logging module is due for an upgrade:

  1. There are dozens of APIs in the logger, much of them are deprecated - this makes it very hard for new developers to keep evolving the project.

  2. One of the key points for Gluster-X, presented in October at Bangalore, is the switch to a structured logging all across gluster.



--
Barak Sason Rofman

Gluster Storage Development

Red Hat Israel

34 Jerusalem rd. Ra'anana, 43501

bsasonro[hidden email]    T: +972-9-7692304
M: +972-52-4326355


_______________________________________________

Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/441850968


NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/441850968

Gluster-devel mailing list
[hidden email]
https://lists.gluster.org/mailman/listinfo/gluster-devel

Reply | Threaded
Open this post in threaded view
|

Re: Modifying gluster's logging mechanism

Yaniv Kaul
In reply to this post by Barak Sason Rofman


On Fri, Nov 22, 2019 at 11:45 AM Barak Sason Rofman <[hidden email]> wrote:
Thank you for your input Atin and Xie Changlong.

Regarding log ordering - my initial thought was to do it offline using a
dedicated too. Should be straight forward, as the logs have time stamp
composed of seconds and microseconds, so ordering them using this value is
definitely possible.
This is actually one of the main reasons I wanted to bring this up for
discussion - will it be fine with the community to run a dedicated tool to
reorder the logs offline?
Reordering the logs offline will allow us to gain the most performance
improvement, as keeping the logs order online will have some cost (probably
through stronger synchronization).
Moreover, we can take log viewing one step further and maybe create some
GUI system (JAVA based?) to view and handle logs (e.g. one window to show
the combined order logs, other windows to show logs per thread etc').

If you need an external tool (please not Java - let's not add another language to the project), you might as well move to binary logging.
I believe we need to be able to do this sort using Linux command line tools only.

Regarding the test method - my initial testing was done by removing all
logging from regression. Modifying the method "skip_logging" to return
'true' in all cases seems to remove most of the logs (though not all, "to
be on the safe side", really removing all logging related methods is
probably even better).

This is not a fair comparison:
1. The regression tests are running with debug log
2. Not logging at all != replacing the logging framework - the new one will have its own overhead as well.
3. I believe there's a lot of code that is not real life scenarios there - such as dumping a lot of data there (which causes a lot of calls to inode_find() and friends - for example).
Y.

As regression tests use mostly single-node tests, some additional testing
was needed. I've written a couple of very basic scripts to create large
number of files / big files, read / write to / from them, move them around
and perform some other basic functionality.
I'd actually be glad to test this in some 'real world' use cases - if you
have specific use cases that you use frequently, we can model them and
benchmark against - this will likely offer an even more accurate benchmark.

On Fri, Nov 22, 2019 at 7:27 AM Xie Changlong <[hidden email]> wrote:

>
> 在 2019/11/21 21:04, Barak Sason Rofman 写道:
>
> I see two design / implementation problems with that mechanism:
>
>    1.
>
>    The mutex that guards the log file is likely under constant contention

_______________________________________________

Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/441850968


NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/441850968

Gluster-devel mailing list
[hidden email]
https://lists.gluster.org/mailman/listinfo/gluster-devel

Reply | Threaded
Open this post in threaded view
|

Re: [Gluster-Maintainers] Modifying gluster's logging mechanism

Ravishankar N
In reply to this post by Barak Sason Rofman

On 22/11/19 3:13 pm, Barak Sason Rofman wrote:
> This is actually one of the main reasons I wanted to bring this up for
> discussion - will it be fine with the community to run a dedicated
> tool to reorder the logs offline?

I think it is a bad idea to log without ordering and later relying on an
external tool to sort it.  This is definitely not something I would want
to do while doing test and development or debugging field issues. 
Structured logging  is definitely useful for gathering statistics and
post-processing to make reports and charts and what not,  but from a
debugging point of view, maintaining causality of messages and working
with command line text editors and tools on log files is important IMO.

I had a similar concerns when  brick multiplexing feature was developed
where a single log file was used for logging all multiplexed bricks'
logs.  So much extra work to weed out messages of 99 processes to read
the log of the 1 process you are interested in.

Regards,
Ravi

_______________________________________________

Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/441850968


NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/441850968

Gluster-devel mailing list
[hidden email]
https://lists.gluster.org/mailman/listinfo/gluster-devel

Reply | Threaded
Open this post in threaded view
|

Re: [Gluster-Maintainers] Modifying gluster's logging mechanism

Barak Sason Rofman
Thank you all for participating in this discussion.

Regarding Yaniv's comments:
If you need an external tool (please not Java - let's not add another language to the project), you might as well move to binary logging.
I believe we need to be able to do this sort using Linux command line tools only.
My intention was not at all to perform ordering using Java.
Ordering of the logs can be done easily using Python or even C (these are the tools that I know).
I'd be happy to know how this can be done with just Linux CLI, please share your insight.
Regarding Java - I only meant to use it as a GUI tool (Java Swift).
A GUI tool that presents ordered logs in a couple of ways (e.g. a tab that shows the sorted logs for all threads, separate tabs per threads etc') may add some value.
Regarding the binary logging, actually the system I proposed is already "semi binary", as it logs time-stamp as raw hex. A switch to full binary is fairly simple and I do see advantages with that proposal.

This is not a fair comparison:
1. The regression tests are running with debug log
2. Not logging at all != replacing the logging framework - the new one will have its own overhead as well.
3. I believe there's a lot of code that is not real life scenarios there - such as dumping a lot of data there (which causes a lot of calls to inode_find() and friends - for example).
1 - Actually I'm not sure about this. need to verify.
2 - I haven't claimed that with a new system we'd suddenly have a 20% performance increase. I have pointed out a potential problematic influence of the current mechanism that users (and developers) may be unaware of (as Strahil's comment suggests).
3 - That's the power of a community. I encourage users and developers to perform further tests on the matter, with real  life scenarios, so we'd have a better understanding of the impact.

Regarding Ravi's comments:
maintaining causality of messages and working
with command line text editors and tools on log files is important IMO.
Will running a tool in the form of "# logOrderer /someDireWithLogs" and having logs sorted in the way they are currently sorted will be so bad?
Furthermore, The system I proposed can easily produce ordered logs if no threads are registered for a private buffer (As I mentioned, and as the project documentation mentions, if a thread doesn;t have a private buffer, he automatically falls down to "level 2" writing, which is writing to a shared buffer - basic async logging. Level 2 and 3 maintain log ordering).

I think at this point we can focus the discussion at 2 points:
1 - Do we want to change the current system?
The current mechanism is "synced" logging, which definitely hurts performance. Are we OK with taking that hit or do we want to improve?
"Async logging" is not a new concept it certainly has it's advantages over "synced" logging.
2 - Given that the answer for question 1 is "yes", what do we require from the new logging system?
I proposed a system I've been developing as a side project for the past couple of weeks and I'd appreciate looking at the proposed mechanism if comments are made specifically on it (and remember that the project is still a work in progress).
Lastly, there are a lot of logging alternatives out there (e.g. Log4c) which are definitely worth consideration.

Lastly I want to adress Strahil's comment:
As an end user, I think that performance improvents must be of outmost priority and this 'async logging' approach makes  sense.
Actually, you make me think If I really need  such detailed logs (I'm running an oVirt lab) , as I can benefit from logless  gluster's performance.
Obviously there are many  different users with many different use cases out there and I believe we should be flexible enough to provide them with a suitable solution for their needs. I'd hate to see users turn off logging just because it hurts their performance as it would hurt our ability as developers to provide support when needed.

Again, thank you for participating and looking forward for more comments and input,

On Fri, Nov 22, 2019 at 12:19 PM Ravishankar N <[hidden email]> wrote:

On 22/11/19 3:13 pm, Barak Sason Rofman wrote:
> This is actually one of the main reasons I wanted to bring this up for
> discussion - will it be fine with the community to run a dedicated
> tool to reorder the logs offline?

I think it is a bad idea to log without ordering and later relying on an
external tool to sort it.  This is definitely not something I would want
to do while doing test and development or debugging field issues. 
Structured logging  is definitely useful for gathering statistics and
post-processing to make reports and charts and what not,  but from a
debugging point of view, maintaining causality of messages and working
with command line text editors and tools on log files is important IMO.

I had a similar concerns when  brick multiplexing feature was developed
where a single log file was used for logging all multiplexed bricks'
logs.  So much extra work to weed out messages of 99 processes to read
the log of the 1 process you are interested in.

Regards,
Ravi



--
Barak Sason Rofman

Gluster Storage Development

Red Hat Israel

34 Jerusalem rd. Ra'anana, 43501

bsasonro[hidden email]    T: +972-9-7692304
M: +972-52-4326355


_______________________________________________

Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/441850968


NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/441850968

Gluster-devel mailing list
[hidden email]
https://lists.gluster.org/mailman/listinfo/gluster-devel

Reply | Threaded
Open this post in threaded view
|

Re: [Gluster-Maintainers] Modifying gluster's logging mechanism

Sankarshan Mukhopadhyay
In reply to this post by Ravishankar N
The original proposal from Barak has merit for planning towards an
alpha form to be available. Has the project moved away from writing
specifications and reviewing those proposals? Earlier, we used to see
those rather than discuss in multi-threaded list emails. While
recorded gains in performance are notable, it would be prudent to
cleanly assess the switch-over cost should the project want to move
over to the new patterns of logging. This seems like a good time to
plan well for a change that has both utility and value.

That said, I'd like to point out some relevant aspects. Logging is
just one (although an important one) part of what is being talked
about as o11y (observability). A log (or, a structured log) is a
record of events. Debugging of distributed systems require
understanding of an unit of work which flowed through a system firing
off events which in turn were recorded. Logs thus are often portions
of events which are part of this unit of work (assume this is a
"transaction" if that helps grasp it better). Or, in other words, logs
are portions of the abstraction ie. events. The key aspect I'd like to
highlight is that (re)designing structured logging in isolation from
o11y principles will not work as we see more customers adopt o11y
patterns, tools within their SRE and other emerging sub-teams.
Focusing just on logs keeps us rooted to the visual display of
information via ELK/EFK models rather than consider the behavior
centric diagnosis of the whole system.



On Fri, Nov 22, 2019 at 3:49 PM Ravishankar N <[hidden email]> wrote:

>
>
> On 22/11/19 3:13 pm, Barak Sason Rofman wrote:
> > This is actually one of the main reasons I wanted to bring this up for
> > discussion - will it be fine with the community to run a dedicated
> > tool to reorder the logs offline?
>
> I think it is a bad idea to log without ordering and later relying on an
> external tool to sort it.  This is definitely not something I would want
> to do while doing test and development or debugging field issues.
> Structured logging  is definitely useful for gathering statistics and
> post-processing to make reports and charts and what not,  but from a
> debugging point of view, maintaining causality of messages and working
> with command line text editors and tools on log files is important IMO.
>
> I had a similar concerns when  brick multiplexing feature was developed
> where a single log file was used for logging all multiplexed bricks'
> logs.  So much extra work to weed out messages of 99 processes to read
> the log of the 1 process you are interested in.
_______________________________________________

Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/441850968


NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/441850968

Gluster-devel mailing list
[hidden email]
https://lists.gluster.org/mailman/listinfo/gluster-devel

Reply | Threaded
Open this post in threaded view
|

Re: Modifying gluster's logging mechanism

Amar Tumballi-2
In reply to this post by Barak Sason Rofman
Hi Barak,

My replies inline.

On Thu, Nov 21, 2019 at 6:34 PM Barak Sason Rofman <[hidden email]> wrote:

Hello Gluster community,

My name is Barak and I’ve joined RH gluster development in August.
Shortly after my arrival, I’ve identified a potential problem with gluster’s logging mechanism and I’d like to bring the matter up for discussion.

The general concept of the current mechanism is that every worker thread that needs to log a message has to contend for a mutex which guards the log file, write the message and, flush the data and then release the mutex.
I see two design / implementation problems with that mechanism:

  1. The mutex that guards the log file is likely under constant contention.

  2. The fact that each worker thread perform the IO by himself, thus slowing his "real" work.


Both  above points are true, and can have an impact when there is lot of logging. While some of us would say we knew the impact of it, we had not picked this up as a priority item to fix for below reasons.

* First of all, when we looked at log very early in project's life, our idea was based mostly on kernel logs (/var/log/messages). We decided, as a file-system, because it is very active with I/Os and should run for years together without failing, there should be NO log messages when the system is healthy, which should be 99%+ time.

* Now, if there are no logs when everything is healthy, and most of the things are healthy 99% of the time, naturally the focus was not 'performance' of logging infra, but the correctness. This is where, the strict ordering through locks to preserve the timestamps of logs, and have it organized came by.


Initial tests, done by removing logging from the regression testing, shows an improvement of about 20% in run time. This indicates we’re taking a pretty heavy performance hit just because of the logging activity.



That is interesting observation. For this alone, can we have an option to disable all logging during regression? That would fasten up things for normal runs immediately.
 

In addition to these problems, the logging module is due for an upgrade:

  1. There are dozens of APIs in the logger, much of them are deprecated - this makes it very hard for new developers to keep evolving the project.

  2. One of the key points for Gluster-X, presented in October at Bangalore, is the switch to a structured logging all across gluster.



+1
 

Given these points, I believe we’re in a position that allows us to upgrade the logging mechanism by both switching to structured logging across the project AND replacing the logging system itself, thus “killing two birds with one stone”.

Moreover, if the upgrade is successful, the new logger mechanism might be adopted by other teams in Red Hat, which lead to uniform logging activity across different products.



This, in my opinion is a good reason to undertake this activity. Mainly because we should be having our logging infra similar with other tools, and one shouldn't be having a learning curve to understand gluster's logging.
 

I’d like to propose a logging utility I’ve been working on for the past few weeks.
This project is still a work in progress (and still much work needs to be done in it), but I’d like to bring this matter up now so if the community will want to advance on that front, we could collaborate and shape the logger to best suit the community’s needs.

An overview of the system:

The logger provides several (number and size are user-defined) pre-allocated buffers which threads can 'register' to and receive a private buffer. In addition, a single, shared buffer is also pre-allocated (size is user-defined). The number of buffers and their size is modifiable at runtime (not yet implemented).

Worker threads write messages in one of 3 ways that will be described next, and an internal logger threads constantly iterates the existing buffers and drains the data to the log file.

As all allocations are allocated at the initialization stage, no special treatment it needed for "out of memory" cases.

The following writing levels exist: 

  1. Level 1 - Lockless writing: Lockless writing is achieved by assigning each thread a private ring buffer. A worker threads write to that buffer and the logger thread drains that buffer into a log file.

In case the private ring buffer is full and not yet drained, or in case the worker thread has not registered for a private buffer, we fall down to the following writing methods:

  1. Level 2 - Shared buffer writing: The worker thread will write it's data into a buffer that's shared across all threads. This is done in a synchronized manner.

In case the private ring buffer is full and not yet drained AND the shared ring buffer is full and not yet drained, or in case the worker thread has not registered for a private buffer, we fall down to the last writing method:

  1. Level 3 - Direct write: This is the slowest form of writing - the worker thread directly write to the log file.

The idea behind this utility is to reduce as much as possible the impact of logging on runtime. Part of this reduction comes at the cost of having to parse and reorganize the messages in the log files using a dedicated tool (yet to be implemented) as there is no guarantee on the order of logged messages.

This is agreeable compromise for reducing load on running system. As, most of the times, logs are used for debugging, and are mostly debugged in isolation after the event happens, if there exists a tool to get things ordered, then it should be fine. 

I am in agreement with Sankarshan, who responded in another thread about tools like EFK/ELK stack for centralized logging. By moving to centralized logging, and having proper structure for logging, the initial goals we had wouldn't be broken, and we will have better performance even when we are logging in DEBUG mode.
 

The full logger project is hosted on: https://github.com/BarakSason/Lockless_Logger

For project documentation visit: https://baraksason.github.io/Lockless_Logger/


Please make sure, we have minimum dependency on this project, and also the install steps are easy for your project. One of the reasons Gluster succeeded is because it is very easy to install (and uninstall), without much dependencies. I personally would like to keep that behavior same. 
 
While developing, at least for couple of releases after this gets used, we need to have current logging structure also as an option. So, there would be confidence to try new thing.

Regards,
Amar

I thank you all for reading through my suggestion and I’m looking forward to your feedback,

--
Barak Sason Rofman

Gluster Storage Development

Red Hat Israel

34 Jerusalem rd. Ra'anana, 43501

bsasonro[hidden email]    T: +972-9-7692304
M: +972-52-4326355

_______________________________________________

Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/441850968


NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/441850968

Gluster-devel mailing list
[hidden email]
https://lists.gluster.org/mailman/listinfo/gluster-devel


_______________________________________________

Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/441850968


NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/441850968

Gluster-devel mailing list
[hidden email]
https://lists.gluster.org/mailman/listinfo/gluster-devel

Reply | Threaded
Open this post in threaded view
|

Re: [Gluster-Maintainers] Modifying gluster's logging mechanism

Sankarshan Mukhopadhyay


On Wed, Nov 27, 2019 at 7:44 AM Amar Tumballi <[hidden email]> wrote:
Hi Barak,

My replies inline.

On Thu, Nov 21, 2019 at 6:34 PM Barak Sason Rofman <[hidden email]> wrote:



[snip]
 


Initial tests, done by removing logging from the regression testing, shows an improvement of about 20% in run time. This indicates we’re taking a pretty heavy performance hit just because of the logging activity.



That is interesting observation. For this alone, can we have an option to disable all logging during regression? That would fasten up things for normal runs immediately.

If having quicker regression runs is the objective, then perhaps we should not look at turning off logging to accomplish them. Instead, there ought to be additional aspects which can be reviewed with turning off logging being the last available option.

_______________________________________________

Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/441850968


NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/441850968

Gluster-devel mailing list
[hidden email]
https://lists.gluster.org/mailman/listinfo/gluster-devel

Reply | Threaded
Open this post in threaded view
|

Re: [Gluster-Maintainers] Modifying gluster's logging mechanism

Barak Sason Rofman
Greetings all and thank you for your contribution to this discussion.

Amar and Sankarshan, regarding your comments:
* First of all, when we looked at log very early in project's life, our idea was based mostly on kernel logs (/var/log/messages). We decided, as a file-system, because it is very active with I/Os and should run for years together without failing, there should be NO log messages when the system is healthy, which should be 99%+ time.

* Now, if there are no logs when everything is healthy, and most of the things are healthy 99% of the time, naturally the focus was not 'performance' of logging infra, but the correctness. This is where, the strict ordering through locks to preserve the timestamps of logs, and have it organized came by.
 I believe this can be said about pretty much every software - nothing is designed to break (except maybe piñatas?).
I am curious though - when the project had begun, there were no INFO levels logs? These logs are logged even when everything is healthy.
In any case, we can set the history aside and focus on the present, and I believe many components were upgraded and improved over time and it looks like the logger was left behind, and is seems there are consequences for this now.

That is interesting observation. For this alone, can we have an option to disable all logging during regression? That would fasten up things for normal runs immediately.
We had an idea that could have improved CI time, but because the test module is built in such a way that each test starts gluster by itself, our idea was not feasible as it would require modification to all the tests (we wanted to run tests with no logging, and do log only when a test fails, on the re-run).
Anyway, IMO, turning off logging does not help us at all handle the core problem, and I also believe it may send a wrong message to our users ("our logging is not good enough so we decide to get rid of it instead of improving it").
And I fully agree with Sankarshan's comment:
If having quicker regression runs is the objective, then perhaps we should not look at turning off logging to accomplish them. Instead, there ought to be additional aspects which can be reviewed with turning off logging being the last available option.
 
Please make sure, we have minimum dependency on this project, and also the install steps are easy for your project. One of the reasons Gluster succeeded is because it is very easy to install (and uninstall), without much dependencies. I personally would like to keep that behavior same.
I did not mean for it to be a dependency for the project, I suggested that as an alternative logging solution that would be fully integrated into gluster.

I am in agreement with Sankarshan, who responded in another thread about tools like EFK/ELK stack for centralized logging. By moving to centralized logging, and having proper structure for logging, the initial goals we had wouldn't be broken, and we will have better performance even when we are logging in DEBUG mode.
I do not know enough about EFK/ELK to base an opinion, but I understand from your comments that this is considered standard  components for logging etc'.
I'm not against this at all but I do need to investigate more before I could share my thoughts on the technical aspects.

[Amar] This, in my opinion is a good reason to undertake this activity. Mainly because we should be having our logging infra similar with other tools, and one shouldn't be having a learning curve to understand gluster's logging.
[Sankarshan] The original proposal from Barak has merit for planning towards an alpha form to be available.
I'm very happy to hear this, as it gives me confidence in my actions, thank you.

Unfortunately, I've been informed that I'm not to pursue this initiative at the current time, as there are other, higher, priorities that requires resources and it was decided not to dedicate them for this initiative.
I do hope that when the time is right, we could revisit this discussion and see how we can advance on that front.
If anyone has insight he'd like to share, I'd be happy to discuss it.

Thank you all very much,

On Wed, Nov 27, 2019 at 4:33 AM Sankarshan Mukhopadhyay <[hidden email]> wrote:


On Wed, Nov 27, 2019 at 7:44 AM Amar Tumballi <[hidden email]> wrote:
Hi Barak,

My replies inline.

On Thu, Nov 21, 2019 at 6:34 PM Barak Sason Rofman <[hidden email]> wrote:



[snip]
 


Initial tests, done by removing logging from the regression testing, shows an improvement of about 20% in run time. This indicates we’re taking a pretty heavy performance hit just because of the logging activity.



That is interesting observation. For this alone, can we have an option to disable all logging during regression? That would fasten up things for normal runs immediately.

If having quicker regression runs is the objective, then perhaps we should not look at turning off logging to accomplish them. Instead, there ought to be additional aspects which can be reviewed with turning off logging being the last available option.
_______________________________________________

Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/441850968


NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/441850968

Gluster-devel mailing list
[hidden email]
https://lists.gluster.org/mailman/listinfo/gluster-devel



--
Barak Sason Rofman

Gluster Storage Development

Red Hat Israel

34 Jerusalem rd. Ra'anana, 43501

bsasonro[hidden email]    T: +972-9-7692304
M: +972-52-4326355


_______________________________________________

Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/441850968


NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/441850968

Gluster-devel mailing list
[hidden email]
https://lists.gluster.org/mailman/listinfo/gluster-devel