[nmh-workers] strace show # pauses/takes 5+ sec | network issue? ( not nmh )

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
2 messages Options
nmh
Reply | Threaded
Open this post in threaded view
|

[nmh-workers] strace show # pauses/takes 5+ sec | network issue? ( not nmh )

nmh
On Sun 9/8/19 15:48 -0700 Bakul Shah wrote:
>In addition to -f as Ken suggested, you can specify the -r
>flag to strace to know which syscalls take an unusually long
>time.  This can then lead you to check more things such as
>network traffic etc.

Thanks Bakul.

My model example now:

    strace -o /tmp/foo -r -t -f /usr/local/nmh/bin/scan last
        # -r        Print a relative timestamp upon entry to each system call...
        # -t        Prefix each line of the trace with the wall clock time.
        # -f        Trace child processes as they are created by currently traced processes...

On Mon 9/9/19 21:10 +0100 Michael Richardson wrote:
>I suggest -ff, which will populate the files /tmp/foo.12343 by PID# I
>can't see why scan is doing DNS, but what do I know.
>
>If it's network related, we'll probably ask you to send tcdpump. so:
>sudo tcpdump -i any -n -p -w /tmp/network.pcap
>
>if you are logged in by SSH, then please add "not port 22"

Thanks Michael.  

I'm looking at the strace man page.

It's been almost two days and the problem is still gone.

On Mon 9/9/19 22:45 +0100 Michael Richardson wrote:
>Also, are you running NMH on your VPS?

Yes, on a fedora 29, a Xen host on prgmr.com;  mhmail -- nmh-1.7.1

On Tue 9/10/19 1:56 -0000 Krullen Van De Trap wrote:
>Set localname in your mts.conf. If this resolves the issue, the delay is
>almost definitely caused by DNS. In order to avoid such pauses, I always
>set localname in my mts.conf.

Thanks, that makes some sense, in the case that it paused near the dns lookup
it seemed to be looking up a dns record for it's own hostname.

On Mon 9/9/19 20:50 -0400 Ken Hornstein wrote:
>>I suggest -ff, which will populate the files /tmp/foo.12343 by PID#
>
>When you have a generic sort of "hang" I'm NOT a fan of that, because it's
>tough to figure out which (of any) processes is causing the hang without
>some (possibly a lot) cross-correlation.

OK.

>>I can't see why scan is doing DNS, but what do I know.
>
>I ran into this one time on an airplane.  If you don't have "localname"
>set in mts.conf, anything that calls LocalName() (which is essentially
>almost everything) ends up calling getaddrinfo(gethostname()), which may
>result in a DNS lookup.  And some things will still result in a call
>to getaddrinfo(gethostname()) even if you DO have localname set (scan
>I don't think will, but I could be wrong).

In /usr/local/nmh/etc/nmh/mts.conf
today, I added the line:

    localname: localhost.localdomain

Thanks to all of you for the help!

--
Tom

--
nmh-workers
https://lists.nongnu.org/mailman/listinfo/nmh-workers
Reply | Threaded
Open this post in threaded view
|

Re: strace show # pauses/takes 5+ sec | network issue? ( not nmh )

Michael Richardson-5
[hidden email] wrote:
    > On Mon 9/9/19 22:45 +0100 Michael Richardson wrote:
    >> Also, are you running NMH on your VPS?

    > Yes, on a fedora 29, a Xen host on prgmr.com; mhmail -- nmh-1.7.1

So, the pause could be in sending your the output.
That could be a TCP MTU discovery issue.

    > In /usr/local/nmh/etc/nmh/mts.conf today, I added the line:

    >     localname: localhost.localdomain

    > Thanks to all of you for the help!

That sounds like the actual fix.

--
]               Never tell me the odds!                 | ipv6 mesh networks [
]   Michael Richardson, Sandelman Software Works        | network architect  [
]     [hidden email]  http://www.sandelman.ca/        |   ruby on rails    [
       

--
nmh-workers
https://lists.nongnu.org/mailman/listinfo/nmh-workers