undefined sanitizer

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
12 messages Options
Reply | Threaded
Open this post in threaded view
|

undefined sanitizer

Mike-6
hi,

I keep having fun.
In attach compile report under -fsanitize=undefined in gcc or clang.
Take care.

(mike)

_______________________________________________
Tinycc-devel mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/tinycc-devel

undefined.txt (147K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: undefined sanitizer

Pascal Cuoq

> On 18 Jun 2019, at 09:54, Mike <[hidden email]> wrote:
>
> I keep having fun.
> In attach compile report under -fsanitize=undefined in gcc or clang.

I should warn you that your recent e-mails are about identical to a previous message from January: https://lists.nongnu.org/archive/html/tinycc-devel/2019-01/msg00093.html , which did not lead to any code change.

Obviously what is lacking is the time to investigate in depth what should be done about these warnings.

Pascal




_______________________________________________
Tinycc-devel mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/tinycc-devel
Reply | Threaded
Open this post in threaded view
|

Re: undefined sanitizer

Michael Matz-4
In reply to this post by Mike-6
Hi,

On Tue, 18 Jun 2019, Mike wrote:

> I keep having fun.
> In attach compile report under -fsanitize=undefined in gcc or clang.
> Take care.

I don't think we should care about alignment of 4 (when 8 would be
needed).  The 64bit platforms we support all handle misaligned memory
accesses quite fine, and doing it "correctly" just wastes space.


Ciao,
Michael.

_______________________________________________
Tinycc-devel mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/tinycc-devel
Reply | Threaded
Open this post in threaded view
|

Re: undefined sanitizer

Vincent Lefevre-10
On 2019-06-22 00:43:48 +0200, Michael Matz wrote:

> Hi,
>
> On Tue, 18 Jun 2019, Mike wrote:
>
> > I keep having fun.
> > In attach compile report under -fsanitize=undefined in gcc or clang.
> > Take care.
>
> I don't think we should care about alignment of 4 (when 8 would be needed).
> The 64bit platforms we support all handle misaligned memory accesses quite
> fine, and doing it "correctly" just wastes space.

Even if there are no issues with the processor, undefined behavior
may yield the generation of bad code due to optimizations.

--
Vincent Lefèvre <[hidden email]> - Web: <https://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)

_______________________________________________
Tinycc-devel mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/tinycc-devel
Reply | Threaded
Open this post in threaded view
|

Re: undefined sanitizer

Michael Matz-4
Hi,

On Sat, 22 Jun 2019, Vincent Lefevre wrote:

>> > I keep having fun.
>> > In attach compile report under -fsanitize=undefined in gcc or clang.
>> > Take care.
>>
>> I don't think we should care about alignment of 4 (when 8 would be needed).
>> The 64bit platforms we support all handle misaligned memory accesses quite
>> fine, and doing it "correctly" just wastes space.
>
> Even if there are no issues with the processor, undefined behavior
> may yield the generation of bad code due to optimizations.

Indeed.  The thing is that such "mis"-alignment isn't generically
undefined behaviour (and hence shouldn't even be part of
-fsanitize=undefined).  It's implementation defined what it means for a
pointer to an object type to be correctly aligned (e.g. one where the
natural alignment of all types is 1 is fully conforming).  Accessing
something via an incorrectly aligned pointer is undefined, but what
incorrectly aligned means is implementation defined.

Usually that's ultimately dictated by the capabilities of the underlying
hardware (and not by -fsanitize when it ignores those capabilities!).
Our hosts support alignment of 4 even for 8byte quantities.

Now, you do have a point in that some compilers assume larger natural
alignment (e.g. that a void* has alignment 8), and that there are
transformations like bit propagation of low bits of addresses, or
vectorization making use of instructions actually requiring large
alignment, that might be affected by assumptions about alignment, leading
to wrong code when those assumptions are violated.  When that actually
happens we can revisit, until then I personally prefer the space savings.


Ciao,
Michael.

_______________________________________________
Tinycc-devel mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/tinycc-devel
Reply | Threaded
Open this post in threaded view
|

Re: undefined sanitizer

Vincent Lefevre-10
On 2019-06-22 20:59:57 +0200, Michael Matz wrote:

> Hi,
>
> On Sat, 22 Jun 2019, Vincent Lefevre wrote:
>
> > > > I keep having fun.
> > > > In attach compile report under -fsanitize=undefined in gcc or clang.
> > > > Take care.
> > >
> > > I don't think we should care about alignment of 4 (when 8 would be needed).
> > > The 64bit platforms we support all handle misaligned memory accesses quite
> > > fine, and doing it "correctly" just wastes space.
> >
> > Even if there are no issues with the processor, undefined behavior
> > may yield the generation of bad code due to optimizations.
>
> Indeed.  The thing is that such "mis"-alignment isn't generically undefined
> behaviour (and hence shouldn't even be part of -fsanitize=undefined).  It's
> implementation defined what it means for a pointer to an object type to be
> correctly aligned (e.g. one where the natural alignment of all types is 1 is
> fully conforming).  Accessing something via an incorrectly aligned pointer
> is undefined, but what incorrectly aligned means is implementation defined.

Yes, it's implementation defined, but I assume that -fsanitize=undefined
warns only when the implementation has decided that this was incorrectly
aligned.

> Usually that's ultimately dictated by the capabilities of the underlying
> hardware

No, I think that this is a decision from the ABI, which can cover
multiple kinds of hardware (e.g. some processors that can handle
alignment and others that can't). Some compiler options can also
relax requirements. For x86, GCC has -malign-double and -malign-data,
but I don't know how this will have an influence on the expected
requirements (i.e. if it just avoids misaligned accesses for its
generated code or if it also forbids misaligned accesses from how
the C source is written). For instance, the gcc(1) man page says:

           On x86-64, -malign-double is enabled by default.

(this covers "double", "long double" and "long long", but I'm
wondering about other 64-bit types such as pointers, as this
would make sense to have the same rule, IMHO).

> (and not by -fsanitize when it ignores those capabilities!). Our
> hosts support alignment of 4 even for 8byte quantities.
>
> Now, you do have a point in that some compilers assume larger natural
> alignment (e.g. that a void* has alignment 8), and that there are
> transformations like bit propagation of low bits of addresses, or
> vectorization making use of instructions actually requiring large alignment,
> that might be affected by assumptions about alignment, leading to wrong code
> when those assumptions are violated.

Yes, that's what might happen and may be the reason why
-fsanitize=undefined complains (it may now know whether the
generated code is actually broken or not, but the UB is a
sufficient condition under which the code may be broken).

> When that actually happens we can revisit, until then I personally
> prefer the space savings.

I'm wondering whether some compiler option (see above) could ensure
that misaligned accesses will not trigger any problem. In this case,
such an option could be recommended to build tcc.

--
Vincent Lefèvre <[hidden email]> - Web: <https://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)

_______________________________________________
Tinycc-devel mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/tinycc-devel
Reply | Threaded
Open this post in threaded view
|

Re: undefined sanitizer

Christian Jullien-3
> Yes, it's implementation defined, but I assume that -fsanitize=undefined warns only when the implementation has decided that this was incorrectly aligned.

It's nice to have this warning even if implementation supports incorrect alignment. Suppose I provide a portable C library which will be compiled with on different architectures. As developer, I want what has a high risk to hang on machines I'm not even aware of.

For example, double can be misaligned on Intel core i7, but while also supported on older generation, misalignment had performance penalty (IIRC first i386 generation does not support misaligned doubles).

If I write a code with misaligned double, it will work on my Core i7 and I'll push this version with my lib. An ARM user will protest because my lib will do a SIGSEGV on its RPi (ARM).

Misalignment is an undefined behavior because a valid code that compiles wo error/warning (possibly with the right casts) can:
- works perfectly with no performance penalty
- works perfectly but with a small to very big performance penalty (sometimes with OS help as on sparc solaris)
- Do a SIGSEV (as with misaligned doubles on ARM).

Static analyzer as those we find now on VC++, gcc, clang or with external tools like splint (I love this one) help to find portability issues.

-----Original Message-----
From: Tinycc-devel [mailto:tinycc-devel-bounces+eligis=[hidden email]] On Behalf Of Vincent Lefevre
Sent: Saturday, June 22, 2019 22:29
To: [hidden email]
Subject: Re: [Tinycc-devel] undefined sanitizer

On 2019-06-22 20:59:57 +0200, Michael Matz wrote:

> Hi,
>
> On Sat, 22 Jun 2019, Vincent Lefevre wrote:
>
> > > > I keep having fun.
> > > > In attach compile report under -fsanitize=undefined in gcc or clang.
> > > > Take care.
> > >
> > > I don't think we should care about alignment of 4 (when 8 would be needed).
> > > The 64bit platforms we support all handle misaligned memory accesses quite
> > > fine, and doing it "correctly" just wastes space.
> >
> > Even if there are no issues with the processor, undefined behavior
> > may yield the generation of bad code due to optimizations.
>
> Indeed.  The thing is that such "mis"-alignment isn't generically undefined
> behaviour (and hence shouldn't even be part of -fsanitize=undefined).  It's
> implementation defined what it means for a pointer to an object type to be
> correctly aligned (e.g. one where the natural alignment of all types is 1 is
> fully conforming).  Accessing something via an incorrectly aligned pointer
> is undefined, but what incorrectly aligned means is implementation defined.

Yes, it's implementation defined, but I assume that -fsanitize=undefined
warns only when the implementation has decided that this was incorrectly
aligned.

> Usually that's ultimately dictated by the capabilities of the underlying
> hardware

No, I think that this is a decision from the ABI, which can cover
multiple kinds of hardware (e.g. some processors that can handle
alignment and others that can't). Some compiler options can also
relax requirements. For x86, GCC has -malign-double and -malign-data,
but I don't know how this will have an influence on the expected
requirements (i.e. if it just avoids misaligned accesses for its
generated code or if it also forbids misaligned accesses from how
the C source is written). For instance, the gcc(1) man page says:

           On x86-64, -malign-double is enabled by default.

(this covers "double", "long double" and "long long", but I'm
wondering about other 64-bit types such as pointers, as this
would make sense to have the same rule, IMHO).

> (and not by -fsanitize when it ignores those capabilities!). Our
> hosts support alignment of 4 even for 8byte quantities.
>
> Now, you do have a point in that some compilers assume larger natural
> alignment (e.g. that a void* has alignment 8), and that there are
> transformations like bit propagation of low bits of addresses, or
> vectorization making use of instructions actually requiring large alignment,
> that might be affected by assumptions about alignment, leading to wrong code
> when those assumptions are violated.

Yes, that's what might happen and may be the reason why
-fsanitize=undefined complains (it may now know whether the
generated code is actually broken or not, but the UB is a
sufficient condition under which the code may be broken).

> When that actually happens we can revisit, until then I personally
> prefer the space savings.

I'm wondering whether some compiler option (see above) could ensure
that misaligned accesses will not trigger any problem. In this case,
such an option could be recommended to build tcc.

--
Vincent Lefèvre <[hidden email]> - Web: <https://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)

_______________________________________________
Tinycc-devel mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/tinycc-devel


_______________________________________________
Tinycc-devel mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/tinycc-devel
Reply | Threaded
Open this post in threaded view
|

Re: undefined sanitizer

Pascal Cuoq
In reply to this post by Vincent Lefevre-10

On 22 Jun 2019, at 22:29, Vincent Lefevre <[hidden email]> wrote:

On 2019-06-22 20:59:57 +0200, Michael Matz wrote:

Indeed.  The thing is that such "mis"-alignment isn't generically undefined
behaviour (and hence shouldn't even be part of -fsanitize=undefined).  It's
implementation defined what it means for a pointer to an object type to be
correctly aligned (e.g. one where the natural alignment of all types is 1 is
fully conforming).  Accessing something via an incorrectly aligned pointer
is undefined, but what incorrectly aligned means is implementation defined.

Yes, it's implementation defined, but I assume that -fsanitize=undefined
warns only when the implementation has decided that this was incorrectly
aligned.

Probably everyone has already seen this blog post about GCC generating code that crashes if pointers to uint32_t are not aligned to 4, but I will post the URL just in case:




_______________________________________________
Tinycc-devel mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/tinycc-devel
Reply | Threaded
Open this post in threaded view
|

Re: undefined sanitizer

Vincent Lefevre-10
In reply to this post by Christian Jullien-3
On 2019-06-23 06:51:04 +0200, Christian Jullien wrote:
> > Yes, it's implementation defined, but I assume that -fsanitize=undefined warns only when the implementation has decided that this was incorrectly aligned.
>
> It's nice to have this warning even if implementation supports incorrect alignment. Suppose I provide a portable C library which will be compiled with on different architectures. As developer, I want what has a high risk to hang on machines I'm not even aware of.

The code may have #ifdef's or similar code to enable misaligned
accesses only when supported. In such a case, a warning would not
be useful, unless explicitly requested.

--
Vincent Lefèvre <[hidden email]> - Web: <https://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)

_______________________________________________
Tinycc-devel mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/tinycc-devel
Reply | Threaded
Open this post in threaded view
|

Re: undefined sanitizer

Christian Jullien-3
#if does not help to deal with different processors when only some of them have performance penalties.
The benefit of a warning is to question the developer about its code and possible way to fix it.
At least, he knows it code has the prerequisite to support misaligned memory of given types.


-----Original Message-----
From: Tinycc-devel [mailto:tinycc-devel-bounces+eligis=[hidden email]] On Behalf Of Vincent Lefevre
Sent: Sunday, June 23, 2019 12:42
To: [hidden email]
Subject: Re: [Tinycc-devel] undefined sanitizer

On 2019-06-23 06:51:04 +0200, Christian Jullien wrote:
> > Yes, it's implementation defined, but I assume that -fsanitize=undefined warns only when the implementation has decided that this was incorrectly aligned.
>
> It's nice to have this warning even if implementation supports incorrect alignment. Suppose I provide a portable C library which will be compiled with on different architectures. As developer, I want what has a high risk to hang on machines I'm not even aware of.

The code may have #ifdef's or similar code to enable misaligned
accesses only when supported. In such a case, a warning would not
be useful, unless explicitly requested.

--
Vincent Lefèvre <[hidden email]> - Web: <https://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)

_______________________________________________
Tinycc-devel mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/tinycc-devel


_______________________________________________
Tinycc-devel mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/tinycc-devel
Reply | Threaded
Open this post in threaded view
|

Re: undefined sanitizer

Vincent Lefevre-10
In reply to this post by Pascal Cuoq
On 2019-06-23 08:53:33 +0000, Pascal Cuoq wrote:
> Indeed.  The thing is that such "mis"-alignment isn't generically undefined
> behaviour (and hence shouldn't even be part of -fsanitize=undefined).  It's
> implementation defined what it means for a pointer to an object type to be
> correctly aligned (e.g. one where the natural alignment of all types is 1 is
> fully conforming).  Accessing something via an incorrectly aligned pointer
> is undefined, but what incorrectly aligned means is implementation defined.

Yes, but once the "correctly aligned" conditions have been defined
by the implementation, if a condition is not met, this falls under
-fsanitize=undefined.

Note also that the implementation can also decide that some alignment
condition is supported but deserves a warning.

> Probably everyone has already seen this blog post about GCC
> generating code that crashes if pointers to uint32_t are not aligned
> to 4, but I will post the URL just in case:
>
> http://pzemtsov.github.io/2016/11/06/bug-story-alignment-on-x86.html

This shows again that one should rely on documentation from the
implementation (e.g. implementation-defined things) rather that
one can normally observe when running the code.

Thus -fsanitize=undefined may give warnings even if the generated
code actually works. In particular, optimizations can "break"
expectations by the user, e.g. due to additional alignment
requirement for automatic vectorization or dead-code detection.

--
Vincent Lefèvre <[hidden email]> - Web: <https://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)

_______________________________________________
Tinycc-devel mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/tinycc-devel
Reply | Threaded
Open this post in threaded view
|

Re: undefined sanitizer

Vincent Lefevre-10
In reply to this post by Christian Jullien-3
On 2019-06-23 12:56:34 +0200, Christian Jullien wrote:
> #if does not help to deal with different processors when only some
> of them have performance penalties.

That's entirely the choice of the developer. Under various means
(compiler options, configure script, etc.), macros can be defined
to give information about the OS, the target architecture (e.g.
with -march=... with GCC), the target architecture for tuning
(e.g. with -mtune=... with GCC), whether some features are enabled
or not[*], the user choices (such as size vs performance), etc.
Then any of these macros can be used with #if.

[*] Some features may have stricter alignment constraints.

> The benefit of a warning is to question the developer about its code
> and possible way to fix it.

Spurious warnings are bad. Let the user/developer decide what he
wants.

--
Vincent Lefèvre <[hidden email]> - Web: <https://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)

_______________________________________________
Tinycc-devel mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/tinycc-devel