Re: [bootstrappable] Re: wip-full-source-bootstrap: from a 357-byte `hex0' to 'hello'

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

Re: [bootstrappable] Re: wip-full-source-bootstrap: from a 357-byte `hex0' to 'hello'

Danny Milosavljevic
Hi Janneke,

I just found the bug in tinycc that caused failed our ARM bootstrap to fail.

I use the following reproducer:

int main() {
        double f = 1.0;
        return 0;
}

and then invoke

  tcc a.c

on ARM (32) using your patched tcc.  (but it's also broken in the unpatched tcc)

(tcc cross compiler is not enough.  tcc has to actually itself be an ARM
EABI executable)

I get a bus error here:

│   0x24698 <init_putv+1688>        vstr    d0, [r0]                                                                                                          │

Debugging some more, I find:

tccgen.c:

/* store a value or an expression directly in global data or in local array */
static void init_putv(CType *type, Section *sec, unsigned long c)
{
[...]
        size = type_size(type, &align);
        section_reserve(sec, c + size); // c == 0, size == 8
        ptr = sec->data + c; // sec->data == 0x24b01e, c == 0
            switch(bt) {
                /* XXX: when cross-compiling we assume that each type has the
                   same representation on host and target, which is likely to
                   be wrong in the case of long double */
            case VT_BOOL:
                vtop->c.i = vtop->c.i != 0;
            case VT_BYTE:
                *(char *)ptr = vtop->c.i;
                break;
            case VT_SHORT:
                *(short *)ptr = vtop->c.i;
                break;
            case VT_FLOAT:
                *(float*)ptr = vtop->c.f;
                break;
            case VT_DOUBLE:
                *(double *)ptr = vtop->c.d;
                break;
[... and so on]

tccelf.c:

/* reserve at least 'size' bytes from section start */
ST_FUNC void section_reserve(Section *sec, unsigned long size)
{
    if (size > sec->data_allocated)  // both 8
        section_realloc(sec, size);
    if (size > sec->data_offset) // both 8
        sec->data_offset = size;
}

Nothing here make sure that the VFP double is aligned to 8 Byte.

And indeed, (0x24b01e % 8) == 6, not 0.

Alignment could be disabled on the CPU

  https://developer.arm.com/documentation/ddi0464/f/system-control/register-descriptions/system-control-register

but I don't think EABI wants that.

tinycc does have:

/* reserve at least 'size' bytes aligned per 'align' in section
   'sec' from current offset, and return the aligned offset */
ST_FUNC size_t section_add(Section *sec, addr_t size, int align)
{
    size_t offset, offset1;

    offset = (sec->data_offset + align - 1) & -align;
    offset1 = offset + size;
    if (sec->sh_type != SHT_NOBITS && offset1 > sec->data_allocated)
        section_realloc(sec, offset1);
    sec->data_offset = offset1;
    if (align > sec->sh_addralign)
        sec->sh_addralign = align;
    return offset;
}

But that's not used for init_putv.

And section_reserve, which is used, doesn't care about alignment at all.

(it seems there's a reserved part and a data part in each section, and
it holds that the data part elements are aligned--but the reserved part
elements are NOT aligned.  I don't see how sec->data would be aligned
by the dynamic memory allocator either)

Other notes:

tccgen.c even has this:

>                /* XXX: when cross-compiling we assume that each type has the
>                   same representation on host and target, which is likely to
>                   be wrong in the case of long double */

Yeah, and even when NOT cross-compiling, the alignment is wrong--which means
it sometimes won't work at all on ARM, depending on luck.

As a workaround, we can patch tcc to instead do the assignments on elements
on the stack and then copy those over, instead of doing

                *(double *)ptr = vtop->c.d

(the latter of which emits VFP instructions that expect double-aligned
pointers).

_______________________________________________
Tinycc-devel mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/tinycc-devel

attachment0 (499 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [bootstrappable] Re: wip-full-source-bootstrap: from a 357-byte `hex0' to 'hello'

Jan Nieuwenhuizen
Danny Milosavljevic writes:

Hello Danny,

> I just found the bug in tinycc that caused failed our ARM bootstrap to fail.
>
> I use the following reproducer:
>
> int main() {
>         double f = 1.0;
>         return 0;
> }

Beautiful!  Well done!  I can confirm that adding this patch

--8<---------------cut here---------------start------------->8---
diff --git a/gnu/packages/commencement.scm b/gnu/packages/commencement.scm
index a50a238ddd..5a3fa694b3 100644
--- a/gnu/packages/commencement.scm
+++ b/gnu/packages/commencement.scm
@@ -1417,8 +1417,8 @@ ac_cv_c_float_format='IEEE (little-endian)'
                (substitute* "test/Makefile.in"
                  (("^bigtest:.*") "bigtest: basic\n")
                  (("( |\t)(childin|convfmt|fflush|longwrds|math|negexp)" all sep) sep))
-               (substitute* "io.c"
-                 (("char rs;") "int rs;"))
+               (substitute* '("builtin.c" "eval.c" "io.c")
+                 (("1.0") "1"))
                #t))
            (add-before 'configure 'setenv
              (lambda _
--8<---------------cut here---------------end--------------->8---

to the gawk-mesbot0 recipe also fixes "inc.awk".  The pre
increment/decrement code looks like this:

--8<---------------cut here---------------start------------->8---
                *lhs = make_number(lval +
                               (tree->type == Node_preincrement ? 1.0 : -1.0));

--8<---------------cut here---------------end--------------->8---

> on ARM (32) using your patched tcc.  (but it's also broken in the unpatched tcc)
>
> (tcc cross compiler is not enough.  tcc has to actually itself be an ARM
> EABI executable)
>
> I get a bus error here:
>
> │   0x24698 <init_putv+1688>        vstr    d0, [r0]                                                                                                          │

Ah, so it may result in a wrong assingment, or bus error even.  Great!

> Debugging some more, I find:
>
> tccgen.c:
>
> /* store a value or an expression directly in global data or in local array */
> static void init_putv(CType *type, Section *sec, unsigned long c)
> {
> [...]
>         size = type_size(type, &align);
>         section_reserve(sec, c + size); // c == 0, size == 8
>         ptr = sec->data + c; // sec->data == 0x24b01e, c == 0
>             switch(bt) {
>                 /* XXX: when cross-compiling we assume that each type has the
>                    same representation on host and target, which is likely to
>                    be wrong in the case of long double */
>             case VT_BOOL:
>                 vtop->c.i = vtop->c.i != 0;
>             case VT_BYTE:
>                 *(char *)ptr = vtop->c.i;
>                 break;
>             case VT_SHORT:
>                 *(short *)ptr = vtop->c.i;
>                 break;
>             case VT_FLOAT:
>                 *(float*)ptr = vtop->c.f;
>                 break;
>             case VT_DOUBLE:
>                 *(double *)ptr = vtop->c.d;
>                 break;
> [... and so on]

Ah yes, this code has been problematic in the sense that I found bus errors
here and tried workarounds several times (look at the
wip-arm-bootstrap14 branch).

> tccelf.c:
>
> /* reserve at least 'size' bytes from section start */
> ST_FUNC void section_reserve(Section *sec, unsigned long size)
> {
>     if (size > sec->data_allocated)  // both 8
>         section_realloc(sec, size);
>     if (size > sec->data_offset) // both 8
>         sec->data_offset = size;
> }
>
> Nothing here make sure that the VFP double is aligned to 8 Byte.
>
> And indeed, (0x24b01e % 8) == 6, not 0.

Ah...I wasn't aware of this requirement...

> Alignment could be disabled on the CPU
>
>   https://developer.arm.com/documentation/ddi0464/f/system-control/register-descriptions/system-control-register
>
> but I don't think EABI wants that.

Hmm, what does this mean?  We are not really using EABI, or are we?  We
seem to need this terrible hack

--8<---------------cut here---------------start------------->8---
diff --git a/tccelf.c b/tccelf.c
index 2ac7466c..42546f57 100644
--- a/tccelf.c
+++ b/tccelf.c
@@ -1867,7 +1867,7 @@ static void tcc_output_elf(TCCState *s1, FILE *f, int phnum, ElfW(Phdr) *phdr,
     ehdr.e_ident[EI_OSABI] = ELFOSABI_FREEBSD;
 #endif
 #ifdef TCC_TARGET_ARM
-#ifdef TCC_ARM_EABI
+#if defined (TCC_ARM_EABI) || BOOTSTRAP
     ehdr.e_ident[EI_OSABI] = 0;
     ehdr.e_flags = EF_ARM_EABI_VER4;
     if (file_type == TCC_OUTPUT_EXE || file_type == TCC_OUTPUT_DLL)
--8<---------------cut here---------------end--------------->8---

at least to get tcc's ARM binaries to run on Aarch64-Linux.  Is this
related; iow, could it be that this "fix" for Aarch64 break ARM?

> tinycc does have:
>
> /* reserve at least 'size' bytes aligned per 'align' in section
>    'sec' from current offset, and return the aligned offset */
> ST_FUNC size_t section_add(Section *sec, addr_t size, int align)
> {
>     size_t offset, offset1;
>
>     offset = (sec->data_offset + align - 1) & -align;
>     offset1 = offset + size;
>     if (sec->sh_type != SHT_NOBITS && offset1 > sec->data_allocated)
>         section_realloc(sec, offset1);
>     sec->data_offset = offset1;
>     if (align > sec->sh_addralign)
>         sec->sh_addralign = align;
>     return offset;
> }
>
> But that's not used for init_putv.

OK.

> And section_reserve, which is used, doesn't care about alignment at all.
>
> (it seems there's a reserved part and a data part in each section, and
> it holds that the data part elements are aligned--but the reserved part
> elements are NOT aligned.  I don't see how sec->data would be aligned
> by the dynamic memory allocator either)
>
> Other notes:
>
> tccgen.c even has this:
>
>>                /* XXX: when cross-compiling we assume that each type has the
>>                   same representation on host and target, which is likely to
>>                   be wrong in the case of long double */
>
> Yeah, and even when NOT cross-compiling, the alignment is wrong--which means
> it sometimes won't work at all on ARM, depending on luck.
>
> As a workaround, we can patch tcc to instead do the assignments on elements
> on the stack and then copy those over, instead of doing
>
>                 *(double *)ptr = vtop->c.d
>
> (the latter of which emits VFP instructions that expect double-aligned
> pointers).

So alignment should be fixed, but that's more work and you propose a
workaround, right?  I'm struggling to understand the implications of
this last bit...guessing you will be preparing a patch for the mes-0.23
branch of our "bootstrappable tinycc"?  Oh, and we need that same patch
for plain tcc-0.9.27, for "tcc-boot" of course!

Greetings,
Janneke

--
Jan Nieuwenhuizen <[hidden email]> | GNU LilyPond http://lilypond.org
Freelance IT http://JoyofSource.com | Avatar® http://AvatarAcademy.com

_______________________________________________
Tinycc-devel mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/tinycc-devel
Reply | Threaded
Open this post in threaded view
|

Re: [bootstrappable] Re: wip-full-source-bootstrap: from a 357-byte `hex0' to 'hello'

grischka
In reply to this post by Danny Milosavljevic
Danny Milosavljevic wrote:
> int main() {
>         double f = 1.0;
>         return 0;
> }
...
> I get a bus error here:
>
> │   0x24698 <init_putv+1688>  vstr  d0, [r0]                                                                                                          │
...
> And indeed, (0x24b01e % 8) == 6, not 0.
...
>                 *(double *)ptr = vtop->c.d
>
> (the latter of which emits VFP instructions that expect double-aligned
> pointers).

It seems that in fact, on certain systems, initializing intentionally
misaligned (packed) structure members could crash tcc already during
compilation.

But no such thing happens in this case.  The 'ptr' in init_putv()
comes from

         ptr = sec->data + c;

and it seems that if tcc is doing the right thing then 'c' cannot
be misaligned, and if malloc/realloc on that system is doing the
right thing,  then sec->data cannot be misaligned either.  So...?

--- grischka


_______________________________________________
Tinycc-devel mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/tinycc-devel
Reply | Threaded
Open this post in threaded view
|

Re: [bootstrappable] Re: wip-full-source-bootstrap: from a 357-byte `hex0' to 'hello'

arnold
In reply to this post by Jan Nieuwenhuizen
Jan Nieuwenhuizen <[hidden email]> wrote:

> to the gawk-mesbot0 recipe also fixes "inc.awk".  The pre
> increment/decrement code looks like this:
>
> --8<---------------cut here---------------start------------->8---
> *lhs = make_number(lval +
>       (tree->type == Node_preincrement ? 1.0 : -1.0));
>
> --8<---------------cut here---------------end--------------->8---

What in the world?  That looks like gawk 3.x code, which is
terribly ancient.  What project is still using a version that old?

Arnold
(The gawk maintainer)

_______________________________________________
Tinycc-devel mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/tinycc-devel
Reply | Threaded
Open this post in threaded view
|

Re: [bootstrappable] Re: wip-full-source-bootstrap: from a 357-byte `hex0' to 'hello'

Jan Nieuwenhuizen
> Jan Nieuwenhuizen <[hidden email]> wrote:

Hello Arnold!

>> to the gawk-mesbot0 recipe also fixes "inc.awk".  The pre
>> increment/decrement code looks like this:
>>
>> --8<---------------cut here---------------start------------->8---
>> *lhs = make_number(lval +
>>       (tree->type == Node_preincrement ? 1.0 : -1.0));
>>
>> --8<---------------cut here---------------end--------------->8---
>
> What in the world?  That looks like gawk 3.x code, which is
> terribly ancient.  What project is still using a version that old?

We are removing binary seeds from the GNU Guix package graph.  The
binary packages in Guix form an acyclic graph and at the bottom of the
graph we originally had binutils, glibc, gcc, bash, coreutils&co (gawk,
gzip, sed, tar, ...).

Since 2016 we have been working to eliminate those binary seeds.  For a
complete overview and more background see

    https://guix.gnu.org/en/blog/2020/guix-further-reduces-bootstrap-seed-to-25/
    https://guix.gnu.org/blog/2019/guix-reduces-bootstrap-seed-by-50/

but what we did is remove all those, replacing them by Stage0, GNU Mes,
tinycc...and multiple versions of ancient GNU tools.

Using ancient tools is less than great, we are using those because "it
works" or rather, we didn't succeed as yet using newer versions.  Often,
newer versions of a software are more demanding in their requirements
and are less bootstrappale.  In other cases, ancient software does not
build with newer tools, because they are more strict.

> Arnold
> (The gawk maintainer)

Thanks for reaching out!

Sadly I do not have more concrete information (let alone a bug report or
feature request) for you yet, other than that we are using gawk-3.0.0,
lateron v3.1.8, and only finally v5.0.1.  Simalarly for other tools.

The biggest hudle was bootstrapping glibc and gcc, as you can imagine.
Currently, we start with gcc-2.95.3 and I would very much like to target
gcc-4.6.4 directly instead.  For a tool as gawk, it would be great to
be able to the latest greatest!

Greetings,
Janneke (GNU Mes author)

--
Jan Nieuwenhuizen <[hidden email]> | GNU LilyPond http://lilypond.org
Freelance IT http://JoyofSource.com | Avatar® http://AvatarAcademy.com

_______________________________________________
Tinycc-devel mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/tinycc-devel
Reply | Threaded
Open this post in threaded view
|

Re: [bootstrappable] Re: wip-full-source-bootstrap: from a 357-byte `hex0' to 'hello'

Danny Milosavljevic
In reply to this post by grischka
Hi grischka,
Hi Janneke,

On Fri, 08 Jan 2021 08:16:29 +0100
grischka <[hidden email]> wrote:

> But no such thing happens in this case.  The 'ptr' in init_putv()
> comes from
>
>          ptr = sec->data + c;
>
> and it seems that if tcc is doing the right thing then 'c' cannot
> be misaligned, and if malloc/realloc on that system is doing the
> right thing,  then sec->data cannot be misaligned either.  So...?

How does tcc allocate dynamic memory?  I've tried to find out, but
tcc_malloc is defined to be "use_tcc_malloc", which I don't find
anywhere.  Does it use libc malloc for that ?

(With MEM_DEBUG defined, tcc_malloc_debug seems to use malloc.  But
what about without MEM_DEBUG defined ?)

If so, is libc malloc supposed to ensure alignment of allocated memory?

According to https://man7.org/linux/man-pages/man3/malloc.3.html yes.

@Janneke: So our mes libc malloc should be aligning the stuff--but it's not
doing it.  So it's a bug in our libc.

_______________________________________________
Tinycc-devel mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/tinycc-devel

attachment0 (499 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [bootstrappable] Re: Re: wip-full-source-bootstrap: from a 357-byte `hex0' to 'hello'

Orians, Jeremiah (DTMB)
> If so, is libc malloc supposed to ensure alignment of allocated memory?
> According to https://man7.org/linux/man-pages/man3/malloc.3.html yes.
> @Janneke: So our mes libc malloc should be aligning the stuff--but it's not doing it.  So it's a bug in our libc.

Looks like you'll have to waste 3.7bytes on average per malloc to always pad to the 8byte boundary.

-Jeremiah

_______________________________________________
Tinycc-devel mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/tinycc-devel
Reply | Threaded
Open this post in threaded view
|

Re: [bootstrappable] Re: wip-full-source-bootstrap: from a 357-byte `hex0' to 'hello'

Danny Milosavljevic
In reply to this post by Jan Nieuwenhuizen
Hi Janneke,

On Fri, 08 Jan 2021 07:25:52 +0100
Jan Nieuwenhuizen <[hidden email]> wrote:

> > Alignment could be disabled on the CPU
> >
> >   https://developer.arm.com/documentation/ddi0464/f/system-control/register-descriptions/system-control-register
> >
> > but I don't think EABI wants that.  
>
> Hmm, what does this mean?  We are not really using EABI, or are we?

VFP is a floating point unit on ARM CPUs.  It has been designed to either
require aligned members and be fast, or not, depending on a CPU flag that
for example the kernel can set.

See also https://www.keil.com/support/man/docs/armcc/armcc_chr1359124231926.htm
for much more detail.

ARM is a famously mix-and-match CPU, so the client can choose whatever
they want.

If you choose aligned-only, you will get a bus error on misaligned access.
A single program really shouldn't be choosing--it's more the entire platform
doing the choosing.

EABI has standardized on particular settings, among which is required alignment.

<https://www.keil.com/support/man/docs/armcc/armcc_chr1359124228744.htm>

> So alignment should be fixed, but that's more work and you propose a
> workaround, right?  

No, I wanted to find out what's going on first.

Now that grischka commented and I looked up the malloc docs, I suggest
to fix mes libc's malloc to align what it gives back.  That's all--that
should fix everything.

We should provide maxalign_t to make it known to the user what the
malloc alignment we are using is.

_______________________________________________
Tinycc-devel mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/tinycc-devel

attachment0 (499 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [bootstrappable] Re: Re: wip-full-source-bootstrap: from a 357-byte `hex0' to 'hello'

Vincent Lefevre-10
In reply to this post by Orians, Jeremiah (DTMB)
On 2021-01-08 13:36:26 +0000, Orians, Jeremiah (DTMB) wrote:
> > If so, is libc malloc supposed to ensure alignment of allocated memory?
> > According to https://man7.org/linux/man-pages/man3/malloc.3.html yes.
> > @Janneke: So our mes libc malloc should be aligning the stuff--but it's not doing it.  So it's a bug in our libc.
>
> Looks like you'll have to waste 3.7bytes on average per malloc to
> always pad to the 8byte boundary.

Note that it's an 8-byte boundary for 32-bit systems, but a 16-byte
boundary for 64-bit systems:

  https://www.gnu.org/software/libc/manual/html_node/Aligned-Memory-Blocks.html

And about the average waste, this depends on other factors (the
main one may be that the block size is often a multiple of some
power of two).

--
Vincent Lefèvre <[hidden email]> - Web: <https://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)

_______________________________________________
Tinycc-devel mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/tinycc-devel
Reply | Threaded
Open this post in threaded view
|

Re: [bootstrappable] Re: wip-full-source-bootstrap: from a 357-byte `hex0' to 'hello'

Jan Nieuwenhuizen
In reply to this post by Danny Milosavljevic
Danny Milosavljevic writes:

Hi Danny, Grishka!

> On Fri, 08 Jan 2021 08:16:29 +0100
> grischka <[hidden email]> wrote:
>
>> But no such thing happens in this case.  The 'ptr' in init_putv()
>> comes from
>>
>>          ptr = sec->data + c;
>>
>> and it seems that if tcc is doing the right thing then 'c' cannot
>> be misaligned, and if malloc/realloc on that system is doing the
>> right thing,  then sec->data cannot be misaligned either.  So...?
>
> How does tcc allocate dynamic memory?  I've tried to find out, but
> tcc_malloc is defined to be "use_tcc_malloc", which I don't find
> anywhere.  Does it use libc malloc for that ?
>
> (With MEM_DEBUG defined, tcc_malloc_debug seems to use malloc.  But
> what about without MEM_DEBUG defined ?)
>
> If so, is libc malloc supposed to ensure alignment of allocated memory?
>
> According to https://man7.org/linux/man-pages/man3/malloc.3.html yes.
>
> @Janneke: So our mes libc malloc should be aligning the stuff--but it's not
> doing it.  So it's a bug in our libc.

Beautiful!  Maybe this explains other differences we saw between
aarch64-linux and arm-linux?

Greetings,
Janneke

--
Jan Nieuwenhuizen <[hidden email]> | GNU LilyPond http://lilypond.org
Freelance IT http://JoyofSource.com | Avatar® http://AvatarAcademy.com

_______________________________________________
Tinycc-devel mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/tinycc-devel