Hello and a few questions about using libtcc

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Hello and a few questions about using libtcc

Joshua Scholar
Hi. I just joined this list.  

It feels a bit rude just barging in with questions here I go.

I would like to use libtcc to implement a jit so I have a few questions.
First a couple questions about the library's API, and then a couple on the generated code.

On the api:
1) If I've generated some code into a memory buffer and I've retrieved an entry point with tcc_get_symbol, is it safe to call tcc_delete_state before I call the entry point?  Looking at the source code, it looks like tcc_delete_state deletes some things that you don't need at run-time.

2) If you use TCC_RELOCATE_AUTO, does tcc ever delete that memory for you?  Does  tcc_delete_state delete the generated code?  I do realize that if I supply a buffer myself, I have to make it executable myself, but then I can also know that I can delete it.

3) Does the tcc compiler allocate space for a stack?  Or is it that when you call into code it generated, it uses your current stack? Or is there a difference between tcc_run and just calling a function you got back from tcc_get_symbol, where tcc_run allocates its own stack? 

3 a) I noticed that there's a bounds checking version of alloca, but I couldn't understand it.  Does this imply an answer to my previous question, that TCC DOES allocate its own stack? Generally operating systems expand user stacks as needed, up to a point, so bounds checking a stack doesn't make a lot of sense.

4) Is it possible to reuse a TCCState? What would it do?  Would it trash the code already generated?  Would it remember symbols from previous compiles?

5) Is the compiler thread safe?  While it would be surprising, I might as well ask if you can compile multiple sources at once.

Now questions about generated code.  The jit I'm hoping to make is for a language that's embarrassingly parallel, so I need to know how the generated code works with threads and stacks and contexts.

1) The simplest thing, what's the calling convention of generated functions?  On 64 bit windows?  On 64 bit Linux? 

2)  Is the TCC runtime multithread safe?
a) And what are the details?  Can I run generated code in multiple threads at once?  Does it use locks for anything?
Is it at least as thread safe as C usually is, ie, you can do anything that doesn't involve a shared buffer like an implicit error string. Would it be fine as long as I put a critical section around some non-thread safe part?

3) Does the run time library use thread local storage?  

4) If I did something weird like have a call out from generated code to my code, and my code returned on the same stack but in the context of a different thread than it entered from, would that break anything?

5) Some systems are broken by fibers, ie, if I switched between generated code, each using a different stack, but in the same thread, would that break anything?

6) Are there any pointers on how to use the built in assembler?

Thanks,

Joshua Scholar

_______________________________________________
Tinycc-devel mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/tinycc-devel
Reply | Threaded
Open this post in threaded view
|

Re: Hello and a few questions about using libtcc

Kyryl Melekhin
On Sun, Dec 20, 2020 at 2:03 PM Joshua Scholar <[hidden email]> wrote:

> 1) If I've generated some code into a memory buffer and I've retrieved an entry point with tcc_get_symbol, is it safe to call tcc_delete_state before I call the entry point?  Looking at the source code, it looks like tcc_delete_state deletes some things that you don't need at run-time.
>
> 2) If you use TCC_RELOCATE_AUTO, does tcc ever delete that memory for you?  Does  tcc_delete_state delete the generated code?  I do realize that if I supply a buffer myself, I have to make it executable myself, but then I can also know that I can delete it.
>
> 3) Does the tcc compiler allocate space for a stack?  Or is it that when you call into code it generated, it uses your current stack? Or is there a difference between tcc_run and just calling a function you got back from tcc_get_symbol, where tcc_run allocates its own stack?
>
> 3 a) I noticed that there's a bounds checking version of alloca, but I couldn't understand it.  Does this imply an answer to my previous question, that TCC DOES allocate its own stack? Generally operating systems expand user stacks as needed, up to a point, so bounds checking a stack doesn't make a lot of sense.
>
> 4) Is it possible to reuse a TCCState? What would it do?  Would it trash the code already generated?  Would it remember symbols from previous compiles?
>
> 5) Is the compiler thread safe?  While it would be surprising, I might as well ask if you can compile multiple sources at once.
>
> Now questions about generated code.  The jit I'm hoping to make is for a language that's embarrassingly parallel, so I need to know how the generated code works with threads and stacks and contexts.
>
> 1) The simplest thing, what's the calling convention of generated functions?  On 64 bit windows?  On 64 bit Linux?
>
> 2)  Is the TCC runtime multithread safe?
> a) And what are the details?  Can I run generated code in multiple threads at once?  Does it use locks for anything?
> Is it at least as thread safe as C usually is, ie, you can do anything that doesn't involve a shared buffer like an implicit error string. Would it be fine as long as I put a critical section around some non-thread safe part?
>
> 3) Does the run time library use thread local storage?
>
> 4) If I did something weird like have a call out from generated code to my code, and my code returned on the same stack but in the context of a different thread than it entered from, would that break anything?
>
> 5) Some systems are broken by fibers, ie, if I switched between generated code, each using a different stack, but in the same thread, would that break anything?
>
> 6) Are there any pointers on how to use the built in assembler?
>

I'll do my best to answer these, make sure to reply if any corrections
are necessary.

1. tcc_delete_state is not a thing, you mean tcc_delete? No,
tcc_delete will free everything, if you call it before running
generated code it will crash. But you could free some stuff from
TCCState, just make sure to not call tcc_run_free() if you don't plan
to ever use the state for compiling.

2. TCC_RELOCATE_AUTO means that the TCCState manages the code's
memory, and no, you still have to free it manually by calling
tcc_delete() or tcc_run_free(). The memory pointer is s->runtime_mem

3. No. The program does not create additional stacks. tcc_run creates
a section in memory and calls __init_array_start, just like any normal
C program would when you run it with libc. But because it goes through
the new section init it will allocate the amount of static memory it
needs. So you can expect the static memory to be set 0.

3a Bounds checking code is for debug only, there are special trap
instructions inserted by compiler if you enable it, but it will not
mitigate any program errors, it can detect stackoverflow and report
it, but that's the extent. It also checks the memory sections for
buffer overflows.

4. Yes you can reuse it, but it's not going to be efficient because it
will have to recompile the previous stuff also.

5. I'd say no, a compiler is not thread safe unless you run a TCCState
per thread.

-----

1.  Read code in gfunc_prolog gfunc_epilog. But I am confused by this
because x64 only has one calling convention, so I don't understand why
it matters except for i386 really. Windows is like the only exception
that does not follow the abi, but the difference is only register
order. Convention is fastcall

2. There is one semaphore in the code, but all I can tell you is that
it's there for some specific reason which I can't recall

3. No

4. I don't see any problems doing that. Stack memory is generally
thread safe and reentrant.

5. No idea what that means, probably no though

6. Use keywords like asm() or __asm() or __asm__() and only GAS
assembly syntax is supported.

_______________________________________________
Tinycc-devel mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/tinycc-devel
Reply | Threaded
Open this post in threaded view
|

Re: Hello and a few questions about using libtcc

Michael Matz-4
In reply to this post by Joshua Scholar
Hello,

In addition to the answers already given by Kyryl:

On Sun, 20 Dec 2020, Joshua Scholar wrote:

> Now questions about generated code.  The jit I'm hoping to make is for a
> language that's embarrassingly parallel, so I need to know how the generated
> code works with threads and stacks and contexts.
>
> 1) The simplest thing, what's the calling convention of generated
> functions?  On 64 bit windows?  On 64 bit Linux? 

TCC follows the native calling convention, i.e. msvc on windows and ELF
psABI on linux.

> 2)  Is the TCC runtime multithread safe?

Depends what you mean by runtime:
a) if you mean the code of libtcc itself, i.e. compiler/linker: you can
use several TCCStates from separate threads, but must not use the same
TCCState concurrently.
b) if you mean the code of libtcc1, i.e. support routines sometimes used
by the generated code: then, yes, that's thread-safe.  (It's also very
minimal, there's not much in term of support code necessary).

> a) And what are the details?  Can I run generated code in multiple threads
> at once?  Does it use locks for anything?

Yes, the generated code is as thread-safe as the input C code is.  No it
doesn't use locks for anything (the runtime support for boundschecking
uses locks, but that's for debugging purposes).

> Is it at least as thread safe as C usually is, ie, you can do anything that
> doesn't involve a shared buffer like an implicit error string. Would it be
> fine as long as I put a critical section around some non-thread safe part?
>
> 3) Does the run time library use thread local storage?  

No.

> 4) If I did something weird like have a call out from generated code to my
> code, and my code returned on the same stack but in the context of a
> different thread than it entered from, would that break anything?

No.  Or, perhaps better said, it would break in the same way when the
generated code would also be your code and not generated by TCC, i.e. TCC
doesn't introduce additional restrictions.  In particular the usual
makecontext/swapcontext way of implementing lightweight threads via stack
switching should work just fine, as should any more unusual way of
switching threads but not stack (what is that even supposed to mean?), as
long as the input code doesn't have any problem if it had been written
literally without TCC involvement.

> 5) Some systems are broken by fibers, ie, if I switched between generated
> code, each using a different stack, but in the same thread, would that break
> anything?

See above, stack switching shouldn't be affected by TCC-generated code, if
the input C source isn't.

> 6) Are there any pointers on how to use the built in assembler?

It's the GCC builtin assembler syntax, but only for i386/x86-64, and with
some limits.  The limits aren't documented and subject to change if more
things are needed.  As for usage documentation have a look at GCCs docu:

https://gcc.gnu.org/onlinedocs/gcc-10.2.0/gcc/Using-Assembly-Language-with-C.html


Ciao,
Michael.
_______________________________________________
Tinycc-devel mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/tinycc-devel
Reply | Threaded
Open this post in threaded view
|

Re: Hello and a few questions about using libtcc

Joshua Scholar




TCC follows the native calling convention, i.e. msvc on windows and ELF
psABI on linux.


Thank you for answering my question thoroughly.


> 4) If I did something weird like have a call out from generated code to my
> code, and my code returned on the same stack but in the context of a
> different thread than it entered from, would that break anything?

No.  Or, perhaps better said, it would break in the same way when the
generated code would also be your code and not generated by TCC, i.e. TCC
doesn't introduce additional restrictions.  In particular the usual
makecontext/swapcontext way of implementing lightweight threads via stack
switching should work just fine, as should any more unusual way of
switching threads but not stack (what is that even supposed to mean?), as
long as the input code doesn't have any problem if it had been written
literally without TCC involvement.

"switching threads but not stack (what is that even supposed to mean?)"  

What it means is this, imagine that a scheduler outside the generated code created a new stack, switched to it then called some generated code.
And at some point that generated code rather than returning to the scheduler yielded by making a call into my own code which, instead of returning, just saved the context/continuation somewhere and switched back to its native stack.

Then later on, a scheduler running on a different thread with a different thread local state, for instance, takes that stack and switches to it and returns into the generated code, appearing to 
return from the call, but in a different thread.

Any code that used thread local memory might break.


Ciao,
Michael._______________________________________________
Tinycc-devel mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/tinycc-devel

I assume that tcc_compile_string is equivalent to tcc_add_file.  Does that mean that you can add multiple strings to be compiled?  Does tcc copy them, or does it compile them immediately and forget them, or do the original buffers have to be retained?

Kyryl's  answers brought up a bunch more questions for me.

For instance, he said "tcc_delete will free everything, if you call it before running generated code it will crash."

So I wonder, 
1) when is the run time library loaded, when is it initialized and is it ever freed or finalized?

If a jit made a different TCCState for each routine it compiles, say 1000 routines, 

a) would tcclib load 1000 copies of the runtime library?

b) would it make a static variable section for the runtime library 1000 times and initialize the variables in it?

c) would it make a different heap for each routine?  If I call malloc in one routine and then free it in another routine would it try to free it into a different heap and corrupt a heap?

d) if the jit wanted to update a routine with different code, so it called tcc_delete on that code's state, then made a new state and compiled new code for it, would that break anything?  Would tcc_delete finalize anything?  Delete a heap?

ie., is tcclib really designed to be used the way a jit would use it?

He also said that it's possible to reuse a TCCState but "Yes you can reuse it, but it's not going to be efficient because it
will have to recompile the previous stuff also."

This is all very ambiguous.  

2 What state is remembered from one tcc_relocate to the next?  
a) does it remember that I called tcc_set_output_type,  tcc_add_library_path, tcc_add_symbol, tcc_add_file, or tcc_compile_string?  If it remembers tcc_compile_string, did it save a copy of the string?

3 And since he said I'd have to recompile the previous things, what happened to the previously compiled code?  
a) Was it deleted?  What if I had supplied my own buffer?  
b) is the problem one I asked about before, Question 1's run time libraries, heaps or  or Question 2's saving  tcc_add_file or  tcc_compile_string from last time?

It sure would be cool if it turns out that libtcc loaded and initialized its runtime library before any TCCState is made, and that runtime library is shared between all states.  That would be what people who make jits want.
But I'm still worried by the answer that you can't run code after a TCCState is deleted.

Thanks to everyone who answers,

Joshua Scholar

_______________________________________________
Tinycc-devel mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/tinycc-devel
Reply | Threaded
Open this post in threaded view
|

Re: Hello and a few questions about using libtcc

Michael Matz-4
Hello,

On Sun, 20 Dec 2020, Joshua Scholar wrote:

>>> 4) If I did something weird like have a call out from generated code to
>> my
>>> code, and my code returned on the same stack but in the context of a
>>> different thread than it entered from, would that break anything?
>>
>> No.  Or, perhaps better said, it would break in the same way when the
>> generated code would also be your code and not generated by TCC, i.e. TCC
>> doesn't introduce additional restrictions.  In particular the usual
>> makecontext/swapcontext way of implementing lightweight threads via stack
>> switching should work just fine, as should any more unusual way of
>> switching threads but not stack (what is that even supposed to mean?), as
>> long as the input code doesn't have any problem if it had been written
>> literally without TCC involvement.
>>
>> "switching threads but not stack (what is that even supposed to mean?)"
>
> What it means is this, imagine that a scheduler outside the generated code
> created a new stack, switched to it then called some generated code.
> And at some point that generated code rather than returning to the
> scheduler yielded by making a call into my own code which, instead of
> returning, just saved the context/continuation somewhere and switched back
> to its native stack.
>
> Then later on, a scheduler running on a different thread with a different
> thread local state, for instance, takes that stack and switches to it and
> returns into the generated code, appearing to
> return from the call, but in a different thread.

Yeah, so the classic makecontext stack switching/co-routines.  The above
does switch stacks, which is why I was confused by your saying of not
doing that.  Just mis-understanding.

> Any code that used thread local memory might break.

Yeah, and no, TCC is not doing any of that; the code it generates uses
exactly the features that the input code uses as if it were compiled by a
normal ahead-of-time C compiler.  So if your to-be-compiled source
snippets are free of effects breaking the above use case then the code
generated by TCC is free of them as well.

> I assume that tcc_compile_string is equivalent to tcc_add_file.  Does that
> mean that you can add multiple strings to be compiled?  Does tcc copy them,
> or does it compile them immediately and forget them, or do the original
> buffers have to be retained?

The string buffer doesn't have to be retained.  TCC generates machine code
into the appropriate buffers, binds them to the host programs (or other
parts of such machine code also added by tcc_compile_string) and that's
it.

> Kyryl's answers brought up a bunch more questions for me.
>
> For instance, he said "tcc_delete will free everything, if you call it
> before running generated code it will crash."
>
> So I wonder,
> 1) when is the run time library loaded, when is it initialized and is it
> ever freed or finalized?

Which run time library?  In a code snippet like
   "int foo(int a, int b) { return a + b; }"
there's nothing else involved than the assembly code containing basically
some moves and an add and return instruction.  No runtime library
involved.  There's a bit of runtime lib code in libtcc1.a which implements
support for some things the compiler relies on that is better expressed
with extra routines: some double-long arithmetics and va_list support,
alloca, and the helpers for bounds-checking.  The code for that lies in
libtcc1.a and is linked into the TCCState via tcc_add_runtime, called by
tcc_relocate_ex in some cases.  This linking also places the resulting
code in the provided buffers (or somewhere into TCCState).  It's deleted
with tcc_delete, like the code from compile_string.


> If a jit made a different TCCState for each routine it compiles, say 1000
> routines,
>
> a) would tcclib load 1000 copies of the runtime library?

Yes.  Basically a TCCState is a complete sandbox separate from each other
(not in a safety sense, but in design, if you have wild writes in on
TCCState it might affect memory that happens to be for another TCCState),
only communicating with the host executable (e.g. to provide a 'bar'
routine for this snippet: "void callbar(void) { bar(); }").  But libtcc1.a
is only loaded as necessary, so if your snippets don't use va_list and
alloca (and you don't use bounds checking) then you don't need any of it
on x86-64 for usual code (i.e. not one involving 128 bit arithmetics).

> b) would it make a static variable section for the runtime library 1000
> times and initialize the variables in it?

Yes.

> c) would it make a different heap for each routine?  If I call malloc in
> one routine and then free it in another routine would it try to free it
> into a different heap and corrupt a heap?

malloc and free aren't provided by TCC, they are provided by your host
program (or rather by the supporting C library linked into it), the
snippets, when calling malloc will use those routines.  So, if those are
thread-safe (they are in all but the most basic C systems) then all is
safe.  All TCCStates (and the routines therein) will use the same heap.

> d) if the jit wanted to update a routine with different code, so it called
> tcc_delete on that code's state, then made a new state and compiled new
> code for it, would that break anything?  Would tcc_delete finalize
> anything?  Delete a heap?

Nope.  Everything allocated (e.g. via malloc) by code snippets stays
allocated after tcc_delete (probably then causing a leak as you also loose
all data information, like where that malloced block was).  So you would
need to make sure that all resources are freed in the snippets before
calling tcc_delete, e.g. by providing a finalizer routine yourself (and
calling it!).

About changes to an existing TCCState: it's quite probably that this
currently doesn't work, I honestly don't know.  In particular such
sequence:

    TCCState *s = tcc_new();
    tcc_compile_string(s, "...");
    tcc_relocate(s, ... buf ...)
    foo = tcc_get_symbol("foo");
    // do something with foo, up to here everything is clear
    // now comes the parts where I'm sure are bugs/unsupportedness:
    tcc_compile_string(s, " ... something else ..."); // uhh, changing a relocated state?
    tcc_relocate(s, ....); // finalize only new stuff ???
    foo2 = tcc_get_symbol("foo2");
    // do something with foo2
    ...
    tcc_delete(s);

Even if the above happens to work (i.e. two relocate calls, where the
second would affect only the added snippets, or at least don't destroy
the old snippets), which I doubt, then you certainly will run into
problems when the second snippet tries to override symbol "foo" (from the
first snippet) and expects that even old snippets calling old foo will
then call new foo.

> ie., is tcclib really designed to be used the way a jit would use it?

Depends.  If you expect that the jit can regularly replace existing
functions with new versions transparently, then no.  TCC could be extended
to do so, but it's not there as of now.  At the very least there needs to
be some way of unrolling the process of relocation and some symbol table
trickery for the symbol replacements.

> He also said that it's possible to reuse a TCCState but "Yes you can reuse
> it, but it's not going to be efficient because it
> will have to recompile the previous stuff also."

Well, to get around the above problem of not being able to replace
functions transparently you could use a scheme where you remember all
strings fed into tcc_compile_string, with some meta-info (like this string
was for function "foo", and order of addition).  Then, in order to replace
function "foo" you would replace that string, and then feed all the
collected string (all old ones, except the old-foo string, now replaced
with new-foo) into a new TCCState.  You end up with a state equivalent to
the old one, except for the replaced foo function.  Of course the
addresses of stuff is all different, so you would have to refetch e.g. all
symbols that you were fetching from the old state.

TCC is extremely fast in generating code, but the above process,
especially if there's much non-changing code might be too slow in the end.
You would have to try.

> 2 What state is remembered from one tcc_relocate to the next?
> a) does it remember that I
> called tcc_set_output_type,  tcc_add_library_path, tcc_add_symbol,
> tcc_add_file,
> or tcc_compile_string?  If it remembers tcc_compile_string, did it save a
> copy of the string?

As said above, calling tcc_relocate twice on the same state is probably
not going to work right now, so the correct answer would be "mu".  But the
info of output_type and added libraries would still be there.  The "info"
(i.e. code/data/addresses) from added libs, files and symbols would still
be there, but in a relocated/finalized fashion, in such way that
relocating/finalizing it again would mangle it.  And no, compile_string
doesn't remember it's argument anywhere.

> 3 And since he said I'd have to recompile the previous things, what
> happened to the previously compiled code?

Also mu.

> It sure would be cool if it turns out that libtcc loaded and initialized
> its runtime library before any TCCState is made, and that runtime
> library is shared between all states.  That would be what people who
> make jits want. But I'm still worried by the answer that you can't run
> code after a TCCState is deleted.

The runtime lib is really the smallest thing to worry about.  What's
probably the bigger issue for you right now would be that a TCCState can
be finalized only once safely.

It's actually probably not too much work to make TCCState into one where
you can repeatedly call compile_string and relocate in a mixed way (and
even override functions/data), it's just that noone invested the time to
do that.


Ciao,
Michael.

_______________________________________________
Tinycc-devel mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/tinycc-devel