Hi. I just joined this list.
It feels a bit rude just barging in with questions here I go. I would like to use libtcc to implement a jit so I have a few questions. First a couple questions about the library's API, and then a couple on the generated code. On the api: 1) If I've generated some code into a memory buffer and I've retrieved an entry point with tcc_get_symbol, is it safe to call tcc_delete_state before I call the entry point? Looking at the source code, it looks like tcc_delete_state deletes some things that you don't need at run-time. 2) If you use TCC_RELOCATE_AUTO, does tcc ever delete that memory for you? Does
tcc_delete_state delete the generated code? I do realize that if I supply a buffer myself, I have to make it executable myself, but then I can also know that I can delete it. 3) Does the tcc compiler allocate space for a stack? Or is it that when you call into code it generated, it uses your current stack? Or is there a difference between tcc_run and just calling a function you got back from tcc_get_symbol, where tcc_run allocates its own stack? 3 a) I noticed that there's a bounds checking version of alloca, but I couldn't understand it. Does this imply an answer to my previous question, that TCC DOES allocate its own stack? Generally operating systems expand user stacks as needed, up to a point, so bounds checking a stack doesn't make a lot of sense. 4) Is it possible to reuse a TCCState? What would it do? Would it trash the code already generated? Would it remember symbols from previous compiles? 5) Is the compiler thread safe? While it would be surprising, I might as well ask if you can compile multiple sources at once. Now questions about generated code. The jit I'm hoping to make is for a language that's embarrassingly parallel, so I need to know how the generated code works with threads and stacks and contexts. 1) The simplest thing, what's the calling convention of generated functions? On 64 bit windows? On 64 bit Linux? 2) Is the TCC runtime multithread safe? a) And what are the details? Can I run generated code in multiple threads at once? Does it use locks for anything? Is it at least as thread safe as C usually is, ie, you can do anything that doesn't involve a shared buffer like an implicit error string. Would it be fine as long as I put a critical section around some non-thread safe part? 3) Does the run time library use thread local storage? 4) If I did something weird like have a call out from generated code to my code, and my code returned on the same stack but in the context of a different thread than it entered from, would that break anything? 5) Some systems are broken by fibers, ie, if I switched between generated code, each using a different stack, but in the same thread, would that break anything? 6) Are there any pointers on how to use the built in assembler? Thanks, Joshua Scholar _______________________________________________ Tinycc-devel mailing list [hidden email] https://lists.nongnu.org/mailman/listinfo/tinycc-devel |
On Sun, Dec 20, 2020 at 2:03 PM Joshua Scholar <[hidden email]> wrote:
> 1) If I've generated some code into a memory buffer and I've retrieved an entry point with tcc_get_symbol, is it safe to call tcc_delete_state before I call the entry point? Looking at the source code, it looks like tcc_delete_state deletes some things that you don't need at run-time. > > 2) If you use TCC_RELOCATE_AUTO, does tcc ever delete that memory for you? Does tcc_delete_state delete the generated code? I do realize that if I supply a buffer myself, I have to make it executable myself, but then I can also know that I can delete it. > > 3) Does the tcc compiler allocate space for a stack? Or is it that when you call into code it generated, it uses your current stack? Or is there a difference between tcc_run and just calling a function you got back from tcc_get_symbol, where tcc_run allocates its own stack? > > 3 a) I noticed that there's a bounds checking version of alloca, but I couldn't understand it. Does this imply an answer to my previous question, that TCC DOES allocate its own stack? Generally operating systems expand user stacks as needed, up to a point, so bounds checking a stack doesn't make a lot of sense. > > 4) Is it possible to reuse a TCCState? What would it do? Would it trash the code already generated? Would it remember symbols from previous compiles? > > 5) Is the compiler thread safe? While it would be surprising, I might as well ask if you can compile multiple sources at once. > > Now questions about generated code. The jit I'm hoping to make is for a language that's embarrassingly parallel, so I need to know how the generated code works with threads and stacks and contexts. > > 1) The simplest thing, what's the calling convention of generated functions? On 64 bit windows? On 64 bit Linux? > > 2) Is the TCC runtime multithread safe? > a) And what are the details? Can I run generated code in multiple threads at once? Does it use locks for anything? > Is it at least as thread safe as C usually is, ie, you can do anything that doesn't involve a shared buffer like an implicit error string. Would it be fine as long as I put a critical section around some non-thread safe part? > > 3) Does the run time library use thread local storage? > > 4) If I did something weird like have a call out from generated code to my code, and my code returned on the same stack but in the context of a different thread than it entered from, would that break anything? > > 5) Some systems are broken by fibers, ie, if I switched between generated code, each using a different stack, but in the same thread, would that break anything? > > 6) Are there any pointers on how to use the built in assembler? > I'll do my best to answer these, make sure to reply if any corrections are necessary. 1. tcc_delete_state is not a thing, you mean tcc_delete? No, tcc_delete will free everything, if you call it before running generated code it will crash. But you could free some stuff from TCCState, just make sure to not call tcc_run_free() if you don't plan to ever use the state for compiling. 2. TCC_RELOCATE_AUTO means that the TCCState manages the code's memory, and no, you still have to free it manually by calling tcc_delete() or tcc_run_free(). The memory pointer is s->runtime_mem 3. No. The program does not create additional stacks. tcc_run creates a section in memory and calls __init_array_start, just like any normal C program would when you run it with libc. But because it goes through the new section init it will allocate the amount of static memory it needs. So you can expect the static memory to be set 0. 3a Bounds checking code is for debug only, there are special trap instructions inserted by compiler if you enable it, but it will not mitigate any program errors, it can detect stackoverflow and report it, but that's the extent. It also checks the memory sections for buffer overflows. 4. Yes you can reuse it, but it's not going to be efficient because it will have to recompile the previous stuff also. 5. I'd say no, a compiler is not thread safe unless you run a TCCState per thread. ----- 1. Read code in gfunc_prolog gfunc_epilog. But I am confused by this because x64 only has one calling convention, so I don't understand why it matters except for i386 really. Windows is like the only exception that does not follow the abi, but the difference is only register order. Convention is fastcall 2. There is one semaphore in the code, but all I can tell you is that it's there for some specific reason which I can't recall 3. No 4. I don't see any problems doing that. Stack memory is generally thread safe and reentrant. 5. No idea what that means, probably no though 6. Use keywords like asm() or __asm() or __asm__() and only GAS assembly syntax is supported. _______________________________________________ Tinycc-devel mailing list [hidden email] https://lists.nongnu.org/mailman/listinfo/tinycc-devel |
In reply to this post by Joshua Scholar
Hello,
In addition to the answers already given by Kyryl: On Sun, 20 Dec 2020, Joshua Scholar wrote: > Now questions about generated code. The jit I'm hoping to make is for a > language that's embarrassingly parallel, so I need to know how the generated > code works with threads and stacks and contexts. > > 1) The simplest thing, what's the calling convention of generated > functions? On 64 bit windows? On 64 bit Linux? TCC follows the native calling convention, i.e. msvc on windows and ELF psABI on linux. > 2) Is the TCC runtime multithread safe? Depends what you mean by runtime: a) if you mean the code of libtcc itself, i.e. compiler/linker: you can use several TCCStates from separate threads, but must not use the same TCCState concurrently. b) if you mean the code of libtcc1, i.e. support routines sometimes used by the generated code: then, yes, that's thread-safe. (It's also very minimal, there's not much in term of support code necessary). > a) And what are the details? Can I run generated code in multiple threads > at once? Does it use locks for anything? Yes, the generated code is as thread-safe as the input C code is. No it doesn't use locks for anything (the runtime support for boundschecking uses locks, but that's for debugging purposes). > Is it at least as thread safe as C usually is, ie, you can do anything that > doesn't involve a shared buffer like an implicit error string. Would it be > fine as long as I put a critical section around some non-thread safe part? > > 3) Does the run time library use thread local storage? No. > 4) If I did something weird like have a call out from generated code to my > code, and my code returned on the same stack but in the context of a > different thread than it entered from, would that break anything? No. Or, perhaps better said, it would break in the same way when the generated code would also be your code and not generated by TCC, i.e. TCC doesn't introduce additional restrictions. In particular the usual makecontext/swapcontext way of implementing lightweight threads via stack switching should work just fine, as should any more unusual way of switching threads but not stack (what is that even supposed to mean?), as long as the input code doesn't have any problem if it had been written literally without TCC involvement. > 5) Some systems are broken by fibers, ie, if I switched between generated > code, each using a different stack, but in the same thread, would that break > anything? See above, stack switching shouldn't be affected by TCC-generated code, if the input C source isn't. > 6) Are there any pointers on how to use the built in assembler? It's the GCC builtin assembler syntax, but only for i386/x86-64, and with some limits. The limits aren't documented and subject to change if more things are needed. As for usage documentation have a look at GCCs docu: https://gcc.gnu.org/onlinedocs/gcc-10.2.0/gcc/Using-Assembly-Language-with-C.html Ciao, Michael. _______________________________________________ Tinycc-devel mailing list [hidden email] https://lists.nongnu.org/mailman/listinfo/tinycc-devel |
Thank you for answering my question thoroughly.
"switching threads but not stack (what is that even supposed to mean?)" What it means is this, imagine that a scheduler outside the generated code created a new stack, switched to it then called some generated code. And at some point that generated code rather than returning to the scheduler yielded by making a call into my own code which, instead of returning, just saved the context/continuation somewhere and switched back to its native stack. Then later on, a scheduler running on a different thread with a different thread local state, for instance, takes that stack and switches to it and returns into the generated code, appearing to return from the call, but in a different thread. Any code that used thread local memory might break.
I assume that tcc_compile_string is equivalent to tcc_add_file. Does that mean that you can add multiple strings to be compiled? Does tcc copy them, or does it compile them immediately and forget them, or do the original buffers have to be retained? Kyryl's answers brought up a bunch more questions for me. For instance, he said "tcc_delete will free everything, if you call it before running generated code it will crash." So I wonder, 1) when is the run time library loaded, when is it initialized and is it ever freed or finalized? If a jit made a different TCCState for each routine it compiles, say 1000 routines, a) would tcclib load 1000 copies of the runtime library? b) would it make a static variable section for the runtime library 1000 times and initialize the variables in it? c) would it make a different heap for each routine? If I call malloc in one routine and then free it in another routine would it try to free it into a different heap and corrupt a heap? d) if the jit wanted to update a routine with different code, so it called tcc_delete on that code's state, then made a new state and compiled new code for it, would that break anything? Would tcc_delete finalize anything? Delete a heap? ie., is tcclib really designed to be used the way a jit would use it? He also said that it's possible to reuse a TCCState but "Yes you can reuse it, but it's not going to be efficient because it will have to recompile the previous stuff also."This is all very ambiguous. 2 What state is remembered from one tcc_relocate to the next? a) does it remember that I called tcc_set_output_type, tcc_add_library_path, tcc_add_symbol, tcc_add_file, or tcc_compile_string? If it remembers tcc_compile_string, did it save a copy of the string? 3 And since he said I'd have to recompile the previous things, what happened to the previously compiled code? a) Was it deleted? What if I had supplied my own buffer? b) is the problem one I asked about before, Question 1's run time libraries, heaps or or Question 2's saving
tcc_add_file or
tcc_compile_string from last time? It sure would be cool if it turns out that libtcc loaded and initialized its runtime library before any TCCState is made, and that runtime library is shared between all states. That would be what people who make jits want. But I'm still worried by the answer that you can't run code after a TCCState is deleted. Thanks to everyone who answers, Joshua Scholar _______________________________________________ Tinycc-devel mailing list [hidden email] https://lists.nongnu.org/mailman/listinfo/tinycc-devel |
Hello,
On Sun, 20 Dec 2020, Joshua Scholar wrote: >>> 4) If I did something weird like have a call out from generated code to >> my >>> code, and my code returned on the same stack but in the context of a >>> different thread than it entered from, would that break anything? >> >> No. Or, perhaps better said, it would break in the same way when the >> generated code would also be your code and not generated by TCC, i.e. TCC >> doesn't introduce additional restrictions. In particular the usual >> makecontext/swapcontext way of implementing lightweight threads via stack >> switching should work just fine, as should any more unusual way of >> switching threads but not stack (what is that even supposed to mean?), as >> long as the input code doesn't have any problem if it had been written >> literally without TCC involvement. >> >> "switching threads but not stack (what is that even supposed to mean?)" > > What it means is this, imagine that a scheduler outside the generated code > created a new stack, switched to it then called some generated code. > And at some point that generated code rather than returning to the > scheduler yielded by making a call into my own code which, instead of > returning, just saved the context/continuation somewhere and switched back > to its native stack. > > Then later on, a scheduler running on a different thread with a different > thread local state, for instance, takes that stack and switches to it and > returns into the generated code, appearing to > return from the call, but in a different thread. Yeah, so the classic makecontext stack switching/co-routines. The above does switch stacks, which is why I was confused by your saying of not doing that. Just mis-understanding. > Any code that used thread local memory might break. Yeah, and no, TCC is not doing any of that; the code it generates uses exactly the features that the input code uses as if it were compiled by a normal ahead-of-time C compiler. So if your to-be-compiled source snippets are free of effects breaking the above use case then the code generated by TCC is free of them as well. > I assume that tcc_compile_string is equivalent to tcc_add_file. Does that > mean that you can add multiple strings to be compiled? Does tcc copy them, > or does it compile them immediately and forget them, or do the original > buffers have to be retained? The string buffer doesn't have to be retained. TCC generates machine code into the appropriate buffers, binds them to the host programs (or other parts of such machine code also added by tcc_compile_string) and that's it. > Kyryl's answers brought up a bunch more questions for me. > > For instance, he said "tcc_delete will free everything, if you call it > before running generated code it will crash." > > So I wonder, > 1) when is the run time library loaded, when is it initialized and is it > ever freed or finalized? Which run time library? In a code snippet like "int foo(int a, int b) { return a + b; }" there's nothing else involved than the assembly code containing basically some moves and an add and return instruction. No runtime library involved. There's a bit of runtime lib code in libtcc1.a which implements support for some things the compiler relies on that is better expressed with extra routines: some double-long arithmetics and va_list support, alloca, and the helpers for bounds-checking. The code for that lies in libtcc1.a and is linked into the TCCState via tcc_add_runtime, called by tcc_relocate_ex in some cases. This linking also places the resulting code in the provided buffers (or somewhere into TCCState). It's deleted with tcc_delete, like the code from compile_string. > If a jit made a different TCCState for each routine it compiles, say 1000 > routines, > > a) would tcclib load 1000 copies of the runtime library? Yes. Basically a TCCState is a complete sandbox separate from each other (not in a safety sense, but in design, if you have wild writes in on TCCState it might affect memory that happens to be for another TCCState), only communicating with the host executable (e.g. to provide a 'bar' routine for this snippet: "void callbar(void) { bar(); }"). But libtcc1.a is only loaded as necessary, so if your snippets don't use va_list and alloca (and you don't use bounds checking) then you don't need any of it on x86-64 for usual code (i.e. not one involving 128 bit arithmetics). > b) would it make a static variable section for the runtime library 1000 > times and initialize the variables in it? Yes. > c) would it make a different heap for each routine? If I call malloc in > one routine and then free it in another routine would it try to free it > into a different heap and corrupt a heap? malloc and free aren't provided by TCC, they are provided by your host program (or rather by the supporting C library linked into it), the snippets, when calling malloc will use those routines. So, if those are thread-safe (they are in all but the most basic C systems) then all is safe. All TCCStates (and the routines therein) will use the same heap. > d) if the jit wanted to update a routine with different code, so it called > tcc_delete on that code's state, then made a new state and compiled new > code for it, would that break anything? Would tcc_delete finalize > anything? Delete a heap? Nope. Everything allocated (e.g. via malloc) by code snippets stays allocated after tcc_delete (probably then causing a leak as you also loose all data information, like where that malloced block was). So you would need to make sure that all resources are freed in the snippets before calling tcc_delete, e.g. by providing a finalizer routine yourself (and calling it!). About changes to an existing TCCState: it's quite probably that this currently doesn't work, I honestly don't know. In particular such sequence: TCCState *s = tcc_new(); tcc_compile_string(s, "..."); tcc_relocate(s, ... buf ...) foo = tcc_get_symbol("foo"); // do something with foo, up to here everything is clear // now comes the parts where I'm sure are bugs/unsupportedness: tcc_compile_string(s, " ... something else ..."); // uhh, changing a relocated state? tcc_relocate(s, ....); // finalize only new stuff ??? foo2 = tcc_get_symbol("foo2"); // do something with foo2 ... tcc_delete(s); Even if the above happens to work (i.e. two relocate calls, where the second would affect only the added snippets, or at least don't destroy the old snippets), which I doubt, then you certainly will run into problems when the second snippet tries to override symbol "foo" (from the first snippet) and expects that even old snippets calling old foo will then call new foo. > ie., is tcclib really designed to be used the way a jit would use it? Depends. If you expect that the jit can regularly replace existing functions with new versions transparently, then no. TCC could be extended to do so, but it's not there as of now. At the very least there needs to be some way of unrolling the process of relocation and some symbol table trickery for the symbol replacements. > He also said that it's possible to reuse a TCCState but "Yes you can reuse > it, but it's not going to be efficient because it > will have to recompile the previous stuff also." Well, to get around the above problem of not being able to replace functions transparently you could use a scheme where you remember all strings fed into tcc_compile_string, with some meta-info (like this string was for function "foo", and order of addition). Then, in order to replace function "foo" you would replace that string, and then feed all the collected string (all old ones, except the old-foo string, now replaced with new-foo) into a new TCCState. You end up with a state equivalent to the old one, except for the replaced foo function. Of course the addresses of stuff is all different, so you would have to refetch e.g. all symbols that you were fetching from the old state. TCC is extremely fast in generating code, but the above process, especially if there's much non-changing code might be too slow in the end. You would have to try. > 2 What state is remembered from one tcc_relocate to the next? > a) does it remember that I > called tcc_set_output_type, tcc_add_library_path, tcc_add_symbol, > tcc_add_file, > or tcc_compile_string? If it remembers tcc_compile_string, did it save a > copy of the string? As said above, calling tcc_relocate twice on the same state is probably not going to work right now, so the correct answer would be "mu". But the info of output_type and added libraries would still be there. The "info" (i.e. code/data/addresses) from added libs, files and symbols would still be there, but in a relocated/finalized fashion, in such way that relocating/finalizing it again would mangle it. And no, compile_string doesn't remember it's argument anywhere. > 3 And since he said I'd have to recompile the previous things, what > happened to the previously compiled code? Also mu. > It sure would be cool if it turns out that libtcc loaded and initialized > its runtime library before any TCCState is made, and that runtime > library is shared between all states. That would be what people who > make jits want. But I'm still worried by the answer that you can't run > code after a TCCState is deleted. The runtime lib is really the smallest thing to worry about. What's probably the bigger issue for you right now would be that a TCCState can be finalized only once safely. It's actually probably not too much work to make TCCState into one where you can repeatedly call compile_string and relocate in a mixed way (and even override functions/data), it's just that noone invested the time to do that. Ciao, Michael. _______________________________________________ Tinycc-devel mailing list [hidden email] https://lists.nongnu.org/mailman/listinfo/tinycc-devel |
Free forum by Nabble | Edit this page |