I am currently toying around with WebAssembly compiled through LLVM but I haven't yet managed to understand the stack / stack pointer and how it relates to the overall memory layout.
I learned that I have to use
--allocate-stack N to make my program run and I figured that this is basically adding
(data (i32.const 4) "8\00\00\00") (with N=8) to my generated wast, with the binary part obviously being a pointer to a memory offset and the i32 constant being its offset in linear memory.
What I do not quite understand, though, is why the pointer's value is
56 (again with N=8) and how this value relates to the exact region of the stack in memory, which, in my case, currently looks like:
7-35: other data sections
I know that I am probably more a candidate for "just use emscripten", but I'd also like to understand this.<ul><li>Is the stack pointer always stored at offset 4 in linear memory?</li> <li>How is its initial value calculated? (aligned to next offset%16==0 + N after data?)</li> <li>What's stored before, and what's after the offset it points at?</li> </ul>Answer1:
I touched on this in <a href="https://stackoverflow.com/questions/43571620/understanding-class-structures-and-constructor-calls" rel="nofollow">another question</a>. From C++'s stack there are actually 3 places where the values can end up:<ol><li>On the execution stack (each opcode pushes and pops values, so
addpops 2 and then pushes 1).</li> <li>As a local.</li> <li>In the
Notice that you can't take the address of 1. and 2. Only in these cases would I expect a code generator to go with 3. How this is done isn't dictated by WebAssembly, it's up to whatever ABI you chose. What Emscripten and other tools do is they store the stack pointer at address
4, and then very early in the program they choose a spot where the stack should go. It doesn't <em>have</em> to always be 4, but it's simpler to always stick to that ABI especially if dynamic linking is involved.
On initial value: that location has to be big enough to hold the whole stack, and the implementation of
malloc has to know about it because it can't allocate heap space over it. That's why some tooling allows you to specify max size.
Anything can be stored before / after (though after you'd likely have prior stack values). WebAssembly doesn't currently have guard pages, so exhausting the in-memory stack will clobber heap values (unless the code generator also emits stack checks). That's all "memory safe" in that it still can't escape the
WebAssembly.Memory, so the browser can't get owned but the developer's own code can totally be owned. A memory-safe language built on top of WebAssembly would have to enforce memory safety within the
Note that I haven't explained 1. and 2. Their existence means that most C++ programs will use less in-memory stack in WebAssembly than a native C++ program uses stack.