200 lines
9.7 KiB
Plaintext
200 lines
9.7 KiB
Plaintext
Date: Sun, 19 Nov 2000 16:23:57 -0600 (CST)
|
|
From: Chris Lattner <sabre@nondot.org>
|
|
To: Vikram Adve <vadve@cs.uiuc.edu>
|
|
Subject: Re: a few thoughts
|
|
|
|
Okay... here are a few of my thoughts on this (it's good to know that we
|
|
think so alike!):
|
|
|
|
> 1. We need to be clear on our goals for the VM. Do we want to emphasize
|
|
> portability and safety like the Java VM? Or shall we focus on the
|
|
> architecture interface first (i.e., consider the code generation and
|
|
> processor issues), since the architecture interface question is also
|
|
> important for portable Java-type VMs?
|
|
|
|
I forsee the architecture looking kinda like this: (which is completely
|
|
subject to change)
|
|
|
|
1. The VM code is NOT guaranteed safe in a java sense. Doing so makes it
|
|
basically impossible to support C like languages. Besides that,
|
|
certifying a register based language as safe at run time would be a
|
|
pretty expensive operation to have to do. Additionally, we would like
|
|
to be able to statically eliminate many bounds checks in Java
|
|
programs... for example.
|
|
|
|
2. Instead, we can do the following (eventually):
|
|
* Java bytecode is used as our "safe" representation (to avoid
|
|
reinventing something that we don't add much value to). When the
|
|
user chooses to execute Java bytecodes directly (ie, not
|
|
precompiled) the runtime compiler can do some very simple
|
|
transformations (JIT style) to convert it into valid input for our
|
|
VM. Performance is not wonderful, but it works right.
|
|
* The file is scheduled to be compiled (rigorously) at a later
|
|
time. This could be done by some background process or by a second
|
|
processor in the system during idle time or something...
|
|
* To keep things "safe" ie to enforce a sandbox on Java/foreign code,
|
|
we could sign the generated VM code with a host specific private
|
|
key. Then before the code is executed/loaded, we can check to see if
|
|
the trusted compiler generated the code. This would be much quicker
|
|
than having to validate consistency (especially if bounds checks have
|
|
been removed, for example)
|
|
|
|
> This is important because the audiences for these two goals are very
|
|
> different. Architects and many compiler people care much more about
|
|
> the second question. The Java compiler and OS community care much more
|
|
> about the first one.
|
|
|
|
3. By focusing on a more low level virtual machine, we have much more room
|
|
for value add. The nice safe "sandbox" VM can be provided as a layer
|
|
on top of it. It also lets us focus on the more interesting compilers
|
|
related projects.
|
|
|
|
> 2. Design issues to consider (an initial list that we should continue
|
|
> to modify). Note that I'm not trying to suggest actual solutions here,
|
|
> but just various directions we can pursue:
|
|
|
|
Understood. :)
|
|
|
|
> a. A single-assignment VM, which we've both already been thinking
|
|
> about.
|
|
|
|
Yup, I think that this makes a lot of sense. I am still intrigued,
|
|
however, by the prospect of a minimally allocated VM representation... I
|
|
think that it could have definite advantages for certain applications
|
|
(think very small machines, like PDAs). I don't, however, think that our
|
|
initial implementations should focus on this. :)
|
|
|
|
Here are some other auxiliary goals that I think we should consider:
|
|
|
|
1. Primary goal: Support a high performance dynamic compilation
|
|
system. This means that we have an "ideal" division of labor between
|
|
the runtime and static compilers. Of course, the other goals of the
|
|
system somewhat reduce the importance of this point (f.e. portability
|
|
reduces performance, but hopefully not much)
|
|
2. Portability to different processors. Since we are most familiar with
|
|
x86 and solaris, I think that these two are excellent candidates when
|
|
we get that far...
|
|
3. Support for all languages & styles of programming (general purpose
|
|
VM). This is the point that disallows java style bytecodes, where all
|
|
array refs are checked for bounds, etc...
|
|
4. Support linking between different language families. For example, call
|
|
C functions directly from Java without using the nasty/slow/gross JNI
|
|
layer. This involves several subpoints:
|
|
A. Support for languages that require garbage collectors and integration
|
|
with languages that don't. As a base point, we could insist on
|
|
always using a conservative GC, but implement free as a noop, f.e.
|
|
|
|
> b. A strongly-typed VM. One question is do we need the types to be
|
|
> explicitly declared or should they be inferred by the dynamic
|
|
> compiler?
|
|
|
|
B. This is kind of similar to another idea that I have: make OOP
|
|
constructs (virtual function tables, class heirarchies, etc) explicit
|
|
in the VM representation. I believe that the number of additional
|
|
constructs would be fairly low, but would give us lots of important
|
|
information... something else that would/could be important is to
|
|
have exceptions as first class types so that they would be handled in
|
|
a uniform way for the entire VM... so that C functions can call Java
|
|
functions for example...
|
|
|
|
> c. How do we get more high-level information into the VM while keeping
|
|
> to a low-level VM design?
|
|
> o Explicit array references as operands? An alternative is
|
|
> to have just an array type, and let the index computations be
|
|
> separate 3-operand instructions.
|
|
|
|
C. In the model I was thinking of (subject to change of course), we
|
|
would just have an array type (distinct from the pointer
|
|
types). This would allow us to have arbitrarily complex index
|
|
expressions, while still distinguishing "load" from "Array load",
|
|
for example. Perhaps also, switch jump tables would be first class
|
|
types as well? This would allow better reasoning about the program.
|
|
|
|
5. Support dynamic loading of code from various sources. Already
|
|
mentioned above was the example of loading java bytecodes, but we want
|
|
to support dynamic loading of VM code as well. This makes the job of
|
|
the runtime compiler much more interesting: it can do interprocedural
|
|
optimizations that the static compiler can't do, because it doesn't
|
|
have all of the required information (for example, inlining from
|
|
shared libraries, etc...)
|
|
|
|
6. Define a set of generally useful annotations to add to the VM
|
|
representation. For example, a function can be analysed to see if it
|
|
has any sideeffects when run... also, the MOD/REF sets could be
|
|
calculated, etc... we would have to determine what is reasonable. This
|
|
would generally be used to make IP optimizations cheaper for the
|
|
runtime compiler...
|
|
|
|
> o Explicit instructions to handle aliasing, e.g.s:
|
|
> -- an instruction to say "I speculate that these two values are not
|
|
> aliased, but check at runtime", like speculative execution in
|
|
> EPIC?
|
|
> -- or an instruction to check whether two values are aliased and
|
|
> execute different code depending on the answer, somewhat like
|
|
> predicated code in EPIC
|
|
|
|
These are also very good points... if this can be determined at compile
|
|
time. I think that an epic style of representation (not the instruction
|
|
packing, just the information presented) could be a very interesting model
|
|
to use... more later...
|
|
|
|
> o (This one is a difficult but powerful idea.)
|
|
> A "thread-id" field on every instruction that allows the static
|
|
> compiler to generate a set of parallel threads, and then have
|
|
> the runtime compiler and hardware do what they please with it.
|
|
> This has very powerful uses, but thread-id on every instruction
|
|
> is expensive in terms of instruction size and code size.
|
|
> We would need to compactly encode it somehow.
|
|
|
|
Yes yes yes! :) I think it would be *VERY* useful to include this kind
|
|
of information (which EPIC architectures *implicitly* encode. The trend
|
|
that we are seeing supports this greatly:
|
|
|
|
1. Commodity processors are getting massive SIMD support:
|
|
* Intel/Amd MMX/MMX2
|
|
* AMD's 3Dnow!
|
|
* Intel's SSE/SSE2
|
|
* Sun's VIS
|
|
2. SMP is becoming much more common, especially in the server space.
|
|
3. Multiple processors on a die are right around the corner.
|
|
|
|
If nothing else, not designing this in would severely limit our future
|
|
expansion of the project...
|
|
|
|
> Also, this will require some reading on at least two other
|
|
> projects:
|
|
> -- Multiscalar architecture from Wisconsin
|
|
> -- Simultaneous multithreading architecture from Washington
|
|
>
|
|
> o Or forget all this and stick to a traditional instruction set?
|
|
|
|
Heh... :) Well, from a pure research point of view, it is almost more
|
|
attactive to go with the most extreme/different ISA possible. On one axis
|
|
you get safety and conservatism, and on the other you get degree of
|
|
influence that the results have. Of course the problem with pure research
|
|
is that often times there is no concrete product of the research... :)
|
|
|
|
> BTW, on an unrelated note, after the meeting yesterday, I did remember
|
|
> that you had suggested doing instruction scheduling on SSA form instead
|
|
> of a dependence DAG earlier in the semester. When we talked about
|
|
> it yesterday, I didn't remember where the idea had come from but I
|
|
> remembered later. Just giving credit where its due...
|
|
|
|
:) Thanks.
|
|
|
|
> Perhaps you can save the above as a file under RCS so you and I can
|
|
> continue to expand on this.
|
|
|
|
I think it makes sense to do so when we get our ideas more formalized and
|
|
bounce it back and forth a couple of times... then I'll do a more formal
|
|
writeup of our goals and ideas. Obviously our first implementation will
|
|
not want to do all of the stuff that I pointed out above... be we will
|
|
want to design the project so that we do not artificially limit ourselves
|
|
at sometime in the future...
|
|
|
|
Anyways, let me know what you think about these ideas... and if they sound
|
|
reasonable...
|
|
|
|
-Chris
|
|
|