penberg / hornet Goto Github PK
View Code? Open in Web Editor NEWHornet, a JVM optimized for low-latency applications.
License: Other
Hornet, a JVM optimized for low-latency applications.
License: Other
nwatkins@kyoto:~/src/hornet/build$ llvm-config --version
3.4
nwatkins@kyoto:~/src/hornet/build$ cmake ..
-- Found LLVM: /usr/lib/llvm-3.4 (found version "3.4")
-- Configuring done
-- Generating done
-- Build files have been written to: /home/nwatkins/src/hornet/build
nwatkins@kyoto:~/src/hornet/build$ make
Scanning dependencies of target jvm
[ 4%] Building CXX object CMakeFiles/jvm.dir/vm/alloc.cc.o
[ 9%] Building CXX object CMakeFiles/jvm.dir/vm/jvm.cc.o
[ 14%] Building CXX object CMakeFiles/jvm.dir/vm/klass.cc.o
[ 19%] Building CXX object CMakeFiles/jvm.dir/vm/object.cc.o
[ 23%] Building CXX object CMakeFiles/jvm.dir/vm/thread.cc.o
[ 28%] Building C object CMakeFiles/jvm.dir/mps/mps.c.o
[ 33%] Building CXX object CMakeFiles/jvm.dir/java/backend.cc.o
[ 38%] Building CXX object CMakeFiles/jvm.dir/java/class_file.cc.o
[ 42%] Building CXX object CMakeFiles/jvm.dir/java/constant_pool.cc.o
[ 47%] Building CXX object CMakeFiles/jvm.dir/java/ffi.cc.o
[ 52%] Building CXX object CMakeFiles/jvm.dir/java/intern.cc.o
[ 57%] Building CXX object CMakeFiles/jvm.dir/java/interp.cc.o
[ 61%] Building CXX object CMakeFiles/jvm.dir/java/jni.cc.o
[ 66%] Building CXX object CMakeFiles/jvm.dir/java/loader.cc.o
[ 71%] Building CXX object CMakeFiles/jvm.dir/java/opcode.cc.o
[ 76%] Building CXX object CMakeFiles/jvm.dir/java/prims.cc.o
[ 80%] Building CXX object CMakeFiles/jvm.dir/java/translator.cc.o
[ 85%] Building CXX object CMakeFiles/jvm.dir/java/verify.cc.o
[ 90%] Building CXX object CMakeFiles/jvm.dir/java/zip.cc.o
[ 95%] Building CXX object CMakeFiles/jvm.dir/java/llvm.cc.o
/home/nwatkins/src/hornet/java/llvm.cc:15:30: fatal error: llvm/IR/Verifier.h: No such file or directory
#include "llvm/IR/Verifier.h"
^
compilation terminated.
make[2]: *** [CMakeFiles/jvm.dir/java/llvm.cc.o] Error 1
make[1]: *** [CMakeFiles/jvm.dir/all] Error 2
make: *** [all] Error 2
Motivation
JIT compilation is a source of latency jitter that applications in low latency environments can control only by "warmup phase" that is inconvenient and error-prone. AOT compilation is a natural solution to the problem for applications that don't use dynamic JVM features such as reflection. AOT is also necessary if the hardware or software has restrictions that don't allow executing JIT'd code like in the iPhone and XBox 360.
Implementations
Mono virtual machine for .NET supports ahead-of-time compilation.
GCJ is a portable, optimizing, ahead-of-time compiler for the Java Programming Language. It can compile Java source code to Java bytecode (class files) or directly to native machine code, and Java bytecode to native machine code.
The invokespecial
bytecode instruction is not supported.
Transform Java bytecode at link time to a format that can be interpreted more effectively similar to what Mono does for CIL:
https://github.com/mono/mono/blob/master/mono/interpreter/transform.c
References
Rose, John. "Explicit tail-call bytecode." (2011) http://cr.openjdk.java.net/~jrose/draft/vm-tailcall-jep.html
Schwaighofer, Arnold. "Tail Call Optimization in the Java HotSpot™ VM." Master's thesis, Johannes Kepler University Linz (2009).
Motivation
JRuby, for example, would benefit from fork()
support to be able to fully support Ruby semantics.
Hornet needs to invoke <clinit>
upon class initialization to make sure static fields are properly initialized.
Make this such that you can implement other ByteCode, IR can run on top of it with minimum modification.
Some additions like JSleep and Virtualisation (Any OS) support would be cool - http://www.waratek.com/
Add support for an JIT activity inspection application:
Eliminating fastpath memory allocations is important for predicable execution. VisualVM, for example, does support memory profiling but it's very cumbersome to use.
We need low-overhead, always-on support for something like aprof. It is implemented as a JVM agent and it produces reports like this:
TOTAL allocation dump for 29,423 ms (0h00m29s)
Allocated 66,155,144 bytes in 2,870,108 objects in 1,108 locations of 230 classes
-------------------------------------------------------------------------------
java.lang.Integer: 34,953,568 (52%) bytes in 2,184,598 (76%) objects (avg size 16 bytes)
java.lang.Integer.valueOf: 34,931,824 (99%) bytes in 2,183,239 (99%) objects
FibonacciNumbers.fib: 34,852,928 (99%) bytes in 2,178,308 (99%) objects
...
Autobox elision is good for Scala applications in particular because you don't have much control over autoboxing in Scala.
David Keenan from Twitter mentions in his presentation "Twitter-Scale Computing with OpenJDK" that they do it in their in-house OpenJDK version.
Use something like DynASM to implement a fast JIT:
Motivation
Cache misses are a major concern in low latency environments. Currently, developers are forced to use DirectByteBuffers or Unsafe API to be able to control memory layout and access the memory in predictable fashion. Giving developers control over memory layout via JVM intrinsics would help to mitigate the problem.
If you guys are open for suggestion these are areas you can improve:
[ ] Box Elimination
[ ] Array performance (layout for multi dimensional arrays and cache contention in access of length etc.) - Also see Arrays 2.0 presentation at the JVM Summit
[ ] Shadow implementations (or alternate implementations) for selected classes which delegates to alternate implementation in a such a way that it is transparent to the user - E.g. See Pauseless HashMap by Azul
[ ] Whole program Super JIT - Perhaps you can look at Rewriting (http://en.wikipedia.org/wiki/Rewriting, https://code.google.com/p/spsc/, http://pat.keldysh.ru/~ilya/, https://code.google.com/p/hosc/, https://sites.google.com/site/keldyshscp/) for this but targeted at JIT.
[ ] High performance closure support
[ ] High performance inner class / anonymous class support
[ ] Faster serialisation
[ ] Faster reflection
[ ] Faster bytecode manipulation
[ ] Better Numerical Computing Support and Performance - May be this can be shadowed by actual classes if they are used in Hornet but backed by a pure Java implementation for use in other JVMs. E.g. introduce Decimal class.
[ ] Hardware based synchronisation for better performing synchronisation in architectures supporting it. If no hardware support fallback to an alternative which more efficient than the current locking system
[ ] Warm up feature like in JRocket, Azul
[ ] Multi tenant VM like IBM J9, Waratek
[ ] Support value types like IBM J9
[ ] Virtualisation of the JVM like in Waratek
[ ] Better resource management to manage cloud costs like in Waratek
[ ] JVM Clustering like in Terracotta
[ ] Automatic parallelization for Multi core, GPU and FPGA - Also planned in Java 9+, worthwhile seeing Pervasive DataRush though a framework.
[ ] Java 9+ Proof - should support modularisation and other planned Java 9+ features
[ ] Green threads at the VM level like the old JDK (see http://www.paralleluniverse.co/quasar/)
Implementations
VMKit distributes a "full Java virtual machine" called J3 with it.
The README sounds like LLVM is optional, but when I build without LLVM-dev installed I get the following problem (installing LLVM fixes it).
nwatkins@kyoto:~/src/hornet/build$ cmake ..
-- The C compiler identification is GNU 4.9.2
-- The CXX compiler identification is GNU 4.9.2
-- Check for working C compiler: /usr/lib/ccache/cc
-- Check for working C compiler: /usr/lib/ccache/cc -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working CXX compiler: /usr/lib/ccache/c++
-- Check for working CXX compiler: /usr/lib/ccache/c++ -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Found JNI: /usr/lib/jvm/default-java/jre/lib/amd64/libjawt.so
CMake Warning at cmake_modules/FindLLVM.cmake:126 (message):
Could not find llvm-config. Try manually setting LLVM_CONFIG to the
llvm-config executable of the installation to use.
Call Stack (most recent call first):
CMakeLists.txt:14 (find_package)
-- Could NOT find LLVM (missing: LLVM_ROOT_DIR LLVM_HOST_TARGET)
-- Configuring done
-- Generating done
-- Build files have been written to: /home/nwatkins/src/hornet/build
nwatkins@kyoto:~/src/hornet/build$ make
Scanning dependencies of target jvm
[ 5%] Building CXX object CMakeFiles/jvm.dir/vm/alloc.cc.o
[ 10%] Building CXX object CMakeFiles/jvm.dir/vm/jvm.cc.o
[ 15%] Building CXX object CMakeFiles/jvm.dir/vm/klass.cc.o
[ 20%] Building CXX object CMakeFiles/jvm.dir/vm/object.cc.o
[ 25%] Building CXX object CMakeFiles/jvm.dir/vm/thread.cc.o
[ 30%] Building C object CMakeFiles/jvm.dir/mps/mps.c.o
[ 35%] Building CXX object CMakeFiles/jvm.dir/java/backend.cc.o
[ 40%] Building CXX object CMakeFiles/jvm.dir/java/class_file.cc.o
[ 45%] Building CXX object CMakeFiles/jvm.dir/java/constant_pool.cc.o
[ 50%] Building CXX object CMakeFiles/jvm.dir/java/ffi.cc.o
In file included from /home/nwatkins/src/hornet/java/ffi.cc:1:0:
/home/nwatkins/src/hornet/include/hornet/ffi.hh:5:17: fatal error: ffi.h: No such file or directory
#include <ffi.h>
^
compilation terminated.
make[2]: *** [CMakeFiles/jvm.dir/java/ffi.cc.o] Error 1
make[1]: *** [CMakeFiles/jvm.dir/all] Error 2
make: *** [all] Error 2
Motivation
JVM debugging, performance, and tracing tools don't play well with native code or the operating system. Having deep integration with Linux "perf" tool will make debugging and optimizing Java applications more powerful in low latency environments where causes of latency spikes are spread across the software (and hardware) stack.
Problem
Garbage collectors typically need to halt execution of all mutator threads during the collection cycle. This means in practice that applications are paused for 1-100 milliseconds or more depending on the type of GC being used. Pauseless GCs attempt to mitigate the problem by reducing the time to "stop-the-world" or eliminating it altogether.
Solution
The Memory Pool System advertises fast allocation and low pause times. Hornet needs to tell MPS about GC roots which can be implemented with stackmaps (Agensen, 1997) as long as the jsr
bytecode is not present. Luckily, jsr
is no longer used by modern Java compilers and it can be eliminated with subroutine inlining from legacy bytecode (Artho, 2005).
References
Agesen, Ole, and David Detlefs. "Finding references in Java stacks." OOPSLA. Vol. 97. 1997.
Artho, Cyrille, and Armin Biere. "Subroutine inlining and bytecode abstraction to simplify static and dynamic analysis." Electronic Notes in Theoretical Computer Science 141.1 (2005): 109-128.
JVM value type proposal here:
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.