[QCon 2019] Maximizing Performance with GraalVM

Thomas Wuerthinger 

For other QCon blog posts, see QCon live blog table of contents

Tradeoff between what factors optimized

  • Startup time
  • Peak throughput
  • Memory footprint
  • Maximizing request latency
  • Packaging size (matters for mobile)
  • Can usually optimize a few (but not all) of these

GraalVM

  • Supports JVM languages, Rubby, Python, C, Rust, R etc
  • Can embed in node js, oracle database
  • Standalone binary
  • Community Edition and Enterprise Edition
  • Can run with Open JDK using Graal JIT compiler or AOT (ahead of time compiling)

AOT

  • To use, create new binary with pre-compiled code
  • Package classes from app, libraries used and part of the VM
  • Iterate adding things until know what need. Then create native executable.
  • Uses an order of magnitude less memory than JIT. Saving memory helps when running on AWS Lambda
  • CPU usage a lot less up front. Small peak at startup
  • JIT compiler has profiling feedback so can do better in the long run. AOT has PGO (profile guided optimizations) to deal with this
  • Working on improving – collecting profiles up front, low latency GC option and tracing agent to facilitate configuration

Performance

  • Startup time (from start until first request can be served). Two orders of magnitude faster with AOT
  • Starting up in less than 50 milliseconds allows spinning up new process upon request
  • Hard to measure. Can be lucky/unlucky when get data.
  • JIT has an advantage for peak performance. It has profiling data and can make optimistic assumptions. If the assumption not true, can de-optimize/bail out of optimization.

Benchmarks

  • Benchmarks are good. Should have more
  • Optimizing on too few benchmarks is like overfitting on machine learning
  • http://renaissance.dev/ – benchmark suite. Includes Scala and less commonly tested

Choosing

  • GraalVM JIT – when need peak throughput, max latency and no config
  • GraalVM AOT – use when need fast startup time, small memory footprint and small packaging size

Recommends reading top 10 things to do with GraalVM

Q&A

  • Have you considered using Epsilon in benchmark? Not yet. Makes sense since doesn’t do any GC
  • Why not use parallel GC? Not sure if it would make a difference. Kirk noted would avoid allocation hit over G1.
  • Does AOT make sense for large heaps? Can make sure don’t have disadvantage at least.

My impressions

I had heard about Graal and forgotten a lot. I re-learned much. I like the list of steps slides and the diagram. I feel like it will be more memorable this time. I also liked the comparison at the end on impact of the dimensions covered up front.

Leave a Reply

Your email address will not be published. Required fields are marked *