[QCon 2019] Maximizing Performance with GraalVM

Posted on June 25, 2019 by Jeanne Boyarsky

Thomas Wuerthinger

For other QCon blog posts, see QCon live blog table of contents

Tradeoff between what factors optimized

Startup time
Peak throughput
Memory footprint
Maximizing request latency
Packaging size (matters for mobile)
Can usually optimize a few (but not all) of these

GraalVM

Supports JVM languages, Rubby, Python, C, Rust, R etc
Can embed in node js, oracle database
Standalone binary
Community Edition and Enterprise Edition
Can run with Open JDK using Graal JIT compiler or AOT (ahead of time compiling)

AOT

To use, create new binary with pre-compiled code
Package classes from app, libraries used and part of the VM
Iterate adding things until know what need. Then create native executable.
Uses an order of magnitude less memory than JIT. Saving memory helps when running on AWS Lambda
CPU usage a lot less up front. Small peak at startup
JIT compiler has profiling feedback so can do better in the long run. AOT has PGO (profile guided optimizations) to deal with this
Working on improving – collecting profiles up front, low latency GC option and tracing agent to facilitate configuration

Performance

Startup time (from start until first request can be served). Two orders of magnitude faster with AOT
Starting up in less than 50 milliseconds allows spinning up new process upon request
Hard to measure. Can be lucky/unlucky when get data.
JIT has an advantage for peak performance. It has profiling data and can make optimistic assumptions. If the assumption not true, can de-optimize/bail out of optimization.

Benchmarks

Benchmarks are good. Should have more
Optimizing on too few benchmarks is like overfitting on machine learning
http://renaissance.dev/ – benchmark suite. Includes Scala and less commonly tested

Choosing

GraalVM JIT – when need peak throughput, max latency and no config
GraalVM AOT – use when need fast startup time, small memory footprint and small packaging size

Recommends reading top 10 things to do with GraalVM

Q&A

Have you considered using Epsilon in benchmark? Not yet. Makes sense since doesn’t do any GC
Why not use parallel GC? Not sure if it would make a difference. Kirk noted would avoid allocation hit over G1.
Does AOT make sense for large heaps? Can make sure don’t have disadvantage at least.

My impressions

I had heard about Graal and forgotten a lot. I re-learned much. I like the list of steps slides and the diagram. I feel like it will be more memorable this time. I also liked the comparison at the end on impact of the dimensions covered up front.

[QCon 2019] The Trouble With Memory

Posted on June 25, 2019 by Jeanne Boyarsky

Kirk Pepperdine

For other QCon blog posts, see QCon live blog table of contents

General

Slow database queries, inefficient app code and too many database queries are most reported problems
Once drill down, over 70% of all Java apps are bottlenecked on memory churn. It’s not reported because hard to observe
Tend to put logging around past problems.
If apply instrument to a system, it will always tell you something. And then you act on it
Cheapar to predict than react

Common libraries

Logback
Marshalling Json, SQL
Caching products
Hibernate

Memory

Java heap has generations
Hopefully people have moved to G1GC
Everything happens in the free list

Problems

Large number of temporary objects quickly fills Eden
Causes frequent young cycles. Causes premature promotion which means will go to tenured too early
Heap becomes more fragment
Allocation is quick. No cost to collect if objects die quickly. However, still slow if you do something quick enough times.
Large live data set size. Data consistently live in your heap. Increases time to copy/compact. Likely have less space to copy to. Think about Windows defragmenter. [Do people still have to do that?]
Memory leak from unstable live data. JVM will terminate if you are lucky.
Out of memory – 98% of recent time spent in GC with less than 2% of heap recovered. If don’t meet that criteria, app is just really slow, but don’t get the out of memory error.

Escape analysis

Test applied to a piece of data. What is the visibility/scope.
If scoped locally, only thread that created it can see it.
If passed to method, partial escape.
If data scoped so multiple threads can see it (Ex: static), full escape.

Demo

Showed GC log. Want to see low pause times
Showed allocation rates. Problem if too high
In Visual VM, looked at profiler. Check filters to ensure not filtering the bottleneck out of your profile
Sort by # allocated objects to see frequency. It doesn’t take longer to allocate a large object than a small one.
Take a snapshot and look at trace
“Stop thinking” – explore what is shown without assuming
Time to look at the code from the stack trace that is creating all the objects
Escape analysis code
Run jitwatch to see allocations. Can see if direct/inline allocation. Can see when bytecode eliminates an allocation
Profiler is lying to you.
Performance differs in test vs prod environment

Q&A

How know the performance problem is the int[] in the demo? Went through profiler to show stack trace. Used BigInteger which uses up a lot more memory than a long
Absolute number for GC allocation rate? Sparc? Number seem to hold regardless of hardware. Should focus on the CPU going forward.
<missed question> – try to find mutable state that is not shared

My impression

This was great. I learned a lot and it kept my attention. I really liked the demo.

[QCon 2019] Java Futures

Posted on June 25, 2019 by Jeanne Boyarsky

Brian Goetz

For other QCon blog posts, see QCon live blog table of contents

General

Java is approaching middle age. Almost 25 years old
Keep promises to users
Prime directive is compatibility
Backward compatibility matters ex: generics. Don’t need to recompile old code
Patterns ex: single method interfaces for lambdas rather than having to rewrite libraries
Languages features are forever. Interacts with others; even future ones
Waited 10 years for generics until had right story/timing. Knew copying C++ was the wrong choice
No language is ever finished
Languages are never good enough because hardware changes, new problems, developer expectations change
mid-2019 edition because things change so fast

Cadence

Used to release based on a feature rather than a data
Often didn’t feel worthwhile to do small feature because got stuck behind big ones
Now doing about two years of six month release schedule.
Release management overhead went down to almost zero
Same rate of innovation; just changed rate of release
Java 13 already in rampdown. Released in September.
Already working on Java 14

Preview feature

Risk of things happening too quickly since features are forever
Preview means feature is done but not finalized
Not experimental/beta.
Think of as provisional feature.
Expected outcome is that will be promoted to real feature in next version or two
Full IDE/Tooling support for preview features
Need to turn them on so not accidentally using in production -enable-preview flag

Current initiatives

Amber – right sizing language ceremony. Includes local variable type inference and future changes like pattern matching
Valhalla – adapt form modern hardware, value types, generic specialization
Loom – Fibers
Panama – Native code and data

Local Variable type inference

In Java 10. Future if on Java 8. Infinite past if you are Brian :).
var instead of type for local variable
Not syntactic sugar. Can expose “hidden” type (ex: capture types, intersection types and anonymous class types). So see more generics.
One of most commonly requested features
But also significant/vocal angst – can write bad code; giving into fashion
Not controversial once release. Fear of change in advance?
Will take time for good practices to emerge
Style guide https://openjdk.java.net/projects/amber/LVTIstyle.html

Switch enhancements

Preview feature in Java 12 and 13
Significant fraction of switch statements want to be expression. SO have to assign in each case.
Break is annoying. Irritating and error prone
Looked at how switch statements needs to evolve for pattern matching and then made more generally useful.
Boilerplate bad because a place for bugs to hide
Two changes – can use switch as expression or statement. Streamlined syntax label -> consequence
Will be in Java 14 unless feedback said made some horrible error

Multi-line String literals

Preview feature in Java 13
Require quotes and concatenation. Error prone for JSON, SQL and HTML
Manual mangling introduces errors
Was going to be a preview feature in Java 12. WIthdrawn because had better idea of how to do it.
Triple quotes to start/end
Dots for indentation that is just for IDE so don’t get extra whitespace when reformat in IDE.

Pattern matching

Intend to deliver in phases; hopefully starting in Java 14
Phase 1
- Replaces if statement for instanceof and then the cast for the same
- “Not only does the language make you cast explicitly, but it gives you the change to get it wrong”
- if (a instanceof Integer intvalue) – checks type and gives new variable can use as type safely
- Simplifies equals method implementation
Phase 2
- Use in switch
- case Integer i – matches if Integer, bind to i and allow use of variable inside the case.

Records

Lots of boilerplate – constructor, accessors, Object methods
A lot of code to read only to find out didn’t need to read any of it.
IDE generates boilerplate, but it doesn’t help you read it.
record Point(int x, int y) {}
Like enum, give up some extra features to get functionality.
Get sensible defaults in record
Declaratively states “I am simply a carrier for my data”
Programmer making a commitment
[does this discourage adding instance methods?]
Product type of algebraic data type

Sealed types

Sum type of algebraic data types
Shape = Circle + Rect
Compiler will know only those types
sealed interface Shape {}
record Circle(Point center, int radius) implements Shape {}
record Rect(Point x,….
Can use in if statement and pattern matching to get fields of record as local variables
Can use in switch statement. Compiler will complain if don’t address all possible implementations of sealed types
ex: case Circle(Var center, var r) -> PI * r * r;

Project Valhalla

Aims to reboot the layout of data in memory
Hardware change a lot in past 25 years.
Cost of memory fetch vs arithmetic has increased heavily in last 25 years
An array of objects will use data all over so many cache misses
Big change because goes down so far.
We want to be able to specify the class should be inlined
Giving up immutability and representation polymorphism.
Getting hardware friendly data layout
Code for the class works like an int

My Impressions

Great talk! It covered both things I had seen and things I hadn’t. Great perspective. Excited for some of the coming features.

M	T	W	T	F	S	S
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30

Down Home Country Coding With Scott Selikoff and Jeanne Boyarsky

Java/J2EE Software Development and Technology Discussion Blog

Tag Archives: qcon

[QCon 2019] Maximizing Performance with GraalVM

[QCon 2019] The Trouble With Memory

[QCon 2019] Java Futures

Share this:

Share this:

Share this: