[dev nexus 2024] a glance at the java performance toolbox

Speaker: Ana Maria Mihalceanu

@ammbra1508

For more, see the 2024 DevNexus Blog Table of Contents


What is performance

  • From user POV, how much work can do in a reasonable amount of time
  • From business, what is cost in computational resources needed to provide that user experience

Cloud

  • Practically unlmited resources
  • Reasonable cost

Container images

  • Tools to build container images – docker, jib, kaniko, buidah, etc
  • All started with a Dockerfile
  • Other tools arrived later to make easier

JLink

  • JRE stopped being included in Java 11
  • Can use jlink to include custom JRE with just modules need.
  • Can also omit man pages and header files.
  • Compress zip-9 offers the best compression.

Fine Tuning JVM Flags

  • Ergonomics docs – process for JVM/GC to tune performance measures
  • Tune min/max heap size with -Xms and -Xmx
  • Consider Java heap ratio

jcmd

  • Tracks native memory
  • Want available in container
  • Add jdk.jcmd module to application

Other commands/tools

  • Use jinfo to see what flags used in app
  • Helps when don’t know all flag names
  • Look for amount of memory reserved and amount used
  • Look for big values
  • JConsole – can see graph of memory use
  • jstat – garbage collection statistics
  • jmap – histogram of heap summary
  • Profiling with Java Flight Recorder – use when looking for something, not all the time. Need jdk.jfr module. Can specify how long to record.
  • Prometheus server – monitors/alerts on events
  • JFR Streaming – sends metrics to monitoring service

Sample app for testing at https://github.com/ammbra/performance-glance

My take

Good information and good demo. It was nice seeing the commands actually get used. Clear how to apply.

[QCon 2019] Low Latency in the Cloud, with OSS

Mark Price @epickrram

For other QCon blog posts, see QCon live blog table of contents

Requirements

  • Trading app
  • Need microsecond (not millisecond) response time
  • Need data in memory vs database
  • Lock free programming
  • Redundancy
  • High volume
  • Predictable latency

Hydra

  • System built on OSS
  • Opinionated framework to accelerate app dev
  • Clients communicate with stateless, scalable gateways
  • Persistors – manage data in memory.
  • Gateway – converts large text message to something smaller and more efficient

Design choices

  • Replay logs to reapply changes. Business logic must be fully deterministic. Bounded recovery times
  • Placement group in cloud – machines guaranteed to be near each other. Minimizes latency between nodes

Testing latency

  • Do as part of CD pipeline
  • Can’t physically monitor with fibertab
  • Capture in histogram to get statistical view and calculate data
  • Test under load
  • Fan out where test from
  • Store % in time series data
  • Can see jigger for garbage collection

Performance on shared box/cloud

  • Not in control of resources running on
  • Containers share L3 cache so can see higher rates of cache miss
  • CPU throttling effects
  • Hard to measure since can’t see what neighbors are doing
  • One option is to rent the largest box possible and compare to vendor website for specs. If have max # cores, know have box to self. Expensive. Was about five dollars a year. At that price, might be worth just buying own machine in data center
  • Can pack non latency services onto shared machines

<missed some at the end. I got an email that distracted me>

My impressions

There was a lot of discussion about the histogram. I would have liked to see some examples rather than just talking about how it is calculated. They didn’t have to be real examples to be useful. There were some interesting facts and it was a good case study so I’m glad I went. I was glad he addressed that non-cloud is a possible option for this scenario

[2018 oracle code one] Bulletproof Java Enterprise Applications

Building Bulletproof Java Enterprise Applications
Speaker: Sebastian Daschner

For more blog posts, see The Oracle Code One table of contents


 

Being resilient

  • Don’t crash
  • Prevent faiures from casading
  • Don’t allow actions that are doomed to fail

Timeouts

  • Avoid deadlocks
  • Kill at some point so overall system and continue
  • Especially http and database timeouts.
  • Some libraries default to no timeouts

Retries

  • Immediately retry to avoid temporary failure – but be careful that not putting more load on a failing server
  • Avoid unnecessary error codes
  • Decide how often and how many times to retry

Java EE Extensions

My take: I like that there are actual code examples. I don’t like that the text based slides are in vi (or a screenshot of vi). Such a smal font and tons of wasted whitespace.