JavaOne – Make your CPU cores sweat with Parallel Streams

“Make your CPU cores sweat with Parallel Streams”

Speaker: Lukasz Pater

For more blog posts from JavaOne, see the table of contents


Started with the canonical Person/Car/age example to show what streams are [I think everyone at Javaone knows that]

Then showed code that finds all the prims under 10 milion that are palindromes when expressed in binary. Moores law is now relying on multi-core architecture. This example takes 1.6 seconds vs 8.6 seconds as parallel improvement. Five times faster.

Good analogy: If told to wash the fork, then the knife, then the… it is slow. If told to wash the dishes, you can optimize internally.

History of threads

  • JDK 1 – Threads – still good for small background task
  • JDK 5 – ExecutorService, concurrent objects
  • JDK 7 – fork/join framework – recursively decompose tasks into subtasks and then combine results
  • JDK 8 – parallel streams – use fork/join framework and Spliterator behind the scenes

Making a parallel stream

  • parallelStream() – for source
  • parallel() – anywhere in stream pipeline intermediate operation list

Fork Join Pool

  • Uses work stealing to balance tasks amongst workers in pool
  • All parallel streams use one common pool instance with # threads = # CPU cores – 1. That final thread is for the master to assign work.
  • Can change by setting system property java.util.concurrent.ForkJoinPool.common.parallelism. Must pass on commands line because first call to parallel stream resets it if set in code
  • If want custom fork join pool, create one and submit your stream to it. Does not recommend doing this. One reason you might want to is to add a timeout to the stream

Warnings

  • Avoid IO – burn CPU cycles waiting for IO/network
  • Use only for CPU intensive tasks
  • Be careful with nested paralle streams
  • Having many smaller tasks in the pool will better balance the workload
  • Don’t create your own fork join pool

Spliterator

  • splitable iterator
  • to traverse elements of a source in parallel
  • tryAdvance(Consumer) – do something if an element exists
  • trySplit() – partition off some elements to another spliterator leaving less elements in the original – fork so have tree of spliterators until run out of elements
  • characteristics()
  • estimateSize()
  • StreamSupport.stream(mySplierator, true) – creates parallel stream from spliterator – shouldn’t need to do this
  • ArrayList decomposes into equal sizes. LinkedList gives a smaller % of the elements because linear to get elements and want to minimize wait time
  • ArrayList and IntStream.range decompose well
  • LInkedLIst and Stream.iterate() decompose poorly – could even run out of memory
  • HashSet and TreeSet decompose in between

Other tips

  • Avoiding autoboxing also saves time. iterate() creates boxed objects where range() creates primitives
  • Parallel streams perform better where order doesn’t atter. findAny() or unordered().limit() [he missed the terminal operation in the limit example]
  • Avoid shared state
  • If have multiple calls to sequential() and parallel(), the last one wins and takes effect for the entire stream pipeline

My take:
Good discussion of performance and things to be beware of. My blog wasn’t live becase I couldn’t get internet in the room. I typed it live though! A couple typos like findFist() but nothing signficiant

JavaOne – Maven BOF

“Maven 5 BOF”

Speaker: Brian Fox, Manfred Moser & Robert Scholte

For more blog posts from JavaOne, see the table of contents


[I was late because we talked more about JUnit 5 after the BOF]

Only 26.4% of Maven Central traffic is from Maven. Nothing else is more than 10% though; not even Ivy or Gradle

Some projects don’t have snapshots; instead every commit is a release

Talked about version ranges. Depends on proximity to your project rather than the latest version. Important to clean up pom dependencies before Java 9 so not in module path. Use Maven dependency plugin (analyze) to find unused ones. Make sure to use latest version of depedency plugin.

Maven won’t generate module descriptor. Different purpose. Not all modules are dependencies. More info in module descriptor. What to export is a decision that needs to be decided by developer. jdeps can generate a rough descriptor to get started based on binaries.

Can have .mvn file inside projects with preferences startig in Maven 3.3.5. For example, you can specify to provide more memory.

Shouldn’t be issues going from 3.3.5 to 3.5.9

Maven (dependency) resolver is now a standalone project

JavaOne – JUnit 5 BOF

“JUnit 5 BOF”

Speaker: Sam Brannen

For more blog posts from JavaOne, see the table of contents


Sam invited us other JUnit presenters at JavaOne to sit on stage with him as a panel. Which was cool. I ot to sit on stage in a Moscone West room!

Sam reviewed/completed from his session earlier today

  • Tests package private
  • Create strings for message in lambda if have on where slow
  • assert timeout
  • Eclipse releases non beta support next week
  • assert all
  • nested tests – good for BDD style so can read as outline/steps with indentation in test report. there’s also an example of the bowling game where the indentation and display names make it easier to follow the output
  • Repeated tests
  • Parameterized tests – values, csv and method sources. Someone wrote an extension for a json source
  • dynamic tests – use lambdas
  • For future releases of JUnit – scenario tests (like in TestNG) where don’t do later steps if earlier fail, test ordering (good for integration tests), parallel execution, option to have new or existing instance for multiple tests, executing in user defined theads, declaration test suites for junit platform (declare which tests to run based on annotations)
  • The name JUnit sucks because we write many types of tests
  • Order of tests is based on hash code of method name since JUnit 4.8. Jupiter would give options.
  • Eclipse only provides a few options to customize which tests run. Maven/Gradle have more
  • Spring 5 – released last week, supports existing features and constructor/method injection, conditional test execution using SpEL expressions (can disable test by OS or date or other conditions – @EnabledIf/@DisabledIf, @ExtendWith(SpringExtension.class) so can use with Mockito or other former runner. @SpringJUnitConfig/@SpringJUnitWebConfig combines spring extension with spring config

Steve Moyer plugged the side project JUnit Pioneer for ope source extensions. Will be building extensions for the JUnit 4 rules that don’t have JUnit 5 equivalents. Also serves as an incubator for JUnit core. Has beta plugin for pi test.

JUnit 5 samples repo shows how to start with Gradle/Maven. Surefire team had non backward compatible change which is why need 2.19 to work. Later version of JUnit 5 will require 2.21+ to avoid this issue.

Starting surefire 2.20, recommend *Tests instead of *Test.