java mini talks at qcon

This is part of my live blogging from QCon 2015. See my QCon table of contents for other posts.

This session is four 10 minute talks.

Determininistic testing in a non-deterministic world

  • determinism – everything has a cause and effect
  • pseudorandom – algorihtm that generates approximately random #s

Should see LocalDateTime with Clock instance to reproduce results in a program. There is a fixed clock so all operations in program see the exam exact time. Similar to using a random seed for generating numbers.

Hash spreads and probe functions: a choice of performance and consistency
primitive collections faster/less memory than boxed implementations. Uses 56 bytes for each Integer.

  • hash spread – function that destroys simple patterns in input data while presuming maximal info abobut input. Goal is to avoid collisions without spending too much time hashing.
  • hash probe – function to determining order of slots that span array. For example a linear probe goes down one slot if collision. A quadratic probe goes down down further if collision

Typesafe config on steroids
Property files are hard to scale. Apache Commons adds typing, but still limits to property file format and limit composition. Spring helps with scaling property file.

Typesafe Config – library used by Play and Akaa Standalone project without depenencies so can use in Java. JSON like format called HOCON (human optimized config object notation)

Scopes – library built on top of typesafe config

Real time distributed event driven computing at Credit Suisse
Credit Suisse produced own language that they call “data algebra”. Looks like a DSL.

java 8 stream performance – maurice naftalin – qcon

This is part of my live blogging from QCon 2015. See my QCon table of contents for other posts.

See http://www.lambdafaq.org

Background
He started with background on streams. (This is old news by now, but still taking some notes). The goals were to bring a functional style to Java and “explicit but unobtrusive” hardware parallelism. The former is more important than performance.

The intention is to replace loops with aggregate operations. [I like that he picked an example that required three operations and not an oversimplified example]. More concise/readable. Easy to change to parllelize.

Reduction == terminal operation == sink

Performance Notes
Free lunch is over. Chips don’t magically get faster over time. Intead, add core. The goal of parallel streamsisfor the intermediate operations in parallel and then bringing them together in reduction.

What to measure?

  • We want to know how code changes affect system performance in prod. Not feasible though because would need to do a controlled eperiment in prod conditions. Instead, we do a controlled experiment in lab conditions and hope not answering a simplified question.
  • Hard to microbenchmark because of inaccuracy, garbage collection, optimization over time, etc. There are benchmarking libraries – Caliper or JMH. [or better if don’t need to microbenchmark]
  • Don’t optimize code if don’t have a problem. What’s your performance requirement? [and is it the bottleneck]. Similarly don’t optimize the OS or the problem lies somewhere else.

Case study
This was a live demo. First we saw that not using BufferedReader makes a file slow to read. [not about streams]. Then we watched my JMeter didn’t work on the first try. [the danger of a live demo]. Then he showed how messing with the GC size and making it too small is bad for performance as well [still not on streams]. He is trying to shw the process of perofrmance tuning overall. Which is valid info. Just not what I expected this session to be about.

Then [after I didn’t see the stream logic being a problem in th first plae], he showe how to solve subproblems and merge them.[oddly not calling it map reduce]

8 minutes before the end of the talk, we finally see the non-parallel code for the case study. It’s interesting code becauase it uses two terminal operations and two streams. At least reading in the file is done normally. Finally, we see that the combiner is O(n) which prevents speeding it up.

Some rules

  • The workload of the intermedidate operations must be great enough to outweith the overheads. Often quoted as size of data set * processing cost per element
  • sorted() is worse
  • Collectors cost extra. toMap*( merging maps is slow. toList, toSet() is dominated by the accumulator.
  • In the real world, the fork/join pool doesn’t operate in isolation

My impressions: A large amount of this presentation wasn’t stream performance. Then the case study shows that reading without a BufferedReader is slow. [no kidding]. I feel like the example was contrived and we “learned” that poorly written code behaves poorly. I was hopingthe talk would actually be about parallelization. When parallelStream() saves time and when it doesn’t for example. What I learned was for this particular scenario, parallelization wasn’t helpful. And then right at the end, the generic rules. Which felt rushed and thown at us.

working remotely successfully – brad greenlee – qcon

This is part of my live blogging from QCon 2015. See my QCon table of contents for other posts.

Etsy does remote better than anyplace else he worked. Alot of people in Brooklyn office and other offices and people working from home office. He uses “remotes” as a noun as shorthand for “remote employees”.

Advice for Organizations
Number one factor for success is critical mass. Having one remote on the team doesn’t work. Having enough makes communication happen in a remote friendly format. Using chat/email/video conferencing rather than in person/physical whiteboards.

Communication

  • Chat – like IRC or Slack. They use channels; not just one on one chat like Sametime or Lync. Have #remotes channel. Virtual water cooler.
  • Shorter, more frequent interactions build stronger bonds than longer, less frequent ones.
  • Etsy is a “reply all” email culture. Use ignore/mute feature so not reading all.
  • A/V – 4 full timers work on A/V. Google hangout wasn’t enough. Switched to Vidio. Remotes type in name of room to join video conference. Remotes never late to meetings so a remote showing on screen reminds the previous meeting to end
  • Make it easy. Don’t want resistance from on sites to including remotes

Info sharing

  • Can attend talks remotely or watch them later
  • Too many people for monthly all hands to attending in person

Other tips

  • For larger meetings, have remote advocate in room – make sure speaker repeats qustions, remotes are header, et
  • Weekly one on ones are more important. Might be only (virtual) face to face. And chance to give insideĀ info everyone in the office knows.
  • Daily standups. Video conferencing in hasn’t worked well because clustered around computer in open office. Async check ins worked better because cross time zones. [we have remotes by phone and it has been smooth. We have a room with a door though and don’t try to do video
  • Make visits special. Take time to talk, have lunch, etc. Make sure to have seat/monitor/etc
  • Try to go remote once in a while to see what person is dealing with

Policies

  • Remotes can visit any time they want and company pays for. He visits quarterly. Goal: appreciation
  • Local employees aren’t free. Company pays for desk. Should pay for remote to have good space too.
  • Responsive IT group.
  • Mailed hoodies in advance so everyone got on same day
  • Be mindful of decisions and how they affect remotes. Ex: Friday afternoon beers exclude remotes

Obstacles

  • Open office plans suck – not many quiet spaces to speak to remote
  • Remote collaboration is hard – ex: pair programming. Haven’t gotten enough experiencewith a tool to get past this. Don’t have a good virtual whiteboard tool yet.
  • Fear of remotes/fear of unknown

Advice for remotes

  • Visit at least once a quarter. Socialize when there. Visiting to talk to people; not to sit in corner and code
  • Make sure have proper work environment at home or find a co-working space
  • Some people need a dedicated space for work at home
  • Don’t forget to go “home” at end of work day. If start at 7am to sync with East Coast, don’t feel bad about ending at 3pm.
  • At disadvantage in being heard/seen, so put extra effort into being noticed.
  • Support each other; talk to other remotes
  • Mixer app – created app to randomly pair people and suggest they talk

Q&A

  • With open office, do headsets help with noise? Sometimes. Other times, cut in and out. People often don’t have a quick call to avoid disturbing neighbors
  • How do you evaluate remote people? Same as in office. What do you get done
  • How look for workers and find those good fit for remote culture? Networking. % of remotes depends on job function
  • If all remotes used to be in person and are now far away, don’t have critical mass. How address? People at company a long time to handle being only remote person better. [presumably because already have network]
  • Agile and remotes? There was a presumption this can’t work. [I disagree and commented to that effect.]

Good talk from both points of view (company/team and remotes). I also saw an underlying theme that Etsy supports remote to get the best employees. Not to save money on office space. Good intent.