[QCon 2019] Low Latency in the Cloud, with OSS

Mark Price @epickrram

For other QCon blog posts, see QCon live blog table of contents

Requirements

  • Trading app
  • Need microsecond (not millisecond) response time
  • Need data in memory vs database
  • Lock free programming
  • Redundancy
  • High volume
  • Predictable latency

Hydra

  • System built on OSS
  • Opinionated framework to accelerate app dev
  • Clients communicate with stateless, scalable gateways
  • Persistors – manage data in memory.
  • Gateway – converts large text message to something smaller and more efficient

Design choices

  • Replay logs to reapply changes. Business logic must be fully deterministic. Bounded recovery times
  • Placement group in cloud – machines guaranteed to be near each other. Minimizes latency between nodes

Testing latency

  • Do as part of CD pipeline
  • Can’t physically monitor with fibertab
  • Capture in histogram to get statistical view and calculate data
  • Test under load
  • Fan out where test from
  • Store % in time series data
  • Can see jigger for garbage collection

Performance on shared box/cloud

  • Not in control of resources running on
  • Containers share L3 cache so can see higher rates of cache miss
  • CPU throttling effects
  • Hard to measure since can’t see what neighbors are doing
  • One option is to rent the largest box possible and compare to vendor website for specs. If have max # cores, know have box to self. Expensive. Was about five dollars a year. At that price, might be worth just buying own machine in data center
  • Can pack non latency services onto shared machines

<missed some at the end. I got an email that distracted me>

My impressions

There was a lot of discussion about the histogram. I would have liked to see some examples rather than just talking about how it is calculated. They didn’t have to be real examples to be useful. There were some interesting facts and it was a good case study so I’m glad I went. I was glad he addressed that non-cloud is a possible option for this scenario

Leave a Reply

Your email address will not be published. Required fields are marked *