I was taking a training class this week and was informed that some of the things I do aren’t “common knowledge.” So I’m going to be using the soft skills category in this blog to share some of them. Consider this the first blog post of many.
The advice that started this was a comment I made about blocking out your calendar. If you want to be able to do something, you need to make it a priority. And how do you do that if your calendar tells the world (well your co-workers) – “hey look, Fred is free at this time.”
Personally, I block out time for lunch each day. I also block out “coding blocks” so I have dedicated blocks in which I can focus. Want to do something after work? Block out your calendar starting 5 or 5:30.
Now I know what you are thinking – that won’t work in my job. We have unplanned events, emergencies, high priority meetings. Well, that’s ok. You can make peace with the idea that you won’t keep all of your blocks. Maybe you want to run at 5pm three days a week. Block out 5pm on all five weekdays. That way you can have two unplanned events and rest assured that you will still run three days a week.
One reason this works, is that you regain control of your calendar. If you have it blocked, someone has to ask you (or you have to offer) to free up that time for them. Which ensures it is important enough a reason to do so.
One of my co-workers started calendar blocks at my suggestion. He still has those blocks in his calendar so it must be working for him too! And if it helps us, why not you as well?
Title: Moving Java Forward Faster
Speakers: Donald Smith
See my live blog table of contents from Oracle Cloud
New Java Release Model
- “no more limos; think trains”
- From Java 2 to Java 8, had target release of every two years. Slipped a bunch of times (once to 5 years)
- From Java 2 to Java 8, did update releases roughly every 6 months. (not counting security releases). These update releases had new APIs or functionality. Ex: 8u20, 8u40, 8u60
- Java 9 was released in September 2017 and is already at end of life
- Tried to make case that Java 10 is really 9.1 and Java 12 is really 10.1 and Java 13 is really 10.2 and so forth. [I don’t think this analogy holds. Talking to Mark Reinhold suggests that they aren’t trying to make major changes specifically for 11).
- Didn’t use 9.1, 10.1, etc because need major version number to make spec change.
- Carve up changes across releases instead of a major release every few years.
- Every six months is now a feature release and can potentially change the spec.
- Enterprises do not like 6 month releases so every 3 years is LTS release.
- Java 11 is 18.9 LTS. So the LTS still uses the yy.m naming convention. So Java 17 will be the 21.9 LTS. If so java –version get vendor string that will have both Java 11 and 18.9 in string for Oracle JDK
- Java 11 (18.9 LTS) will be supported for about 5 years plus 3 years of extended support.
- LTS releases will be more stable because people are using the interim releases who don’t care about LTS.
- Java 6 support ends in late 2018. Java 7 support ends in 2022 and Java 8 ends in 2025. Java 8 might be supported longer. TBD.
- Oracle will be producing Open JDK binaries vs it being a third party thing. Open JDK binaries are GPL licensed. They will only be available for 6 months for Java 11, 17, etc
- Open sourcing the commercial features that are part of the JDK – mission control, flight recorder, app class data sharing, java usage tracker (see how many JREs used in system).
- Separately packaged tools will stay commercial
What’s new in Java 9
- Last “major” release. 100+ features
- No longer using word “major” because releases are frequent and incremental.
- Jigsaw gives smaller footprint to attack by having less modules.
- java misc Unsafe – [the usual so not writing it up]
- AOT compiler – for application as well. Not immediate, but coming.
- jshell – live demo
- G1 is now default garbage collector
Long term goal of jlink
- Horror stories where people need a dozen versions of JREs because apps don’t work with various patches.
- Long term goal – shift thinking of how package apps from standalone JRE to shipping a JRE with the app itself. (for client side apps for users)
- Gut: This makes the problem worse
- jlink lets you create custom runtime optimized for program. Doesn’t have all the modules.
- Get smaller package with just what need.
- [security implications are interesting; have to patch each app but apps far less likely to contain vulnerability]
- jlink also requires packing for hardware
- Goal shifting to jlink because browsers and OS are heading away from one common Java for Windows and Mac. Worried that one day there will be an OS update that will block separate JRE.
- [I asked why it isn’t a problem for developers if OS blocks common java. He said maybe a configuration or a developer build. So power users vs end users problem]
What’s new in Java 10
- First feature release (vs major release)
- Type inference – var x – … (example of a feature that couldn’t be in an update release since changes Java spec)
- G1 garbage collector uses multiple threads
- 12 JEPs (Java Enhancement Proposals) targeted.
- Open source root certificates. Can connect to many TLS servers out of the box. Vs OpenJDK for RHEL which assumes you have Firefox installed.
What’s new in Java 11
- 4 JEPs already targeted. Waiting until ready to target
- Removing some APIs
Future – version not yet known
- Optimize for data; not just code. So big data libraries don’t need to call native code
- Project Panama – interoperate with native libraries (better JNI)
- Project Loom – lightweight threads
- Project Valhalla – value types. Maybe Java 12?
Excellent session. I thought I understood the release model and still learned some nuances! Glad he spent more than half the session on this topic. And I hadn’t realized the long term implications for Jigsaw/jlink.
Title: Getting Started with Hadoop, Spark, Hive and Kafka
Speakers: Edelweiss Kammermann
See my live blog table of contents from Oracle Cloud
Nice beginning with picture of Uruguay and a map
- Volume – Lots of data
- Variety – Many different data format
- Velocity – Data create/consumed quickly
- Veracity – Know data is accurate
- Value – Data has intrinsic value; but have to find it
- Manage huge volumes of data
- Parallel processing
- Highly scalable
- HDFS: Hadoop Distributed File System for storing info
- Map Reduce – for processing data. Language/methods inside hadoop
- Writes data into fixed size blocks
- NameNode – ike index, central entry point
- DataNode – store data. Send data to next DataNode and so on until done.
- Fault tolerant – can survive node failure (Each DataNode sends heartbeat every 3 seconds to NameNode; assues dead after 10 minutes), Communication failure (DataNode sends ack), data corruption (data nodes send block report to NameNode of good blocks)
- Can have second NameNode for active/standby config. DataNodes report to both.
- Analyze and query HDFS data to find patterns
- Structure the data into tables so can write SQL like queries – HiveQL
- HiveQL has multitable insert and cluster by clause
- HiveQL has high atench and lacks a query cache
- Can write in Java, Scala, Python or R
- Fast in-memory data processing engine
- Supports SQL, streaing data, machine learning and graph procesing
- Can run standalone, on Hadoop or on Apache Mesos
- Much faster than map reduce. How much faster depends n whether the data can fit into memory
- Includes packages for core, streaming, SQL, MLLib and GraphX
- RDD (resilient distributed dataset) – immutable programming abstraction of objects collection, can be splt cross clusters. Can create from text file, sql, nosql, etc
- Can choose which acks need to receive – none, from the leader or from al replicas
- Integrate data from different sources as input/output
- Producer/consumer pattern (called source and sink)
- Incoming essages are stored in topics
- Topics are identified by unique names and split into partitions (for redundancy and partitions)
- Partitions are ordered and has an id named offset
- Brokers are Kafka servers in a cluster. Recommended to have three
- Define replication factor for data. 2 or 3 is common
- Consumers read data from a topic. They read in order from a partition, but in parallel between partitions.
Good simplified intro for a bunch of topics. It was good seeing how things fit together. The audience asked what sounded like detailed questions. I would have liked if they held that for the end.