[2019 oracle code one] Machine Learning

Machine Learning for Java Developers in 45 Minutes

Speakers: Zoran Sevarac & Frank Greco – @zsevarac & @frankgreco

For more blog posts, see The Oracle Code One table of contents


General

  • “AI is the new electricity” – Andrew Ng (societies with AI were above those without
  • For many tasks, algorithms are well known
  • Other algorithms harder – image recognition. Rule based. Constantly add rules. Large number of rules. Complex.
  • When complexity goes up, bells should go off. Avoid complexity.
  • When complexity index is too big, it isn’t scalable. Breading ground for bugs.
  • Not all use cases are not good for ML
  • Core of ML – recognizing patterns in data and making predictions against the data
  • Learn language by understanding all the rules (algorithm) or observing patterns (ML)

Terms

  • AI – type of algorithm where machine emulates aspects of human behavior
  • ML – subset of AI. Allows machine to learn from experience/data
  • Deep learning. Subset of ML. Uses powerful computing and advanced nueral networks

Deep learning

  • Accuracy grows with more data.
  • Older learning algorithms get outperformed after a certain amount of data.
  • Think of deep learning as a graph. Each node performs computation. Computation can be reconfigured by tweaking coefficients on edges
  • Layer – groups of nodes

Examples

  • Image recognition
  • Spam classification
  • Data classification
  • Identifying handwritten characters/image transformation

Data

  • Training data
  • Try to minimize differences as go thru
  • Once goes below a certain threshold, training stops
  • Determine whether false positives or false negatives are worse for your use case

JSR381 – Visual Recognition API

  • Standard API for computer vision tasks using machine learning
  • Provides generic ML API design to support other libraries
  • Next phase is to figure out who/what get wider support/adoption
  • Brings ML closer to general Java dev audience
  • App programmers need to know this. Don’t need to become a data scientist to use.

Why matters

  • Patterns
  • Can change data structures
  • The case for Learned Index Structures – https://arxiv.org/abs/1712.01208
  • New hardware for API
  • What happens to countries that host call centers and their economy?

Issues

  • Need clean data
  • Privacy and ethics
  • Correlation vs causality
  • Data hacking/poisoning
  • DeepFakes – can create people that don’t exist
  • Interpretability
  • AI/ML talent is scarce

My take

This was a great way to get started. There were a bunch of code samples as well using Java APIs.

Leave a Reply

Your email address will not be published. Required fields are marked *