[2023 kcdc] busy developer’s guide to next generation languages

Speaker: Ted Neward

Twitter: @tednewarrd

For more, see the table of contents.


General

  • Covering 10 next gen languages
  • The languages we use here are old enough to drink
  • However, the world has changed. Problems have changed
  • We got lazy and added features to general purpose programming languages
  • “How many of you like abstractFactory.impl.impl”
  • If don’t know all features of your chosen language, maybe it is too complicated
  • Excel is world’s most popular functional programming language. If you change a cell, everything in the dependency tree changes.

Crystal

  • crystal-lang.org
  • Online playground – https://play.crystal-lang.org/#/cr
  • native complication via LLVM (low level virtual machine) toolchain
  • interoperates with other LLVM based platforms ex: GraalVM)
  • heavily inspired by Ruby but has performance of native
  • created specifically to tap into AI
  • statically type checked/type inferenced
  • non-nillable types (compile time nil checks)
  • macro metaprogramming system
  • creates an executable

Julia

  • julialang.org
  • interactive shell: https://julialang.org/learning/tryjulia/
  • decently known in R/math/science community
  • compiled (via LLVM)
  • direct support for complex and rational numbers
  • OO and functional via multiple dispatch
  • dynamically typed
  • parallel/async/multithreaded
  • metaprogrammming (code is data; data is code) – ex: Math.parse (“1 + 1”)
  • good candidate for parallelizable math
  • can call from C

IO

  • iolanguage.org
  • IO for Graal
  • Development has ceased. Original creator proved his point. Others set it up on top of other languages which are active. Ex: IO for Java
  • homoiconic language – all values are objects; everything is a message
  • no keywords
  • will hurt brain until it clicks

Flix

  • flix.dev
  • functional first imperative logic language
  • runs on JVM
  • algebraic data types and pattern matching
  • Java took these features
  • easy to mix pure and impure code (re side effects)
  • First class Datalog contraints (based on Prolog) – rules and rules chaining

Pony

  • ponylang.io
  • statically typed, OO
  • uses actor model
  • capabilities secure: type, memory, exception, no deadlocks, no data race
  • high performance
  • philosophy: get stuff done
  • guarantees if compiles, won’t crash, etc

My take

Good high level overview of many things. Good to see code examples for each as well. Also interesting that he presented out of HTML and Dropbox. It worked well. I left when there were 10 minutes left (and 5 languages left) because my session is right after this. It was hard to leave, the session was excellent.

[2023 kcdc] DRYing out your GItLab Pipeline

Speaker: Lynn Owens

For more, see theĀ table of contents.


Intro/Problem

  • Every gitlab project has own .gitlab-ci.yml file. Great for getting started
  • Quickly have hundreds of projects
  • Goal is to eliminate copy/paste by centralizing in a few projects

What NAIC has

  • 200+ projects maintained by 11 teams in 2 dev orgs
  • Pipeline is inner source
  • Version 6 of pipeline; working on version 7
  • Reduced maintenance burden by making change once and not in each project
  • Hosted directly on gitlab.com

Milestone 1 – Hidden jobs for pipeline project

  • GitLab has “hidden” jobs
  • Start with a period
  • Don’t appear in any pipeline; just for the common code
  • The “pipeline” project has a .gitlab-ci-base.yml which contains common code
  • Common code makes no assumptions about teams and is configurable for all known use cases
  • v1 was about two dozen lines of common code
  • The client projects include the pipeline code (can include in any part of gitlab so doesn’t need to be yours)
include:
   -project: 'NAIC/pipeline' 
   -file './gitlab-ci-base.yml'
  • Then added jobs that extended the hidden jobs to call functions in the base code. Where deploy_foo is in the base code
deploy_foo:
  stage: deploy
  extend: .deploy_s3
  variables:
   ...

Suggested practices

  • Advises against pinning the pipeline to a tag because don’t get bug fixes and everyone has to upgrade manually
  • Don’t include stages in the pipeline as it forces one opinion on everyone. Many groups had written a pipeline for their use case and not all same.

Milestone 2 – Profiles

  • Found a half dozen use cases. ex: Maven for Java, NPM building Angular etc.
  • The .gitlab-ci.yaml was a copy/paste of the others in the use case.
  • Made profiles/maven-java.yml and the like in the common profile
  • Profiles are not one size fits all because there are a bunch of different ones and can still use the milestone v1 approach.

Milestone 3 – Pipeline scripts

  • Common code like logging, calling rest apis, etc
  • Switched from bash scripting to python so had common code in modules and could unit test the modules

Options to get scripts

  • Could have the pipeline create a tar.zip and upload to a repo. This is a little slow
  • Could have a global before_script that does a git clone of peipleine-scripts. Uses a network connection
  • Could bake the scripts into an image. Requires a pipeline

If was doing again, wouldn’t create separate pipeline-scripts because tightly coupled to pipeline. Doesn’t change problem of using the scripts though.

Testing

  • If client projects are all using the default branch, small changes will affect them all.
  • Use a testing framework for script code (ex: python/go)
  • Follow development practices
  • Write a sample app for each profile. Have the common pipeline trigger a downstream pipeline on this project. For any merge to master, the downstream jobs must pass.
  • Before major refactors, inventory profile jobs and audit afterwards,

Milestone 4 – Profile Fragments

  • Had about 24 profiles (ex: maven-java-jar, maven-java-pom, maven-java-k8s, etc)
  • Typically three components – build tool, language, deployment method
  • These profiles had a lot of copy/paste
  • Decomposed into fragments – ex: maven, npm, java, angular, k8s, s3)

Selling the idea

  • Needed to convince people to use this pipeline instead of writing own or another team.
  • Offer flexibility
  • Show value
  • Follow semantic versioning to the T (he tags every merge to master of the pipeline even though encourages use of the default branch. the tags are good rollback points or if the project needs something older)
  • Changelog everything
  • Document well
  • Train and evangelize
  • Record training so have library

My take

This was a good case study and useful to see concrete examples and techniques. I wish we could see the code, but I understand that belongs to their org.

[2023 kcdc] cve 101: the unfolding of a zero day attack

Speaker: Theresa Mammarella

Twitter: @t_mammarella

For more, see the table of contents.


Notes

  • Annual cost of cyber crime predicting to top 8 trillion. Only US and China have more than that as GDP

Terminology

  • Vulnerability – weakness/flaw in system
  • Threat – attack vector, potential action
  • Risk – probably frequency of that loss.
  • Goal of cybersecurity is to minimize risk. Can’t control intent to do harm so focus on vunlerability

CVEs

  • CVE – Common Vulnerabilities and Exposures
  • Format CVE-xxxx-yyyyy. xxxx = year came out. yyyy = identifier
  • CVSS scoring – how bad is it on a scale of 0-10. Ten is worst
  • CVSS score has three parts – basic (exploitability, impact), temporal, environmental. Good description here
  • Basic is the one we see on the CVE
  • CVE can be rejected. The number is used and cannot be reused. Example. Something thought found a vulnerability. Investigation was flawed and not an actual issue. Story about it here.

How to talk about

  • Private disclosure – organization can choose when/whether to fix/share
  • Coordinated/responsible disclosure – best practice – agreed upon time frame
  • Full/public disclosure – share everything
  • Best to report via company website, security.md file, security files on server, github private vulnerability reporting

Zero day vulnerability

Examples

  • log4jshell – remote code loading. Was reported responsibility but incomplete fix so zero days on those CVEs
  • Could be as simple as a bounds check. For OpenSSL. Announced something big coming and get ready. When announced learned it only affected OpenSSL 3 (not 2) and high, not critical so boy who cried wolf situation.

Security Practices for Developers

  • Insider threat includes poor training
  • A lot more developers than info security. Increasingly harder for security teams to keep up.
  • Cost of finding and fixing bugs increases over time
  • Does this touch the internet? take untrusted input/ handle sensitive data?
  • OWASP Top 10. Updated in 2021 to add insecure design, software/data integrity failures and server side request forgery (SSRF). Some merged such as injection.
  • Starting OWASP Top 10 for Large Language Model Applications. A draft version is available
  • mitre/hipcheck – scorecard for supply chain risk. Similarly, Sonatype security rating and OpenSSF Scorecard
  • Open source dependency management. Embedded in many projects. 90% of app is open source on average. North Korea attacked many apps including Putty

Attack types

  • Typosquatting – look alike domain with one or two wrong characters
  • Open source repo attackes – attempt to get maleware/weakness added into depednecy source
  • Build tool attacks
  • Dependency confusion – different version that shows up as latest

Trust?

  • Sometimes third party projects. ex: OpenSSF Scorecard
  • NPM and PyPI often have supply chain attacks. Maven Central more so
  • Scanning tools to find issues can be helpful
  • You are responsible when things go wrong

My take

Good talk. Covered concepts and good real life examples. I learned a few things like the OWASP Top 10 for LLMs. Appreciated the shout out to “the Java people in the front row” when talking about log4j. I added a few links in my blog that weren’t in the original presentation for things I wanted to learn more about.