How I studied for the AWS Certified Cloud Practitioner Exam

This post describes how I studied (and passed) the exam with a 895. It is not how I recommend studying. Also see:

Constraint 1: Why I took the exam in such a rush

  1. In early January, my employer told me I had to take this exam in the first three months of the year. (Scott suggested I ask if I could take the associate instead. Given the constraints in this post, I don’t think that would have been a good choice!)
  2. In early March, I’m giving two conference presentations (a full day lab that I’ve never given before) along with a separate presentation that includes Java 12 (which isn’t even final yet so requires reading.) Later in the month, I’m going on two trips with the FRC (FIRST Robotics Challenge team.) And when I get back, I’m putting the finishing touches on being Volunteer Coordinator for the NYC FRC competition in April. So I clearly don’t want to take the exam in March!
  3. The third week of February, the FRC robot is “due”. The programmers get the most time with the robot leading up to that so I want to spend as much time and energy in the lab as possible. And right after that, I have to get ready for the early march conference. So I clearly don’t want to take the exam in February!
  4. Finally, on January 3rd, Oracle has posted on Twitter that
    @jeanneboyarsky @thewiprogrammer @javacert New Java certs coming in early 2019. We hope to have more to post soon!
    I don’t know Oracle defines “early 2019”. But once the OCA 11 objectives come out, I want that to be the only cert exam in my head!
  5. All this means I was heavily motivated to take and pass the exam as soon as possible.

Constraint 2: Why I wanted to minimize my study time

Amazon/AWS has four levels of certifications:

  • Foundational
  • Associate
  • Professional
  • Specialty

The higher you go in the list, the harder the exams get and the more experience you have. The Foundational exam describes itself as targeting:

useful for individuals in technical, managerial, sales, purchasing, or financial roles who work with the AWS Cloud

That’s quite a span of skillsets. Which suggested to me what the exam would either be easy or a pile of memorization. (Spoiler: it was the later).

Given that the exam is only $100, I decided to take it quickly. I’d either pass or know what to study for a retake. While I passed on the first try, I don’t think it was by much. Also, Janeice DelVecchio took the exam a week before and scared me into studying more. Which is good. I absolutely would have failed with my original plan!

Constraint 3: Videos are not my preferred mechanism for learning

Most of the exam materials around are videos. I don’t like learning from videos. I like learning from either reading or something interactive. The Amazon videos were even worse for me than your average video, but more on that later.

This means I used a suboptimal study plan to avoid having to watch seven hours of video.

How I actually studied

Here’s how I studied along with my comments on how long I spent on each step and commentary. (not including time spent writing the study guide or these blog posts) Remember to see the linked post if you want to see which resources I actually recommend. I’m posting this because it was hard to find out what anyone actually *did* to study or how long they needed. (Remember I take mock exams really quickly when reading this!)

For the official videos, I read the transcript and then clicked through every few minutes to find the demos to watch. I did watch one video in full because there was no transcript. It was hard to pay attention. There’s no option to view at a faster speed. Watching a cartoon to read to me is not my idea of optimal learning.

While I’m complaining about the official videos, lack of speed up wasn’t my biggest problem. Amazon has the videos set up to pause if you go to a different browser tab. So if becomes difficult to search for more detail while the lecture is going on. And they send you two emails for every video you view.

DayWhat Studied/How longComments
1/8-1/11Read three whitepapers and comparison of support plans listed on exam page
Spent: 6 hours? (didn’t time this, but I know how long my commute is)
The whitepaper listing all the services is long and dry. It covers over 200 services. I should have found out which services were important before reading. The other readings were good even as a starting point.
1/12Unofficial 34 minute video with suggestion on how to study and overview of the most important services.
Spent: 20 minutes
I don’t like learning from video, but Janeice strongly recommended this. It was good. And at least youtube allows you to watch videos on 2x speed.
A number of people have reported getting more than 12% of the questions on billing/pricing.
Started Study Reference
1/13Started creating list of all services
Spent: 90 minutes before abandoned
Realized there were 150+ services and this wasn’t useful.
1/13“Watched” official video: Cloud segment
Spent: 10 minutes
Read the transcript instead of watching the video.
1/13Udemy mock exam #1
Spent: 40 minutes
Clearly I wasn’t ready yet. But I wanted to do one a day to focus my studying and treat them like flashcards.
1/14“Watched” official video: Pricing
Spent: 10 minutes
Read the transcript instead of watching the video.
1/14Did free official 10 questions and Whizlabs free 20 questions
Spent: 30 minutes
Since it was less than a week to the exam, I wanted to get another point of view on what I should be trying to learn
1/15Udemy mock exam #2
Spent: 40 minutes
Still nowhere near ready, but skipping a day would invite all the material to fall out of my head.
1/16“Watched” official video: Security
Spent: 10 minutes
More transcripts
1/16Udemy mock exam #3
Spent: 40 minutes
More practice
1/17“Watched” official video: Architecture & bonus materials
Spent: 40 minutes
I actually watched the bonus materials video in full. This one had a presenter who was actually interacting with the content. Figures that the video I liked the best was out of scope!
1/17Udemy mock exam #4
Spent: 40 minutes
At this point, I was starting to feel ready. (I wasn’t. This was an illusion because all the available practice materials were easier than the exam.)
1/18Watched” official video: Core services
Redid all end of video questions
Spent: 2-3 hours
This freaked me out. I did significantly worse on the end of video questions than after my first attempt proving I didn’t retain the important stuff.
1/18Udemy mock exam #5
Spent: 40 minutes
Don’t take this one. It has so much out of scope material, that all it does is scare you!
1/19Udemy mock exam #6
Spent: 40 minutes
This wasn’t as scary as exam 5 but it also had a lot of stuff that was out of scope
1/19Redid exams for muscle memory/review
Udemy mock exam 1-3
Amazon’s 10 practice questions
Whizlabs 20 questions
Repeated end of chapter questions until I had them memorized
Spent: 3 hours
This was helpful. Both to review facts and get things loaded into my short term memory the morning of the test.
1/19Skimmed study guide one final time
Spent: 15 minutes
Last minute subconscious

Other studying – I have no idea how much time I spent with these:

  • Studying from my study guide: I brought my study guide with me everywhere the day before the test. I looked at it on and off on the subway and at the 5 hour robotics meeting. I don’t know how much time I actually spent looking at it. I also carried around pieces of it at robotics meetings earlier in the week.
  • Verbalizing facts – I added some facts I needed to memorize to conversations to help retain them.
  • I had signed up as “backup” at the Toastmasters meeting two days before the exam. The day before that, i was assigned a speaking slot. On the subway ride home that night, I wrote a speech that was “sort of ” an entertaining story about clouds, mentoring, speaking and Amazon keywords. I wasn’t able to deliver it without notes because it had the flow of Alice in Wonderland’s rabbit hole. But again, I was saying the keywords out loud which makes them more memorable.

How I did on the mocks

Many people find it useful to compare how they are doing on mock exams to how someone else did to see if they are “ready” and gauge studying.

There are multiple scores for each representing multiple attempts.

SourceScores
Cloud concepts video100%
100%
Core services video60%
80%
100%
Security video50%
40%
70%
100%
Architecture video60%
100%
Pricing & Support video56%
77%
100%
Amazon official free 10 questions90%
100%
Whizlabs free 20 questions85%
95%
Udemy mock #1
81%
92%
Udemy mock #2
71%
98%
Udemy mock #3
72%
87%
Udemy mock #4
82%
Udemy mock #5
60%
Udemy mock #5
68%

Jeanne’s experiences taking the AWS Certified Cloud Practitioner Exam

Yesterday, I took and passed the AWS Certified Cloud Practitioner Exam. Also see:

Registering for the exam

Registering was pretty easy. You enter your zip code and it tells you the next available exam date of the nearby centers. You can click on each one to see actual dates/times. There were both weekday and weekend choices which was nice.

The exam center

I took the exam at “Forest Hills Brainseed.” Being able to walk to the venue was nice because I could leave “unnecessary objects” at home!

The center had a locker for your stuff. You hold the key during the exam. They weren’t strict about what you put in the locker. I kept my credit card and tissues in my pockets. (Some centers have made me empty my pockets.) The center keeps your drivers license while you are in the exam room.

Like most exams, you are entitled to a writing utensil and something to write on. This center uses paper and pencil. I haven’t gotten physical paper at an exam in ages. It was *so* nice.

This center also provides ear plugs which I didn’t need.

The center had two rooms of 10 computers each. My room was less than half filled, but it was still hot. Luckily, I wore a short sleeve t-shirt under a sweatshirt so could just remove my sweatshirt.

The actual exam

The software didn’t log me in at first. The center had me change computers and then it worked.

After 22 minutes, I had completed my first pass of the exam and was 100% confident on 44 of the answers. (Passing is a little higher than that.) Luckily I was 50% confident of others. I did some more review but turned it in with about 50 minutes left. (I always finish cert exams quickly.)

After you click “End test”, you get 6-9 survey questions. Then you get your pass/fail result. One to five days later, you’ll get an email with your actual score. (Janeice and I both got the score one day later). Given that this is a pass fail exam, the score isn’t important to me. That said, I did get a good score – 895 (out of 1000)

How to view your score

  • Sign in to the AWS Training and Certification Portal.
  • Click the ‘Certification’ tab.
  • Click the ‘AWS Certification Account’ button.
  • Click “Previous Exams”
  • Click “Download” on the right hand side

The Amazon AWS Java SQS Client is *Not* Thread-safe

I recently added long polling to an Amazon SQS project that processes thousands of messages a minute. The idea was simple:

  • Spawn N number of threads (let’s say 60) that repeatedly check an SQS queue using long polling
  • Each thread waits for at most one message for maximum concurrency, restarting if no message is found
  • Each time a message is found, the thread processes it and ACK’s via deleteMessage() (failure to do so causes the message to go back on the queue after the visibility timer is reached)

For convenience, I used the Java Concurrency API ScheduledExecutorService.scheduleWithFixedDelay() method, setting each thread with 1 millisecond delay, although I could have accomplished the same thing using the Thread class and an infinite while() loop. With short polling, this kind of structure would tend thrash, but with long polling, each thread is just waiting when there are no messages available. Note: For whatever reason, Java does not allow a 0 millisecond delay for this method, so 1 millisecond it is!

Noticing the Problem
When I started testing my new version based on long polling, I noticed something quite odd. While the messages all seem to be processed quickly (1-10 milliseconds) and there were no errors in the logs, the AWS Console showed 50+ messages in-flight. Based on the number of messages being processed a second and the time it was taking to process them, the in-flight counter should have been only 3-4 messages at any given time but it consistently stayed high.

Isolating the Issue
I knew it had something to do with long polling, since previously with short polling I never saw that many messages consistently in flight, but it took a long time to isolate the bug. I discovered that in certain circumstances the Amazon AWS Java SQS Client is not thread-safe. Apparently, the deleteMessage() call can block if too many other threads are performing long polling. For example, if you set the long polling to 10 seconds, the deleteMessage() can block for 10 seconds. If you set long polling to 20 seconds, the deleteMessage() can block for 20 seconds, and so on. Below is a sample class which reproduces the issue. You may have to run it multiple times and/or increase the number of polling threads, but you should see intermittent delays in deleting messages between Lines 25 and 27.

package net.selikoff.aws;

import java.util.concurrent.*;
import com.amazonaws.regions.Regions;
import com.amazonaws.services.sqs.*;
import com.amazonaws.services.sqs.model.*;

public class SQSThreadSafeIssue {
	private final String queueName;
	private final AmazonSQS sqsClient;
	private final int numberOfThreads;
	
	public SQSThreadSafeIssue(Regions region, String queueName, int numberOfThreads) {
		super();
		this.queueName = queueName;
		this.sqsClient = AmazonSQSClientBuilder.standard().withRegion(region).build(); // Relies on locally available AWS creds
		this.numberOfThreads = numberOfThreads;
	}
	
	private void readAndProcessMessages(ReceiveMessageRequest receiveMessageRequest) {
		final ReceiveMessageResult result = sqsClient.receiveMessage(receiveMessageRequest);
		if(result!=null && result.getMessages()!=null && result.getMessages().size()>0) {
			result.getMessages().forEach(m -> {
				final long start = System.currentTimeMillis();
				System.out.println("1: Message read from queue");
				sqsClient.deleteMessage(new DeleteMessageRequest(queueName, m.getReceiptHandle()));
				System.out.println("2: Message deleted from queue in "+(System.currentTimeMillis()-start)+" milliseconds");
			});
		}
	}
	
	private void createMessages(int count) {
		for(int i=0; i<count; i++) {
			sqsClient.sendMessage(queueName, "test "+System.currentTimeMillis());
		}
	}
	
	public void produceThreadSafeProblem(int numberOfMessagesToAdd) {
		// Start up and add some messages to the queue
		createMessages(numberOfMessagesToAdd);
		
		// Create thread executor service
		final ScheduledExecutorService queueManagerService = Executors.newScheduledThreadPool(numberOfThreads);
		
		// Create reusable request object with 20 second long polling
		final ReceiveMessageRequest receiveMessageRequest = new ReceiveMessageRequest();
		receiveMessageRequest.setQueueUrl(queueName);
		receiveMessageRequest.setMaxNumberOfMessages(1);
		receiveMessageRequest.setWaitTimeSeconds(20);
		
		// Schedule some thread processors
		for(int i=0; i<numberOfThreads; i++) {
			queueManagerService.scheduleWithFixedDelay(() -> readAndProcessMessages(receiveMessageRequest),0,1,TimeUnit.MILLISECONDS);
		}
	}
	
	public static void main(String[] args) {
		final SQSThreadSafeIssue issue = new SQSThreadSafeIssue(Regions.YOUR_REGION_HERE,"YOUR_QUEUE_NAME_HERE",60);
		issue.produceThreadSafeProblem(5);
	}
}

And below is a sample output of this, showing that each message took 20 seconds (the long polling time) to be deleted.

1: Message read from queue
1: Message read from queue
1: Message read from queue
1: Message read from queue
1: Message read from queue
2: Message deleted from queue in 20059 milliseconds
2: Message deleted from queue in 20098 milliseconds
2: Message deleted from queue in 20024 milliseconds
2: Message deleted from queue in 20035 milliseconds
2: Message deleted from queue in 20038 milliseconds

Note: The SQSThreadSafeIssue class requires Java 8 or higher along with the following libraries to compile and run. It uses the latest version of the Amazon AWS Java SDK 1.11.278 available from AWS (although not in mvnrepository.com yet):

Understanding the Problem
Now that we see messages are taking 20 seconds (the long polling time) to be deleted, the large number of messages in-flight makes total sense. If the messages are taking 20 seconds to be deleted, what we are seeing is the total number of in-flight messages over the last 20 second window waiting to be deleted, which is not a ‘true measure’ of in-flight messages actually being processed. The more threads you add, say 100-200, the more easily the issue becomes to reproduce. What’s especially interesting is that the polling threads don’t seem to be blocking each other. For example, if 50 messages come in at once and there are 100 threads available, then all 50 messages get read immediately, while not a single deleteMessage() is allowed through.

So where does the Problem lie? That’s easy. Despite being advertised as @ThreadSafe in the API documentation, the AmazonSQS client is certainly not thread-safe and appears to have a maximum number of connections available. While I imagine this doesn’t come up often when using the default short-polling, it is not difficult to reproduce this problem when long-polling is enabled in a multi-threaded environment.

Finding a Solution
The solution? Oh, that’s trivial. So trivial, I was tempted to leave as an exercise to the reader! But since I’m hoping AWS developers will read article and fully understand the bug, so they can apply a patch, here goes….

You just need to create two AmazonSQS instances in the constructor of SQSThreadSafeIssue, one for reading (Line 21) and one for deleting (Line 26). Once you have two distinct clients, the deletes all happen within a few milliseconds. Once applied to the original project I was working on, the number of in-flight messages dropped significantly to a number that was far more expected.

Although this work-around fix is easy to apply, it should not be necessary, aka you should be able to reuse the same client. In fact, AWS documentation often encourages you to do so. The fact that the Amazon SQS client is not thread-safe when long polling is enabled is a very serious issue, one I’m hoping AWS will resolve in a timely manner!