toList() vs collect(Collectors.toList())

I had some extra time this week so went through a bunch of Sonar findings. One was interesting – in Java 17 you can use .toList() instead of .collect(Collectors.toList()) on a stream.

[Yes, I know this was introduced in Java 16. I live in a world where only LTS releases matter]

Cool. I can fix a lot of these without thinking. It’s a search and replace on the project level after all. I then ran the JUnit regression tests and got failures. That was puzzling to me because I’ve been using .toList() in code I write for a good while without incident.

After looking into it, I found the problem. .toList() guarantees the returned List is immutable. However, Collectors.toList() makes no promises about immutability. The result might be immutable. Or you can change it freely. Surprise?

That’s according to the spec. On the JDK I’m using (and Jenkins is using), Collectors.toList() was returning an ArrayList. So people were treating the returned List as mutable and it was working. I added a bunch of “let’s make this explicitly mutable” and then I was able to commit.

Here’s an example that illustrates the diference

import java.util.*;
import java.util.stream.*;

public class PlayTest {

	public static void main(String[] args) {

		var list = List.of("a", "b", "c");
		var collectorReturned = collector(list);
		var toListReturned = toList(list);
		
		System.out.println(collectorReturned.getClass());  // ArrayList (but doesn't have to be)
		System.out.println(toListReturned.getClass());  // class java.util.ImmutableCollections$ListN
		
		collectorReturned.add("x");
		System.out.println(collectorReturned);  // [bb, cc, x]
		toListReturned.add("x");  // throws UnsupportedOperationException

	}

	private static List<String> toList(List<String> list) {
		return list.stream()
				.filter(s -> ! s.equals("a"))
				.map(s -> s + s)
				.toList();
	}

	private static List<String> collector(List<String> list) {
		return list.stream()
				.filter(s -> ! s.equals("a"))
				.map(s -> s + s)
				.collect(Collectors.toList());
				
	}

Collectors.toList() also makes no promises about serializablity or thread safety but I wasn’t expecting it to.

Unexpected Pytest things for JUnit Developers

I used PyTest for the first time recently. It’s similar to JUnit as one would expect (since they are both XUnit frameworks.) There were a few things that were unexpected though so writing a a blog post. That’ll help me remember if I forget 🙂 Plus posting something on the internet is a great way to find out what you misunderstood if anything!

Getting started

I redownloaded Python as it is been a couple years since I last used Python on my machine and I don’t remember the current state of affairs. Then I created a virtual env for this blog post.

python3 -m venv blog-post
cd blog-post
source bin/activate

Next I installed pytest and pytest-mock inside the virtual environment. (adding it to the requirements file isn’t enough. I think this since the IDE needs them to run the tests)

pip3 install pytest
pip3 install pytest-mock

I confirmed my VS Code had the Microsoft Python extension configured. It did. I also set up my project for Pytest integration as described in Eric’s blog post so I can run tests in my IDE.

Surprise #1: unittest vs pytest

Python has a built in testing framework called unittest and an add on one called pytest. Java has JUnit and TestNG so it certainly isn’t unusual to have more than one option. I was surprised one was built in though. And this difference was definitely one I had to pay attention to when looking at docs. More on the differences if curious.

Creating hello world test

I created this directory structure:

math.py contains:

def add(x, y):
    return x + y

And test_add.py contains:

import pytest

import src.my_mod.math as target

def test_add():
    actual = target.add(4, 5)
    assert 9 == actual

I had to remember to use the flask icon to run in the IDE, but I’m not going to call that a surprise.

Surprise #2: Running at the command line

Running at the command line definitely yielded a surprise. This does not work because of the paths. (pytest uses a different path when the command vs a module

pytest
pytest tests
pytest tests/test_add.py
python -m pytest

Either of the following work. I like the first one because I don’t have to change the name of the file I want to run. (Although I’m using the IDE more anyway.)

python -m pytest tests
python -m pytest tests/test_add.py

Surprise #3: Assertion messages

The basics of assertions make sense to me. assert False, assert True, assert x == y. So far so good.

Failing assertions are good as well. Having an incorrect expected value gives output like the following. It shows the expanded version of actual. And it shows those final expected vs actual in the short test summary.

def test_add():
        actual = target.add(4, 5)
>       assert 7 == actual
E       assert 7 == 9

tests/test_add.py:7: AssertionError
=========================== short test summary info ============================
FAILED tests/test_add.py::test_add - assert 7 == 9
============================== 1 failed in 0.01s ===============================
Finished running tests!

In JUnit, it is good practice to add an assertion message to get more details. The expanded values are still there. However, the short test summary info only shows your custom message. I’m not using the message as I prefer to have the summary messages included what was expected vs actual and I don’t want to have to repeat the code to make that so.

    def test_add():
        actual = target.add(4, 5)
>       assert 7 == actual, "addition result incorrect"
E       AssertionError: addition result incorrect
E       assert 7 == 9

tests/test_add.py:7: AssertionError
=========================== short test summary info ============================
FAILED tests/test_add.py::test_add - AssertionError: addition result incorrect
============================== 1 failed in 0.02s ===============================
Finished running tests!

Surprise #4: Logging output

When not doing TDD (ex: testing code that already exists), I like to write a test that prints the expected value and then adding an assertion to match. I also do this when applying the golden master pattern (declare what exists to be working and codify it)

def test_add():
    actual = target.add(4, 5)
    print("Jeanne debug output " + str(actual))

I was baffled why there was no output. Reason? The print’s are only printed if the test contains a failing assertion!

On one of my machines, the prints don’t output in VS Code but do at the command line. On my other machine, it prints in both. It might be settings related, but not sure.

Surprise #5: Conventions matter

I accidentally created a file in tests that didn’t begin with test. I learned this is critical and my test got ignored until I renamed the file. (Yes, I’d have known this if I had read docs)

Surprise #6: When you made a syntax error…

If you make a certain syntax error, all the tests fail. This is scary until you realize what’s going on.

On to mocking

Here’s a simple test method to mock

import requests

def get_url(url):
    return requests.get(url)

And the test:

def test_get_url(mocker):
    url = 'https://python.org'
    data = 'html data from url'
    mocker.patch('requests.get', return_value=data)

    actual = target.get_url(url)
    assert data == actual
    requests.get.assert_called_once_with(url)

I like that the return_value is specified on the same line as the mock call. The assert for the parameter is at the end which differs from Java. I also like that there isn’t an elaborate dependency injection system. For more on mocking see Eric’s write up.

Surprise #7: My mock is a string???

When I wrote this line of code in error, I got an error that the ‘str’ object is not callable.

mocker.patch('requests.get', data)

In hindsight, this makes sense. I replace the “get” method with the string value data rather than setting data as a return value.

Surprise #8: operating system environment variables

I was mocking out os.env by setting a dictionary. I thought I was setting the environment variables to empty

mock.patch.dict(os.environ, {})

However, when I ran this on a CI server, it failed because the OS variable was in fact set. I learned that the dictionary is added to the existing variables by default. The fix is

mock.patch.dict(os.environ, {}, clear=True)

the problem with attending team/project meetings on vacation

I was recently discussing the impact of attending team/project meetings while on vacation with someone. Seemed like a good blog post.

Vacation is supposed to be about taking a mental and physical break. This means that tangential work things (taking a video course, obtaining a cert) etc don’t fall under the scope of this blog post. Those are career related but help you and not just your current employer. Similarly, work things that aren’t related to your current team/project also don’t fall within this scope. Whether one considers Toastmasters, employe networks, a town hall to be something they do on vacation, I feel like they fall in a different impact bucket that team/project meetings. (although some of the impacts overlap)

Personal impact – not recharging

It’s harder to recharge if working during your vacation. Plus vacation belongs to you. A Fast Company article includes the phrase “as simple as answering emails or as involved as taking meetings and creating deliverables”. So taking meetings (the subject of this blog post) is one of the more impactful things. The article lists the many personal downsides of working on vacation from burnout to being more creative.

I like that they use the word “disconnect.” It’s hard to disconnect while simultaneously joining a team/project meeting!

I know some people will say they don’t care about personal impact so my blog post is mostly about impact to others.

Team impact – coverage and opportunity

If a specific person doesn’t miss a meeting, you don’t know where coverage and skill gaps exist on the rest of the team. It’s easier to have the person who knows a topic the most answer or take notes or make a decision. Vacations are the most common way to address this. The person who is going to be on vacation during a meeting knows this in advance so has the opportunity to remind others what is important and what person X would bring up. It also lets people exercise the muscle of stepping up and not replying on person X. (Sick time accomplishes the same thing but isn’t often scheduled so adds stress to the person covering)

Team impact – feedback

When Person X returns, the others who attended the meeting, brief person X. This is an opportunity for feedback. Person X learns what happened. Everyone else gets more ideas of what they might have said. Even if a “decision” was made that was flat out wrong, it can be revisited now that person X has pointing out something critical that nobody else noted. (Obviously this doesn’t apply to contract negotiation.)

Team impact – learning to not experience everything

We can’t experience everything first hand. Vacation generally involves missing some meeting we wanted to attend. For example, I will be missing my team’s next retrospective and sprint review. I would like to be at those meetings. But I also trust my teammates to tell me what I need to know. So I know it’ll be ok to miss them.

Team impact – pressure to others

When one person “volunteers” to work on his/her time off, other’s on the team feel the pressure to do so. It doesn’t matter if that is the intent. It becomes a convention that (often newer or weaker) team members feel that working on vacation is now the floor of what is acceptable. And high impact things (team/project meetings and deliverables) are the worst because they are so visible. It’s not like anyone knows if someone skims (and doesn’t reply to) email.

Team impact – expectations

Similar to pressure to others, the pressure can exist on the other side. If it is seen that person x will attend team/project meetings on vacation, other meetings may be scheduled ignoring people’s schedules because of an expectation they will attend.

What about partial vacation days?

Where I work, we are allowed to take vacation in units of 2 hours. It’s interesting how this interacts with the “time away to recharge.” The answer is surprisingly well. I do take some longer vacations. But I also take a lot of 1-2 days or partial days. For example, I take 2-4 hours off to go to “Broadway in Bryant Park” in the summer. Or I’ll work 8:30-10:30am on a Friday morning before spending the rest of the day at the pool with my friends. The key is that I’m focused on what I’m doing and have 100% forgot about work on my vacation time. So even though it is short, my fun thing is my focus, not work. Taking 6 hours off and working 2 hours to attend a meeting is planning around the meeting which has the same problems listed in the above sections.

What about mandatory consecutive leave?

I haven’t worked in a place that has consecutive leave requirements. However, I am aware that some banks require a week or two in a row. My understanding is that it is about preventing shady things from happening. It’s hard to have an unofficial process in place if someone is out that long. I’m not a fan of mandatory consecutive leave because I take a lot of shorter vacation. But not being allowed to work certainly solves this problem!

What about management?

This article is about staff positions where there is a team. It’s different for managers, CEOs, owners, etc. What’s awkward is when these people attend meetings on vacation, they make others feel like others need to as well.