Fixing postgres

When upgrading my Mac, Postgres stopped working. This page documents my journey. First, the migration process gave me a long password for the postgres user. It also didn’t set up Postgres to start up automatically. netstat confirmed nothing running on 5432.

Posgres is started on the old machine using a launch daemon. I can see it at /Library/LaunchDaemons/postgresql-15.plist. Launch daemons require elevated access to run which seems related to the user recreation.

The plist file is a text file so I read it. It says that it runs at load with userName postgres. I ran dscl . list /Users and confirmed that user still exists.

I then came across this post which says I shouldn’t expect the database to start up from a time machine backup. Given that this is just a test database and there’s nothing in there I care about, I tried reinstalling Postgres. Which didn’t change anything.

I then checked /Library/PostgreSQL/15/data/log/<latest> and got:

2023-05-21 07:51:02.282 EDT [372] LOG:  listening on IPv6 address "::", port 5432
2023-05-21 07:51:02.282 EDT [372] LOG:  listening on IPv4 address "", port 5432
2023-05-21 07:51:02.282 EDT [372] LOG:  listening on Unix socket "/tmp/.s.PGSQL.5432"
2023-05-21 07:51:02.300 EDT [402] LOG:  database system was interrupted; last known up at 2023-04-25 20:00:54 EDT
2023-05-21 07:51:02.667 EDT [402] LOG:  invalid record length at 0/331FC48: wanted 24, got 0
2023-05-21 07:51:02.667 EDT [402] LOG:  invalid primary checkpoint record
2023-05-21 07:51:02.667 EDT [402] PANIC:  could not locate a valid checkpoint record
2023-05-21 07:51:02.668 EDT [372] LOG:  startup process (PID 402) was terminated by signal 6: Abort trap: 6
2023-05-21 07:51:02.668 EDT [372] LOG:  aborting startup due to startup process failure
2023-05-21 07:51:02.668 EDT [372] LOG:  database system is shut down

That was helpful. I reset the log.

JeanneBrskysMBP:bin postgres$ ./pg_resetwal ../data
The database server was not shut down cleanly.
Resetting the write-ahead log might cause data to be lost.
If you want to proceed anyway, use -f to force reset.
JeanneBrskysMBP:bin postgres$ ./pg_resetwal -f  ../data
Write-ahead log reset

After restarting, I saw postgres up again. Yay.

nyjeanne@JeanneBrskysMBP ~ % netstat | grep 5432
222187ba86974afb stream      0      0 222187b5bb6cdf9b                0                0                0 /tmp/.s.PGSQL.5432

And the startup log is happy now.

2023-05-21 08:09:08.930 EDT [374] LOG:  database system is ready to accept connections
2023-05-21 08:14:08.987 EDT [403] LOG:  checkpoint starting: time
2023-05-21 08:14:08.992 EDT [403] LOG:  checkpoint complete: wrote 3 buffers (0.0%); 0 WAL file(s) added, 0 removed, 0 recycled; write=0.002 s, sync=0.001 s, total=0.005 s; sync files=2, longest=0.001 s, average=0.001 s; distance=0 kB, estimate=0 kB

The other problem is that I had removed the database from pgadmin. Luckily this doesn’t delete the database. I registered it again with the same info as the settings on the old mac and all was well. My data is still there too. Nice!

postgresql and jdbc

In postgresql – selecting maximum for each group, we saw the actual stored function.  That was the easy part.  Then there was dealing with Postgresql and JDBC.  I encountered a few surprises while doing this.

Calling the stored function

I expected to use JDBC’s CallableStatement since I was calling a stored procedure.  Nope.  The proper way to call it is:

PreparedStatement stmt = conn.prepareStatement("select * from highlighted_topic_per_forum(?) limit ?");

stmt.setString(1, formattedDate);
stmtp.setInt(2, limit);

Passing a timestamp

Originally, I was passing a timestamp into the stored function by calling stmt.setTimestamp() and a type of “timestamp without time zone”.  Try as I may, postgresql didn’t like this.  It complains $1 not found.  I eventually decided to pass the timestamp as a string.

Passing an array

I was originally planning to pass the category ids via JDBC.  Given what what I experienced with timestamps, I wasn’t so enthusiastic about trying this on an application I work on in my free time.  When I saw the advice online to extend the Array class, I decided to hard code the two category ids.

postgresql – selecting maximum for each group

I had a requirement to return the thread with the most replies in each forum at JavaRanch‘s Coderanch forums.  In Postgresql 8.4, this would be very easy – just use the window functions.  Unfortunately, we aren’t on Postgresql 8.4 yet.  The other common pattern is something like

select stuff
from mytable t1
where date = (select max(date)
from mytable t2
where t1.key= t2.key)

This doesn’t work well for me either because the value I am comparing (number of posts is dynamic.)  I decided to use a stored procedure to “simplify” things.  I’ve written some stored procedures in Postgresql to do updates before and stored procedures in other languages to do queries so this didn’t seem like a huge task.

Postgresql calls them stored functions, so let’s proceed.  First you need to create a type to represent a row that gets returned by the stored function.

CREATE TYPE highlighted_topic_per_forum_holder AS
(last_user_id integer,
post_time timestamp without time zone,
&lt;other columns go here&gt;);

Then you create the stored procedure.  The outer loop goes through each forum.  The inner loop is the SQL that finds the post with the most replies that was posted to in some time period.  It uses a nested query with a limit to return only 1 thread per forum.  The rest of the SQL adds the relevant data.  See postgresql and JDBC for why it takes varchar rather than timestamp.

CREATE or replace FUNCTION highlighted_topic_per_forum(character varying)
RETURNS SETOF highlighted_topic_per_forum_holder AS
forum RECORD;
r highlighted_topic_per_forum_holder%rowtype;
for forum in EXECUTE 'select forum_id from jforum_forums where categories_id in (1,7)' loop
for r in EXECUTE 'select p.user_id AS last_user_id, p.post_time, p.attach AS attach, t.* '
|| 'from jforum_topics t, jforum_posts p, '
|| '(select topic_id, count(*) from jforum_posts '
|| ' where post_time &gt;= date '' ' || $1 || ' '' '
|| ' and forum_id = ' || forum.forum_id
|| ' AND need_moderate = 0 '
|| ' group by topic_id order by 2 desc limit 1) nested '
|| ' where p.topic_id = nested.topic_id '
|| ' and p.post_id = t.topic_last_post_id '
|| ' order by post_time desc' loop
return next r;
end loop;
end loop;
LANGUAGE 'plpgsql';