postgresql – selecting maximum for each group

I had a requirement to return the thread with the most replies in each forum at JavaRanch‘s Coderanch forums.  In Postgresql 8.4, this would be very easy – just use the window functions.  Unfortunately, we aren’t on Postgresql 8.4 yet.  The other common pattern is something like

select stuff
from mytable t1
where date = (select max(date)
from mytable t2
where t1.key= t2.key)

This doesn’t work well for me either because the value I am comparing (number of posts is dynamic.)  I decided to use a stored procedure to “simplify” things.  I’ve written some stored procedures in Postgresql to do updates before and stored procedures in other languages to do queries so this didn’t seem like a huge task.

Postgresql calls them stored functions, so let’s proceed.  First you need to create a type to represent a row that gets returned by the stored function.

CREATE TYPE highlighted_topic_per_forum_holder AS
(last_user_id integer,
post_time timestamp without time zone,
<other columns go here>);

Then you create the stored procedure.  The outer loop goes through each forum.  The inner loop is the SQL that finds the post with the most replies that was posted to in some time period.  It uses a nested query with a limit to return only 1 thread per forum.  The rest of the SQL adds the relevant data.  See postgresql and JDBC for why it takes varchar rather than timestamp.

CREATE or replace FUNCTION highlighted_topic_per_forum(character varying)
RETURNS SETOF highlighted_topic_per_forum_holder AS
$BODY$
DECLARE
forum RECORD;
r highlighted_topic_per_forum_holder%rowtype;
BEGIN
for forum in EXECUTE 'select forum_id from jforum_forums where categories_id in (1,7)' loop
for r in EXECUTE 'select p.user_id AS last_user_id, p.post_time, p.attach AS attach, t.* '
|| 'from jforum_topics t, jforum_posts p, '
|| '(select topic_id, count(*) from jforum_posts '
|| ' where post_time >= date '' ' || $1 || ' '' '
|| ' and forum_id = ' || forum.forum_id
|| ' AND need_moderate = 0 '
|| ' group by topic_id order by 2 desc limit 1) nested '
|| ' where p.topic_id = nested.topic_id '
|| ' and p.post_id = t.topic_last_post_id '
|| ' order by post_time desc' loop
return next r;
end loop;
end loop;
return;
END;
$BODY$
LANGUAGE 'plpgsql';

find friends in social networking without a password

I’ve always been concerned about the whole “give us your e-mail password and we will tell you which of your friends are registered on our service” thing on social networking sites.  To the point that I refuse to give out the password.  If I give out my password, the sites can do whatever they want with it.  Surely there is a better way!

While I’ve been reading about open standards for such things, today was the first day I actually saw it in practice.  I registered for GoodReads this week.  When clicking on find friends, you see the usual – click yahoo/hotmail/gmail/AOL/facebook/twitter/plaxo.  When clicking you have the option to type your password.  For some, you have an alternate choice.  Marked as “new”.  This alternate choice actually looks secure.

Summary of providers

Provider Allows providing password to glean contacts Comments on Non-password access to glean contacts
Yahoo Yes Worked well – similar to google as described below
Hotmail Yes Allows, but don’t have a hotmail account so untried
Gmail Yes Worked great; see below
AOL Yes No access
Facebook No Allows, but didn’t try.  I have to allow GoodReads access to write on my wall not just see contacts and didn’t want to go through the remove process at Facebook.
Twitter Yes Have to temporarily allow more access, but easy to revoke after from twitter’s connections page.
Plaxo No Not sure.  Plaxo wasn’t clear enough about what information they would be getting so I didn’t say ok.

Walking through gmail

  1. Click “Or: sign in directly on Gmail. (new)”
  2. Takes to page at a GOOGLE URL saying “The site www.goodreads.com is requesting access to your Google Account for the product(s) listed below.  Google Contacts
  3. Choose “grant access”
  4. [do stuff on GoodReads]
  5. Optional which I did because I only want to grant one time access – remove GoodReads from accessing my contacts list:
    1. Go to Google Accounts
    2. Click “change authorized websites”
    3. Click “revoke access”

The good

I am giving google my password.  Google already has my gmail password and is just checking it is correct.  I’m not passing it through GoodReads.  Google is also telling me specifically what information they are letting GoodReads see.

The bad

Just because I e-mailed someone once and they are in my Google contact list doesn’t mean I know them.  I also have to trust GoodReads won’t spam all my contacts.  Both of these problems exist with the old “give me your password” method.  I’m willing to accept both of these on a reputable site and not willing to provide a password.  So great progress.