csrf – jforum cleanup and problems

See part 1 for how we got here  and part 2 for how we changed the OWASP filter.

Code cleanup and problems

There is some poorly written code in JForum that CSRF now prevents from working.  In these cases, I needed to clean up our code.  For example:

  1. Links/anchors shouldn’t be used to update state.  They should only be used for gets.  The OWASP filter is realistic enough to recognize links are in fact used for updates in the real world and accomodates them.  It doesn’t accommodate code like
    onclick="document.location = '${contextPath}/${jforumMainServletMapping}/${moduleName}/...';" />

    In the cases this was used for actions that required CSRF protection, I needed to refactor to use a form.  In a few cases, I created a one to one form.  On others where there were a lot of actions or it wasn’t conducive, I used a common form

    <!-- CSRF token messes up submission if forum id isn't in URL -->
    function submitActionForForum(actionVerb, forumId) {
    var action = document.actionForm.action;
    action = action.replace("ACTION", actionVerb);
    action = action.replace("FORUM_ID", forumId);
    document.actionForm.action = action;
    <#-- general purpose form for the submit buttons on the screen; see JS for ACTION and FORUM_ID values -->
    <form method="post" name="actionForm" action="${contextPath}/${jforumMainServletMapping}/${moduleName}/ACTION/FORUM_ID">
    onClick="submitActionForForum('up', ${forum.id})"
  2. Some pages (like the first one I happened to test) didn’t have </head> so dynamically adding the JavaScript which sets the CSRF token wasn’t working.  I wrote a unit test to identify such files and ensure we don’t create any more.  We also added the missing HTML headers.  Most of them were on the admin pages.
JForum changes
JForum has some features that don’t play well with OWASP CSRF Guard
  1. JForum’s controller framework is heavily dependent on the number of query parameters . Adding a CSRF token broke all sorts of things.  I solved this by adding a check that the parameter wasn’t the CSRF token to the loops in WebRequestContext’s constructor, parseFriendlyURL()  and handleMultipart().  The one in handleMultiPart is unnecessary and just there for consistency.  I also have the constructor calling a method to see if the query string is empty instead of a one liner:
    private boolean isQueryStringEmpty(HttpServletRequest superRequest) {
    String queryString = superRequest.getQueryString();
    if (queryString == null || queryString.length() == 0) {
    return true;
    // ignore OWASP token (so if only OWASP present, ignore it)
    return queryString.matches("^&?" + CsrfFilter.OWASP_CSRF_TOKEN_NAME + "=[-0-9a-zA-Z]+$");
  2. I then learned that multipart reads the input stream and can’t be called twice.  One approach would be to copy the stream.  I didn’t go that route.  The only multipart requests are posts and PMs.  Both of which need CSRF protection.  Instead I have the filter set to just assume multi part requests need CSRF protection.
  3. Added <noscript> on post_form.htm to explain that JavaScript is now required to post.
  4. A couple places were missing <form> around input fields.
  5. In one place, a jQuery form needed the token set explicitly since it was choosing the parameters for jQuery to send on.  (I think this was only in our JForum fork.)
  6. The CSRF filter requires AJAX to pass the token has a header not a parameter.  To add it to a jquery.ajax block:
    headers: {'OWASP_CSRFTOKEN': '${OWASP_CSRFTOKEN!""}'},

    This syntax uses Freemarker’s default value to use a space instead of blowing up if there is no token.

Note to users
I was worried about users seeing the CSRF screen.  Some users won’t know what CSRF is.  And most wouldn’t know what to do.  Feel free to read our error page which includes things to try and who to email.
Enhancements/Longer term goals
We started these but it will take a while
  1. Don’t use token for anonymous users
  2. Get rid of CSRF Token from the  URLs (use forms consistently for posts)
  3. Rollback any hacks we made to fix things/cleanup code.
  4. Reconsider using JavaScript to add the tokens.

Note: all code is sanitized to remove references to javaranch/coderanch.


The story continues in part 4 – deciding not to use JavaScript for the solution.

csrf – extending the owasp solution and “interesting” IE javascript bugs (part 2)

While implementing CSRF for JForum, I needed to extend the OWASP solution.  Let me tell you, they don’t make it easy to extend.  Lots of final.  Here’s what I did – linked to code on github.

To read about the original problem or why I choose the OWASP filter, see part 1.

Extending the OWASP solution

  1. CsrfFilter.java – The OWASP filter is final so had to copy the logic in my own.  I added logic to get the action method name/
  2. CsrfHttpServletRequestWrapper.java – Since I’m using actions instead of URLs, I need to make the request look as if it the actions are URLs.  A short simple class.
  3. CsrfListener.java – OWASP assumes you are using URLs and you can enumerate them in the property file.  I have action names and there a lot of pages to allow as unprotected – without wildcard patterns to help, this is unwieldy.  The OWASP listener isn’t final but the logic I needed to extend was in the middle of a method.  So I copied their listener adding a method that reads the csrf.properties and creates “org.owasp.csrfguard.unprotected.” properties for all lines that weren’t set to “AddToken.”
  4. Redirect.java – the OWASP Redirect class works just fine if you always have the same context root.  We have a different one when testing locally vs on the server.  And since the path is in a property file, it isn’t easy to change.  Of course the OWASP class was final so I had to copy logic rather than extending it.  My Redirect class gets the server and context root from the request and adds it to the relative path in the OWAP property file.
  5. AddJavaScriptServletFilter.java – Adds the OWASP JavaScript to each page after </head> unless it is a download or a user isn’t logged in.  Since logged in users can’t do anything CSRF exploitable, they don’t need the token.  This also makes Google happier because googlebot doesn’t see the extra parameter confusing the URLs.
  6. AjaxHelper.java – The OWASP filter adds “OWASP CSRFGuard Project” to the X-Requested-With header for AJAX calls.  You can customize it to not include that, but I wasn’t sure what would happen if I left it blank.  It was easy enough to change JForum to use startsWith instead of equals() to make it more flexible.
  7. Owasp.CsrfGuard.js (attempt 1)
    1.  The JavaScript provided by OWASP has a bug.  It goes through the properties in document.all from top to bottom.  As it finds forms, it adds hidden input fields.  So far so good.  The problem is that when you have a lot of forums, some of them get bumped past the original end of the document.  So the loop never sees them and the CSRF token is never added.  The solution is to go through the document from bottom to top.  I’ve submitted a fix to OWASP for this.  What did the trick for understanding the problem was adding console.log(“length: ” +  document.all.length);  at the beginning and end.  It grows by roughly 100.   The bug fix was changing in injectTokens()
      for(var i=len -1; i>=0; i--) {


      // BUG FIX: if go through in ascending order, the indexes change as you add hidden elements
      // when have 100 forms, this moves some of the forms past the original document.all size
      for(var i=len -1; i>=0; i--) {
    2. IE doesn’t like
      var action = form.getAttribute("action");

      when you have a form field named action.  (If at all possible, DON’T DO THAT!!!).  Since JForum has 100’s of such reference, changing it at this point isn’t an option.  IE helpfully returns some object.  It’s not an array/map or list of any kind that I can tell.  After getting hopelessly stuck on this, I asked Bear Bibeault for help.  He noted the answer was in “Secrets of the JavaScript Ninja” a book he wrote that I had recently read.  And it was.  Solution

      // hack to test if action is a string since IE returns [object] when action in form and as hidden field
      // if not a string, assume it is our action and add token for now
      var action = form.getAttributeNode("action").nodeValue;
    3. The OWASP code contains this innocent looking code.
      var hidden = document.createElement("input");
      hidden.setAttribute("type", "hidden");
      hidden.setAttribute("name", tokenName);
      hidden.setAttribute("value", (pageTokens[uri] != null ? pageTokens[uri] : tokenValue));

      . It works fine in all browsers except IE. Want to know what IE does? That’s right.  It generates a hidden field with the correct token value and *no* token name.  I found this problem solution online and changed the code to

      var hidden;
      try {
      hidden = document.createElement('<input type="hidden" name="' + tokenName + '" />');
      } catch(e) {
      hidden = document.createElement("input");
      hidden.type = "hidden";
      hidden.name = tokenName;
      hidden.value = (pageTokens[uri] != null ? pageTokens[uri] : tokenValue);
    4. My last IE change was an easy one.  Easy in that it was a problem I had seen before.
      var location = element.getAttribute(attr);

      changed into

      //var location = element.getAttribute(attr);
      // hack - getting same error as on action - don't know why but hack to move forward
      var attr = element.getAttributeNode(attr);
      var location = null;
      if ( attr != null) {
      location = attr.nodeValue;
    5. Then I encountered an issue with the event handling framework in IE>  I didn’t even look at this.  All IE debugging was done between 9pm and midnight January 28th-30th after we all thought CSRF was fine from being in production the previous Sunday.  The only computer I had access to that ran IE is from 2002 and barely runs.  I was tired of IE and didn’t want to throw any more time down that rabbit hole.
  8. Owasp.CsrfGuard.js (attempt 2) – After encountering numerous IE problems, I would up merging the code from OWASP CSRF Guard version 2 and 3 to make it less dependent on the browser.

Finally, we get to look at the JForum specific parts in part 3.

postgres tuning – fixing a production problem

After a new feature was implemented (not by me), coderanch started crashing almost every day in the middle of the night.  In a few days, I’ll be blogging about the troubleshooting process and how timezones helped.  This post focuses on the end game – once we knew the problem was being caused by a large query – which is when I started being involved.

The tuning process

Like all good SQL tuning, I ran explain and iterated.  As background the new feature was a join table with half a million rows.

Explain cost What changed Observations
210,184 n/a No wonder the site is crashing.  For a web page (the forum list), this is forever!  While the query plan is using an index, it is using the index to join a table with half a million rows to a table with millions of rows.
40,590 Removed an unnecessary subquery.  (It was unnecessary because the column in populates isn’t used.) The problem is that the query isn’t using the index for a where clause.  Which is causing joins on very large tables to get a small amount of data.  Another problem is that the query limits the # rows returned to one page worth but does it at the end prohibiting the database from saving work.
1,807 Hack – we really want to  query the post time from the join table.  Since it wasn’t on there and it was too much work to add during the week, I introduced a hack.  I sorted by post creation (post id) and limited the query to sorting the most recent 100 records for the rest of the query. While this is much faster, it is functionally incorrect.  If an older post is still under discussion, it didn’t appear in the post list.  So broken, but fast enough to get us to the weekend.
288 Added the latest post time as a stored field on the join table. Ah.  Done

Learnings about postgres – locks

I ran a really simple statement to add a column to a table:

alter table jforum_topics_forums add column last_post_time TIMESTAMP without time zone not null default now();

Luckily I was on the test server because I had to kill it after 5 minutes.  At first, I thought the problem was setting the field to a value since it had to go through all the records.  That wasn’t the problem though.  The problem was that postgres was waiting on a lock.

SELECT * FROM pg_stat_activity;

select * from pg_locks where pid= 4503 and granted='f';

Running the above SQL, showed me postgres was waiting on an exclusive lock.  After I shut down the forum, the alter statement ran almost instantaneously.  The actual stored procedure to populate the new field (based on another table) took a few minutes.  But that makes sense as it was a stored procedure doing half a million queries.

Testing with lots of data

Everything went fine on my machine. On the test server (which does have lots of data), I realized that I forgot to add the index that uses the new last post time column.  That was the entire point of this exercise!  And it goes to show how important it is to have production volumes of test data.