restoring my iBrick to an iPad

This week, I updated my iPad to OS 3.2.2. It didn’t go well.

What happened:

  1. Plugged in iPad to laptop.
  2. iTunes noted there was an OS update and suggested I backup the iPad before applying it.  I did.  Whew!  This turned out to be a blessing.
  3. Selected option to update.
  4. Got error that could not update.  With error code 14. Apple lists the error codes which is nice.
  5. Tried again but iTunes said my iPad needed to be restored to factory state to work again.  Not happy at this point.
  6. Got a different error code on restore.  Now really nervous.  Tried unplugging and replugging USB.  Seemed to help.
  7. Restore worked

Ok now I’m “restored” to factory settings.  How do I get back to using the iPad.

  1. Synced with iTunes.  This gave me most system settings and some data back.  (It didn’t put my password back, but that was the only thing noticeably missing.)
  2. Most of my apps aren’t here at this point and they have data of their own.  I’m a little nervous at this point.
  3. iTunes as the list of apps that were installed, but I don’t see an easy way to put them back.
  4. Went through the app store and re-downloaded them one by one.  At least iTunes recognized they were bought (thanks to an up to date backup) and didn’t charge me again.
  5. Tried some of the apps.  They do still have the data from the iTunes sync.  That means iTunes does sync data even if the app it goes with isn’t currently installed.

Summary

  • Best thing – I didn’t lose any data
  • Worst thing – re-downloading each app is tedious.  Can’t iTunes send a list to the iPad and have the app store download everything on that list?

Memo: Avoid Functions in Database Queries

The most common type of questions in the JavaRanch JDBC forum tends to be about improving performance in a database (that, and “Where can I download a JDBC Driver?”). While remote trouble-shooting performance issues can be tricky, we often spot issues with the presented queries and offer alternatives to improve performance. One issue I see from time to time is the use of database built-in functions in database queries. In general, these should be avoided at all costs for commonly-executed queries, especially ones that could trigger the function to be applied to every record in a table. This article will address this issue of using functions in database queries in more detail and provide explanations and tips for avoiding them in the future.

1. Throwing away indexes

By default, databases search an entire table for records. The purpose of indexes, created automatically by some DBMS systems, is to allow records to be retrieved faster. Using Big O notation, a sorted set of records can find an item within a range in O(log(n)) while an unsorted set would have to search the entire table, or O(n). For hash searches on a specific key, the search time is near O(constant) for properly balanced data structures.

Developers can create indexes on multiple columns in a table, but they have no direct control over how and when they are used. It is the job of the DBMS query optimizer to apply what it thinks are the best indexes at the time the query is requested. Queries that apply a function to a column in a database will likely throw away any index that the query optimizer could take advantage of to search the table, resulting in a full-table scan every time the table is queried. If you are lucky and your table only contains a few dozen records, this issue might not be noticeable, but for the rest of us, this could pose a serious problem. While there are some DBMSs, Oracle being one of them, that support function-based indexes, but they are far from standard practice.

2. Goodbye Portability

My biggest issue with functions, like any database-specific feature, is the fact that they are database specific features! Once you write a query that takes advantage of a specific function, porting your application to a different database system becomes much more difficult. Oftentimes, functions are used because there is a muddling of the data, application, and presentation layer. I have seen developers use database-specific functions to format data (most commonly, dates) from SELECT queries that are transmitted from JDBC directly to the user. If you have a strong mid-tier platform like Java, it is better to leverage the formatting functions within the language, not database-specific ones, to present data to the user.

Porting an application to a different database is non-trivial at best, but the use of database-specific functions will make the job much more difficult. And to those developers out there who often comment that switching databases never happens, it does. I’ve done it. And it’s not for the faint of heart.

3. Slow if used incorrectly

There are correct ways to use functions in database queries that are often overlooked when writing the query. For example, compare the following two MySQL-based queries, both of which search for orders placed on February 15, 2010 and use built-in functions:

SELECT * FROM Widgets WHERE DATE_FORMAT(orderDate,"%Y-%m-%d") = '2010-02-15';
SELECT * FROM Widgets WHERE orderDate = (CAST '2010-02-15' as DATETIME);

First, question is, are they equivalent? The answer is no, but let’s skip that for now and discuss performance. Which is likely to perform better on a database that contains a tree-based index on Widgets.orderDate?

Give up? The second query, of course! The second query applies a function to a constant, and most database query analyzers are intelligent enough to only apply this function once. It then uses the range index to find the records in Log(n) time. On the other hand, the first query performs a function call on every record in the table and therefore ignores the index – resulting in a slow table scan.

As for accuracy, if orderDate is a DATETIME, the second query will return all the results placed only at midnight on February 2nd (2010-02-15 00:00:00) while the second query will return all results placed during the entire day. No worries – there is an easy fix that still uses the index on orderDate for optimal performance:

SELECT * FROM Widgets WHERE orderDate >= (CAST '2010-02-15' as DATETIME)
     AND orderDate < (CAST '2010-02-16' as DATETIME);

Even though this adds a second parameter to the search, the sorted index can be applied to both.

Final Thoughts

Like many things in the world, database functions are necessary evil that are required to solve certain problems. The goal shouldn’t be to never use them, but to keep using them to a minimum. One common solution, if the function is being applied to a column repeatedly, is to denormalize the data and add a new column containing the value of the original column with the function applied to it. This is equivalent to how a function-based index works, for databases that do not have this feature built-in. The only overhead is the cost of maintaining the value in the new column each time the original value is updated, although since reads are more common than writes, this is often easy in practice. I am aware some developers have issues with denormalized data, but it is often necessary on large tables with queries that can run for minutes or even hours.

In the previous example, if searching on specific single dates is extremely common, the developer could add an orderDateSTR column that contains strings such as ‘2010-02-15’. A hash index could then be built on the column which would allow searches for single dates to be accomplished in near constant time. Granted, this is not useful for range queries, but would be useful in situations where single date searching needed to be fast on large data sets.

Starting a voluntary discussion group

I co-founded a lunchtime discussion group at work five years ago.  As we approach the fifth anniversary, I was reflecting on what went into it.  What worked; what didn’t; what would I do differently.  Our purpose was to have a place to discuss technology that we weren’t necessarily working on and to share knowledge/ideas across teams/departments.

Background

Technically the group started more than five years ago as a technical book club.  That didn’t work because everyone had to read from the same book.  Which requires preparation each meeting.  It also makes it hard to come back after you’ve missed a meeting.  The organizer of that group and I had an informal conversation near the cafeteria hallway about how to improve things and the new group was born!

The new group

We started meeting twice a month and switched to once a month over time.  We have a mix of moderated discussion topics and presentations with heavy Q & A.  At the beginning, we said we would cancel the meeting if there weren’t 4 people in the room.  This hasn’t been a problem in years.

What I consider key to the group’s success

  • We always meet the same day and time – this makes it easy for people to remember when the group meets.  For example, the first Tuesday of the month; the first and third Thursday of the month, etc.
  • Advertising – We advertise through an opt-in mailing list (don’t spam people), telling people who might be interested and putting up physical signs the day of the meeting.  Speaking of which these signs are always the same color/font/logo to encourage branding/memory.  People have even said “you are the blue sign people”.
  • Voluntary – This one is important to me.  Attending the group is voluntary.  Running the group is voluntary.  This means it is not a work thing and does not get billed to work projects.  It also allows the group to remain independent topic wise and FUN.
  • Informal – While we do have presentations, they are usually of an informal type. (Sometimes someone is previewing a formal presentation.)  Again, this is in keeping with the goal of things being FUN.
  • Size of group – We haven’t had problems with the size of the group per se.  We get between 6 and 20 people most months which is a good size.  We’ve been asked about including other offices in other states in the group.  We do have a phone bridge for people within the company and have invited people from one or two other offices on occasion recently.  And this is fine if it is small.  If there are too many from too many offices, it changes the spirit of the group.  And there are already company wide information exchange type groups in place.  This particular item is something we are going to have to monitor so we don’t become victims of our own success and lose the benefits of a New York based group.
  • Varying formats – Having a mix of presentations and moderated discussion has served us well.  Presentations are good when someone has a lot of experience with a technology.  (or can throw something together on it.)  Moderated discussion is good when multiple people have lots of small thoughts.  And the moderator doesn’t have to be an expert.  Just like at JavaRanch where I moderate the BlackBerry forum without knowing anything about BlackBerry development.  Whenever I moderate at the discussion group, I come with a list of questions/talking points.
  • Hook to real work topics – Most of the time topics involve technologies or emerging technologies someone in the company is working on.  Or planning to work on.  Or a cross cutting concern like performance.
  • Range of topics – You’ll notice I said most of the time.  Some of our topics have nothing to do with work.  There’s that FUN thing again.  We also select topics based on what attendees want to hear.
  • Emphasis on interactiveness – While we do have people who just sit and listen, most participants are actively engaged.  This keeps a good dynamic, keeps in informal and keeps it FUN.
  • Different moderators – Trying to have as wide a range of people moderate as possible is helpful.  It gives the group a changing voice.  It helps more people feel invested in the group.  And it gives people a chance to practice leading.
  • Group leaders changing over time – We have had 5 different people co-lead the group at different points it time.  (Co-leaders are good because it is less work and you don’t have to worry about vacations.)  One person left the company.  And two others (including myself) stepped down from actively running the group after getting job responsibilities that would have compromised the informal appearance of the group.  The two of us still attend/moderate and are plenty involved.  But we don’t run it or give the opening/closing remarks anymore.
  • Topic control – I mentioned we are an informal group.  This gives a stronger ability to decline topics.  At times, people from outside the group come to us with ideas.  Some are good ideas and we use them.  Some are “good ideas that don’t fit.”  (There’s no such thing as a bad idea; we like ideas; that’s how brainstorming happens.)  Being able to decline these ideas keeps the group FUN and interesting.

What to think about if starting your own discussion group?

  1. What is your goal?– Knowing what your goal/vision is for the group helps you set it up in a way conclusive to what you want to accomplish.  It also tells you who to advertise your group to.
  2. Do you want it to be official or unofficial?– Both are valid options.  They have different tones/topics/styles and deciding which you have is important.  (Incidentally, the reason I stepped down from running this group is because I became involved with an official one.)
  3. Do you want it to be voluntary or mandatory? – Even an official group is hard to make mandatory.  Beware of this option.
  4. When is good to meet?– Think about when meetings are generally scheduled in your company.  It is ok to change this until you stabilize.  But then try to stick to the same date/time.
  5. What is a good group size?– What is your target?  Is there a number that is too high or too low?
  6. How do you gain critical mass? – It’s much easier to run a group than to start one.  Think about what you can do to get people to come to your first few meetings.

Summary

I used the word FUN a lot in this post.  That’s really the vision for this group.  Good techies are passionate about technology.  While that doesn’t always come out amidst business activities, our group is an opportunity to channel that pure FUN for what we do.