While migrating a large codebase from Subversion to Git recently I ran across the following error during the initial push from my development system to the new remote repository:

Read from remote host <…>: Operation timed out
error: pack-objects died of signal 13
error: failed to push some refs to ‘ssh://<…>’

Rerunning “git push” with the “–no-thin” parameter fixed the issue for me. Hope this saves someone else a headache!

It’s useful to (a) exclude the vendor directory from normal TextMate projects, but (b) still be able to pull up the Rails source under vendor/rails from time to time. If you simply add vendor to the TextMate Folder Pattern, you’ll get (a) but not (b). To get both, add vendor$.

Here are the patterns I use:

File Pattern: !(public/(favicon.ico|apple-touch-icon*\.png)|\.(|log|pid|sql)$)

Folder Pattern: !(\.git|\.svn|db/sphinx|public/(data|flash|images|pdf|audio|video)|tmp|vendor$)

A lot of people who enter programming through hacking around with scripting languages rather than formal computer science educations don’t understand certain fundamentals such as the difference between functional and non-functional programming models (hint: Ruby isn’t functional). One of the Rails core team members argues that explicit return statements in Ruby code is a bad thing.  There are at least two reasons that this is wrong: (a) Ruby is an untyped language, and (b) the semantics of the Rails API are poorly defined.  How many Rails programmers know that an ActiveRecord callback returning false will cancel all subsequent callbacks in the chain?  How many even know that ActiveRecord callbacks return a value?  If you’re a good programmer working on a large system, you sprinkle callbacks around multiple modules according to their responsibilities.  Do you know off the top of your head the order in which multiple after_save callbacks on the same object are called?  What exactly would happen if one of those callbacks DIDN’T have an explicit return statement, and happend to end with an “if” statement?  Do you have any custom “validate” callbacks?  What’s the last expression evaluated in each of them?  What does Rails do in each case?  Bet you’ll have to read the Rails source code to find out…

The reason that Rails is popular with developers is freedom from all the baggage that normally comes with enterprise-level technology.  Lack of such baggage leads to high productivity, but also high danger — with untyped and generally under-documented platforms it’s even more critical to practice defensive programming by using things like explicit return statements.  This why the RoR community has such rabid TDD devotees.  Small teams of experienced developers can indeed run very fast as long as they make the right assumptions, but a mistaken assumption can cost you dearly.  Insurance against assumptions is what you get with strongly typed languages, well-documented platforms, and other “enterprisy” features.

I’m comfortable with that tradeoff and have been using Rails exclusively for the last four+ years building new products for seed-stage startups, but I would have serious reservations if I were back in the corporate world managing a typical pyramid-shaped team (a few senior people at the top and a lot of junior people at the bottom).  I think the combination of small teams with loose tools is the right one for startups, but frankly it doesn’t scale.  Once the concept has been proven and the business enters “middle age” (probably around the time a professional CEO comes on board), it’s wise to rethink that mixture.

I like to configure my webapps to automatically send me email upon any error, including 404s (file not found), since that can indicate a bad link within the app.  Unfortunately, script kiddies trying URLs like /phpMyAdmin/scripts/setup.php and hundreds of similar variants can trigger the email and flood my inbox.

fail2ban to the rescue!  This wonderful tool automatically scans various system log files, detects when your system is under attack, and bans the offending IP address with iptables.  Here’s a fail2ban filter to match requests for nonexistent resources on your web server:

[Definition]
failregex = (?P<host>[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}) .+ 404 [0-9]+ "
ignoreregex =

Put that in a filter file named apache-404.conf and enable it in jail.local with something like:

[apache-404]
enabled = true
port = http,https
filter = apache-404
logpath = /var/log/apache*/*access.log
bantime = 3600
findtime = 600
maxretry = 5

This bans IP addresses from which more than five bad URL requests are received within 10 minutes for an hour. Adjust to taste.

I’m working on a project which has a very “wide” database schema — not many rows, but lots and lots of (text) columns.  Since I use InnoDB exclusively on MySQL (due to its lack of constraints and transactions, MyISAM is not suitable for production use), I ran into the mysterious “error 139″ when adding rows to one such table.  Some digging turned up this thread, which ends with the creator of InnoDB suggesting that instead of him fixing this bug, users should change their database schemas because “having 13 long columns in one row is not very common.”.  That’s a classic ivy tower response (I say that having three ivy league degrees) from someone who obviously has little real-world experience and whose attitude would really be better suited in another profession.

The experience sent me over to PostgreSQL, which I’ve considered for some time but never had a specific motivation to tackle.  My app ported very easily and my schema works just fine there, thank you very much.

MySQL defaults to REPEATABLE-READ transactions.  This means that repeated SELECT statements within the same transaction return the same data (i.e., they don’t pick up interim updates made outside the transaction).  Specifically, reads take place from a snapshot of the database created for the first read in the transaction.

This may lead to unexpected results when combined with exclusive locks (SELECT … FOR UPDATE).

Consider two transactions, A and B, which run concurrently and each acquire an exclusive lock on a record.  One of them wins, let’s say A.  So A carries out its updates while B waits.  Now A completes, and B becomes unblocked.  One might reasonably assume that B will have access to the latest state of the database, namely the changes made by A.  But not with REPEATABLE-READ.

Under REPEATABLE-READ, B’s view of the database is as of the first read of the database within B, which was quite possibly before the lock was obtained.  So, B’s view of the database may not include the changes made by A.

Let’s consider an example: A and B both deduct $100 from a bank account initially containing $500.  What’s the proper result?  $300 of course.  But under REPEATABLE-READ, you could easily wind up with a result of $400.  Yikes.

The solution is to use READ-COMMITTED transaction isolation (that’s transaction-isolation = READ-COMMITTED in my.cnf).

Under READ-COMMITTED (which most industrial-strength databases use as the default), B’s view of the database will, by definition, include the changes made by A.  And, because we’re using an exclusive lock, B is isolated from other transactions.

You’re supposed to remember more of what you write, right?
  • Attributes are used for sorting, filtering and grouping your search results. Their values do not get paid any attention by Sphinx for search terms, though, and they’re limited to the following data types: integers, floats, datetimes (as Unix timestamps – and thus integers anyway), booleans, and strings. Take note that string attributes are converted to ordinal integers, which is especially useful for sorting, but not much else.
  • Just remember: :conditions is for fields, :with is for attributes (and :without for exclusive attribute filters).
  • If you remember the details about fields and attributes, you’ll know that you can’t sort by fields.
  • indexes [first_name, last_name], :as => :name, :sortable => true
    How is this done? Thinking Sphinx creates an attribute under the hood, called name_sort, and uses that, as Sphinx is quite fine with sorting by strings if they’re converted to ordinal values (which happens automatically when they’re attributes).
  • http://groups.google.com/group/thinking-sphinx/browse_thread/thread/1dc88624cb56ce/a415d1c9797d3c80
    says using :conditions automatically sets match mode to :extended

I’m using Capistrano to deploy a Rails app running under JRuby via the Glassfish gem. For no apparent reason, the Glassfish Java process would exit immediately when started via Capistrano, but not when started via plain SSH or locally via the command line. I finally tracked this down to the following in my deploy.rb file:

default_run_options[:pty] = true

I include that as a matter of practice, because without it I’ve experienced mysterious hanging of commands via Capistrano in the past. However, I couldn’t remember a specific reason for it in this project, and removing it fixed my Glassfish issue (I assume Glassfish has some sort of dependency on the TTY session, blah blah blah).

Yea!