An issue came up at work recently where my development team was asked to create a web interface for the storage, updating, and retrieval of PDF documents. After taking a look at my options, I decided to go with Attachment Fu. Getting started was pretty simple, just follow the directions in the README and away we go.
After doing a couple of uploads on my development machine with some random PDFs I happen to have laying around, it seemed as if only small PDFs would work properly. Larger PDFs, upon retrieval, came back to the browser as corrupt files. This wasn’t really an issue, more of a nagging concern, since the PDFs we need to store are only a couple hundred kilobytes in size. The corruption I was encountering was happening when files were on the order of a couple megabytes. While we did not have the actual files that were going to be uploaded at the time, sample files were used and everything worked perfectly.
That is until we finally got our hands on the actual PDFs and then everything went to hell. The 200KB file came back from the database corrupt and a quick hotfix had to be devised to give users access to a hard copy of the file in question.
After about 2 days of investigation, throwing logger statements all over the code, and learning quite a bit more about connection adapters in Rails then I had previously known, I was able to trace the problem down a bit. The problem seems to stem from the null character \0, or more specifically \000, in the PDF and the pg gem’sstring_to_binary method when used through ActiveRecord. To get around this issue, I modified my copy of Attachment Fu’s DbFile Backend process’s save_to_storagemethod (in RAILS_ROOT/vendor/plugins/attachment_fu/lib/technoweenie/attachment_fu/backends/db_file_backend.rb) as follows: