Category Archives: Tech - Page 2

The Use of S3 and EC2 for Remote Backup

Even before the introduction of the Amazon S3 storage service I was intrigued bye the possibilities of secure backup over the Internet. Over the years I’ve evaluated a number of possibilities such as the use of rsync and Unison either to my own remote servers or to a service. I’m really not too interested in the commercial vendors as most of their software works on Windows or maybe Mac and my files reside on a Linux fileserver. It only makes sense that my backup solution should run on the Linux server as well.

None of these solutions seemed to quite fit the bill for me because of expense, concerns about data security or speed. Since the introduction of S3 I have started playing around with some of the scripts and software which have been developed to take advantage of these powerful services. I was still disappointed though mostly because of some data encryption concerns (on the storage system, not in transit) and the potential charges associated with backing up data to the S3 service. Ideally I would want something rsync like which would only transfer the changed parts of the files instead of recopying the entire file or directory. Unfortunately there is no built in support for anything like this in the low-level S3 system. So after playing with many scripts that suggested they would be able to do something along these lines and remaining unimpressed I decided to put things on hold for a while longer.

Eventually Amazon released the EC2 cloud computing platform but that still didn’t seem particularly useful for my purposes because of the lack of persistent storage between sessions. Once the elastic block storage became available things got more interesting. Now that I could retain data between sessions I had visions of a backup script which would launch an EC2 instance, mount an EBS volume and run rsync or Unison to backup directories on my local server to the remote site. I started playing around with EC2 and soon discovered that although it is very powerful it is a monster to control unless you are writing your own application from the ground up. For a simple job like this that should be easily accomplished by a script it can be a nightmare with several shell variables to set and paths to keep straight. Never mind the several encryption keys and the changing SSH host identifier to deal with. Eventually with some help from two fantastic blog entries (Ereblog and Free Wisdom Online) I was able to get something working…mostly.

It’s quite a fragile thing and you have to make sure that things are executed in the correct paths and with the correct environment variables set. In addition the returned data from the control commands is just awked from the output so it could easily break if the control package were updated, etc. The final nails in the coffin for me were my increased backup storage requirements for photos, audio and video which are huge and can change the economics of doing remote backup quickly. Even for a slimmed down set of documents I found the process to be too slow and fragile for my needs. In the end I have gone back to hauling hard drives with data backups off site and using the rsync program locally to sync these periodically with my live storage.

*Edited 2/2/09 to fix the several times I mistakenly called EC2 EC3 although I knew better. Thanks to the commenter for pointing this out!

Remembering our Media Past

One of my more recent pastimes when I have a few minutes to spare and am already caught up on the news and either need to relax and unwind a bit or just don’t have time to dive in to a more substantial project is to browse around on YouTube (similar to I do on Wikipedia) and see what turns up. One of the more interesting things that I have turned up are old “airchecks” from Twin Cities area television stations.

Being a media geek I’m fascinated by how news has changed over the years, particularly in my market. I’ve known about many of the private collectors of radio airchecks for some time but thanks to the fine people at there are now many TV airchecks from the area available online as well. Some of my favorites are actually the tv news reports on some of the area radio stations (which is how I found the archive in the first place). It’s amazing to see just how different news reporting looked even 15 years ago. While we can discuss somewhat about whether the content is any better there is no doubt that the production quality has drastically improved.

Deconstructing the Car Talk Jukebox

The great folks at National Public Radio’s Car Talk recently switched from using Real Player to a flash based MP3 media player for online listening. I think this is a fantastic change as the only thing I was still using Real Player for was to listen to Car Talk online. I do realize that for some time a podcast version of the show has been available through NPR but I tend to listen to it spread out over several days and the Real Player (and now Flash based player) allow me to jump directly to specific segments of the show, a big advantage over one long MP3 file for my purposes.

The only problem with the new player is that I initially couldn’t get it to work with my system. The player would never fully load and would not play the show. This really presents a problem if one wants to listen to the show. Of course I submitted an email to Click and Clack notifying them about the problem, apparently they’ve been receiving quite a bit of email about the new player for better or worse because it took about 5 days to even get a form response back. Like most of what Click and Clack have to say it wasn’t that helpful (install the latest version of Flash, etc). Since I already regularly waste time with Flash websites on a regular basis I was sure that Flash wasn’t my problem. This led me to start deconstructing their player architecture to find out and fix the problem myself, in true Car Talk fashion.

To make a long story short for the impatient reader I’ll cut to the chase. Ads are loaded and played from a third party site (NPR) and require cookies (3rd party cookies) to play. The ad must play before the player will load the show audio. I had 3rd party cookies disabled, hence no show. I fixed this by explicitly allowing cookies from both and Of course, there are many more interesting ways of solving the problem and more that can be learned by total deconstruction so the reader looking for further edification may want to read on.

By looking at the page source and link formats I fairly quickly determined that Car Talk was using the JW FLV Media Player and it was loading a playlist file called showAllsmil.xml which likely contained the asset (MP3 audio) URLs to be played. The trick would be to find this file and figure out why my player wouldn’t work. By looking at the source of the player page I could determine that before the player fully loaded it needed to play an ad from NPR. That certainly gave me quite a clue as to why things weren’t working and eventually led me to the cookie solution you read about above but let’s explore the javascript code that selects and gives the player a URL for the MP3 ad:

var site = 'CARTALK';
var area = 'Cartalk.Player';
var pageNum = Math.round(Math.random() * 100000000);
var randomNum = Math.round(Math.random() * 100000000);'+site+'/area='+area+'

It’s interesting that this is all done in client side javascript instead of randomly serving an ad from a static server side URL, but I guess doing things in javascript is the Web 2.0 way! Now you know what to do if you want to listen to NPR ads all day long. Generate a bunch of random numbers and load up some URLs. What if you want to listen to an actual Car Talk show, perhaps on an unsupported player/OS like Linux without Flash installed. For this you’re going to want to get your hands on that xml playlist file. First you’ll want to find the URL, which as it turns out is also generated by some bits of javascript which could also be done server side:

var f=gup('play');
var s=gup('show');	
if (s==null || s=="") s="WeeklyShow";
var file2 = ''+s+'/'+f;

Where gup is a function which pulls some variables out of the URL, again something really easy to do in a server side language like PHP, oh well…javascript it is. If you want to listen to the entire most recent weekly show you’ll end up with a URL that looks something like:

If you want to just hear the last segment (segment 10 of the show) you’d end up with:

Of course it’s similar for 01smil.xml through 09smil.xml. Note yet again that this could all be handled without creating a million files if it were done server side, but I digress. When you open up that XML playlist file you end up with something where it’s easy to see the MP3 asset files can be found at:

and so on. Note that these streaming MP3s can be played in any MP3 player so you could play them on Linux or just about anything else that plays MP3 files.

An interesting project would be to create a script which dynamically generated an advertisement MP3 URL, pulled the SMIL file and stripped out the asset URLs and spit out a more standard M3U playlist file. If this were done in server side scripting (PHP anyone) you could easily create a link which would feed any player a playlist of the most recent show segments (plus an opening advert to keep NPR happy). Such a M3U playlist would be useful as it would allow you to play streaming Car Talk MP3s from just about any player/OS without manually getting all the segment URLs.

Computer Collecting

Friends who have seen my electronics warehouse, err.. basement, know that I’m an avid collector of “antique” electronics. From the 8-Track recorder, yes you heard that right not just an 8-Track player, but a recorder, to my collection of cell phones and landline phones my interest in history seems to manifest itself in collecting bits of history.

As an information technology professional I think it’s both important and useful to realize how I got to where I am. For me this means both the people like “Mr. C” my elementary school computer teacher who showed me the inside of an Apple //e and taught me the fundamentals of computing as well as those early machines I worked with. This means that it has been one of my personal goals to collect some of those influential machines from my early years. A fun side benefit is the ability to play the games and software I remember from my youth on real hardware instead of an emulator.

This means that I also have quite a collection of computers in my basement, primarily Motorola 68k Macs and a few Commodores. I’ve even gone so far as to have similar minded geek friends over for a LAN party consisting of these early Macs in a LocalTalk environment. Nothing like a good game of Wagon Train 1848 (multiplayer Oregon Trail) to get things going!

Because of these interests I try to stay on top of what’s going on in vintage computing circles, subscribe to several mailing lists and visit quite a few websites devoted to the topic. There’s something to be said for experimenting with computers just to see what can be done even though it may not be practical (LocalTalk to Ethernet bridge for Internet access from a 512K Mac anyone?) though it seems to be something that occurs less frequently these days.

I recently ran across 1000BiT, a website devoted to vintage computing which I had not seen before. 1000BiT is a great website for finding everything you can related to a specific vintage computer in one place. From system specs to original advertising, brochures and manuals they’ve got it covered. It’s a great stroll through personal computing history and an easy place to get lost in for hours as you pour over the specs and adverts which built an empire.

The Open Source Microsoft Access Alternative

Databases are a wonderful tool for organizing all those bits of information in your life. While open source technology took database backend technology by storm (MySQL anyone?) there remains a gap in desktop database technology. Let’s say you wanted to create a database for your address book. You could certainly do it in MySQL and write a PHP front end for it and make it web based but this really seems like overkill for a personal address book, it also seems like a lot of work.

You could also do it in a spreadsheet program but you give up a lot of advantages of a database (especially a relational database) when you do so. In an effort to fill this void between the massive SQL database with frontend application and the spreadsheet Microsoft offers Microsoft Access. This is both a banckend database engine and a frontend design package in one which allows you to generate forms for updating data as well as reports. As a bonus if your database is too big for it’s engine you can connect via ODBC to a bigger backend such as SQL.

Unfortunately, this segment of database tools has been largely overlooked by open source software, especially in the Windows environment. This is probably not without reason as middle-level database tools like this, even Microsoft Access, are often too complicated for most end users and too limiting for most developers. In fact, if you asked many Microsoft Office users what the “Access” program does they probably wouldn’t be able to tell you. Still, if you need a quick database form for entering data it’s tough to beat this type of application. Perhaps the most widely known open source office suite, OpenOffice, has has made an attempt at an Access alternative in their “Base” tool but, frankly, it leaves a lot to be desired.

A better choice is the KOffice program, Kexi. Like Microsoft Access, Kexi can serve as a combination backend/frontend or as a frontend to a remote backend database. Kexi provides scripting through the python and ruby languauges in addition to the basic tables, forms and reports. In fact, the only real problem with Kexi is that it is not available in an open source version for Windows.

Because KOffice relies on the Qt graphics toolkit it was not made available in an open source version on the Win32 platform. Recognizing the interest in an Access alternative Kexi was ported to Windows and a commercial version is available for $72. The winds of change are in the air though. Trolltech which makes the Qt toolkit has released the Windows version of their toolkit under the GPL meaning Qt based apps can now be made available in Windows under an open source license.

Based on this development the KDE developers have started porting applications, including KOffice and Kexi, over to Windows. Because of the large codebase and complex nature of KOffice it’s going to take a while to get things stable on Windows (they’re currently at Alpha 10) but someday in the not too distant future there will be a good open source alternative to Microsoft Access on Windows. You can see the progress being made and check out the alpha on the KDE for Windows site. In the meantime KOffice/Kexi is available for use on Linux and Mac.

Movable Type Goes Open Source

For reasons I can only speculate about two of my most popular articles to date remain “The Next Big Thing In Blogging Software” and “a year later: an overview of multiblog software options“. The first was written over four years ago and the second just under three years ago. In the online world that is eons.

One might ask that if these have proven to be such popular articles why not update them more frequently. To be honest about it this blog is as much for me to remember and track my interests and solutions to technical problems as it is to share knowledge and information with you the reader. Given the significant amount of time which was invested in installing, testing and reviewing the blog software choices and the return on investment it simply doesn’t make sense to spend the time to do an annual or even semi-annual update. This is primarily because I have been extremely happy with my chosen solution, b2evolution and despite the continued prevalence of WordPress in the blogosphere I see no compelling reason to change and one good reason to stay with b2evolution, multiblogging. Despite the continued development of WordPressMU it remains a sort of kludge which may or may not work in your specific instance. b2evolution, on the other hand, was built from the ground up to support multiple users and blogs so support exists throughout the product. This is reason enough for me to stick with b2evolution, the blogging software that I still believe is undervalued and an excellent choice for the vast majority of independent blogging sites.

For those that have forgotten once upon a time the independent blogging software market was ruled by Greymatter and after it’s discontinuation by Movable Type. There were no other serious contenders. All was good in the land of the blogger, then the sky fell. As I wrote four years ago…

On May 13, 2004 Six Apart, the company behind Movable Type, announced the long-awaited version 3.0. With this blog entry they also single handedly managed to start the demise of the Movable Type monopoly and changed the face of blogging software forever.

What they did was try to commercialize what had been free software while maintaining a crippled free version to placate complainers. As it turned out this was perhaps the biggest mistake Six Apart ever made. As bloggers such as myself became vocal about these changes and provided developing alternatives which were improving on a daily basis the vast majority of independent bloggers abandoned Movable Type for other platforms such as WordPress and b2evolution. I have an unsubstantiated hunch that my prediction of the demise of Six Apart became a haunting reality for the company who saw customers fleeing by the thousands. Although they retained some market share, particularly among the commercial bloggers it would never be the same for Movable Type, once the king of the bloggers.

Despite attempts to rectify the situation and improve the pricing structure it seems that eventually the stubborn Six Apart came to realize the gravity of their mistake. In December 2007, more than three years after that infamous day, Six Apart made what I believe to be one last ditch effort to regain the market share they once had. It was then that Six Apart announced “as of today, and forever forward” Movable Type would be open source. Finally a victory for those who complained so mightily about that initial pricing structure.

How does this change things? It doesn’t really. Movable Type will never again see the market penetration it once had. The decision to go open source is far too late to have that kind of transformational effect. The market has become far too diluted and there is no single competitor (WordPress would be closest) to try and overtake. If it would have been made shortly after the original backlash we would probably all still be running Movable Type for out blogging needs as many of the other contenders would never have seen the development influx they did in the weeks and months after the MT 3.0 announcement. Certainly there is now a possibility that over time Movable Type will innovate and become a serious contender but for the time being it will remain a WordPress (and b2evolution) world. I applaud the move made by Six Apart and it probably will keep the Movable Type software alive and viable for the time being but it’s too bad this lesson was such a hard one for Six Apart. Better late than never. At lest the sentiment is right.

2TB and growing

About a year ago I built several 1.2TB fileservers for a number of my consulting clients which utilized RAID5 arrays for redundancy with LVM running on top for expandability. One of my cleints which does some media work has exhausted the storage space and called a few weeks ago about expanding the storage space on the server.

The four hard drives in the server now were already utilizing all the onboard SATA-II ports. I certainly could have replaced the drives with larger ones (which I did do for another client) but that would have entailed some careful shuffling of data and wouldn’t provide for much future expandability. For another client who uses space much more slowly I could have added a two port SATA expansion card and added two drives in RAID1 but here I expect to need to continue adding space and so I proposed an external storage tower with a multiport SATA link. I was looking for a PCI Express controller which would support eight drives on a single card and would be supported in Debian Linux. I ended up selecting a Highpoint RocketRAID 2322 which seemed to fit the bill.

As it turns out packaged driver support for Linux is only available for Fedora, Red Hat and SuSE. Luckily I found great instructions at this University of Northern Iowa site for building the drivers from source provided by Highpoint. Although there is some grumbling in the open source community about these drivers being non-free licensed (hence no package from Debian) just about everything else is great. The kernel module built without any problems and without a huge number of dependencies and I was able to get the drives up and running without too much work.

Unfortunately, I did not get the module into the initramfs as I had intended and so on reboot it all came crashing down. This entailed a trip to the customer and several hours to fix because the entire system including the root filesystem is LVM on RAID. Luckily, I was able to boot off an Ubuntu CD and build the RocketRAID kernel module again then start the RAID and then the LVM which finally allowed me to mount the filesystem. After doing this a few times I was finally able to get the initramfs straightened out and things working again. Needless to say it was a long night, but a successful one nonetheless.

FOSS Disk Imaging

I’ve written before suggesting the use of Linux for open source drive imaging and it seems there has been some movement in this direction. About a year after my initial posting the folks at PackRatStudios posted this article with a list of free and open source alternatives to the Symantec Ghost software. A quick look at the utilities they reviewed indicates that there is still much work to be done on using Linux as a disk imaging platform, particularly when it comes to ease of use and filesystem (NTFS in particular) support. On the other hand we’re much further along than we were and progress is clearly being made.

Revisiting Open Source Whole Drive Encryption: TrueCrypt vs. DiskCryptor

About a month and a half ago I wrote about open source whole disk encryption software (this was just before TrueCrypt 5 came out) and mentioned an open source program called DiskCryptor which has been available since late fall and was the first open source whole drive encryption (system partition encryption) utility to support Windows that I’m aware of.

DiskCryptor has releases hosted on SourceForge and additional information on the primary developer’s website. Though the developer’s site is in Russian the Google translation facility does an ok job of translating it.

I started using DiskCryptor a few weeks before TrueCrypt 5 came out and was really impressed. Once TrueCrypt was released I tried that and while I do appreciate some aspects of the super redundancy in TrueCrypt whole disk encryption I soon went back to using DiskCryptor for a couple of reasons.

First, I had problems with TrueCrypt blue-screening on me and sometimes preventing my system from shutting down properly (it would sometimes reboot instead of shutting down). This made me quite uncomfortable as I was trusting my data to the software. I understand there have been a few patches to TrueCrypt since I tested version 5.0 which fixes some of the problems people were having and which I have not tried yet but there are other reasons I prefer DiskCryptor.

Second, while all the hand holding and redundant systems in TrueCrypt do make it (to some extent) dummy-resistant they are actually quite a pain when being utilized by a power user and there is no way to bypass them. In some cases it is either inconvenient or unnecessary to create a recovery CD. DiskCryptor does not require that a recovery CD be created and has different, perhaps more robust methods of recovering the data should the need arise.

Third, DiskCryptor supports hibernation! This is reason enough to use DiskCryptor for many laptop users. I understand that TrueCrypt 5.1 includes hibernation support but it appears a bug may have been introduced at the same time with dire consequences for drive security. Read about this bug in English and see the code problem in Russian. This may be fixed in TreuCrypt 5.1a but is not specifically mentioned as fixed in the TrueCrypt changelog as far as I can see.

Fourth, DiskCryptor has (in my mind) more robust/useful recovery options. This is for several reasons. While there is no recovery CD or extensive boot loader decryption ala TrueCrypt the encrypted volumes are fully compatible with standard TrueCrypt encrypted volumes (including pre-TrueCrypt 5). This means you can take a DiskCryptor encrypted volume and physically attach the drive to another system or boot into another OS and then mount and decrypt the drive with TrueCrypt. You cannot even do this with TrueCrypt encrypted drives as the technology behind TrueCrypt whole drive encryption is not compatible with regular TrueCrypt encrypted volumes. To me this is really exciting and useful as it allows me to move drives between systems and retain access to the encrypted data. There is also a BartPE plugin for DiskCryptor so you can boot from a BartPE CD and decrypt/access the encrypted drive. Finally, support is in version 0.3 (coming out shortly) for installing the DiskCryptor boot code on other media (eg. flash memory keys, CD-ROMs, etc.)

Fifth, DiskCryptor appears to be faster than TrueCrypt 5 WDE. At least on my system I noticed no slowdown with DiskCryptor but TrueCrypt 5 significantly slowed down my disk intensive operations. This is a major reason I personally switched back to DiskCryptor and I’m not the only one as evidenced by some posts in the DiskCryptor forums which indicate that in terms of MB/s DiskCryptor is as much as twice as fast as TrueCrypt 5, at least on some systems. Based on my experience I would agree. I understand there have been some performance enhancements in TrueCrypt 5.1a which include some assembly optimization (which was already a part of DiskCryptor) and I have not had a chance to test this latest version yet but believe speed improvements have also been made in the latest version of DiskCryptor which may still give it the edge.

Sixth, the development of DiskCryptor is both more active and more responsive to users than TrueCrypt. “ntldr” the developer of DiskCryptor has been very open to suggestions and very responsive to users through the forum on their website the same cannot be said for TrueCrypt. Based on what I’ve seen from various TrueCrypt users they have been often ignored by the TrueCrypt developers who seem to be a small group of developers who do not respond particularly well to users or accept development assistance (one of the major benefits of open source development). The disenfranchised users include the DiskCryptor developer “ntldr” along with OS X users who started a project called OS X Crypt because of the unresponsive nature of TrueCrypt developers. I think this potentially will be a huge problem for TrueCrypt and it makes me somewhat concerned about the motives and long term success of the TrueCrypt development team. This is also manifested in the somewhat restrictive nature of the TrueCrypt source license compared with other open source licenses such as the GPL (which is used by DiskCryptor). While TrueCrypt may be open source it is most definitely not GPL software and not GPL compatible (read about the issues of including this with GPL software here)

There is one downside to DiskCryptor, there is currently no real help file or instructions for using it but I was able to figure it out by looking at the menu options all of which seem fairly straightforward to me. This is an acknowledged flaw and is being actively worked on by a few DiskCryptor users. In the meantime the primary developer is more concerned about enhancing the feature set and eradicating bugs than on developing documentation, an understandable position for many volunteer software developers.

Communication and publicity is not a strong suit for DiskCryptor and this may be partially to the fact that English is not the first language of the developer. In my opinion this, more than anything, is holding back what is otherwise an excellent (and in my mind superior to TrueCrypt) product. Much of the information is available but it’s in the DiskCryptor forums which contain a mix of Russian and English making them not the most user friendly way to learn about the software. There has also been little tech press coverage of the program.

I am not so much trying to make the case by myself that DiskCryptor is a better product for everyone, though it was for me. I am trying to bring some attention to the first open source whole disk encryption program (there was even a Wikipedia vote where it was decided to eliminate the page for DiskCryptor as non-notable and where people seriously questioned if it was just a knock off of TrueCrypt 5!) and encourage others to talk about and try DiskCryptor. Certainly the program could use some English language press if it is to grow significantly. Hopefully by explaining my reasons for selecting DiskCryptor as my choice I’ve encouraged you to at least keep an open mind and try the software then write and share your experience with others.

Whole Disk Encryption

My laptop is one of the IBM (Lenovo) Thinkpads which includes a fingerprint reader and TPM chip which can be used to both unlock the system at boot and log on to Windows using software supplied with the computer. One thing that the supplied software does not do but something that I’ve been interested in doing is whole disk encryption (something also called by a few other names depending on the vendor and software.

You can learn more about whole disk encryption in this article written by Bruce Schneier a couple of months ago or from the Wikipedia article. Essentially the idea is to encrypt the entire hard drive rather than a small subset of files. Obviously this does not protect the files while the computer is operating but is especially useful if you have a laptop (something prone to being stolen) and want to ensure that if someone stole it the data on it would be useless. While some free utilities such as TrueCrypt have allowed you to encrypt entire volumes they have not allowed you to encrypt the boot drive, at least not when using the Windows operating system. You see the trick with encrypting the boot drive is that you need to unencrypt it for the system to boot so a driver must be loaded at boot time which will prompt the user for a password and thus unlock the key allowing the drive to be unencrypted and the system booted. Until recently there were no free or open source programs which allowed you to do this with the Windows OS (solutions for Linux were available).

In the span of just over a month that has all changed. In December a Russian security consultant released the open source program DiskCryptor (and on SourceForge) which allows you to install a Windows driver (which can be renamed for extra obscurity) which will encrypt your drive and also allow you to install a boot time driver onto the disk which allows for the encryption of the boot volume. The encryption algorithm and container is TrueCrypt compatible so if need be you can access the drive by putting it in another computer which has TrueCrypt installed and mount the volume (with the appropriate password of course). This is an especially nice touch as it ensures some kind of compatibility between the open source projects and makes data recovery from an otherwise dead system a bit less problematic. I’ve been successfully running DiskCryptor on my laptop boot drive for several weeks now and have found the program works as advertised though there is essentially no help file or other documentation so you have to learn the program by playing around with it and looking at menus.

Later today TrueCrypt plans to release version 5.0 of their popular open source encryption software which among other things promises to include a boot driver for Windows systems which will allow the encryption of the boot drive. I plan to try out this software once it becomes available. I am excited to see that there will be two open source solutions to whole drive encryption and look forward to improvements in one or both of the programs.

A few things to note. Neither of these solutions (as far as I’m aware) supports the TPM chip and fingerprint reader in my laptop. This means that you need to enter a separate password to unlock the hard drive in addition to unlocking the computer. It also means that the encryption is all taking place in software and utilizing CPU cycles and slowing down drive access times. While I haven’t noticed a pronounced effect in my usual word processing and Internet browsing on this system I can see that this might be problematic for a media or gaming intensive situation. Hopefully advancements to these solutions will allow for better integration with hardware acceleration and authentication to improve this situation.