My Boog Pages

Wednesday, January 25

CODE: MONKEY

Now It Can Be Told! The Real True Story Of CrimeSpot.net

Now that CrimeSpot.net is up and running, I thought I'd give a little history of how it came about, for both of you who are interested in such things.

It all started back in late September. I was cruising the net, as usual, bouncing around from site to site. I checked Google News to see what was up, then surfed over to a blog I'd never heard of before. It turned out to be pretty cool, and I thought, "Too bad there's no site that rounds up this stuff, so I wouldn't have to find it on my own..."

That's when a man came down from Heaven on a flaming pie and said, "YOU SHALL CREATE A SITE TO COLLECT CRIME FICTION BLOGS, AND IT SHALL BE CALLED CRIMESPOT.NET."

See, most blogs create some sort of syndication file when they are updated, containing the titles, links, and brief summaries of the site. These files are invariably in XML, in one or another of the common formats. And I happened to have been working on an XML processing library for over 2 years.

The next decision: where to host. My code was all in VBScript, which normally runs only on Windows servers, but I really wanted to host on NearlyFreeSpeech.net, which is cheap and reliable but doesn't support VBScript. So that man on the flaming pie made another visit and said, "GENERATE THE PAGES OFF-LINE THEN UPLOAD THEM VIA FTP." Which really wasn't a bad idea - I could create the pages on any old PC and send them up to the server whenever I wanted.

This also meant that these pages could be static, they wouldn't have to be created on the fly, and that would help with the site's performance.

So: the next thing to do was gather some examples and see what I would be dealing with. So I downloaded RSS or Atom files from sites like Ed Gorman's, Jim Winters', Dave White's, etc. And I got a really big shock.

My code was originally intended to generate documents from Access databases, and I got to pick the format. Naturally I chose features that were straightforward to implement. Saving documents back to the database was a later add-on, and was only intended to handle files with the structure I expected. When I looked at the sample files I'd downloaded, I thought Holy crap!, because they contained all sorts of weirdness.

After dicking around for a while I finally decided not to try extending my code to handle them. Instead I used XSLT templates to transform them into a common format that I could handle easily. Each type of syndication file got its own template, more than one in cases where different providers had slightly different formats. This proved to be a very important choice down the road.

These templates allowed me to add features when I had to. When I found that standard Internet time formats could not be imported into Access, I used the templates to change them into something that could. When I needed to cut down the site summaries to 25 words (most sites use 40 or 70), I used a nice recursive template. When I needed to cut HTML tags out of the input - yeah, a template.

Once the data was in the database I could spit it back out in whatever form I chose, so the programming was done. It took almost as much time to come up with a layout for the site, but that was MUCH simpler, and only took so long because of my "mad" "HTML" "skillz".

And now it's up. Enjoy.

posted by Graham at 5:39 PM permalink

Trackbacks | Comments (2)