LangPop.com Updates
It took me a while to get the September update of LangPop.com out. The reason: I had to redo the book statistics.
Previously, I got those from Amazon's web service, which worked ok as far as those things go. However, recently, I ran into several problems:
- They started requiring signatures on the requests, which meant I had to stop and fiddle with my code, which I did not appreciate. (If you have this problem and use Ruby, get the amazon-ecs Ruby gem)
- Their results suffered from some kind of regression in any case: C, C# and C++ all showed up with the same number of results. They didn't use to show the same numbers. Not good from my point of view.
- They were completely unresponsive to requests for information or help.
- And for good measure, they hid their information about their products API on some other web site, with no links from their web services page, which makes it kind of annoying to look for it, and suspicious that they may be removing it at some point in the future.
All of these together made me decide it was time to switch data providers. Where to go? The answer was obvious: back to my home state of Oregon, to Powell's Books, a huge independent book seller. If you ever go to Portland, do yourself a favor and go to Powell's. It is a city block full of books on multiple floors. When I lived in Portland, I would go there before eating to give myself a reason to leave (although they do have a cafe...). And that's just the regular book store. They have another huge space a few blocks away dedicated entirely to technical books. Powell's is one of my favorite things in Oregon, and a great experience.
Anyway, I decided that they're big enough, and do enough business online that I'd ask them about getting data, so I sent some email asking after what options were available. I got a very helpful answer from a real person, CJ Stritzel, who was kind enough to explain what the options were, and work with me to help find the best one for what I needed. Definitely better than "/dev/null", the Amazon message board!
The API is still something they're working on, but it works well enough for what I need, and I'm quite happy with the situation. It has caused some changes in the rankings, though: they show Java books as being by far and away the most popular, which has pushed that language into the top spot in the rankings. We'll see over the coming months; I may rework the queries I'm using some as their API stabilizes and I experiment with different combinations of keywords.
In any case, thanks Powell's, and enjoy the new statistics everyone.
Tcl and the Tk Toolkit, 2nd edition
I'm not sure what I was expecting, but the news that "my" book is finally out is cause for excitement, anxiety, and relief.
It's not actually "my" book at all; it was originally written by Dr. John Ousterhout, who created the Tcl programming language, and the Tk toolkit. And Ken Jones is listed as the co author of the 2nd edition, because he's the guy that pulled everything together and managed to actually bring the project to fruition.
However, I did play a part a part in the project. About 5 years ago, I was doing a lot with Tcl, and then, as now, I thought it was a shame that the language was not promoted as well as it could be. It's not a perfect language, but at nearly 20 years old, it has stood the test of time, and it has a lot going for it. In any case, at the time, I had noticed that there was no good, publicly available tutorial. Tcl has always had an excellent set of man pages describing all of its various commands and subsystems, but did not ship with a tutorial that would help new users get started.
I thought that that would be a good problem to solve in order to make the language more attractive to newcomers. However, writing a tutorial from scratch is no simple task, so I started looking around for material that might be adapted. One thing that sprung to mind was the first edition of "Tcl and the Tk Toolkit". At that point the book was nearly 10 years old, which in tech years is practically forever, and indeed, the material in the book, while good, was quite dated for some portions of Tcl. Figuring it couldn't hurt, I emailed Dr. Ousterhout to see what he thought of open sourcing the book, or at least the first part of it, with the idea of updating and utilizing that as a tutorial. He was actually ok with that idea, but the copyright belonged to the publisher rather than him, so he gave me a contact with the publisher, who I then sent my idea to. They weren't necessarily against the idea of releasing the material under an open license, but came back with another idea: why not update the book? That sounded like an interesting project in its own right.
I don't recall all the details (this was in 2004), but the publisher ended up with Ken Jones as the guy to run the project, as he'd participated successfully in updating another popular Tcl book. In the end, there were several of us (Eric Foster-Johnson, Donal Fellows and Brian Griffin, in addition to Ken and myself), that worked on the project, divying things up by sections, with Ken coordinating. I am honored to have my name listed amongst such august company.
Which brings me back to being nervous: the idea of having something I wrote "set in stone" is a source of some anxiety for someone like me used to working with such a malleable world as that of software. Also; I wrote most of my contribution (I focused on the C API) several years ago, and I hope it's still valid, and of course that there aren't any huge and glaring errors.
For a while, the book seemed like it was in limbo - just when I thought it was dead, Ken would come back with an update. This was also a cause for concern, as I wanted the thing to either die a dignified death, and get on with my life, or appear in print.
As for the tutorial, I did manage to make that happen as well. Clif Flynt, who authored a Tcl/Tk book of his own, was kind enough to donate some material from a tutorial of his own, which several of us hacked at to get it into shape. It's available here: http://www.tcl.tk/man/tcl/tutorial/tcltutorial.html
In any case, I sincerely hope that the material I wrote manages to do justice to the years and years of work that have gone into the language and its implementation, and hope you find the book valuable if you pick up a copy.
Stopping DocBook Version Control Churn
I have been getting frustrated with DocBook's generation of HTML files, because it seems to gratuitously insert random id's for things in a pattern that doesn't repeat, so you run xsltproc on the same file twice, and the output changes. subversion, git, etc... pick up on the differences and want to save the new version, which is quite annoying.
I figured out a way to get around this. I have no idea if it's ugly or not, but it works and that's enough for me. Comments/ideas/improvements welcome.
<xsl:template name="object.id">
<xsl:param name="object" select="."/>
<xsl:choose>
<xsl:when test="$object/@id">
<xsl:value-of select="$object/@id"/>
</xsl:when>
<xsl:when test="$object/@xml:id">
<xsl:value-of select="$object/@xml:id"/>
</xsl:when>
<xsl:otherwise>
id-<xsl:number level="multiple" count="*"/>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
The key is the line:
id-<xsl:number level="multiple" count="*"/>
which replaces
<xsl:value-of select="generate-id($object)"/>
The problem is that generate-id, at least in xsltproc, uses a fairly random bit of data (memory location) to create the id.
My fix uses the location of the element within the document as the id.