07 May 2002

Tue, 07 May 2002

Sam Ruby further explores Mark Pilgrim's referer code

Sam Ruby states:



Mark described his implementation here.  The key addition was to actually fetch and parse the pages identified in the referrer logs. [Sam Ruby]


Exactly. I had actually studied this as well as Chris Wenham's code. The problem is that there's a whole lot of code involved here that is web-server or platform specific. Confession:  In a former life, I worked on web server usage analysis software. I know how difficult it is to write that log file parser that works across 18 flavors of unix, windows, and macintosh AND handle the fact that the log file formats differ and can change on the fly. When we sit down to write it for ourselves, we take very little of that into account, if any. Any attempt at doing that in a generic way would be a huge undertaking. Thus, the only way to provide that deep context, such as Chris Wenham described to me, is to state that this solution only works with web server X on platform Y with blog Z. And that's why the rate of proliferation on these things has been slow. There are a lot of folks that might like to try it but aren't skilled at perl or the various hackery that's required to get this stuff running. I'd like to make it easier for the masses, at least in the simple case.


My approach was to provide a solution that did not solve all the world's problems, had low implementation cost (being that the implementation cost is subtracted from my sleeping hours ;->), but provided utility that is at least in the ballpark in terms of providing something similar to what Pilgrim, Wenham, and Orchard have achieved.


Fact is, I have the referers for my site right here and I have the webalizer running out of a chron scheduled PHP script. I could easily use that data for my own site, and I could derive the context that Chris describes, but that would only solve the problem for me. I'd rather take a stab at solving it for a broad class of users and the response from there is no spoon suggests that there's a market for such a beast, even if it's not perfect.

Posted at: 23:11 | permalink

Updating templates with getReferer web service

there is no spoon says: 



David Watson comes to the rescue for Radio users who would like to add "autolinking" or automatic "backlinking" to their blogs. He's created a little webservice called "getReferers" that automatically generates a list of links to the sites who are linking to you. It then helpfully puts that list at the end of your page. Allowing your readers to see who is linking to you helps to put your comments in context (which, btw, is what everyone who likes the Instant Outliner concept is so excited about). Thank you, David! One question: Um, where am I supposed to paste your code? Does it go in the homepage template? By the way, I think the way JimsLog handles this referer feature is the best I've seen because it indicates referers by post. This makes context even better because you can quickly and easily jump to those links that are specifically related to an individual post you like. I wonder if this was what Sam Ruby was getting at when he said getReferers does "not appear to be distinguishing between pages which contain specific links to specific articles and pages which simply link to something on your website." Of course, above I just linked to the home page of JimsLog, rather than to a specific post. It doesn't really look like his "auto-backlink" method has a way to account for such general links. The "auto-backlink" method employed at diveintomark is sort of a half-way point between getReferers and what JimsLog is doing. (It seems Disenchanted.com (I'm sorry I keep spelling that wrong) doesn't have this problem because it deals in "stories" or "articles" which are one-per-page rather than the blog's multiple-posts-per-page format.) So, what if I put the getReferers script in my Item template? (I'm guessing it would still call for links to the whole page, not just links to that item, right?) [there is no spoon]


You're welcome. Thanks for trying it! I'll try and respond as best I can. Please let me know if I leave anything out. There's so much complexity in some of this stuff, it's easy to overlook otherwise important bits. Doc writing is hard work! Anyway, here goes...


In Radio, you paste the code into your templates. From my experience, I would suggest that you start by creating a temporary placeholder story. I called mine test. It'll look a little funny since I have the referrers coming out in the footer also but you get the picture. My main point with creating the test story is that it gives you a place to try out the code without destroying your entire site by horking up the templates. I've done that a couple of times because I'm not particularly skilled at editing HTML and so I made a mistake and my blog looked like a bad car accident. You can mess up the test story and it's not going to mess up your site.


So, take the code from the article, create a new story, set it on source mode if you're using the wysiwyg editor and paste the code. Edit your user number in the code and click the post changes button. You should see your referer list come back from my app server shortly. If that works, you can move onto the templates. In my case, I edited two templates, #homeTemplate and #template. I added the code immediately after the body text in each template. Digression: I use CSS templates from Joe Gregorio at BitWorking. These may look quite a bit different from what you're running. Here's what it looks like in my #homeTemplate and #template:


<%bodytext%><% scratchpad.s = tcp.httpClient (server:"www.watsondesign.com", path:"/soap/urn:rcsproxy/getReferers?site=0102172&group=radio1", ctFollowRedirects:"5"); string.httpResultSplit (scratchpad.s) %>


Save your templates and open radio and click Radio/Publish/Entire Website. I believe the bodytext macro should be the same so you might try searching on that in the text of the template. Try making yours look similar to this and you should get similar results. The only issue might be if the formatting of the table isn't right for your blog. I'm passing along whatever formatting I get back from the RCS server.


Finally, you are correct in your analysis of the referer linking issues with regard to Sam Ruby's comments. I'll go further into the issue in response to Sam shortly. It's largely a matter of design constraints. In short, no matter where you include the call to my service, you'll get exactly what the RCS server thinks your referers are, and that's always by site, not by page.


Good luck! Let us know how it goes.

Posted at: 23:06 | permalink

Sam Ruby responds to referers issue

Sam Ruby says:



OK, David - I'm helping spread the word.  But I must say, what it appears to me is that you have put your referrer logs in context (which is a step forward), but do not appear to be distinguishing between pages which contain specific links to specific articles and pages which simply link to something on your website. [Sam Ruby]


Thanks for shedding some light on the issue, Sam. You are correct, this approach does not provide the same level of context that you see at disenchanted, decafbad, or diveintomark. However, there's a tradeoff between the autolink software being tightly coupled to the particular website implementation, meaning scripting language, log files, etc., and the design working for a broad range of sites, Radio and RCS users in this case. The data at the RCS server is telling us the referers linking to the site, not the page. If there's a way to get referers by page, it's opaque to me. I will say that if the data is available at the RCS server, I'd be glad to enhance my service to return the data in context. Further, if the data isn't available but Userland would like to add or expose it, I'll do the same.


The sites that have implemented this in the most effective way - disenchanted, decafbad, and diveintomark - to the best of my knowledge are all doing it with the local data that their respective web servers provide. That assumes a lot about the web server implementation and is the principal reason why I rejected that approach in this case.


I did intend to discuss this in some detail in my original post but I ran out of steam and was not thinking clearly when I'd been hacking for hours. I often need to revisit these posts and fill in the blanks. My view right now is that if what I did provokes more discussion on the subject, it'll be a good thing. I'd be perfectly happy if somebody came along and explained how to do this without any compromises with Radio and RCS but I don't believe that's trivial given the aforementioned design constraints. Finally, a big tip of the hat to the folks who pioneered this technique. They deserve a huge round of applause.

Posted at: 18:51 | permalink

I am not Dave Winer, but I am listening

There is no spoon comments on all the link back talk and I provide the answer for Radio or RCS users. Sometimes, I don't think anybody can hear me scream. There are linkbacks here! They are easy to use. If not, ask a question. I won't bite. Come and get 'em. BTW, love the blog name. Cool.

Posted at: 16:19 | permalink

getReferers web service works with ctFollowRedirects in Radio now

This update concerns the debugging that Jon Udell was helping me out with on my getReferers web service.


<% scratchpad.s = tcp.httpClient (server:"www.watsondesign.com", path:"/soap/urn:rcsproxy/getReferers?site=0102172&group=radio1", ctFollowRedirects:"5"); string.httpResultSplit (scratchpad.s) %>


The syntax of the ctFollowRedirects parameter was wrong in the sample code for Radio. I have corrected it here and in the article so that you can embed the call with the correct domain name as it is shown here into your Radio code. This has been tested outside of my firewall via Radio as well as through a proxy server.


Thanks Jon!

Posted at: 15:53 | permalink

CocoBlog and Cocoon

Ugo Cei answers my question regarding which version of Cocoon that CocoBlog has been tested with: May 3rd CVS version.

Posted at: 14:24 | permalink

CocoBlog on the move

Looks like Ugo Cei's not the only one with a cocoon blog anymore. Cool.

Posted at: 14:00 | permalink

Cracks in the armor at Oracle

I.B.M. Overtakes Oracle in Total Database Sales. PALO ALTO, (Reuters) - International Business Machines Corp. took the No. 1 spot last year in terms of total new database software sales from long-time leader Oracle Corp., according to a new report from Dataquest, a unit of technology research firm Gartner Inc. By Reuters. [New York Times: Technology]

Posted at: 06:38 | permalink

The relationship between google and weblogs

Jon Udell: Google's bias is a temporary anomaly.  Agreed.   See also Google Blogs. [Sam Ruby]

Posted at: 06:28 | permalink