November 14, 2003

Blog Roundup Explained

I had a couple of emails and a couple of comments about the selection process so I thought I would explain it a little better by walking though the process for this week. Effectively I have a little spider which grabs the index.html page from the all my blogs. I don't grab the index page from the humour blog because it has been merged back into the main blog. The spider then extracts only the links from these pages and throws away anything that doesn't qualify (for examples cross links between the sites). I try hard to allow for multiple blogs on a single site which means sometimes the per site count may be off a little but that doesn't matter in the long run. The links that get counted could come from (1) the blog roll (see right), (2) the blog family, (3) trackback links, (4) comments or (5) articles. Note that recent comments will get counted twice because they appear in both the comments under the article and also in the "Recent Comments" list.

Next the spider cleans up the links and sorts by quantity. This gives me a Top 10 list. Then it builds a flat list with the appropriate number of entries in it and makes ten random selections. Obviously there will occasionally be a dud link but that is just too bad :-) I may, at my discretion, substitute an alternative.

Note that the list is semi self-fulfilling in that a site picked last week will have four or five additional links this week. Again, looking at the numbers I don't think this will be a serious problem but if it turns out that the same sites are always in the selection list then I may bring in a one week delay before they can reappear (or something).

Posted by Ozguru at November 14, 2003 05:11 PM
