http://www.tcl.tk has a lot of broken links. I ran [TclLinkCheck] against it, and put the results up here: http://www.dedasys.com/report.html I think it wouldn't be a bad idea to do this with the Wiki, as well. ---- ''Would it be an idea to integrate this into wikit - or run it periodically on the same site? -[jcw]'' ---- If it were integrated, it could be much more efficient. I think it makes a lot of sense - there are too many dead links in the Tcl world:-( [jcw] - Ok, let's try to sketch the logic - the link checker runs once in a while. What should it do? * create a page with broken links and page refs (might be quite large)? * have a mechanism to only declare a link broken if several checks failed? * alter the page, perhaps simply add "[[Broken link]]"? * ... ---- [davidw] - If I had to implement it, I would do something like this: 1) Create a page which references all the pages with broken links - but don't include the broken links themselves, at least for now - I think it would be huge. 2) Do a weekly cron job that goes through each page and tests its external links, and updates the page with some kind of broken link tag. The original URL should not dissappear, because it might be useful in tracking down the correct URL, especially in cases where it's just a spelling mistake. This could even be monthly, as, after the initial run, its main purpose would be to turn up links that have gone stale in the meantime, which hopefully won't be happening every day. 3) Maybe each night check the pages that were changed that day, to make sure we're feeding good data into the wiki. [AK]: Consider an image for the [[Broken Link]] tag in addition to the text ''broken link''. Text for 'search', Image for visual search when looking at the page. [jcw] - Ok, how about marking bad links thus: some text, a link http://blah.blah [http://mini.net/badurl.png] with some more text. With an "align=middle" and "alt=BADURL" tag, this would not exceed line heights, and be easy to search for? It can done for verbatim (fixed font) text too (where URLs are auto-converted to links as well), but I can't show that here. ---- [LV] After software has identified bad links, then what? Are there volunteers standing ready to look into the situation and resolve the bad links? I've seen people propose - and in fact implement - deleting of bad links. Of course, that results in pages with basically empty content. And of course perhaps the link is only down the current week due to the person being on vacation, power outages, or someone hijacking their dns entry. Or are we saying, in the above text, that the link checker's result would be to integrate the 'broken links' image beside the actual link? If so, then that would mean that many pages would change on a regular basis just from the link checker. And the link checker needs to be smart enough not to change a link if it has already been changed for a broken link - but to unchange an indicated link if it returns. ---- [SC] Instead of making a change in the source, why not just make the change when the text is rendered either as HTML or whatever format. One way would be to add a class attribute to the link which the stylesheet could render differently. You could then add a note to each page encouraging folks to try to fix bad links. To do this the page renderer would need access to a db of bad links found by the web spider. ---- [LV] So this latest proposal is to do a link check of each wiki page when it is being converted into HTML. That will increase page rendering time. And of course, since the wiki works on a cache, that means that if I change a page in Feb, 2003, and the sites on the page happen not to be reached due to web traffic, etc., then anyone coming to that page in the months or years in the future will see the site marked as having a broken link, even if the problem was resolved moments later. ---- [SC] No, I was thinking that the link check could be done periodically and that the renderer would refer to the results of the check when rendering links. I presume that the results could be stored in an MK database and that looking up each link during rendering wouldn't take too long. Interaction with the cache might be complicated I agree. If the link check ran once a month it could perhaps be run more often on only the bad links to guard against random traffic related problems. I'm not sure how the cache logic works but could cached pages be flagged if they contain bad links and then the cache invalidated if the link is found good? ---- [Category Suggestion] | [Category Tcler's Wiki] | [Category Internet] |