Online htmldiff service
Many Webmasters have heard about or used the W3C Link checker to find dead links on their pages, but very few would know that this service was initially created to help editors of W3C specifications find broken links in their documents, as required by W3C publication rules as a corrollary of our motto on stable cool URIs.
Every once in a while, we provide new services to make the life of our collaborators easier, and offer them to the public at large as much as possible; our latest toy in this category is an htmldiff service, which out of two online HTML documents will create a new document highlighting the differences between the two documents.
This is of course mostly useful to find the changes between two versions of a given document - and indeed, was created to help show the variations between two versions of a given Technical Report.
The tool itself is a pretty simple Python wrapper around Shane McCarron's htmldiff perl script - I'm happy to share the code of the Python wrapper if anyone is interested.
as a blind user dependent upon speech-output, it would be of INESTIMABLE benefit if the DIFF-marker used actual semantic markup when inserting a DIFF, rather than the generic SPAN -- the obvious candidates are: INS and DEL, and to a lesser extent STRONG and EM; SPAN carries no semantics, so it is difficult to communicate to a non-visual user what the color conventions defined for each SPAN actually means; instead of using SPAN, PLEASE use semantically meaningful elements that can trigger voice characteristic changes (regardless of styling) that alert the non-visual user the status of the text marked as a DIFF -- currently, the use of color coding alone (albeit through CSS) to convey meaning/context is a clear and inexcusable violation of the Web Content Accessibility Guidelines, versions 1.0 and 2.0 (http://www.w3.org/TR/wcag10/ and http://www.w3.org/TR/wcag20/).<br />
Good point, indeed; as I mentioned, we're re-using an existing script that we didn't develop, but I'll try to see if the original author has a good reason for having used <span> rather than <ins> and <del>.<br />Also, I wonder if "simply" putting CSS aural indications (and if so, which) could help workaround that issue?<br />
thank you, dom, for investigating the reasoning behind the use of, and alternatives to, the <code>SPAN</code> element...<br />"simply" putting aural CSS would help, but would only benefit a <strong><em>small</em></strong> small segment of users, although with developments like charles chen's FireVox extension for FireFox (which makes the <abbr title="User Agent">UA</abbr> a self-voicing application and supports the CSS3-speech module) the number of users who can take advantage of aural CSS is steadily growing... and, of course, there is tv raman's emacspeak, which (as one might expect from the originator of the concept of aural CSS) has the most robust support for aural CSS of any implementation currently available...<br />i can suggest some concrete markup for <code>@media aural</code> (which would be CSS2 compliant, as Section 19 of CSS2 is the normative section on aural CSS) properties in email -- is there a list to which i should <abbr title="carbon copy">CC</abbr> my suggestions/proposals?<br />generated content using CSS is very spottily supported -- even when the generated content added by the <code>:before</code> and <code>:after</code> pseudo-elements <em>is</em> rendered by a user agent, since the CSS-generated text isn't included in the DOM, it doesn't get passed to screen readers or other assisstive technologies -- for more data on this topic, consult the thread that unspools from:<br />http://lists.w3.org/Archives/Public/wai-xtech/2007Nov/0030.html<br />one consideration is a variation on the <acronym title="Web Content Accessibility Guidelines, version 2.0">WCAG 2.0</acronym> Technique C7:<br /><br />http://www.w3.org/TR/WCAG20-TECHS/C7.html<br />which utilizes the "overflow" property of CSS to hide declarative text in "plain sight", so that there is an aural indicator available within the actual document source to indicate the beginning and end of an inserted, deleted or proposed text (it's a "what works now" kludge to a problem that would benefit from actual pseudo-elemental text, were it available to a user's assisstive technology... note that the <acronym title="Web Content Accessibility Guidelines, version 2.0">WCAG2</acronym> technique does not use <code>visibility:hidden</code> or <code>display:none</code> as those are universal properties that affect the aural and tactile, as well as the visual canvas, the latter with a non-perceptible gap in the aural canvas, the former with a period of silence (equivalent to the way <code>visibility</code> affects the visual canvas)<br />i hope to work with you and enlist the aid of others with pertinent expertise in these areas, to ensure that DIFF-marked documents which bear the imprint of the W3C conform to <acronym title="Web Content Accessibility Guidelines, version 1.0">WCAG 1.0</acronym> as stipulated by the W3C publications criteria. the use of color or stylistic conventions alone to convey meaning is a clear violation of the Priority 1 checkpoint under <acronym title="Web Content Accessibility Guidelines, version 1.0">WCAG 1.0</acronym>, Guideline 2:<br /><blockquote cite="http://www.w3.org/TR/WCAG10/#gl-color"><br /><strong>Guideline 2. Don't rely on color alone. Ensure that text and graphics are understandable when viewed without color.</strong><br />If color alone is used to convey information, people who cannot differentiate between certain colors and users with devices that have non-color or non-visual displays will not receive the information. When foreground and background colors are too close to the same hue, they may not provide sufficient contrast when viewed using monochrome displays or by people with different types of color deficits.<br /><br /><strong>Checkpoints:</strong><br /><br />2.1 Ensure that all information conveyed with color is also available without color, for example from context or markup. <strong>[Priority 1]</strong><br />Techniques for checkpoint 2.1 (http://www.w3.org/TR/WAI-WEBCONTENT-TECHS/#tech-color-convey)<br /></blockquote><br />
I have contacted the script's author:<br />http://lists.w3.org/Archives/Member/w3c-tools/2007OctDec/0017.html (Member-only link, unfortunately).<br />He's offering to implement the said change, so hopefully I'll be able to set up the new script to achieve this better effect.<br />That said, I would be interested to read your proposal for CSS aural properties for a diff document, so if you could send it (you could cc w3c-tools@w3.org, or wai-xtech@w3.org, or www-archive@w3.org, I don't have a strong feeling about it), I would appreciate it.<br />Thanks for your detailed feedback!<br />
OK, I've upgraded the script, and it now returns <ins> and <del> as recommended.<br />
Unfortunately that script wont handle even simple changes inside html table like new row or change inside column. It displays broken html table altough html is valid. Pitty :(<br /><br />example:<br /><br />http://www.w3.org/2007/10/htmldiff?doc1=http%3A%2F%2Fdev.hotelbookings.cz%2Ftest%2Fhtml1.html&doc2=http%3A%2F%2Fdev.hotelbookings.cz%2Ftest%2Fhtml2.html<br />
I've finally released the python wrapper if anyone is interested: http://dev.w3.org/cvsweb/2009/htmldiff/<br />