Finding a Strict Online WYSIWYG HTML Editor

At work, we’re looking at buying a new Content Management System to run our web site. Right now we’re using what I built by hand, but we could really use a system built by more than one or two people.

But one of the scariest issues for me is ensuring that only pure, spotless, valid, accessible XHTML 1.0 Strict content goes into my database. And generally speaking, web-based WYSIWYG HTML editors are… less than exemplary.

I’ve used TinyMCE, FCKEditor, and XStandard. Currently XStandard is by far the most successful at stripping inappropriate code. It is well worth the modest license fee — and the free version is very nearly as good as the paid version.

Why Strict?

When I inherited this web site, it was built the bad old way, mostly using Microsoft Frontpageâ„¢. It was riddled with deeply nested <font> tags — often a font tag for every other character in the content. The code was unreadable, it was easily twenty times its necessary filesize, it was very unfriendly to search engines. Repurposing the same text for another purpose (for use in print documents, for example) required running several cleanup routines, and still there was hour after hour of manually removing little niggly bits of bad code that lingered even after the RegEx tools had had their fun.

After getting my code to a POSH state, it was very distasteful to see users trying to enter <font> tags, inline styles (which bloat the code nearly as bad as <font> tags), inline javascript, Microsoft Smart Tags and other invalid code. This tended to break things, adversely modify the College design, and cause unwanted browser behaviour (like breaking the back button).

With clean, strict code, I can allow Marketing to make the design decisions, I can make the behaviour decisions globally, and content providers can focus on what they need to focus on — providing good, clean, well-written content to their visitors.

I Hate Bad Editors

There are lots of Online WYSIWYG HTML editors out there. But most of them allow any old abomination of HTML. The better ones might provide a cleanup routine based on Tidy, but again, it tends to be as effective as any RegEx tool. And many of those require the editor to click a button or some such to do the cleaning — and they just don’t bother. Some claim to be XHTML-compliant, but they usually don’t forbid users from adding presentational markup to the content (leading to code bloat, design problems, etc), because XHTML Transitional isn’t worried about those elements so much.

But just try to find an editor that enforces XHTML 1.0 Strict. XStandard is by far the closest. But even XStandard still allows my users to insert Javascript or certain weird tags like <marquee> (shudder), requiring a small post-editor cleanup routine. I’m not sure that I’ll be able to edit the CMS that we choose to provide those routines, so a top-quality editor is still essential.

I’ve played around with TinyMCE and FCKEditor enough to get it kind of working as Strict… but I don’t have a lot of confidence in those products.

Peter Krantz or did a good Evaluation of WYSIWYG editors for semantic features, but the need for Strict code is implied, not stated explicitly.

To test the CMS systems that are coming up in our evaluation process, I put together the most abominable page I could. The CMS editor should strip most, if not all, of the bad markup listed in the document. Warning: this is not for the weak of stomach.

Test Web Page for Cleanup of Bad HTML