code-decorations.js

Client-side syntax highlighter for mixed source files (HTML, PHP, JavaScript, Lua). Produces styled HTML by wrapping recognised constructs in <span> elements with CSS classes. No dependencies — pure vanilla JS.

Overview

The library exposes three functions that form a two-phase pipeline:

  1. Phase 1 — structural: tag_parsing(html) identifies and wraps HTML/PHP tags.
  2. Phase 2 — source: source_coloring(html) colorizes language constructs.
  3. Orchestrator: text_coloring(viewer, html) runs both phases and injects the result into the DOM.

The key design constraint is that Phase 2 must not corrupt the markup emitted by Phase 1. Both phases solve this with a tokenize → transform → restore pattern: protected regions are replaced with opaque placeholder tokens before any regex passes run, and restored verbatim afterwards.

CSS Classes Reference

ClassApplied to
tagThe < and > angle brackets of an HTML tag
tag-insideEverything between the angle brackets of an HTML tag
tag-nameThe tag name itself (e.g. div, span), set in post-processing
source-coloringBase class on all source-level highlights
source-stringString literals
source-commentComments (C-style, Lua)
source-keywordLanguage keywords (JS + Lua)
source-underlineFunction names in definitions

Function Reference

tag_parsing(html: string): string

Converts raw source text into display-safe HTML, detecting and wrapping HTML tags with styled spans.

Processing order:

  1. Escape bare &&amp;.
  2. Escape the } sequence that appears literally in {close-tag} sentinel strings.
  3. Tokenize PHP blocks (<? … ?>) — replaces each block with a placeholder so that > inside PHP expressions (e.g. $obj->prop) cannot close an HTML tag prematurely. Each block is stored as entity-escaped text.
  4. Replace all remaining < / > with sentinels {open-tag} / {close-tag}.
  5. Match {open-tag}letter…{close-tag} and wrap as .tag / .tag-inside spans.
  6. Convert any remaining sentinels to &lt; / &gt;.
  7. Restore PHP tokens as their entity-escaped source text.

text_coloring(text_viewer: Element, html: string): void

Orchestrates the full highlight pipeline and writes the result into the DOM.

  1. Calls tag_parsing(html).
  2. Calls source_coloring(html).
  3. Sets text_viewer.innerHTML.
  4. Post-processes each .tag-inside span to wrap the leading tag name in a .tag-name span.

source_coloring(html: string): string

Colorizes source-code constructs. Receives the output of tag_parsing (which contains real <span> markup). All passes use tokenization so that earlier-protected regions are immune to later passes.

Pass order:

  1. Protect HTML tags — tokenize existing <span …> / </span> markup emitted by tag_parsing; restored last.
  2. Protect strings — JS/Lua single-quoted, double-quoted, and template (`…`) literals; stored with source-string wrapping already applied.
  3. Protect comments:
  4. Highlight HTTP/HTTPS links.
  5. Highlight function names in definitions (function name().
  6. Highlight braces { } and the escaped form &#x7d;.
  7. Highlight keywords (JavaScript + Lua, word-boundary matched).
  8. Restore comments → strings → HTML tags in reverse protection order.

Supported keywords:

LanguageKeywords
JavaScriptfunction var let const if else switch for while break continue return yield null true false
Luanil and or not repeat until goto local do end if then else elseif while for in

Changelog (since initial version)

Fix — source_coloring corrupting tag_parsing output
Added HTML tag tokenization as the first pass in source_coloring. Previously, keyword and brace regexes would match characters inside <span class='tag'> markup, breaking the generated HTML structure.
Fix — PHP expressions with > breaking tag detection
Replaced the one-liner html.replaceAll("?>", "?&gt;") with a full PHP block tokenization pass in tag_parsing. The old approach only escaped the closing ?> but left inner > (e.g. in $obj->prop or htmlspecialchars($x)) unprotected, causing the tag matcher to misfire on attribute values like <input value="<?= htmlspecialchars($contact->name) ?>">.
Feature — string and comment protection in source_coloring
Added tokenization passes for string literals and comments before keyword/brace/function passes. Previously, keywords or braces appearing inside strings or comments were incorrectly highlighted. Lua multi-line and single-line comment styles added.
Feature — expanded keyword list
Added full Lua keyword set and missing JS keywords (null, true, false, yield). Removed the ops() parenthesis-suffix helper — plain word-boundary (\b) matching is more correct and handles all usage patterns (assignments, returns, etc.).
Docs — JSDoc headers
Added /** … */ JSDoc blocks to all three public functions documenting parameters, return types, and the multi-pass pipeline order. Note: the literal */ sequence inside the source_coloring JSDoc is written as /* … * / (space before /) to avoid prematurely closing the comment block — a self-referential constraint of the format.

Known Limitations