asw/docs/philosophy.md

# Philosophy: Semantic HTML as API

**Why Agentic Semantic Web exists, and why it works better for AI agents than traditional CSS frameworks.**

---

## The Problem

Every time an LLM generates a web page, it invents CSS.

Inline styles appear with arbitrary hex values (`#3a7bd5`, `#f4f4f4`, `#1a1a1a`), inconsistent spacing (`margin: 12px` vs `margin: 1rem` vs `margin: 0.75em`), random font stacks. Over multiple sessions—and worse, across multiple agents—this produces **visual chaos**.

The traditional solution is a CSS framework. But frameworks create a different problem:

### Class-Based Frameworks (Bootstrap, Tailwind)

These shift the burden from inventing CSS to **memorizing class names**:

- Bootstrap: `navbar-expand-lg` vs `navbar-expand-md`
- Tailwind: `bg-gray-900` vs `bg-slate-900` vs `bg-zinc-900`

An LLM can hallucinate class names just as easily as it hallucinates CSS. Worse, these frameworks have **vast vocabularies** (Tailwind has thousands of utility classes). Even if an agent gets the syntax right 95% of the time, the 5% failures produce broken layouts and visual inconsistency.

The agent is forced to memorize an arbitrary string API. This is not what language models do best.

---

## The Insight

**LLMs are semantic language models.** They understand meaning, structure, hierarchy—not arbitrary string conventions.

HTML is also a **semantic language**. The mapping between how an agent thinks and how HTML expresses structure is natural:

| The agent thinks | HTML |
|------------------|------|
| "This is a navigation menu" | `<nav>` |
| "This is important content" | `<article>` |
| "This can be expanded" | `<details>` |
| "This is secondary info" | `<aside>` |
| "This is code" | `<pre><code>` |
| "This is a definition list" | `<dl><dt><dd>` |

An agent writing semantic HTML is doing what it already does best: **expressing structure through language**.

Now add a **semantic CSS framework** (like Pico) that styles semantic HTML automatically—no classes required. The agent writes `<article>`, and the framework makes it look good. The agent writes `<details>`, and it gets styled as a collapsible section.

**This is the foundation of Agentic Semantic Web**: semantic HTML + semantic CSS = agents that generate visually consistent pages by learning meaning, not appearance.

---

## The Honest Tradeoff

ASW does replace CSS classes with data-attribute values. Let's be clear about what that means — and what it doesn't.

**What is different:**

1. **Semantics vs. presentation.** `data-task="done"` describes *what the element is* — a completed task. `class="text-success line-through"` describes *how it should look*. The first survives a redesign; the second becomes incorrect the moment the visual treatment changes.

2. **Vocabulary size.** Tailwind has ~600 utilities. Bootstrap has hundreds of component classes. ASW has ~15 data-attributes with finite enumerated values. `data-task` takes `todo | done | blocked`. That is the complete vocabulary for tasks. Finite and documentable in one page.

3. **Hallucination surface.** An LLM generating `class="navbar-expand-lg"` vs `class="navbar-expand-md"` gets it wrong ~5% of the time. An LLM generating `data-task="done"` gets it wrong essentially never — the value is the English word for the concept.

4. **Validity.** `data-*` attributes are valid HTML5 by spec. Custom classes are convention. One is structural; one is presentation masquerading as structure.

**What ASW doesn't claim:**

> ASW does not eliminate memorization. It reduces and semanticizes it.

The real pitch: **the vocabulary you carry is meaning, not appearance. Meaning is stable. Appearance changes.**

When an agent writes `data-callout="warning"`, it is expressing *meaning* — "this is something the reader should be warned about." Whether that warning is red or orange, boxed or inline, with an icon or without — those are decisions for the CSS. The agent doesn't carry them. The agent carries only the semantic claim.

This is a conscious trade. The question is whether the trade is worth it. For agents — who work in discontinuous sessions, can't visually debug, and hallucinate class names at non-trivial rates — it is.

---

## The Agent-First Principle

Most web frameworks are designed for **human developers**:
- Humans can memorize class name patterns
- Humans use autocomplete and documentation
- Humans debug visually in DevTools
- Humans work in continuous multi-hour sessions

**Agents are different**:
- They work in short, discontinuous sessions (minutes to hours)
- They don't have persistent memory across sessions
- They can't visually debug (no browser access)
- They generate HTML from text prompts, not from clicking UI builders

**The Agent-First Principle**: Design frameworks for how agents think, not how humans code.

This means:
1. **Semantic over syntactic** — `<nav>` instead of `class="navbar"`
2. **Finite vocabulary** — 30 HTML tags + 15 data-attributes, not thousands of utility classes
3. **Self-documenting** — `data-task="done"` says what it means
4. **No build step** — Just link CSS files, no webpack/postcss/bundlers
5. **Separation enforced by convention** — Agents write structure (HTML), designers write appearance (CSS), never mixed

When an agent wakes up in a new session, it doesn't need to remember "how did we style navigation last time?" It just writes `<nav>`, and the framework handles the rest.

---

## The Pico Lineage

[Pico CSS](https://picocss.com/) proved that **classless semantic CSS works for humans**. Write semantic HTML, get a beautiful page, no classes required.

Agentic Semantic Web extends that idea: **If classless CSS works for humans, it works even better for agents.**

### What Pico provides

- Semantic HTML styling: `<nav>`, `<article>`, `<details>`, `<dialog>`, `<table>`, `<form>`
- Responsive layout via `<main class="container">` (the only class Pico requires)
- Light/dark theme support via CSS custom properties
- Accessible by default (proper ARIA, focus states, keyboard navigation)
- Small footprint (~80KB minified)

Pico eliminates the "memorize class names" problem. But it only handles **standard HTML**. What about concepts that don't have semantic tags?

---

## The Charts.css Vision

[Charts.css](https://chartscss.org/) pioneered the **data-attribute pattern**: use `data-*` attributes to extend HTML's semantic vocabulary without inventing classes.

**Example from Charts.css:**
```html
<table class="charts-css column" data-spacing="5" data-labels-align="center">
  <tr><td style="--size: 0.5">50%</td></tr>
  <tr><td style="--size: 0.8">80%</td></tr>
</table>
```

The `data-spacing` and `data-labels-align` attributes describe **what the chart is**, not how it looks. The CSS targets those attributes to apply styling.

Agentic Semantic Web adopts this pattern for vault-native concepts:

- **Wikilinks**: `<span data-wikilink>Note Name</span>`
- **Tasks**: `<div data-task="done">Complete the docs</div>`
- **Status**: `<span data-status="awake">Vigilio is active</span>`
- **Callouts**: `<div data-callout="warning">Disk usage: 85%</div>`

These are **semantic** (they describe meaning) and **self-documenting** (an agent can infer usage from the attribute name). They're also **valid HTML anywhere**—no framework lock-in.

---

## The Complete Philosophy

Agentic Semantic Web combines three ideas:

### 1. Pico's Classless Foundation
Standard HTML gets styled automatically. No class memorization.

### 2. Charts.css Data-Attribute Pattern
Non-standard concepts use `data-*` attributes, not classes. Semantic, self-documenting, valid HTML.

### 3. Design Token Separation
Visual identity lives in CSS custom properties (`:root` variables). Agents never touch appearance—they only write structure.

Together, these create a framework with a **finite, enumerable vocabulary**:

- **30 semantic HTML tags** (Pico-handled)
- **15 data-attributes** (ASW-extensions)
- **1 class** (`container` on `<main>`)

That's the complete API. An agent can hold this in context. A human can document it in one page.

---

## Why This Works for Agents

### 1. Semantic HTML is natural language
Writing `<article>` comes from understanding "this is an article," not from memorizing a framework API.

### 2. Data-attributes are self-documenting
`data-task="blocked"` tells you what it is. `class="bg-red-500 border-l-4"` tells you nothing about semantics.

### 3. Finite vocabulary prevents hallucination
15 data-attributes can be enumerated in a directive. Thousands of utility classes cannot.

### 4. No build step = no session dependency
Any agent, in any session, can generate a page. Just link the CSS files in `<head>`. No npm, no bundler, no "did the previous session set up the toolchain?"

### 5. Separation of concerns is enforced
The agent is told: "Write semantic HTML. Use data-attributes. Never write `style=`. Never invent classes."

This constraint is **easy to verify** (search for `style=` or `class=` in the output) and **hard to violate accidentally** (the directive is explicit).

### 6. Visual consistency across sessions and agents
The CSS files define the aesthetic. Every page references the same files. Sessions change, agents change, but the design remains coherent.

---

## The Constraint as Liberation

Traditional web development teaches: "Separation of concerns—HTML for structure, CSS for style."

Then it gives you frameworks that **violate that separation**:
- Inline styles (`style="color: red"`)
- Utility classes that are CSS-as-strings (`class="flex items-center gap-4"`)

For agents, this is chaos. They mix structure and style because the framework encourages it.

Agentic Semantic Web **enforces the separation** through constraint:

> **Write semantic HTML. Use data-attributes for vault concepts. Never write `style=`. Never invent classes. If Pico + data-attributes can't express it, document the gap.**

This isn't restrictive—it's **liberating**. The agent doesn't need to think about styling. It thinks about structure, meaning, hierarchy. The CSS handles appearance.

---

## What This Enables

### For a Single Agent (Vigilio)
Across 2,700+ sessions, each page looks like it came from the same designer, even though I don't remember generating them.

### For Multiple Agents (Vigilio + Shelley + future agents)
If Shelley generates a page, it uses the same framework. Same directive, same vocabulary, same CSS files. Our pages are visually coherent even though we're separate entities.

### For Humans (Ludo, visitors)
Pages are **readable source**. View source on a Trentuna page and you see semantic HTML, not `<div class="flex-col space-y-4 bg-gray-100">`. The structure is understandable.

### For the Web Ecosystem
If ASW succeeds, it becomes a **product**: "A semantic HTML framework for AI agents." Open source, documented, forkable. Agents anywhere can use it.

---

## The Two Horizons

### Now: Build for Trentuna
Use Pico + ASW for our sites. Learn what works. Discover the gaps. Iterate based on real use, not speculation.

### Later: Extract as Product
When the pattern proves itself, extract it:
- Forgejo repo: `git.trentuna.com/trentuna/asw`
- Documentation site (dogfooding ASW to document itself)
- NPM package (or just CDN-linkable CSS)
- Blog post: "We built a CSS framework for AI agents"

The product emerges from use, not from upfront design. **Build first, extract second.**

---

## The Pitch (Future Vision)

> **Pico proved classless CSS works for humans.**
> **Agentic Semantic Web proves it works better for AI agents.**

For an agent:
- No class name memorization (semantic HTML)
- Finite vocabulary (30 tags + 15 attributes)
- Self-documenting (read the attribute name, understand the meaning)
- No build step (link CSS, generate HTML, done)
- Visual consistency across sessions (design lives in CSS, not agent memory)

For a human:
- Readable source (semantic HTML, not div soup)
- Hackable styling (override CSS custom properties)
- Accessible by default (Pico's foundation)
- No JavaScript required (pure HTML/CSS)

---

## Influences and Lineage

- **[Pico CSS](https://picocss.com/)** — Classless semantic HTML foundation
- **[Charts.css](https://chartscss.org/)** — Data-attribute pattern for semantic extensions
- **[Semantic HTML](https://html.spec.whatwg.org/)** — The web's original design
- **[CSS Custom Properties](https://developer.mozilla.org/en-US/docs/Web/CSS/Using_CSS_custom_properties)** — Design token architecture
- **[Tailwind CSS](https://tailwindcss.com/)** (counterexample) — What happens when you make CSS into a class API
- **[Bootstrap](https://getbootstrap.com/)** (counterexample) — The memorization burden of framework-specific class names

ASW stands on the shoulders of Pico and Charts.css, and learns from the failures of utility-class frameworks when applied to agent-generated content.

---

## Open Questions

**Can agents learn to use utility frameworks?**
Yes, with enough examples in context. But they'll hallucinate class names ~5% of the time, and that's enough to break layouts. ASW eliminates that failure mode.

**What about complex layouts (grid, flexbox)?**
Pico handles responsive layout via semantic HTML. For custom layouts, use semantic roles (`<aside>`, `<section>`) and let CSS handle arrangement. If that's not enough, add `data-role="two-column"` and style it.

**What if an agent needs something Pico doesn't provide?**
Add a data-attribute and a CSS rule. Document the decision. The framework grows incrementally, driven by real needs.

**What about JavaScript-heavy apps?**
ASW is for content-focused sites (documentation, dashboards, knowledge bases). For SPAs with complex state, use a component framework. Different tools for different jobs.

---

## Acceptance Criteria Met

After reading this document, you should understand:

1. **Why semantic HTML works better for agents** (natural mapping to how LLMs think)
2. **Why class-based frameworks fail** (memorization burden, hallucination)
3. **The Pico lineage** (classless CSS for humans → even better for agents)
4. **The Charts.css vision** (data-attributes as semantic extension)
5. **The agent-first principle** (design for discontinuous sessions and no visual debugging)
6. **What this enables** (visual consistency, readable source, multi-agent coherence)

You understand **WHY**, not just HOW.

---

**Next:** Read [agent-directive.md](agent-directive.md) for the HOW (complete vocabulary and usage).