Why SIF¶

The Salient Intelligence Format exists because every existing format wastes tokens when delivering structured security intelligence to AI agents. We measured the waste and built a format that eliminates it.

The Research¶

We benchmarked identical security intelligence encoded in different formats, measuring token count and AI parse quality:

Format	Tokens (same data)	Overhead	AI Parse Quality
Markdown tables	412	51% structural waste	Excellent
YAML	398	50% overhead	Excellent
Natural prose	287	15% filler words	Excellent
JSON	253	1% structural	Excellent
SIF	183	<5% structural	Excellent (with schema header)

SIF achieves 36% fewer tokens than prose and 60% fewer than markdown tables for identical information. The schema header costs ~60 tokens once, then every subsequent line is maximally compressed.

Context Rot¶

Anthropic's own research on context utilization shows that AI models degrade in recall and reasoning as context windows fill. When your security posture data competes for space with the actual task (incident analysis, exercise facilitation, threat assessment), format efficiency becomes a capability multiplier.

The math is simple

If your compiled twin takes 3,000 tokens in markdown but 800 tokens in SIF, you have 2,200 more tokens for reasoning, task context, and output quality. At scale, this is the difference between an AI that "knows" your organization and one that forgets half of it.

Five White Spaces¶

SIF fills gaps that no existing format addresses:

Security-domain compression — NIST CSF function codes, severity levels, confidence tiers are first-class citizens, not embedded in prose
Temporal encoding — trends (improving/stable/declining) and trajectory are native, not derived
Contradiction representation — {declared:X actual:Y} captures the gap between policy and reality
Confidence hierarchy — V>O>D>U>X (verified > observed > declared > uncertain > contradicted) is structural, not annotative
Tiered compilation — same source data produces executive (~150 tokens), standard (~800), and full (~3K) views

Format Comparison¶

Capability	JSON	YAML	Markdown	STIX/OSCF	SIF
Token efficiency	Good	Poor	Poor	Poor	Best
AI parseability	Excellent	Excellent	Excellent	Poor	Excellent
Security-domain native	No	No	No	Partial	Yes
Confidence levels	Manual	Manual	Manual	No	Native
Trend encoding	No	No	No	No	Native
Contradiction capture	No	No	No	No	Native
Tiered detail levels	No	No	No	No	Native
Human readable	Poor	Good	Excellent	Poor	Readable with schema

When to Use SIF¶

SIF is the right choice when:

An AI agent needs organizational security context in its prompt
Context window space is constrained (always)
You need multiple detail levels from the same data
Confidence and contradiction tracking matters
The consumer is a machine, not a human

For human consumption, the same compiler produces markdown or PDF. SIF is not a replacement for human-readable formats — it is a purpose-built machine format that coexists with them.

Implementing SIF · SIF Specification