ZeroBytes Methodology

ZeroCLIP

Text-Free Deterministic Image Conditioning for Stable Diffusion

Replace text prompts with 16-byte seeds. Same seed = same image, forever, on any machine.
Seed
(concept, style, mood, world)
FNV-1a Hash
16 bytes → 64 bits
Conditioning Vector
[D] float32, unit sphere
Diffusion Model
SD 1.x / 2.x / SDXL
Image
deterministic output

Basis Decomposition CLIP Conditioning

CLIP encodes a vocabulary of ~1024 concepts once, offline, into a static anchor library. At runtime, a seed tuple is hashed and fed into coherent noise to produce weights over this library. The conditioning vector is a weighted sum of anchors, projected onto the unit sphere.

1 Seed (concept, style, mood, world)
2 pack_seed() → 16 bytes → FNV-1a → 64-bit hash
3 coherent_value() × N_anchors → raw weights
4 softmax() → normalized weights [N]
5 weights @ anchors → conditioning [D]
6 L2-normalize → unit sphere [D]
Nearby seeds produce similar weight distributions — coherent noise guarantees smooth visual transitions across the semantic space.

Anchor Weight Distribution

concept_id1000
style_id500

Top 10 anchor weights (softmax-normalized). Move sliders to see smooth transitions.

Latent Coordinate MLP

A tiny neural network (~50k parameters) learns to map 3D coordinates directly to CLIP embedding vectors. Trained offline on CLIP-encoded vocabulary, the MLP becomes the entire algorithm. At inference, the text encoder is gone.

1 Seed → pack_seed() → FNV-1a → 64-bit hash
2 child_hash(×3) → (cx, cy, cz) ∈ [0,1]³
3 MLP: [3] → 256 → 256 → 256 → [D]
4 L2Normalize → unit sphere [D]
The MLP learns a continuous function over the CLIP manifold — it can generalize to coordinates that were never explicitly trained.

3D Coordinate Space

concept axis (x)0.50
style axis (y)0.50
mood axis (z)0.50

3D coordinate mapped to MLP input. Dot shows current position in the learned latent space.

Pure Entropy Conditioning

Conditioning vectors that have never been text and never will be. A seed is hashed and Box-Muller transformed into D normal samples, then L2-normalized onto the unit hypersphere. In guided mode, PCA projection keeps samples near the CLIP manifold.

1 Seed (u64) → FNV-1a → base hash
2 child_hash(×D) → Box-Muller → D normal samples
3a Pure: L2-normalize → uniform point on S⊃(D-1)
3b Guided: PCA project → mean + components⊃T · z → L2-normalize
These conditioning vectors exist only as positions in latent space, never expressible as text — yet the diffusion model responds to them.

Hypersphere Sampling (2D Projection)

blend (coherent entropy)0.50

Pure = full sphere. Guided = near manifold (green region). Star = region anchor.

Self-Bootstrapped Prior Anchors

The most radical variant: no text encoder at any stage. The diffusion model probes random conditioning vectors, measures which produce coherent images, refines them via gradient ascent, and clusters the results into a diverse anchor library. The model discovers its own semantic basis.

P1 Probe: N random vectors → partial denoise → coherence score
P2 Refine: Riemannian gradient ascent on latent sharpness
P3 Cluster: cosine k-means → diverse anchor representatives
RT Runtime: identical to Option A (weighted sum over anchors)
The anchors have no names. Some may correspond to known concepts; others activate visual patterns that language has no word for.

Bootstrap Pipeline

Random vectors scored by coherence. Brighter = higher score. Top candidates selected.

Core Concepts

The foundations shared by all four ZeroCLIP variants.

The Seed

A 4-tuple of integers: (concept_id, style_id, mood_salt, world_seed). Packed into 16 bytes. The entire identity of a conditioning vector fits in less space than a single pixel.

concept_id and style_id are u16 (0–65535). mood_salt is u16. world_seed is u64.

The CONDITIONING Format

ComfyUI expects [(tensor[1, 77, D], dict)]. The conditioning vector [D] is repeat-padded across 77 token positions. For SDXL, the dict carries pooled_output.

D=768 for SD1.x, D=1024 for SD2.x, D=2048 for SDXL sequence embeddings.

Determinism Guarantee

FNV-1a is a pure function: same bytes in, same hash out, always. No Math.random(), no Date.now(), no floating-point order dependence. The entire pipeline is a composition of pure functions.

Same seed = byte-identical conditioning = identical image, on any machine, in any session, forever.

Seed Explorer

Enter seed values and see the packed bytes and FNV-1a hash output.

--
--

ComfyUI Node Pack

20 nodes across 5 categories. Drop into ComfyUI/custom_nodes/ZeroClip-nodes/ and restart. Zero pip dependencies.

NodeVariantInputsOutputPurpose
ZeroClip Seed PackSharedconcept, style, mood, worldZEROCLIP_SEEDPack 4-tuple seed
ZeroClip Seed From RandomSharedseed (INT)ZEROCLIP_SEEDDerive 4-tuple from single int
ZeroClip Empty ConditioningShareddimension, sdxlCONDITIONINGZero vector for negative input
ZeroClip Conditioning BlendSharedcond_a, cond_b, weightCONDITIONINGWeighted interpolation
ZeroClip Conditioning InfoSharedconditioningSTRINGDebug: shape, norm, stats
ZeroClip-A Load AnchorsAanchors_fileANCHORSLoad .npy anchor library
ZeroClip-A ConditioningAanchors, seedCONDITIONINGAnchor weighted sum
ZeroClip-A Conditioning (SDXL)Aanchors_seq, anchors_pooled, seedCONDITIONINGSDXL dual-anchor
ZeroClip-A BatchAanchors, concept range, batchCONDITIONINGSweep concept_id
ZeroClip-B Load MLPBmodel_fileMODELLoad .pt checkpoint
ZeroClip-B ConditioningBmodel, seedCONDITIONINGMLP forward pass
ZeroClip-B Conditioning (SDXL)Bmodel_seq, model_pooled, seedCONDITIONINGSDXL dual-MLP
ZeroClip-C Load ProjectionCprojection_filePROJECTIONLoad .npz PCA matrix
ZeroClip-C ConditioningCseed, mode, dim, (projection)CONDITIONINGEntropy sampling
ZeroClip-C Coherent EntropyCseed, region, blend, mode, dimCONDITIONINGRegional coherence blend
ZeroClip-C Conditioning (SDXL)Cseed, mode, dims, (projections)CONDITIONINGSDXL entropy
ZeroClip-D Load AnchorsDanchors_fileANCHORSLoad bootstrap .npy
ZeroClip-D ConditioningDanchors, seedCONDITIONINGBootstrap anchor sum
ZeroClip-D Conditioning (SDXL)Danchors_seq, anchors_pooled, seedCONDITIONINGSDXL bootstrap
ZeroClip-D BatchDanchors, concept range, batchCONDITIONINGSweep concept_id

Minimal Workflows

Variant A

ZeroClip Seed Pack (1000, 500, 0, 42)
ZeroClip-A Load Anchors → ZeroClip-A Conditioning
↓ positive
KSampler (+ Empty Conditioning as negative)
VAEDecode → SaveImage

Variant B

ZeroClip Seed Pack (1000, 500, 0, 42)
ZeroClip-B Load MLP → ZeroClip-B Conditioning
↓ positive
KSampler (+ Empty Conditioning as negative)
VAEDecode → SaveImage

Variant C (Pure)

ZeroClip-C Conditioning (seed=42, pure, dim=768)
↓ positive (zero setup required)
KSampler (+ Empty Conditioning as negative)
VAEDecode → SaveImage

Variant D

ZeroClip Seed Pack (1000, 500, 0, 42)
ZeroClip-D Load Anchors → ZeroClip-D Conditioning
↓ positive
KSampler (+ Empty Conditioning as negative)
VAEDecode → SaveImage

Getting Started

01

Install

Copy ZeroClip-nodes/ into ComfyUI/custom_nodes/. Restart ComfyUI. 20 nodes appear under the ZeroClip category. No pip install needed.

02

Build Artifacts

Run the build scripts for your chosen variant. Place output files in ComfyUI/models/zeroclip/. Variant C pure mode needs no build at all.

03

Wire Nodes

Connect Seed Pack or Seed From Random to a Conditioner node. Wire the CONDITIONING output to KSampler's positive input. Use Empty Conditioning for negative.