13: GGUF Token Embedding Visualizer¶
Platform: Kaggle (2× Tesla T4)
Overview¶
This notebook explores token embeddings by extracting vectors from GGUF models and visualizing them with GPU‑accelerated UMAP and Plotly/Graphistry.
What You’ll Learn¶
- Extract embeddings via the llama.cpp embeddings endpoint
- Reduce 3072D → 3D with RAPIDS cuML UMAP
- Visualize semantic clustering across categories
- Compare quantization effects on embedding geometry
Requirements¶
- Kaggle notebook with GPU T4 × 2
- llcuda v2.2.0
- RAPIDS cuML + Plotly
Quick Start¶
Open the notebook in Kaggle to run the full workflow.