Split-GPU Setup¶
Configure LLM on GPU 0 and Graphistry on GPU 1.
Architecture¶
flowchart TD
A[GPU 0: llama-server] --> B[Extract entities + relations]
B --> C[GPU 1: RAPIDS + Graphistry] Setup GPU 0 (LLM)¶
from llcuda.server import ServerManager, ServerConfig
from llcuda.graphistry import SplitGPUManager
manager = SplitGPUManager()
manager.assign_llm(0)
config = ServerConfig(model_path="model.gguf", n_gpu_layers=99)
server = ServerManager()
server.start_with_config(config)
Setup GPU 1 (Graphistry)¶
from llcuda.graphistry import GraphWorkload, register_graphistry
workload = GraphWorkload(gpu_id=1)
register_graphistry(api=3, protocol="https", server="hub.graphistry.com")
Workflow¶
- Run LLM inference on GPU 0
- Extract entities/relations from output
- Build graph on GPU 1 with cuDF
- Visualize with Graphistry