This work introduces Deep Geometric Moments (DGM) as a novel, training-free guidance mechanism for text-to-image diffusion models. Unlike existing guidance techniques (e.g., segmentation maps, depth maps, or CLIP features), which impose rigid spatial constraints or rely heavily on global semantics, DGM captures fine-grained, subject-specific visual features through robust geometric representations. The proposed method uses a pretrained DGM model during the diffusion process to steer image generation in a flexible yet identity-preserving manner. Experiments show that DGM achieves a better balance between control and diversity, enabling more nuanced and visually consistent image synthesis without retraining the diffusion model.