We introduces a novel Transformer-based framework, AttenGluco, designed to improve long-term blood glucose level forecasting using multimodal data—including CGM and activity signals. By integrating cross-attention and multi-scale attention mechanisms, the model effectively fuses time-series data with different sampling rates and captures long-term dependencies, outperforming a multimodal LSTM baseline by up to 12% in RMSE across multiple cohorts (healthy, prediabetes, and type 2 diabetes) using the AI-READI dataset.