Accurate prediction of fuel cell performance degradation is crucial for improving reliability and extending service life. However, existing methods are often limited by inconsistent data modalities and scarce data volumes. In this study, a novel forecasting method based on a large language model (LLM) is proposed, including a data–text cross-modal transformation module. The model also employs a data–knowledge multimodal alignment mechanism to achieve deep integration of time series features such as voltage with domain-specific a priori knowledge. Ablation experiments reveal that the performance of the model is significantly positively correlated with the degree of domain knowledge integration, further validating the effectiveness of the prior knowledge embedding strategy. Validation using multiple fuel-cell experimental datasets indicates that the proposed model exhibits superior generalization performance across different operating conditions and achieves higher accuracy than conventional deep-learning models. Experimental results demonstrate that the proposed model significantly improves the long-term tracking accuracy of local nonlinear features and can accurately capture certain abrupt degradation behaviors. The method does not have to perform fine-tuning of the large language model, thus significantly reducing the cost of deploying the training. In summary, with a “data-physics” collaborative health management framework, the model can realize highly accurate and robust prediction of fuel cells under certain data type scarcity scenarios.