arXiv:2411.18253v2 Announce Type: replace-cross Abstract: Purpose: Immunotherapies have revolutionized the landscape of cancer treatments. However, our understanding of response patterns in advanced cancers treated with immunotherapy remains limited. By leveraging routinely collected noninvasive longitudinal and multimodal data with artificial intelligence, we could unlock the potential to transform immunotherapy for cancer patients, paving the way for personalized treatment approaches. Methods: In this study, we developed a novel artificial neural network architecture, multimodal transformer-based simple temporal attention (MMTSimTA) network, building upon a combination of recent successful developments. We integrated pre- and on-treatment blood measurements, prescribed medications and CT-based volumes of organs from a large pan-cancer cohort of 694 patients treated with immunotherapy to predict mortality at three, six, nine and twelve months. Different variants of our extended MMTSimTA network were implemented and compared to baseline methods incorporating intermediate and late fusion based integration methods. Results: The strongest prognostic performance was demonstrated using a variant of the MMTSimTA model with area under the curves (AUCs) of $0.84 \pm $0.04, $0.83 \pm $0.02, $0.82 \pm $0.02, $0.81 \pm $0.03 for 3-, 6-, 9-, and 12-month survival prediction, respectively. Discussion: Our findings show that integrating noninvasive longitudinal data using our novel architecture yields an improved multimodal prognostic performance, especially in short-term survival prediction. Conclusion: Our study demonstrates that multimodal longitudinal integration of noninvasive data using deep learning may offer a promising approach for personalized prognostication in immunotherapy-treated cancer patients.