We propose Mochi, a Graph Foundation Model that addresses task unification and training efficiency by adopting a meta-learning based training framework. Prior models pre-train with reconstruction-based objectives such as link prediction, and assume that the resulting representations can be aligned with downstream tasks through a separate unification step such as class prototypes. We demonstrate through synthetic and real-world experiments that this procedure, while simple and intuitive, has limitations that directly affect downstream task performance. To address these limitations, Mochi pre-trains on few-shot episodes that mirror the downstream evaluation protocol, aligning the training objective with inference rather than relying on a post-hoc unification step. We show that Mochi, along with its more powerful variant Mochi++, achieves competitive or superior performance compared to existing Graph Foundation Models across 25 real-world graph datasets spanning node classification, link prediction, and graph classification, while requiring 8$\sim$27 times less training time than the strongest baseline. Skip to main content Learn about arXiv becoming an independent nonprofit. We gratefully acknowledge support from the Simons Foundation, member institutions , and all contributors. Donate > cs > arXiv:2604.22031 Help

This version is intentionally a light rewrite for admin review. It preserves the article's main claim, the named people or companies involved, and the practical importance of the development instead of turning it into a different lesson.

Why it matters: this story sits in Research and may be useful as a NEWS byte because it gives readers direct context on a current development from ArXiv – Artificial Intelligence. Before approval, an admin should verify that the emphasis, framing, and any implied conclusions still match the source article.

Mochi: Aligning Pre-training and Inference for Efficient Graph Foundation Models via Meta-Learning