← Back to archive day

SecureRouter: Encrypted Routing for Efficient Secure Inference

Yukuan Zhang, Mengxin Zheng, Qian Lou

36

Recommendation Score

breakthrough🔴 AdvancedMachine LearningEfficient InferenceSystemBest for builders

Research context

Primary field

Machine Learning

Core modeling, optimization, inference, and systems efficiency.

Topics

Efficient Inference

Paper type

System

Best for

Best for builders

arXiv categories

cs.CRcs.AIcs.CR

Why It Matters

SecureRouter enables efficient encrypted inference by dynamically adapting model structure per query, slashing MPC overhead—making privacy-preserving AI feasible for real-time, high-throughput production systems.

Abstract

Cryptographically secure neural network inference typically relies on secure computing techniques such as Secure Multi-Party Computation (MPC), enabling cloud servers to process client inputs without decrypting them. Although prior privacy-preserving inference systems co-design network optimizations with MPC, they remain slow and costly, limiting real-world deployment. A major bottleneck is their use of a single, fixed transformer model for all encrypted inputs, ignoring that different inputs require different model sizes to balance efficiency and accuracy. We present SecureRouter, an end-to-end encrypted routing and inference framework that accelerates secure transformer inference through input-adaptive model selection under encryption. SecureRouter establishes a unified encrypted pipeline that integrates a secure router with an MPC-optimized model pool, enabling coordinated routing, inference, and protocol execution while preserving full data and model confidentiality. The framework includes training-phase and inference-phase components: an MPC-cost-aware secure router that predicts per-model utility and cost from encrypted features, and an MPC-optimized model pool whose architectures and quantization schemes are co-trained to minimize MPC communication and computation overhead. Compared to prior work, SecureRouter achieves a latency reduction by 1.95x with negligible accuracy loss, offering a practical path toward scalable and efficient secure AI inference. Our open-source implementation is available at: https://github.com/UCF-ML-Research/SecureRouter

Published April 16, 2026
© 2026 A2A.pub — AI to Action. From papers to practice, daily.
Summaries are AI-assistedPrivacyTerms