ED2LM: Encoder-Decoder to Language Model for Faster Document Re-ranking Inference
Topics: AI (Deep Learning), Ranking, Reranking, Scoring, User Signals
The Google paper introduces ED2LM (Encoder-Decoder to Language Model), a novel approach to document re-ranking that significantly improves inference efficiency while maintaining competitive ranking performance. Traditional cross-attention-based ranking models, such as BERT and T5, incur high computational costs due to their reliance on query-document interaction at inference time. ED2LM mitigates this by transforming an encoder-decoder model into a decoder-only language model at inference time, enabling faster ranking while retaining effectiveness. The method achieves up to 6.8X speedup compared to existing re-ranking models.