There have been a few, good, attempts at Transformer based recommendation systems, but this one from Jiaqi Zhai and the folks inside the modern recommendation systems team at Meta is particularly strong.
https://arxiv.org/html/2402.17152v3
In this work, we treat user actions as a new modality in generative modeling. Our key insights are, a) core ranking and retrieval tasks in industrial-scale recommenders can be cast as generative modeling problems given an appropriate new feature space; b) this paradigm enables us to systematically leverage redundancies in features, training, and inference to improve efficiency. Due to our new formulation, we deployed models that are three orders of magnitude more computationally complex than prior state-of-the-art, while improving topline metrics by 12.4%
The model uses sequences of user history rather than traditional embedding towers, and is a lot more compute intensive than most recommendation systems. To make all that work, the accompanying Github repo has some custom Triton kernels: https://github.com/facebookresearch/generative-recommenders