888.470760_415140.lt. [Top 10 QUICK]
The implementation was made publicly available within TensorFlow .
The paper proposes training both components simultaneously rather than separately. This allows the model to optimize for both accuracy (via the wide component) and serendipity/novelty (via the deep component) [1606.07792]. Key Results & Impact 888.470760_415140.lt.
A wide linear model is used, which excels at memorizing sparse feature interactions (e.g., user clicked 'item A' and user is from 'location B') [1606.07792]. 888.470760_415140.lt.
This architecture has since become a standard baseline for many recommendation tasks in industry, including those described in studies on YouTube recommendations [1606.07792]. If you'd like, I can: 888.470760_415140.lt.
Online experiments showed that "Wide & Deep" significantly increased app acquisitions compared to models that used either approach alone [1606.07792].