Ambridge (2019) calls for exemplar-based accounts of language acquisition. Do modern neural networks such as transformers or word2vec – which have been extremely successful in modern natural language processing (NLP) applications – count? Although these models often have ample parametric complexity to store exemplars from their training data, they also go far beyond simple storage by processing and compressing the input via their architectural constraints. The resulting representations have been shown to encode emergent abstractions. If these models are exemplar-based then Ambridge’s theory only weakly constrains future work. On the other hand, if these systems are not exemplar models, why is it that true exemplar models are not contenders in modern NLP?