Checkout our [preprint](https://www.biorxiv.org/content/10.1101/2024.10.03.616542v1) on understanding how training data affects protein language model likelihoods!