We should pass emojis through if they’re not in the dataset, but if they are we try to translate them like any other token. It’s not clear which is a better strategy, and whether users would generally want emojis intact or included in the context of the translation and possibly modified. The span trick is good for explicitly maintaining sections of text though.