Attention
Embedding
directions are semantics
eg one direction might be gender
# Softmax
$ probability_n = e^{x_1}/\sum_{n=0}^{N-1}{e^n} $
Logits ($x_1$)-- raw input to softmax, unnormalized, not a probability distribution.
Probabilities -- output of the softmax