Fictitious Self-play develops strategy mixtures by optimizing against the uniform distribution over all previous agents.

A more natural way to deal with this is to optimize over the Nash equilibria of all agents.

This is intuitively better because some of the early strategies may have become strictly worse than newer ones due to absolute improvements in the playing quality.

However, this requires the solution to calculate the Nash of the finite matrix generated by the current population of strategies, which is computable, but not necessarily efficient. So approximations may be necessary.