-Suppose the outcome space of _X_ is `{H, T}` and the outcome space of _Y_ is `{1, 2, 3, 4, 5, 6, 7, 8}`. I _wanted_ to say that if observing _X_=`H` concentrates _Y_'s probability mass on `{1, 2, 3}`, that's _more useful_ than if it concentrates _Y_ on `{1, 5, 8}`—but that would require the numbers in Y to be _numbers_ rather than opaque labels; as far as elementary information theory was concerned, mapping eight states to three states reduced the entropy from log<sub>2</sub> 8 = 3 to log<sub>2</sub> 3 ≈ 1.58 no matter "which" three states they were.
+Suppose the outcome space of _X_ is `{H, T}` and the outcome space of _Y_ is `{1, 2, 3, 4, 5, 6, 7, 8}`. I wanted to say that if observing _X_=`H` concentrates _Y_'s probability mass on `{1, 2, 3}`, that's _more useful_ than if it concentrates _Y_ on `{1, 5, 8}`—but that would require the numerals in _Y_ to be _numbers_ rather than opaque labels; as far as elementary information theory was concerned, mapping eight states to three states reduced the entropy from lg<sub>2</sub> 8 = 3 to lg<sub>2</sub> 3 ≈ 1.58 no matter "which" three states they were.