Why does this proof technique make sense? The technique should smell familiar. It's inductive in nature. There's a base case (``for each initial state '') and an inductive step, defined in terms of all possible action instances (``for each state transitions ''). As before, because the action sets are finite, we can do a big case analysis, one action per case, in the inductive step.
After showing the base case and inductive step, we will have shown that each trace in the behavior set of the concrete machine has a corresponding (modulo the abstraction function) trace in the behavior set of the abstract machine. Hence, the behavior set of the concrete machine is a subset of the behavior set of the abstract machine (modulo the abstraction function), which is what we needed to prove.
Aside: In our proof technique, because of our simplistic way of associating actions in one machine to another (through the one-to-one function ) we are associating a single state transition in C to a single state transition in A. More generally, a state transition in C might map to a sequence of transitions in the abstract machine A. This generalization is especially needed for proving concurrent state machines correct and/or dealing with state machines with internal as well as external actions. We won't be using this generalization at all.