An important class of concurrent objects are those that are non- blocking, this is, whose operations are not contained within a mutually exclusive critical sections. A non-blocking object can be accessed by many threads at a time, yet update protocols based on atomic cas operations guarantee the object's consistency.
In this paper, we take a practical look at the cas operation in the context of contemporary shared memory multiprocessors. We first describe an operating system-based solution which permits the construction of a non-blocking function on architectures which only support more primitive atomic primitives such as tas or Atomic Exchange. We then evaluate several locking strategies which are used to synthesize a cas operation, and show that common techniques for reducing synchronization overhead in the presence of contention are inappropriate when used as the basis for non-blocking synchronization. We describe a simple synchronization strategy which avoids much of the overhead normally associated with contention.