docs: document atomic_load_acquire and atomic_store_release

We will use them in the next patch, document what they do. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-02-23 13:58:31 +01:00 · 2018-02-23 13:58:31 +01:00 · 729c0ddd3c
commit 729c0ddd3c
parent b9b7581754
1 changed files with 30 additions and 27 deletions
--- a/docs/devel/atomics.txt
+++ b/docs/devel/atomics.txt
@ -122,20 +122,30 @@ In general, if the algorithm you are writing includes both writes
 and reads on the same side, it is generally simpler to use sequentially
 consistent primitives.
-When using this model, variables are accessed with atomic_read() and
+When using this model, variables are accessed with:
-atomic_set(), and restrictions to the ordering of accesses is enforced
+
 - atomic_read() and atomic_set(); these prevent the compiler from
  optimizing accesses out of existence and creating unsolicited
  accesses, but do not otherwise impose any ordering on loads and
  stores: both the compiler and the processor are free to reorder
  them.
 - atomic_load_acquire(), which guarantees the LOAD to appear to
  happen, with respect to the other components of the system,
  before all the LOAD or STORE operations specified afterwards.
  Operations coming before atomic_load_acquire() can still be
  reordered after it.
 - atomic_store_release(), which guarantees the STORE to appear to
  happen, with respect to the other components of the system,
  after all the LOAD or STORE operations specified afterwards.
  Operations coming after atomic_store_release() can still be
  reordered after it.
 Restrictions to the ordering of accesses can also be specified
 using the memory barrier macros: smp_rmb(), smp_wmb(), smp_mb(),
 smp_mb_acquire(), smp_mb_release(), smp_read_barrier_depends().
 atomic_read() and atomic_set() prevents the compiler from using
 optimizations that might otherwise optimize accesses out of existence
 on the one hand, or that might create unsolicited accesses on the other.
 In general this should not have any effect, because the same compiler
 barriers are already implied by memory barriers.  However, it is useful
 to do so, because it tells readers which variables are shared with
 other threads, and which are local to the current thread or protected
 by other, more mundane means.
 Memory barriers control the order of references to shared memory.
 They come in six kinds:
@ -232,7 +242,7 @@ make atomic_mb_set() the more expensive operation.
 There are two common cases in which atomic_mb_read and atomic_mb_set
 generate too many memory barriers, and thus it can be useful to manually
-place barriers instead:
+place barriers, or use atomic_load_acquire/atomic_store_release instead:
 - when a data structure has one thread that is always a writer
  and one thread that is always a reader, manual placement of
@ -243,18 +253,15 @@ place barriers instead:
     thread 1                                thread 1
     -------------------------               ------------------------
     (other writes)
-                                             smp_mb_release()
+     atomic_mb_set(&a, x)                    atomic_store_release(&a, x)
-     atomic_mb_set(&a, x)                    atomic_set(&a, x)
+     atomic_mb_set(&b, y)                    atomic_store_release(&b, y)
                                             smp_wmb()
     atomic_mb_set(&b, y)                    atomic_set(&b, y)
                                       =>
     thread 2                                thread 2
     -------------------------               ------------------------
-     y = atomic_mb_read(&b)                  y = atomic_read(&b)
+     y = atomic_mb_read(&b)                  y = atomic_load_acquire(&b)
-                                             smp_rmb()
+     x = atomic_mb_read(&a)                  x = atomic_load_acquire(&a)
-     x = atomic_mb_read(&a)                  x = atomic_read(&a)
+     (other reads)
                                             smp_mb_acquire()
  Note that the barrier between the stores in thread 1, and between
  the loads in thread 2, has been optimized here to a write or a
@ -276,7 +283,6 @@ place barriers instead:
                                             smp_mb_acquire();
  Similarly, atomic_mb_set() can be transformed as follows:
  smp_mb():
                                             smp_mb_release();
     for (i = 0; i < 10; i++)          =>    for (i = 0; i < 10; i++)
@ -284,6 +290,8 @@ place barriers instead:
                                             smp_mb();
  The other thread can still use atomic_mb_read()/atomic_mb_set().
 The two tricks can be combined.  In this case, splitting a loop in
 two lets you hoist the barriers out of the loops _and_ eliminate the
 expensive smp_mb():
@ -296,8 +304,6 @@ expensive smp_mb():
                                               atomic_set(&a[i], false);
                                             smp_mb();
  The other thread can still use atomic_mb_read()/atomic_mb_set()
 Memory barrier pairing
 ----------------------
@ -386,10 +392,7 @@ and memory barriers, and the equivalents in QEMU:
  note that smp_store_mb() is a little weaker than atomic_mb_set().
  atomic_mb_read() compiles to the same instructions as Linux's
  smp_load_acquire(), but this should be treated as an implementation
-  detail.  QEMU does have atomic_load_acquire() and atomic_store_release()
+  detail.
  macros, but for now they are only used within atomic.h.  This may
  change in the future.
 SOURCES
 =======