注册 登录  
 加关注
   显示下一条  |  关闭
温馨提示!由于新浪微博认证机制调整,您的新浪微博帐号绑定已过期,请重新绑定!立即重新绑定新浪微博》  |  关闭

gmd20的个人空间

// 编程和生活

 
 
 

日志

 
 

最简单的同步问题和原子操作(转)  

2008-08-15 13:29:35|  分类: 程序设计 |  标签: |举报 |字号 订阅

  下载LOFTER 我的照片书  |

原文来自:Linux_Kernel_Development_Second_Edition.SAMS.pdf

The Single Variable
Now, let's look at a specific computing example. Consider a very simple shared resource, a single global integer, and a very simple
critical region, the operation of merely incrementing it:
i++;
This might translate into machine instructions to the computer's processor that resemble the following:
get the current value of i and copy it into a register
add one to the value stored in the register
write back to memory the new value of i
Now, assume that there are two threads of execution, both enter this critical region, and the initial value of i is seven. The desired
outcome is then similar to the following (with each row representing a unit of time):
Thread 1 Thread 2
get i (7) -
increment i (7 -> 8) -
write back i (8) -
- get i (8)
- increment i (8 -> 9)
- write back i (9)
As expected, seven incremented twice is nine.A possible outcome, however, is the following:
Thread 1 Thread 2
get i (7) get i (7)
This document was created by an unregistered ChmMagic, please go to http://www.bisenter.com to regis.ter it. Thanks
Thread 1 Thread 2
increment i (7 -> 8) -
- increment i (7 -> 8)
write back i (8) -
- write back i (8)
If both threads of execution read the initial value of i before it is incremented, both threads will increment and save the same value. As a
result, the variable i contains the value eight when, in fact, it should now contain nine. This is one of the simplest examples of a critical
region. Thankfully, the solution is equally as simple: We merely need a way to perform these operations in one indivisible step. Most
processors provide an instruction to atomically read, increment, and write back a single variable. Using this atomic instruction, the only
possible outcome is
Thread 1 Thread 2
increment i (7 -> 8) -
- increment i (8 -> 9)
Or:
Thread 1 Thread 2
- increment i (7 -> 8)
increment i (8 -> 9) -
It would never be possible for the two atomic operations to interleave. The processor would physically ensure that it was impossible.
Using such an instruction would alleviate the problem. The kernel provides a set of interfaces that implement these atomic instructions;
they are discussed in the next chapter.

==========================

========================================

Atomic Operations
Atomic operations provide instructions that execute atomicallywithout interruption. Just as the atom was originally thought to be an
indivisible particle, atomic operators are indivisible instructions. For example, as discussed in the previous chapter, an atomic increment
can read and increment a variable by one in a single indivisible and uninterruptible step. Instead of the race discussed in the previous
chapter, the outcome is always similar to the following (assume i is initially seven):
Thread 1 Thread 2
atomic increment i (7 -> 8) -
- atomic increment i (8 -> 9)
The resulting value, nine, is correct. It is never possible for the two atomic operations to occur on the same variable concurrently.
Therefore, it is not possible for the increments to race.
The kernel provides two sets of interfaces for atomic operationsone that operates on integers and another that operates on individual bits.
These interfaces are implemented on every architecture that Linux supports. Most architectures either directly support simple atomic
operations or provide an operation to lock the memory bus for a single operation (and thus ensure another operation cannot occur
simultaneously). Architectures that cannot easily support primitive atomic operations, such as SPARC, somehow cope.
(The only atomic
instruction that is guaranteed to exist on all SPARC machines is ldstub.)
Atomic Integer Operations
The atomic integer methods operate on a special data type, atomic_t. This special type is used, as opposed to having the functions work
directly on the C int type, for a couple of reasons. First, having the atomic functions accept only thea tomic_t type ensures that the atomic
operations are used only with these special types. Likewise, it also ensures that the data types are not passed to any other nonatomic
functions. Indeed, what good would atomic operations be if they were not consistently used on the data? Next, the use of atomic_t ensures
the compiler does not (erroneously but cleverly) optimize access to the valueit is important the atomic operations receive the correct
memory address and not an alias. Finally, use of atomic_t can hide any architecture-specific differences in its implementation.
Despite being an integer, and thus 32 bits on all of the machines that Linux supports, developers and their code once had to assume that
an atomic_t was no larger than 24 bits in size. The SPARC port in Linux has an odd implementation of atomic operations: A lock was
embedded in the lower 8 bits of the 32-bit int (it looked like Figure 9.1). The lock was used to protect concurrent access to the atomic type
because the SPARC architecture lacks appropriate support at the instruction level. Consequently, only 24 usable bits were available on
SPARC machines. Although code that assumed that the full 32-bit range existed would work on other machines, it would have failed in
strange and subtle ways on SPARC machinesand that is just rude. Recently, clever hacks have allowed SPARC to provide a fully usable
32-bit atomic_t, and this limitation is no more. The old 24-bit implementation is still used by some internal SPARC code, however, and lives
in SPARC's <asm/atomic.h>.
Figure 9.1. Old layout of the 32-bit atomic_t on SPARC.
This document was created by an unregistered ChmMagic, please go to http://www.bisenter.com to register it. Thanks
The declarations needed to use the atomic integer operations are in <asm/atomic.h>. Some architectures provide additional methods that
are unique to that architecture, but all architectures provide at least a minimum set of operations that are used throughout the kernel.
When you write kernel code, you can ensure that these operations are correctly implemented on all architectures.
Defining an atomic_t is done in the usual manner. Optionally, you can set it to an initial value:
atomic_t v; /* define v */
atomic_t u = ATOMIC_INIT(0); /* define u and initialize it to zero */
Operations are all simple:
atomic_set(&v, 4); /* v = 4 (atomically) */
atomic_add(2, &v); /* v = v + 2 = 6 (atomically) */
atomic_inc(&v); /* v = v + 1 = 7 (atomically) */
If you ever need to convert an atomic_t to an int, use atomic_read():
printk("%d\n", atomic_read(&v)); /* will print "7" */
A common use of the atomic integer operations is to implement counters. Protecting a sole counter with a complex locking scheme is silly,
so instead developers use atomic_inc() and atomic_dec(), which are much lighter in weight.
Another use of the atomic integer operators is atomically performing an operation and testing the result. A common example is the atomic
decrement and test:
int atomic_dec_and_test(atomic_t *v)
This function decrements by one the given atomic value. If the result is zero, it returns true; otherwise, it returns false. A full listing of the
standard atomic integer operations (those found on all architectures) is in Table 9.1. All the operations implemented on a specific
architecture can be found in <asm/atomic.h>.
This document was created by an unregistered ChmMagic, please go to http://www.bisenter.com to register it. Thanks
Table 9.1. Full Listing of Atomic Integer Operations
Atomic Integer Operation Description
ATOMIC_INIT(int i) At declaration, initialize an atomic_t to i
int atomic_read(atomic_t *v) Atomically read the integer value of v
void atomic_set(atomic_t *v, int i) Atomically set v equal to i
void atomic_add(int i, atomic_t *v) Atomically add i to v
void atomic_sub(int i, atomic_t *v) Atomically subtract i from v
void atomic_inc(atomic_t *v) Atomically add one to v
void atomic_dec(atomic_t *v) Atomically subtract one from v
int atomic_sub_and_test(int i, atomic_t *v) Atomically subtract i from v and return true if the result is zero;
otherwise false
int atomic_add_negative(int i, atomic_t *v) Atomically add i to v and return true if the result is negative;
otherwise false
int atomic_dec_and_test(atomic_t *v) Atomically decrement v by one and return true if zero; false
otherwise
int atomic_inc_and_test(atomic_t *v) Atomically increment v by one and return true if the result is zero;
false otherwise
The atomic operations are typically implemented as inline functions with inline assembly (apparently, kernel developers like inlines). In the
case where a specific function is inherently atomic, the given function is usually just a macro. For example, on most sane architectures, a
word-sized read is always atomic. That is, a read of a single word cannot complete in the middle of a write to that word. The read will
always return the word in a consistent state, either before or after the write completes, but never in the middle. Consequently,
atomic_read() is usually just a macro returning the integer value of thea tomic_t.
Atomicity Versus Ordering
The preceding discussion on atomic reading begs a discussion on the differences between atomicity and ordering. As
discussed, a word-sized read will always occur atomically. It will never interleave with a write to the same word; the read
will always return the word in a consistent stateperhaps before the write completed, perhaps after, but never during. For
example, if an integer is initially 42 and then set to 365, a read on the integer will always return 42 or 365 and never
some commingling of the two values. This is atomicity.
It might be that your code wants something more than this, perhaps for the read to always occur before the pending
write. This is not atomicity, but ordering. Atomicity ensures that instructions occur without interruption and that they
complete either in their entirety or not at all. Ordering, on the other hand, ensures that the desired order of two or more
instructionseven if they are to occur in separate threads of execution or even separate processorsis preserved.
The atomic operations discussed in this section guarantee only atomicity. Ordering is enforced via barrier operations,
which we will discuss later in this chapter.
This document was created by an unregistered ChmMagic, please go to http://www.bisenter.com to register it. Thanks.
In your code, it is usually preferred to choose atomic operations over more complicated locking mechanisms. On most architectures, one
or two atomic operations incur less overhead and less cache-line thrashing than a more complicated synchronization method. As with any
performance-sensitive code, however, testing multiple approaches is always smart.
Atomic Bitwise Operations
In addition to atomic integer operations, the kernel also provides a family of functions that operate at the bit level. Not surprisingly, they are
architecture specific and defined in <asm/bitops.h>.
What may be surprising is that the bitwise functions operate on generic memory addresses. The arguments are a pointer and a bit number.
Bit zero is the least significant bit of the given address. On 32-bit machines, bit 31 is the most significant bit and bit 32 is the least
significant bit of the following word. There are no limitations on the bit number supplied, although most uses of the functions provide a
word and, consequently, a bit number between 0 and 31 (or 63, on 64-bit machines).
Because the functions operate on a generic pointer, there is no equivalent of the atomic integer's atomic_t type. Instead, you can work with
a pointer to whatever data you desire. Consider an example:
unsigned long word = 0;
set_bit(0, &word); /* bit zero is now set (atomically) */
set_bit(1, &word); /* bit one is now set (atomically) */
printk("%ul\n", word); /* will print "3" */
clear_bit(1, &word); /* bit one is now unset (atomically) */
change_bit(0, &word); /* bit zero is flipped; now it is unset (atomically) */
/* atomically sets bit zero and returns the previous value (zero) */
if (test_and_set_bit(0, &word)) {
/* never true */
}
/* the following is legal; you can mix atomic bit instructions with normal C */
word = 7;
A listing of the standard atomic bit operations is in Table 9.2.
This document was created by an unregistered ChmMagic, please go to http://www.bisenter.com to register it. Thanks
Table 9.2. Listing of Atomic Bitwise Operations
Atomic Bitwise Operation Description
void set_bit(int nr, void *addr) Atomically set the nr-th bit starting from addr
void clear_bit(int nr, void *addr) Atomically clear the nr-th bit starting from addr
void change_bit(int nr, void *addr) Atomically flip the value of the nr-th bit starting
from addr
int test_and_set_bit(int nr, void *addr) Atomically set the nr-th bit starting from addr and
return the previous value
int test_and_clear_bit(int nr, void *addr) Atomically clear the nr-th bit starting from addr
and return the previous value
int test_and_change_bit(int nr, void *addr) Atomically flip the nr-th bit starting from addr and
return the previous value
int test_bit(int nr, void *addr) Atomically return the value of the nr-th bit starting
from addr
Conveniently, nonatomic versions of all the bitwise functions are also provided. They behave identically to their atomic siblings, except they
do not guarantee atomicity and their names are prefixed with double underscores. For example, the nonatomic form of test_bit() is
__test_bit(). If you do not require atomicity (say, for example, because a lock already protects your data), these variants of the bitwise
functions might be faster.
What the Heck Is a Non-Atomic Bit Operation?
On first glance, the concept of a non-atomic bit operation may not make any sense. Only a single bit is involved, thus
there is no possibility of inconsistency. So long as one of the operations succeeds, what else could matter? Sure,
ordering might be important, but we are talking abouta tomicity here. At the end of the day, if the bit has a value that was
provided by any of the instructions, we should be good to go, right?
Let's jump back to just what atomicity means. Atomicity means that either instructions succeed in their entirety,
uninterrupted, or instructions fail to execute at all. Therefore, if you issue two atomic bit operations, you expect two
operations to succeed. Sure, the bit needs to have a consistent and correct value (the specified value from the last
successful operation, as suggested in the previous paragraph). Moreover, however, if the other operations succeed, then
at some point in time the bit needs to have those intermediate values, too.
For example, assume you issue two atomic bit operations: Initially set the bit and then clear the bit. Without atomic
operations, the bit may end up cleared, but it may never have been set. The set operation could occur simultaneously
with the clear operation and fail. The clear operation would succeed, and the bit would emerge cleared as intended. With
atomic operations, however, the set would actually occurthere would be a moment in time when a read would show the
bit as setand then the clear would execute and the bit be zero.
This behavior might be important, especially when ordering comes into play.
The kernel also provides routines to find the first set (or unset) bit starting at a given address:
This document was created by an unregistered ChmMagic, please go to http://www.bisenter.com to regis.ter it. Thanks
int find_first_bit(unsigned long *addr, unsigned int size)
int find_first_zero_bit(unsigned long *addr, unsigned int size)
Both functions take a pointer as their first argument and the number of bits in total to search as their second. They return the bit number of
the first set or first unset bit, respectively. If your code is searching only a word, the routines __ffs() and ffz(), which take a single
parameter of the word in which to search, are optimal.
Unlike the atomic integer operations, code typically has no choice whether to use the bitwise operationsthey are the only portable way to
set a specific bit. The only question is whether to use the atomic or nonatomic variants. If your code is inherently safe from race conditions,
you can use the nonatomic versions, which might be faster depending on the architecture.

================================================

原子操作 一般都是通过 使用处理器支持的带lock前缀汇编指令来实现的,像intel的处理器里面都提供这种指令,在执行这样的操作的时候处理器会锁定总线,这样就确保就算在多cpu环境中也同时有一个cpu在指定的内存数据进行操作。这种锁定会造成一定的性能损失吧。看看linux内核中include/asm-x86/atomic_32.h 文件里面的原子操作就是这样实现的:
http://lxr.linux.no/linux+v2.6.26.2/include/asm-x86/atomic_32.h
41/**
42 * atomic_add - add integer to atomic variable
43 * @i: integer value to add
44 * @v: pointer of type atomic_t
45 *
46 * Atomically adds @i to @v.
47 */
48static inline void atomic_add(int i, atomic_t *v)
49{
50        asm volatile(LOCK_PREFIX "addl %1,%0"
51                     : "+m" (v->counter)
52                     : "ir" (i));
53}
54
55/**
56 * atomic_sub - subtract integer from atomic variable
57 * @i: integer value to subtract
58 * @v: pointer of type atomic_t
59 *
60 * Atomically subtracts @i from @v.
61 */
62static inline void atomic_sub(int i, atomic_t *v)
63{
64        asm volatile(LOCK_PREFIX "subl %1,%0"
65                     : "+m" (v->counter)
66                     : "ir" (i));
67}
68


windows 系统有类似的API函数提供使用,就是就是对 lock前缀的指令的简单封装而已。

  评论这张
 
阅读(537)| 评论(0)
推荐 转载

历史上的今天

评论

<#--最新日志,群博日志--> <#--推荐日志--> <#--引用记录--> <#--博主推荐--> <#--随机阅读--> <#--首页推荐--> <#--历史上的今天--> <#--被推荐日志--> <#--上一篇,下一篇--> <#-- 热度 --> <#-- 网易新闻广告 --> <#--右边模块结构--> <#--评论模块结构--> <#--引用模块结构--> <#--博主发起的投票-->
 
 
 
 
 
 
 
 
 
 
 
 
 
 

页脚

网易公司版权所有 ©1997-2017