Userspace/kernel memcpy implementation

At the heart of the code for copying data between kernel and user-space on x86, there is the macro __copy_user, which expands to some assembly as follows :
 #define __copy_user(to,from,size)     
do {         
 int __d0, __d1;       
 __asm__ __volatile__(      
  "0: rep; movsl\n"     
  " movl %3,%0\n"     
  "1: rep; movsb\n"     
  "2:\n"       
  ".section .fixup,\"ax\"\n"    
  "3: lea 0(%3,%0,4),%0\n"    
  " jmp 2b\n"     
  ".previous\n"      
  ".section __ex_table,\"a\"\n"    
  " .align 4\n"     
  " .long 0b,3b\n"     
  " .long 1b,2b\n"     
  ".previous"      
  : "=&c"(size), "=&D" (__d0), "=&S" (__d1)  
  : "r"(size & 3), "0"(size / 4), "1"(to), "2"(from) 
  : "memory");      
} while (0)

This is perhaps an intimidating few lines, so we here we go into more detail into the code. Don't forget to read the inline assembly links on this site.

Naive copy

  "0: rep; movsl\n"
  " movl %3,%0\n"
  "1: rep; movsb\n"
  "2:\n"

The first three lines naively copy 'size' bytes from 'from' to 'to'. I write "naively" because it might happen that the copy fails; then a memory exception would occur. A memory fault could occur for several reasons, each having their own way to handle the fault. The extra code used in '__copy_user' is one way to handle simple errors.

Fixup

  ".section .fixup,\"ax\"\n"
  "3: lea 0(%3,%0,4),%0\n"    
  " jmp 2b\n"
The kernel contains a table (in section '__ex_table') containing entries (X, Y) saying that if an error occurs at address X, then jump to address Y. The entries declared in the '__copy_user' code thus say that if a memory error occurs when executing the code at the label '0:', the handler should return to (and then execute) the code at the label '3:', calculating the actual number of non-copied bytes. Likewise, an error at '1:' makes the handler return to '2:', just skipping any more copying.

Note that lea 0(%3,%0,4),%0 is equivalent to the calculation %ecx = (size % 4) + %ecx * 4.

Labels like '0:', '1:', ... are "local labels", which can be used several times for different locations in the same code. They are then referenced by '0b', '1f', ... meaning "label 0 searching backwards" and "label 1 searching forwards" respectively.

Sections

  ".previous\n"
The assembler directive .previous just tells the assembler to put the following code/data in the section used before the current section, probably the .text section.

The exception table data is put into the __ex_table section used by all exception table code in the kernel.

Exception table

  ".section __ex_table,\"a\"\n"    
  " .align 4\n"     
  " .long 0b,3b\n"     
  " .long 1b,2b\n"     
  ".previous"      
The actual exception table itself in X,Y form, as described above.

Originally by Per Persson, modified by John Levon.