When you mention jmp+body+test, I believe you are talking about the translation of a `while` loop in high-level languages. There is a reason for the second approach. Let's take a look.
Consider
x = N
while (x != 0) {
BODY
x--
}
The naive way is
mov ecx, N ; store var x in ecx register
top:
cmp ecx, 0 ; test at top of loop
je bottom ; loop exit when while condition false
BODY
dec ecx
jmp top
bottom:
This has N conditional jumps and N unconditional jumps.
The second way is:
mov ecx, N
jmp bottom
top:
BODY
dec ecx
bottom:
cmp ecx, 0
jne top
Now we still do N conditional jumps but we only do ONE unconditional jump. A small savings but it just might matter, especially because it is in a loop.
Now you did mention the `loop` instruction which is essentially
dec ecx
cmp ecx, 0
je somewhere
How would you work that in? Probably like this:
mov ecx, N
cmp ecx, 0 ; Must guard against N==0
je bottom
top:
BODY
loop top ; built-in dec, test, and jump if not zero
bottom:
This is a pretty little solution typical of CISC processors. Is it faster than the second way above? That depends a great deal on the architecture. I suggest you do some research on the performance of the `loop` instruction in the IA-32 and Intel 64 processor architectures, if you really want to know more.