learn from asm work in progress

This commit is contained in:
dwelch
2014-02-25 03:06:09 -05:00
parent daaa73fa5c
commit d861fc427d
2 changed files with 300 additions and 3 deletions

View File

@@ -449,7 +449,304 @@ So the compiler chose to add 7 to b (b is held in r1 here) and then
add b+1. We know that a+b = b+a so why the compiler did it that way
r0,r1,r0 instead of r0,r0,r1 we may never know. It works.
So if you are following along in the manual you may notice our syntax
has some handwaving going on. The mov and add both have this shifter_operand
ADD{<cond>}{S} <Rd>, <Rn>, <shifter_operand>
Under ADDs shifter_operand in my manual it tells me to look here
The options for this operand are described in Addressing
Mode 1 - Data-processing operands on page A5-2, i
The one we are using in the case of these add instructions is this one
2. <Rm>
See Data-processing operands - Register on page A5-8.
The mov instruction takes us to the same place and for that mov pc,lr
it is also a register mov but the mov r0,#12 was an immediate
1. #<immediate>
See Data-processing operands - Immediate on page A5-6.
Now ARM is generally pretty good at using pseudo code to tell you
what is going on.
0: e3a0000c mov r0, #12
going back to the mov in the alphabetical listing (understand that
assembly language is generally case insesitive, you can use MOV or
mov and it works). That table of bits and such at the beginning of
each of these instructions is the machine code and it is not a bad idea
to look at it. If you are learning assembly language you need to
strengthen your bit manipulation skills, hex/binary and such. so the
top 4 bits of the instruciton 31 to 28 are the condition field, to be
talked about later for now most instrucitons use a 0xE which means always
execute, we see that e at the top of our machine code. Now it is
more obvious with other instruction sets, with ARM our "opcode" bits
are often strewn about. In this case there are two zeros at 27 and
26, then 25 is an I it tells us not on this page but when we look
at the shifter_operand stuff that this means Immediate or not in this
case we are using an immediate so this bit is a 1, then 24 - 21
is a 0xD. The S bit we are not using it is a zero. SBZ in this manual
means should be zero, so hopefully the assembler did that. Then Rd
our destination register, in this case r0 so that should be a zero.
And then the shifter_operand stuff. So far without looking at the
shifter operand our instruction looks like this in machine code
111000i1101000000000ssssssssssss
where the s bits will be filled in once we find our shifter operand.
Your manual may be slightly different mine says
A5.1.3 Data-processing operands - Immediate
And of course there is another diagram of the instruction this one
a bit more generic it doesnt know we are doing a move and the Rn field
is there where we had a SBZ for MOV. That is all okay this is the
same encoding just generic to show us how to use an immediate with
the various instructions that can use this immediate encoding.
We first see that bit 25 is a 1, we didnt know that before we now have
11100011101000000000ssssssssssss
And that so far matches what the compiler/assembler generated
1110 0011 1010 0000 0000 ssssssssssss
0xE3A00...
shifter_operand = immed_8 Rotate_Right (rotate_imm * 2)
if rotate_imm == 0 then
shifter_carry_out = C flag
else /* rotate_imm != 0 */
shifter_carry_out = shifter_operand[31]
so lets work backwards from what the assembler produced it has 0x00C
for those lower 12 bits so that is 3 bits of rotate_imm bits 11 to 8
and 8 bits of immed_8 bits 8 to 0. for this encoding
rotate_imm = 000 = 0x0
immed8 = 11000000 = 0x0C
shifter_operand = immed_8 Rotate_Right (rotate_imm * 2)
since rotate_imm is zero in our case then it doesnt rotate we simply
get shifter_operand = immed_8 = 0x0C which is a 12 decimal which is
what we wanted
0: e3a0000c mov r0, #12
What we have learned from this exercise is for this immediate encoding
type which is used by many ARM instructions, we can basically have
immediates with any 8 bit value rotated left or right an even number of
bits. A rotate right in this case means shift the bits to the right
and the rotate means the bits that fall off the end on the right
fill in the bits on the left. So the number 0x1234 has more than
8 bits that are not zeros in a row
123456789-1
0001001000110100
11 significant bits.
what about 0x1220?
12345678
0001001000100000
unsigned int fun ( void )
{
return(0x122);
}
00000000 <fun>:
0: e59f0000 ldr r0, [pc] ; 8 <fun+0x8>
4: e12fff1e bx lr
8: 00000122 andeq r0, r0, r2, lsr #2
Nope doesnt quite work, because there were 5 bits behind it
cant rotate that value an even number of bits, but we can try
12345678
0001001000100
0x244
unsigned int fun ( void )
{
return(0x244);
}
00000000 <fun>:
0: e3a00f91 mov r0, #580 ; 0x244
4: e12fff1e bx lr
yes, worked... the lower 12 bits are 0xF91
so rotate_imm is 1111 and immed_8 is 0x91
so we need to rotate 0x91 right 0xF*2 times. which is 30 times.
30 times right is the same as 2 left so
00000...0010010001
0000...00100100010 one bit
000...001001000100 two bits
and there is our 0x00000244
Pretty cool. Now you know what I know. A really cool one is something
like
unsigned int fun ( void )
{
return(0x10000001);
}
00000000 <fun>:
0: e3a00211 mov r0, #268435457 ; 0x10000001
4: e12fff1e bx lr
taking full advantage of the rotate around the end.
0x11 rotated right 2*2 bits or 4 bits
So what if we and instead of add
unsigned int fun ( unsigned int a, unsigned int b )
{
return(a&b);
}
00000000 <fun>:
0: e0000001 and r0, r0, r1
4: e12fff1e bx lr
with the add we had
0: e0800001 add r0, r0, r1
the difference being the opcode bits one says do an add the other do
an and. The rest of the instruciton is the same because all of the
other operands are the same r0,r0,r1.
And if you have not done this already when you look up the register
encoding for "shifter_operand" in my book is in section
A5.1.4 Data-processing operands - Register
those lower 12 bits are mostly zeros with the last four being that
third register.
be careful though, they have a little swizzle in there
4: e0810000 add r0, r1, r0
the destination register Rd where the result goes is in bits 15 to 12
the first operand is 19 to 16, Rn. so you cant just read left to right
in the instruction and see the three registers, you have to know
that for this flavor of encoding it is middle, first then at the end
the last/third register from the instruction.
Moving on...
unsigned int fun ( unsigned int *a )
{
return(*a+8);
}
00000000 <fun>:
0: e5900000 ldr r0, [r0]
4: e2800008 add r0, r0, #8
8: e12fff1e bx lr
So we saw an ldr before with the special program counter register, this
is a general purpose register r0 in both places. The thing in the
[brackets] is the address to load from and the other register is where
to put the thing read from memory. In this case the C code passes an
address to a 32 bit thing as its only operand. We know that parameter
the address *a will be passed in in r0. so we read from that address
since we are no longer going to need to use *a again and we want to use
r0 as our return address we can trash r0 by saving what we read over it
then adding 8 to it. If we had further uses for *a (we will do this
kind of thing later) then the compiler would have had to use a different
register for a while then eventually computed the return value and put
it in r0.
lets do that now...
unsigned int fun ( unsigned int *a )
{
return(a[0]+a[11]+8);
}
00000000 <fun>:
0: e5902000 ldr r2, [r0]
4: e590302c ldr r3, [r0, #44] ; 0x2c
8: e0820003 add r0, r2, r3
c: e2800008 add r0, r0, #8
10: e12fff1e bx lr
Since we will need the pointer/address a again we cannot trash it on
that first read. This brings up a point in the calling convetion
that will become obvious in the near future. The calling convention
so far as we care so far says that we can trash/destroy r0-r3 in our
function without causing any harm to anyone else. Other registers
we will see that we need to preserve so that when we return they
contain the same value they had when the caller called our function.
If you had looked at the ldr encoding you may have realized that the
ldr r2,[r0]
encoding actually has an offset of zero
ldr r2,[r0,#0]
Quite literally an instruction like this
ldr r3,[r0,#44]
means take the number in r0, add 44 to it then use that as an address to
read from memory.
This program said to get a[11], and a is 32 bit integers so that means
4 bytes per so 11*4 = 44. Take the address a, add 44 and read that
item. The compiler here can now trash r0 of it wants we dont need it
anymore as the address a, but we will need to get our answer in it in
an efficient manner if we are optimizing which we are. r2 for the
moment we cannot trash because it holds the first item we need to add
together a[1]. r1,r0,and r3 are fair game the compiler chooses r3, we
dont know why it just did. So we have three things we need to add a[1]
a[11] and the immediate 8. the add instruction can only add two things
at a time so the compiler chose to add a[1] and a[11] then take that
result and add 8.
The assembler could have just as well done something like this
ldr r1,[r0]
ldr r0,[r0, #44] ; 0x2c
add r0,r0,r1
add r0,r0,#8
bx lr
and it would have worked just as well. there are algorithms that
compilers use to determine which register they can use and which they
cant and which they can but takes more work. In this case the first
instruciton the easy to use ones were r1,r2,r3. Then for the second
instruction r0,r1,r2,r3 minus the one they used as Rd in the first
instruction and the third instruction Rd didnt have to be r0 they
could have done this for example
ldr r3,[r0]
ldr r2,[r0, #44] ; 0x2c
add r1,r2,r3
add r0,r1,#8
bx lr
It it would have been just as good as the other combinations.

View File

@@ -1,5 +1,5 @@
unsigned int fun ( unsigned int a, unsigned int b )
unsigned int fun ( unsigned int *a )
{
return(a+b+7);
return(a[0]+a[11]+8);
}