learn from asm work in progress

2014-02-25 03:06:09 -05:00
parent daaa73fa5c
commit d861fc427d
2 changed files with 300 additions and 3 deletions
--- a/learnasmfromc/LAFC.txt
+++ b/learnasmfromc/LAFC.txt
@@ -449,7 +449,304 @@ So the compiler chose to add 7 to b (b is held in r1 here) and then
 add b+1.  We know that a+b = b+a so why the compiler did it that way
 r0,r1,r0 instead of r0,r0,r1 we may never know.  It works.

-
+So if you are following along in the manual you may notice our syntax
+has some handwaving going on.  The mov and add both have this shifter_operand
+
+ADD{<cond>}{S} <Rd>, <Rn>, <shifter_operand>
+
+
+Under ADDs shifter_operand in my manual it tells me to look here
+
+The options for this operand are described in Addressing
+Mode 1 - Data-processing operands on page A5-2, i
+
+The one we are using in the case of these add instructions is this one
+
+2. <Rm>
+See Data-processing operands - Register on page A5-8.
+
+The mov instruction takes us to the same place and for that mov pc,lr
+it is also a register mov but the mov r0,#12 was an immediate
+
+1. #<immediate>
+See Data-processing operands - Immediate on page A5-6.
+
+Now ARM is generally pretty good at using pseudo code to tell you
+what is going on.
+
+   0:   e3a0000c    mov r0, #12
+
+going back to the mov in the alphabetical listing (understand that
+assembly language is generally case insesitive, you can use MOV or
+mov and it works).  That table of bits and such at the beginning of
+each of these instructions is the machine code and it is not a bad idea
+to look at it.  If you are learning assembly language you need to
+strengthen your bit manipulation skills, hex/binary and such.  so the
+top 4 bits of the instruciton 31 to 28 are the condition field, to be
+talked about later for now most instrucitons use a 0xE which means always
+execute, we see that e at the top of our machine code.  Now it is
+more obvious with other instruction sets, with ARM our "opcode" bits
+are often strewn about.  In this case there are two zeros at 27 and
+26, then 25 is an I it tells us not on this page but when we look
+at the shifter_operand stuff that this means Immediate or not in this
+case we are using an immediate so this bit is a 1, then 24 - 21
+is a 0xD.  The S bit we are not using it is a zero.  SBZ in this manual
+means should be zero, so hopefully the assembler did that.  Then Rd
+our destination register, in this case r0 so that should be a zero.
+And then the shifter_operand stuff.  So far without looking at the
+shifter operand our instruction looks like this in machine code
+
+111000i1101000000000ssssssssssss
+
+where the s bits will be filled in once we find our shifter operand.
+
+Your manual may be slightly different mine says
+
+A5.1.3 Data-processing operands - Immediate
+
+And of course there is another diagram of the instruction this one
+a bit more generic it doesnt know we are doing a move and the Rn field
+is there where we had a SBZ for MOV.  That is all okay this is the
+same encoding just generic to show us how to use an immediate with
+the various instructions that can use this immediate encoding.
+
+We first see that bit 25 is a 1, we didnt know that before we now have
+
+11100011101000000000ssssssssssss
+
+And that so far matches what the compiler/assembler generated
+
+1110 0011 1010 0000 0000 ssssssssssss
+
+0xE3A00...
+
+
+shifter_operand = immed_8 Rotate_Right (rotate_imm * 2)
+if rotate_imm == 0 then
+   shifter_carry_out = C flag
+else /* rotate_imm != 0 */
+   shifter_carry_out = shifter_operand[31]
+
+so lets work backwards from what the assembler produced it has 0x00C
+for those lower 12 bits so that is 3 bits of rotate_imm bits 11 to 8
+and 8 bits of immed_8 bits 8 to 0.  for this encoding
+rotate_imm = 000 = 0x0
+immed8 = 11000000 = 0x0C
+
+
+shifter_operand = immed_8 Rotate_Right (rotate_imm * 2)
+
+since rotate_imm is zero in our case then it doesnt rotate we simply
+get shifter_operand = immed_8 = 0x0C which is a 12 decimal which is
+what we wanted
+
+   0:   e3a0000c    mov r0, #12
+
+What we have learned from this exercise is for this immediate encoding
+type which is used by many ARM instructions, we can basically have
+immediates with any 8 bit value rotated left or right an even number of
+bits.  A rotate right in this case means shift the bits to the right
+and the rotate means the bits that fall off the end on the right
+fill in the bits on the left.  So the number 0x1234 has more than
+8 bits that are not zeros in a row
+
+   123456789-1
+0001001000110100
+
+11 significant bits.
+
+what about 0x1220?
+
+   12345678
+0001001000100000
+
+unsigned int fun ( void )
+{
+    return(0x122);
+}
+
+00000000 <fun>:
+   0:   e59f0000    ldr r0, [pc]    ; 8 <fun+0x8>
+   4:   e12fff1e    bx  lr
+   8:   00000122    andeq   r0, r0, r2, lsr #2
+
+Nope doesnt quite work, because there were 5 bits behind it
+cant rotate that value an even number of bits, but we can try
+
+   12345678
+0001001000100
+
+0x244
+
+unsigned int fun ( void )
+{
+    return(0x244);
+}
+
+00000000 <fun>:
+   0:   e3a00f91    mov r0, #580    ; 0x244
+   4:   e12fff1e    bx  lr
+
+yes, worked... the lower 12 bits are 0xF91
+
+so rotate_imm is 1111 and immed_8 is 0x91
+
+so we need to rotate 0x91 right 0xF*2 times. which is 30 times.
+
+30 times right is the same as 2 left so
+
+00000...0010010001
+0000...00100100010 one bit
+000...001001000100 two bits
+
+and there is our 0x00000244
+
+Pretty cool.  Now you know what I know.  A really cool one is something
+like
+
+unsigned int fun ( void )
+{
+    return(0x10000001);
+}
+
+ 00000000 <fun>:
+   0:   e3a00211    mov r0, #268435457  ; 0x10000001
+   4:   e12fff1e    bx  lr
+
+taking full advantage of the rotate around the end.
+
+0x11 rotated right 2*2 bits or 4 bits
+
+So what if we and instead of add
+
+unsigned int fun ( unsigned int a, unsigned int b )
+{
+    return(a&b);
+}
+
+00000000 <fun>:
+   0:   e0000001    and r0, r0, r1
+   4:   e12fff1e    bx  lr
+
+with the add we had
+
+   0:   e0800001    add r0, r0, r1
+
+the difference being the opcode bits one says do an add the other do
+an and.  The rest of the instruciton is the same because all of the
+other operands are the same r0,r0,r1.
+
+And if you have not done this already when you look up the register
+encoding for "shifter_operand" in my book is in section
+
+A5.1.4 Data-processing operands - Register
+
+those lower 12 bits are mostly zeros with the last four being that
+third register.
+
+be careful though, they have a little swizzle in there
+
+   4:   e0810000    add r0, r1, r0
+
+the destination register Rd where the result goes is in bits 15 to 12
+the first operand is 19 to 16, Rn.  so you cant just read left to right
+in the instruction and see the three registers, you have to know
+that for this flavor of encoding it is middle, first then at the end
+the last/third register from the instruction.
+
+Moving on...
+
+unsigned int fun ( unsigned int *a )
+{
+    return(*a+8);
+}
+
+00000000 <fun>:
+   0:   e5900000    ldr r0, [r0]
+   4:   e2800008    add r0, r0, #8
+   8:   e12fff1e    bx  lr
+
+
+So we saw an ldr before with the special program counter register, this
+is a general purpose register r0 in both places.  The thing in the
+[brackets] is the address to load from and the other register is where
+to put the thing read from memory.  In this case the C code passes an
+address to a 32 bit thing as its only operand.  We know that parameter
+the address *a will be passed in in r0.  so we read from that address
+since we are no longer going to need to use *a again and we want to use
+r0 as our return address we can trash r0 by saving what we read over it
+then adding 8 to it.  If we had further uses for *a (we will do this
+kind of thing later) then the compiler would have had to use a different
+register for a while then eventually computed the return value and put
+it in r0.
+
+lets do that now...
+
+
+unsigned int fun ( unsigned int *a )
+{
+    return(a[0]+a[11]+8);
+}
+
+00000000 <fun>:
+   0:   e5902000    ldr r2, [r0]
+   4:   e590302c    ldr r3, [r0, #44]   ; 0x2c
+   8:   e0820003    add r0, r2, r3
+   c:   e2800008    add r0, r0, #8
+  10:   e12fff1e    bx  lr
+
+Since we will need the pointer/address a again we cannot trash it on
+that first read.  This brings up a point in the calling convetion
+that will become obvious in the near future.  The calling convention
+so far as we care so far says that we can trash/destroy r0-r3 in our
+function without  causing any harm to anyone else.  Other registers
+we will see that we need to preserve so that when we return they
+contain the same value they had when the caller called our function.
+
+If you had looked at the ldr encoding you may have realized that the
+  ldr r2,[r0]
+encoding actually has an offset of zero
+  ldr r2,[r0,#0]
+Quite literally an instruction like this
+  ldr r3,[r0,#44]
+means take the number in r0, add 44 to it then use that as an address to
+read from memory.
+
+This program said to get a[11], and a is 32 bit integers so that means
+4 bytes per so 11*4 = 44.  Take the address a, add 44 and read that
+item.  The compiler here can now trash r0 of it wants we dont need it
+anymore as the address a, but we will need to get our answer in it in
+an efficient manner if we are optimizing which we are.  r2 for the
+moment we cannot trash because it holds the first item we need to add
+together a[1].  r1,r0,and r3 are fair game the compiler chooses r3, we
+dont know why it just did.  So we have three things we need to add a[1]
+a[11] and the immediate 8.  the add instruction can only add two things
+at a time so the compiler chose to add a[1] and a[11] then take that
+result and add 8.
+
+The assembler could have just as well done something like this
+
+ldr r1,[r0]
+ldr r0,[r0, #44]    ; 0x2c
+add r0,r0,r1
+add r0,r0,#8
+bx  lr
+
+and it would have worked just as well.  there are algorithms that
+compilers use to determine which register they can use and which they
+cant and which they can but takes more work.  In this case the first
+instruciton the easy to use ones were r1,r2,r3.  Then for the second
+instruction r0,r1,r2,r3 minus the one they used as Rd in the first
+instruction and the third instruction Rd didnt have to be r0 they
+could have done this for example
+
+ldr r3,[r0]
+ldr r2,[r0, #44]    ; 0x2c
+add r1,r2,r3
+add r0,r1,#8
+bx  lr
+
+It it would have been just as good as the other combinations.



--- a/learnasmfromc/test.c
+++ b/learnasmfromc/test.c
@@ -1,5 +1,5 @@

-unsigned int fun ( unsigned int a, unsigned int b )
+unsigned int fun ( unsigned int *a )
 {
-    return(a+b+7);
+    return(a[0]+a[11]+8);
 }