removing stuff
This commit is contained in:
@@ -1,771 +0,0 @@
|
||||
|
||||
Learn Assembly Langauge from C
|
||||
|
||||
This is an attempt to learn assembly language by compiling simple C
|
||||
code segments and analyzing what is going on.
|
||||
|
||||
You will want to go to http://infocenter.arm.com. Along the left side
|
||||
expand ARM architecture. Then expand Reference Manuals then click on
|
||||
ARMv5 Reference Manual
|
||||
|
||||
Then in the right side of the page, low, center, click on PDF Version
|
||||
|
||||
These may or may not be direct links, if not follow the instructions
|
||||
above.
|
||||
|
||||
Reference Manuals
|
||||
http://infocenter.arm.com/help/topic/com.arm.doc.set.architecture/index.html
|
||||
|
||||
ARMv5 Reference Manual
|
||||
http://infocenter.arm.com/help/topic/com.arm.doc.subset.architecture.reference/index.html#v5
|
||||
|
||||
This might be a direct link to the pdf.
|
||||
https://silver.arm.com/download/download.tm?pv=1073121
|
||||
|
||||
You may need to create a user name and password, it only costs you
|
||||
an email address...
|
||||
|
||||
I know that the Raspberry Pi uses an ARMv6. The document they now
|
||||
call the ARMv5 Architectural Reference Manual, is a direct derivative
|
||||
of what was simply called the ARM ARM (ARM Architectural Reference
|
||||
Manual). But it probably became too complicated to try to
|
||||
cover all the architecture variations, so they just started making
|
||||
new manuals for each and this one stopped here. It is still a good
|
||||
starting point, includes the classic 32 bit instructions and the
|
||||
original 16 bit thumb instructions. A good place to build a foundation
|
||||
for the ARM instruction set.
|
||||
|
||||
I have included a copy of the build_arm script that I use to download
|
||||
and build a GNU based ARM toolchain from sources. This is the toolchain
|
||||
I use for my projects. I maintain this script in a different github
|
||||
repo, build_gcc, so this is just a copy and may get stale the real one
|
||||
I maintain for myself is in my build_gcc repo. If you are running on
|
||||
Windows, I have stopped doing that myself and have stopped trying to
|
||||
maintain a Windows build script. I preferred mingw to cygwin, but it
|
||||
was possible on both, even better just download one of the many out
|
||||
there.
|
||||
|
||||
Linux or Windows you can get a pre-built GNU toolchain here.
|
||||
|
||||
https://launchpad.net/gcc-arm-embedded
|
||||
|
||||
which should work well enough, or
|
||||
|
||||
go here
|
||||
http://www.mentor.com/embedded-software/sourcery-tools/sourcery-codebench/editions/lite-edition/
|
||||
Under ARM Processors select the Download the EABI Release link then fill
|
||||
in your email/whatever. They will email you a link to download the
|
||||
Linux or Windows version. This formery-codesourcery-now-mentor-graphics
|
||||
lite version is more complete than the one I use.
|
||||
|
||||
Now we can start...We have to master the compiler first it is quite
|
||||
easy to let the optimizer in the compiler remove your code, for example
|
||||
|
||||
|
||||
void fun ( void )
|
||||
{
|
||||
unsigned int a;
|
||||
unsigned int b;
|
||||
unsigned int c;
|
||||
a = 5;
|
||||
b = 7;
|
||||
c = a+b;
|
||||
}
|
||||
|
||||
Assuming you have your toolchain in place this is how we are going to
|
||||
learn asm for every example:
|
||||
|
||||
arm-none-eabi-gcc -O2 test.c -c -o test.o
|
||||
arm-none-eabi-objdump -D test.o
|
||||
|
||||
you may have to adjust the prefix to -gcc and -objdump
|
||||
arm-none-linux-gnueabi or arm-elf or whatever...it should all work fine
|
||||
|
||||
------ cut -------
|
||||
test.o: file format elf32-littlearm
|
||||
|
||||
|
||||
Disassembly of section .comment:
|
||||
|
||||
00000000 <.comment>:
|
||||
0: 43434700 movtmi r4, #14080
|
||||
------ cut -------
|
||||
|
||||
Here is the problem. There is no .text. The stuff it did disassemble
|
||||
wasnt really code it was some text and the disassembler just chewed on
|
||||
it anyway. What happened is that function DOES NOTHING. Think about
|
||||
it there are no inputs, there are no outputs, it calls no functions,
|
||||
the math it does means nothing because it is sent nowhere.
|
||||
|
||||
So lets try this
|
||||
|
||||
unsigned int fun ( void )
|
||||
{
|
||||
unsigned int a;
|
||||
unsigned int b;
|
||||
unsigned int c;
|
||||
a = 5;
|
||||
b = 7;
|
||||
c = a+b;
|
||||
return(c);
|
||||
}
|
||||
|
||||
run those same two commands
|
||||
|
||||
Disassembly of section .text:
|
||||
|
||||
00000000 <fun>:
|
||||
0: e3a0000c mov r0, #12
|
||||
4: e12fff1e bx lr
|
||||
|
||||
So we can learn a little here, but it wasnt really what we wanted, the
|
||||
addition was removed. the optimizer knew that we were simply adding
|
||||
5+7=12 so it just moves the answer 12 into register r0 and the function
|
||||
returns.
|
||||
|
||||
So in the ARM ARM there is a chapter titled ARM Instructions, under that
|
||||
Aphabetical list of ARM instructions, and under that each instruction
|
||||
has its own subsection.
|
||||
|
||||
So we start with MOV.
|
||||
|
||||
We a drawing thing and then some Syntax
|
||||
|
||||
MOV{<cond>}{S} <Rd>, <shifter_operand>
|
||||
|
||||
As with other things you are by now used to {these brackets} mean
|
||||
optional. Rd in this case means the destination register and
|
||||
shifter_operand we have to dig deeper.
|
||||
|
||||
The condition field the S bit we will get to that later.
|
||||
|
||||
Most processors use "registers". If you look that word up in the
|
||||
dictionary it talks about a book in which records of acts, events, names,
|
||||
etc., are kept. Or variations on that type of a register. The word
|
||||
here is not really incorrectly used. It is not a book but it is a place
|
||||
where we keep information, bits. And most processors have many of them
|
||||
some only one or two some hundreds, usually in the 4, 8, 16, or 32 range
|
||||
is typical. The ARM as far as we are concerned has 16 (the new 64 bit
|
||||
ARM which has a different instruction set has 32).
|
||||
|
||||
Now when I talk about some processors I dont mean some ARM processors
|
||||
have this and some ARM processors have that. What I mean is that
|
||||
different processor archtectures are different. An Intel x86 processor
|
||||
is different from an ARM, is different from a MIPS, is different from
|
||||
a Power PC and so on. There are many many different processor architectures
|
||||
designed and sold by many different companies. Once you learn one
|
||||
instruction set (assembly language) it is not hard to learn a second or
|
||||
third and so on. They are more similar than they are different. yes
|
||||
this does mean that ARM processors are different from x86, they are not
|
||||
compatible, you cant directly run the code compiled for one on the other.
|
||||
|
||||
So registers, we use them here, there are 16 of them in this ARM. In
|
||||
C programs we have variables we can make as many of them as we want
|
||||
(within reason) and call them what we want. We are going to stick with
|
||||
their proper names for the most part.
|
||||
|
||||
Registers in the ARM instrucition set are mostly general purpose meaning
|
||||
one is the same as the other, they dont have special features or powers.
|
||||
BUT...There are a few of them that have special features or powers, in
|
||||
that they are tied to some instructions. r0-r12 are general purpose,
|
||||
nothing special about them. r13 is also known as the stack pointer,
|
||||
to be talked about later. For now it is really general purpose but
|
||||
is commonly used as the stack pointer so we will assume it has that
|
||||
special property. r14 is also known as the link register or lr. If you
|
||||
think about it when we call a function in C
|
||||
|
||||
a=7;
|
||||
b=5;
|
||||
printf("blah");
|
||||
c=b+a;
|
||||
|
||||
We know enough at the C level that the call to printf() means our program
|
||||
changes path, runs through all the code in printf(), then comes back
|
||||
to the line after we called printf.
|
||||
|
||||
Assembly works the same way but of course much simpler, lower level.
|
||||
We call a function with the bl instruction which you can look up.
|
||||
Branch and Link. We have to talk about r15 for a second first. R15
|
||||
in this ARM is the program counter or PC. It is the register that keeps
|
||||
track of where we are in our program. It keeps the address of the
|
||||
instructions we are fetching and executing. Thinking in C in the code
|
||||
above the pc would be the line number perhaps, keeping track of where
|
||||
we are. Now when you call a function that you expect to return from
|
||||
you need to do two things. You need to save the address of the
|
||||
line/instruction after your branch, and you need to then branch to the
|
||||
code where the function you are calling lives. (branch, jump, goto, all
|
||||
the same thing)
|
||||
|
||||
So to return from a function we need to put the address of the instruction
|
||||
after the call back in the pc so that the pc branches back to where
|
||||
we were before the function call. Basically the end of printf() needs
|
||||
to point the pc back to the c=b+a; line. Now since you can call printf()
|
||||
from a zillion different places you cant hardcode that return address
|
||||
in printf, it has to be more flexiby. So r14, a.k.a lr is used.
|
||||
When we call a function lr will contain the return address.
|
||||
|
||||
|
||||
00000000 <fun>:
|
||||
0: e3a0000c mov r0, #12
|
||||
4: e12fff1e bx lr
|
||||
|
||||
So our program is moving the value 12 into r0 and then returning back
|
||||
to the calling functions return address in lr. the old way to do the
|
||||
return was
|
||||
|
||||
mov pc,lr
|
||||
|
||||
And for what we are doing for now that is fine. But then ARM created
|
||||
this thumb instruction set thing where the instructions are 16 bits
|
||||
instead of 32, basically a completely different instruction set, and
|
||||
to bounce between thumb and arm mode you use the BX instruction, you can
|
||||
look that up in your manual. it may be a little confusing in the manual
|
||||
depending on how they worded it. And some of the manuals are kinda wrong,
|
||||
you may already know that there is no perfect programmers reference manual
|
||||
ARM is no different. We may run across some.
|
||||
|
||||
The traditional ARM instructions are 32 bits wide, and as a rule they
|
||||
must be on aligned addresses, basically a multiple of 4 bytes, so
|
||||
0x0, 0x4, 0x8, 0xC, 0x10, and so on. The lower two address bits must
|
||||
be a zero for ARM instructions. the traditional thumb instructions
|
||||
are 16 bits wide and must be aligned so the lower bit is always zero,
|
||||
0x0, 0x2, 0x4, 0x6, 0x8, and so on. What they did is for the BX
|
||||
instruciton if the register you give it, the lr in this case, has an
|
||||
lsbit of 1 then the processor knows that is a thumb instruciton, it
|
||||
strips that lsbit off (makes it a zero) and starts fetching thumb
|
||||
instructions at that address. If the register specified in the bx
|
||||
instruciton contains an address with the lsbit of zero, then the bx
|
||||
instruction puts that value in the PC and starts fetching ARM instructions
|
||||
in ARM mode. The beauty of this is the bl instruction does the complement
|
||||
if you were in thumb mode then the lr is loaded with the return address|1
|
||||
the lsbit is set. If the bl happens in arm mode then the lsbit is
|
||||
not set in the lr.
|
||||
|
||||
The mov pc,lr instruction simply moves the value in lr into the value
|
||||
in pc, these are shortcut names you can also write
|
||||
|
||||
mov r15,r14
|
||||
|
||||
Some folks would call this intel syntax (vs att), this is an ARM it
|
||||
is neither an intel or att or anything else. I prefer this style where
|
||||
the destination is on the left (with an exception of course). Replace
|
||||
that comma with an equals sign
|
||||
|
||||
mov r15=r14
|
||||
|
||||
when you read that code, I am putting the thing on the right into the
|
||||
thing on the left.
|
||||
|
||||
The way the ARM processors work the mov instruction does not work like
|
||||
the bx it simply copies the registers, if you were in thumb mode lets
|
||||
say and you called a function in ARM mode and did a mov pc,lr you would
|
||||
make the processor very upset because it wont fetch an instrucition
|
||||
at an unaligned address (more on aligned and unaligned later).
|
||||
|
||||
So our little two line program which didnt do what we wanted was still
|
||||
quite the talking point.
|
||||
|
||||
00000000 <fun>:
|
||||
0: e3a0000c mov r0, #12
|
||||
4: e12fff1e bx lr
|
||||
|
||||
One more thing and we are done with this one. The #12 on the right there
|
||||
as we saw above think of the comma as an equals r0=#12. The # is just
|
||||
a syntax thing to help the assembler parse our code just like brackets
|
||||
and semicolons and such are used in C to help the parser and the human
|
||||
keep track of things.
|
||||
|
||||
Now forgetting about thumb for now, the ARM instruction set is know as
|
||||
a fixed length instruction set. All of the ARM instructions are
|
||||
32 bits, no more, no less. Other instruction sets like the Intel x86
|
||||
are variable length instruction sets. You can have instructions as
|
||||
small as one byte and some that are many bytes long. There are pros
|
||||
and cons to each approach. One of the cons to having fixed length
|
||||
instructions, and worse the length of the instruction is the size of
|
||||
a register. Well explain how you would encode 0xABCD1234 into a single
|
||||
instruciton
|
||||
|
||||
mov r0,#0xABCD1234
|
||||
|
||||
and have some other bits there to tell the processor this is a mov and
|
||||
the destination is r0? Answer is you cant. ARMs approach is confusing
|
||||
at first, but in this case the value 12 fits in the bits we have. so
|
||||
mov r0,#12
|
||||
fits in a single instruction.
|
||||
|
||||
This number at the end there is called an immediate. That bit pattern
|
||||
for that 12 is encoded in the instruciton or immediate vicinity if you
|
||||
will.
|
||||
|
||||
Lets make the compiler deal with an immediate that ARM cannot encode
|
||||
in a single instruction. since I happen to know how ARM does this I
|
||||
can pick one at will...
|
||||
|
||||
unsigned int fun ( void )
|
||||
{
|
||||
unsigned int a;
|
||||
unsigned int b;
|
||||
unsigned int c;
|
||||
a = 0x1200;
|
||||
b = 0x0034;
|
||||
c = a+b;
|
||||
return(c);
|
||||
}
|
||||
|
||||
|
||||
00000000 <fun>:
|
||||
0: e59f0000 ldr r0, [pc] ; 8 <fun+0x8>
|
||||
4: e12fff1e bx lr
|
||||
8: 00001234 andeq r1, r0, r4, lsr
|
||||
|
||||
You should know this, but in case you dont. Not all compilers produde
|
||||
the same machine/assembly code from the same high level language.
|
||||
You might actually get a different answer here than I do depending
|
||||
on your compiler. And I dont mean that gcc vs clang vs borland vs
|
||||
microsoft. You can easily have gcc produce things different ways
|
||||
depending on the command line settings or the version of gcc you are using
|
||||
and so on. So just becuase I happen to get these results for this
|
||||
code today doesnt mean you will, you just have to roll with what my
|
||||
compiler is producing and then figure out what yours is later.
|
||||
|
||||
What they did here is know that they couldnt encode a
|
||||
|
||||
mov r0,#0x1234
|
||||
|
||||
into a single instruction, that is invalid it will complain if you try.
|
||||
so they put that 32 bit number 0x00001234 in some memory location somewhere
|
||||
then they said read this 32 bit thing from memory and put all 32 bits in
|
||||
r0. With that technique they can have any 32 bit pattern they want.
|
||||
|
||||
A beauty of a fixed length instruction set is that you can be lazy with
|
||||
your disassembler, you can assume everything is an instruction and
|
||||
just disassemble it. So the andeq r1 stuff is not real it is not
|
||||
an instruction that is our 0x0001234 data. If you happened to have
|
||||
that andeq instruciton just like that then the machine code would
|
||||
be 0x00001234. So sometimes with these ARM disassemblers you have
|
||||
to just know which is instructions and which is data.
|
||||
|
||||
Now the ldr r0,[pc] that is a real instruction. Ldr means load register
|
||||
or load into a register the value at some address. The address for
|
||||
sytax parsing and human readable purposes is in [brackets]. And in
|
||||
this case it is the program counter. So get the address that is in
|
||||
the program counter, read from memory at that address, and place that
|
||||
value in r0.
|
||||
|
||||
Now you should be asking, but isnt the pc the address of our instruction
|
||||
should that load 0xe59f0000 instead of 0x00001234. Well this is one
|
||||
of those pipeline things you may have heard of. These days the pipe
|
||||
is actually deeper and for reverse compatibility we just happen to know
|
||||
the rule. For ARM the rule is whenever you use the pc in an instruction
|
||||
it points two instructions ahead, or it points at the address after
|
||||
the next instruciton. In this case the bx lr is the next instruction
|
||||
so while we are in the instruction at address 0 the pc is pointing at
|
||||
address 8, the pc contains 0x00000008, two ahead. So this is actually
|
||||
loading from memory at address 0x00000008 which is the value 0x00001234
|
||||
and puts that 0x00001234 into r0.
|
||||
|
||||
Moving on.
|
||||
|
||||
The problem with this code from a "how do I see an add" perspective
|
||||
|
||||
unsigned int fun ( unsigned int a, unsigned int b )
|
||||
{
|
||||
return(a+b);
|
||||
}
|
||||
|
||||
is that we have told the compiler what the inputs to the addition is
|
||||
and the compiler can then do that addition for us at compile time
|
||||
instead of runtime. So if we want to see the compiler generate
|
||||
an add operation then we have to hide the operands from it by making
|
||||
them inputs to this function. We also need this function to do something
|
||||
so we have to return something as well and to see that add this function
|
||||
has to return the addition or something derived from it otherwise the
|
||||
addition surves no purpose and will be removed as dead code.
|
||||
|
||||
So the above generates
|
||||
|
||||
00000000 <fun>:
|
||||
0: e0800001 add r0, r0, r1
|
||||
4: e12fff1e bx lr
|
||||
|
||||
Okay, should have said something by now but here goes. Compilers use
|
||||
a calling convention in order to manage the code being generated. It
|
||||
is up to the compiler at the end of the day what that convention is.
|
||||
Some processor families will try to encourage or dictate the calling
|
||||
convention, assuming they know more about their processor and how
|
||||
it interacts with compiled code. Sometimes not sometimes the same
|
||||
processor may have different conventions from different compilers or
|
||||
versions of compilers. Naturally objects made with different conventions
|
||||
wont necessarily link together and run.
|
||||
|
||||
Calling convetion by this or other terms, is a list of rules if you will
|
||||
for knowing where to find the inputs to a function, where to place the
|
||||
output, in some cases where to find the return address and so on.
|
||||
|
||||
In the case of ARM for these simple 32 bit variables the first variable
|
||||
a in this case will always be in the r0 register, the second in r1 and
|
||||
so on up to r3. Then after four (r0,r1,r2,r3) registers are used the
|
||||
stack holds the rest. We will get to the stack later, and we may get
|
||||
to more compilcated situations where the r0-r3 gets more confusing.
|
||||
|
||||
For the time being the function parameters are in r0,r1,r2...the
|
||||
compiler can assume this when it compiles a function. You may have
|
||||
noticed we are not compiling an entire program we are only compiling
|
||||
one function into on object. yet the compiler knows where
|
||||
the operands are because it always uses the same set of rules. In this
|
||||
case a comes into the function in r0, b into the function in r1.
|
||||
|
||||
This add operation like many of the arm math operations can be read
|
||||
by your mind this way
|
||||
|
||||
add r0,r0,r1 ; the syntax
|
||||
r0=r0+r1 ; what you should see/think
|
||||
|
||||
Now I left this out before but these functions are returning something
|
||||
as well. The calling convention so much as we need to know for now
|
||||
puts the return value in r0. Just like we know all functions by this
|
||||
compiler will do the same thing for placing operands in registers before
|
||||
calling our function, we will place the return value in a know place
|
||||
so the function we return to can find that return value.
|
||||
|
||||
So we have successfully prevented the assembler from optimizing out our
|
||||
addition as dead code by hiding the inputs and forcing the result as
|
||||
an output.
|
||||
|
||||
lets get slightly more complicated.
|
||||
|
||||
unsigned int fun ( unsigned int a, unsigned int b )
|
||||
{
|
||||
return(a+b+7);
|
||||
}
|
||||
|
||||
00000000 <fun>:
|
||||
0: e2811007 add r1, r1, #7
|
||||
4: e0810000 add r0, r1, r0
|
||||
8: e12fff1e bx lr
|
||||
|
||||
|
||||
So the compiler chose to add 7 to b (b is held in r1 here) and then
|
||||
add b+1. We know that a+b = b+a so why the compiler did it that way
|
||||
r0,r1,r0 instead of r0,r0,r1 we may never know. It works.
|
||||
|
||||
So if you are following along in the manual you may notice our syntax
|
||||
has some handwaving going on. The mov and add both have this shifter_operand
|
||||
|
||||
ADD{<cond>}{S} <Rd>, <Rn>, <shifter_operand>
|
||||
|
||||
|
||||
Under ADDs shifter_operand in my manual it tells me to look here
|
||||
|
||||
The options for this operand are described in Addressing
|
||||
Mode 1 - Data-processing operands on page A5-2, i
|
||||
|
||||
The one we are using in the case of these add instructions is this one
|
||||
|
||||
2. <Rm>
|
||||
See Data-processing operands - Register on page A5-8.
|
||||
|
||||
The mov instruction takes us to the same place and for that mov pc,lr
|
||||
it is also a register mov but the mov r0,#12 was an immediate
|
||||
|
||||
1. #<immediate>
|
||||
See Data-processing operands - Immediate on page A5-6.
|
||||
|
||||
Now ARM is generally pretty good at using pseudo code to tell you
|
||||
what is going on.
|
||||
|
||||
0: e3a0000c mov r0, #12
|
||||
|
||||
going back to the mov in the alphabetical listing (understand that
|
||||
assembly language is generally case insesitive, you can use MOV or
|
||||
mov and it works). That table of bits and such at the beginning of
|
||||
each of these instructions is the machine code and it is not a bad idea
|
||||
to look at it. If you are learning assembly language you need to
|
||||
strengthen your bit manipulation skills, hex/binary and such. so the
|
||||
top 4 bits of the instruciton 31 to 28 are the condition field, to be
|
||||
talked about later for now most instrucitons use a 0xE which means always
|
||||
execute, we see that e at the top of our machine code. Now it is
|
||||
more obvious with other instruction sets, with ARM our "opcode" bits
|
||||
are often strewn about. In this case there are two zeros at 27 and
|
||||
26, then 25 is an I it tells us not on this page but when we look
|
||||
at the shifter_operand stuff that this means Immediate or not in this
|
||||
case we are using an immediate so this bit is a 1, then 24 - 21
|
||||
is a 0xD. The S bit we are not using it is a zero. SBZ in this manual
|
||||
means should be zero, so hopefully the assembler did that. Then Rd
|
||||
our destination register, in this case r0 so that should be a zero.
|
||||
And then the shifter_operand stuff. So far without looking at the
|
||||
shifter operand our instruction looks like this in machine code
|
||||
|
||||
111000i1101000000000ssssssssssss
|
||||
|
||||
where the s bits will be filled in once we find our shifter operand.
|
||||
|
||||
Your manual may be slightly different mine says
|
||||
|
||||
A5.1.3 Data-processing operands - Immediate
|
||||
|
||||
And of course there is another diagram of the instruction this one
|
||||
a bit more generic it doesnt know we are doing a move and the Rn field
|
||||
is there where we had a SBZ for MOV. That is all okay this is the
|
||||
same encoding just generic to show us how to use an immediate with
|
||||
the various instructions that can use this immediate encoding.
|
||||
|
||||
We first see that bit 25 is a 1, we didnt know that before we now have
|
||||
|
||||
11100011101000000000ssssssssssss
|
||||
|
||||
And that so far matches what the compiler/assembler generated
|
||||
|
||||
1110 0011 1010 0000 0000 ssssssssssss
|
||||
|
||||
0xE3A00...
|
||||
|
||||
|
||||
shifter_operand = immed_8 Rotate_Right (rotate_imm * 2)
|
||||
if rotate_imm == 0 then
|
||||
shifter_carry_out = C flag
|
||||
else /* rotate_imm != 0 */
|
||||
shifter_carry_out = shifter_operand[31]
|
||||
|
||||
so lets work backwards from what the assembler produced it has 0x00C
|
||||
for those lower 12 bits so that is 3 bits of rotate_imm bits 11 to 8
|
||||
and 8 bits of immed_8 bits 8 to 0. for this encoding
|
||||
rotate_imm = 000 = 0x0
|
||||
immed8 = 11000000 = 0x0C
|
||||
|
||||
|
||||
shifter_operand = immed_8 Rotate_Right (rotate_imm * 2)
|
||||
|
||||
since rotate_imm is zero in our case then it doesnt rotate we simply
|
||||
get shifter_operand = immed_8 = 0x0C which is a 12 decimal which is
|
||||
what we wanted
|
||||
|
||||
0: e3a0000c mov r0, #12
|
||||
|
||||
What we have learned from this exercise is for this immediate encoding
|
||||
type which is used by many ARM instructions, we can basically have
|
||||
immediates with any 8 bit value rotated left or right an even number of
|
||||
bits. A rotate right in this case means shift the bits to the right
|
||||
and the rotate means the bits that fall off the end on the right
|
||||
fill in the bits on the left. So the number 0x1234 has more than
|
||||
8 bits that are not zeros in a row
|
||||
|
||||
123456789-1
|
||||
0001001000110100
|
||||
|
||||
11 significant bits.
|
||||
|
||||
what about 0x1220?
|
||||
|
||||
12345678
|
||||
0001001000100000
|
||||
|
||||
unsigned int fun ( void )
|
||||
{
|
||||
return(0x122);
|
||||
}
|
||||
|
||||
00000000 <fun>:
|
||||
0: e59f0000 ldr r0, [pc] ; 8 <fun+0x8>
|
||||
4: e12fff1e bx lr
|
||||
8: 00000122 andeq r0, r0, r2, lsr #2
|
||||
|
||||
Nope doesnt quite work, because there were 5 bits behind it
|
||||
cant rotate that value an even number of bits, but we can try
|
||||
|
||||
12345678
|
||||
0001001000100
|
||||
|
||||
0x244
|
||||
|
||||
unsigned int fun ( void )
|
||||
{
|
||||
return(0x244);
|
||||
}
|
||||
|
||||
00000000 <fun>:
|
||||
0: e3a00f91 mov r0, #580 ; 0x244
|
||||
4: e12fff1e bx lr
|
||||
|
||||
yes, worked... the lower 12 bits are 0xF91
|
||||
|
||||
so rotate_imm is 1111 and immed_8 is 0x91
|
||||
|
||||
so we need to rotate 0x91 right 0xF*2 times. which is 30 times.
|
||||
|
||||
30 times right is the same as 2 left so
|
||||
|
||||
00000...0010010001
|
||||
0000...00100100010 one bit
|
||||
000...001001000100 two bits
|
||||
|
||||
and there is our 0x00000244
|
||||
|
||||
Pretty cool. Now you know what I know. A really cool one is something
|
||||
like
|
||||
|
||||
unsigned int fun ( void )
|
||||
{
|
||||
return(0x10000001);
|
||||
}
|
||||
|
||||
00000000 <fun>:
|
||||
0: e3a00211 mov r0, #268435457 ; 0x10000001
|
||||
4: e12fff1e bx lr
|
||||
|
||||
taking full advantage of the rotate around the end.
|
||||
|
||||
0x11 rotated right 2*2 bits or 4 bits
|
||||
|
||||
So what if we and instead of add
|
||||
|
||||
unsigned int fun ( unsigned int a, unsigned int b )
|
||||
{
|
||||
return(a&b);
|
||||
}
|
||||
|
||||
00000000 <fun>:
|
||||
0: e0000001 and r0, r0, r1
|
||||
4: e12fff1e bx lr
|
||||
|
||||
with the add we had
|
||||
|
||||
0: e0800001 add r0, r0, r1
|
||||
|
||||
the difference being the opcode bits one says do an add the other do
|
||||
an and. The rest of the instruciton is the same because all of the
|
||||
other operands are the same r0,r0,r1.
|
||||
|
||||
And if you have not done this already when you look up the register
|
||||
encoding for "shifter_operand" in my book is in section
|
||||
|
||||
A5.1.4 Data-processing operands - Register
|
||||
|
||||
those lower 12 bits are mostly zeros with the last four being that
|
||||
third register.
|
||||
|
||||
be careful though, they have a little swizzle in there
|
||||
|
||||
4: e0810000 add r0, r1, r0
|
||||
|
||||
the destination register Rd where the result goes is in bits 15 to 12
|
||||
the first operand is 19 to 16, Rn. so you cant just read left to right
|
||||
in the instruction and see the three registers, you have to know
|
||||
that for this flavor of encoding it is middle, first then at the end
|
||||
the last/third register from the instruction.
|
||||
|
||||
Moving on...
|
||||
|
||||
unsigned int fun ( unsigned int *a )
|
||||
{
|
||||
return(*a+8);
|
||||
}
|
||||
|
||||
00000000 <fun>:
|
||||
0: e5900000 ldr r0, [r0]
|
||||
4: e2800008 add r0, r0, #8
|
||||
8: e12fff1e bx lr
|
||||
|
||||
|
||||
So we saw an ldr before with the special program counter register, this
|
||||
is a general purpose register r0 in both places. The thing in the
|
||||
[brackets] is the address to load from and the other register is where
|
||||
to put the thing read from memory. In this case the C code passes an
|
||||
address to a 32 bit thing as its only operand. We know that parameter
|
||||
the address *a will be passed in in r0. so we read from that address
|
||||
since we are no longer going to need to use *a again and we want to use
|
||||
r0 as our return address we can trash r0 by saving what we read over it
|
||||
then adding 8 to it. If we had further uses for *a (we will do this
|
||||
kind of thing later) then the compiler would have had to use a different
|
||||
register for a while then eventually computed the return value and put
|
||||
it in r0.
|
||||
|
||||
lets do that now...
|
||||
|
||||
|
||||
unsigned int fun ( unsigned int *a )
|
||||
{
|
||||
return(a[0]+a[11]+8);
|
||||
}
|
||||
|
||||
00000000 <fun>:
|
||||
0: e5902000 ldr r2, [r0]
|
||||
4: e590302c ldr r3, [r0, #44] ; 0x2c
|
||||
8: e0820003 add r0, r2, r3
|
||||
c: e2800008 add r0, r0, #8
|
||||
10: e12fff1e bx lr
|
||||
|
||||
Since we will need the pointer/address a again we cannot trash it on
|
||||
that first read. This brings up a point in the calling convetion
|
||||
that will become obvious in the near future. The calling convention
|
||||
so far as we care so far says that we can trash/destroy r0-r3 in our
|
||||
function without causing any harm to anyone else. Other registers
|
||||
we will see that we need to preserve so that when we return they
|
||||
contain the same value they had when the caller called our function.
|
||||
|
||||
If you had looked at the ldr encoding you may have realized that the
|
||||
ldr r2,[r0]
|
||||
encoding actually has an offset of zero
|
||||
ldr r2,[r0,#0]
|
||||
Quite literally an instruction like this
|
||||
ldr r3,[r0,#44]
|
||||
means take the number in r0, add 44 to it then use that as an address to
|
||||
read from memory.
|
||||
|
||||
This program said to get a[11], and a is 32 bit integers so that means
|
||||
4 bytes per so 11*4 = 44. Take the address a, add 44 and read that
|
||||
item. The compiler here can now trash r0 of it wants we dont need it
|
||||
anymore as the address a, but we will need to get our answer in it in
|
||||
an efficient manner if we are optimizing which we are. r2 for the
|
||||
moment we cannot trash because it holds the first item we need to add
|
||||
together a[1]. r1,r0,and r3 are fair game the compiler chooses r3, we
|
||||
dont know why it just did. So we have three things we need to add a[1]
|
||||
a[11] and the immediate 8. the add instruction can only add two things
|
||||
at a time so the compiler chose to add a[1] and a[11] then take that
|
||||
result and add 8.
|
||||
|
||||
The assembler could have just as well done something like this
|
||||
|
||||
ldr r1,[r0]
|
||||
ldr r0,[r0, #44] ; 0x2c
|
||||
add r0,r0,r1
|
||||
add r0,r0,#8
|
||||
bx lr
|
||||
|
||||
and it would have worked just as well. there are algorithms that
|
||||
compilers use to determine which register they can use and which they
|
||||
cant and which they can but takes more work. In this case the first
|
||||
instruciton the easy to use ones were r1,r2,r3. Then for the second
|
||||
instruction r0,r1,r2,r3 minus the one they used as Rd in the first
|
||||
instruction and the third instruction Rd didnt have to be r0 they
|
||||
could have done this for example
|
||||
|
||||
ldr r3,[r0]
|
||||
ldr r2,[r0, #44] ; 0x2c
|
||||
add r1,r2,r3
|
||||
add r0,r1,#8
|
||||
bx lr
|
||||
|
||||
It it would have been just as good as the other combinations.
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
@@ -1,42 +0,0 @@
|
||||
|
||||
# Usage
|
||||
# sudo ./build_arm
|
||||
|
||||
# Setup vars
|
||||
export TARGET=arm-none-eabi
|
||||
export PREFIX=/opt/gnuarm
|
||||
export PATH=$PATH:$PREFIX/bin
|
||||
export JN
|
||||
export JN='-j 8'
|
||||
|
||||
rm -rf build-*
|
||||
rm -rf gcc-*
|
||||
rm -rf binutils-*
|
||||
|
||||
# Get archives
|
||||
wget http://ftp.gnu.org/gnu/binutils/binutils-2.23.2.tar.bz2
|
||||
wget http://ftp.gnu.org/gnu/gcc/gcc-4.8.2/gcc-4.8.2.tar.bz2
|
||||
|
||||
# Extract archives
|
||||
bzip2 -dc binutils-2.23.2.tar.bz2 | tar -xf -
|
||||
bzip2 -dc gcc-4.8.2.tar.bz2 | tar -xf -
|
||||
|
||||
# Build binutils
|
||||
mkdir build-binutils
|
||||
cd build-binutils
|
||||
../binutils-2.23.2/configure --target=$TARGET --prefix=$PREFIX
|
||||
echo "MAKEINFO = :" >> Makefile
|
||||
make $JN all
|
||||
sudo make install
|
||||
|
||||
# Build GCC
|
||||
mkdir ../build-gcc
|
||||
cd ../build-gcc
|
||||
../gcc-4.8.2/configure --target=$TARGET --prefix=$PREFIX --without-headers --with-newlib --with-gnu-as --with-gnu-ld --enable-languages='c'
|
||||
make $JN all-gcc
|
||||
sudo make install-gcc
|
||||
|
||||
# Build libgcc.a
|
||||
make $JN all-target-libgcc CFLAGS_FOR_TARGET="-g -O2"
|
||||
sudo make install-target-libgcc
|
||||
|
||||
@@ -1,5 +0,0 @@
|
||||
|
||||
unsigned int fun ( unsigned int *a )
|
||||
{
|
||||
return(a[0]+a[11]+8);
|
||||
}
|
||||
Reference in New Issue
Block a user