From 1aac22342945a4a99e1b20bd0289ff297a9bb6e2 Mon Sep 17 00:00:00 2001
From: dwelch67 <dwelch@dwelch.com>
Date: Sat, 20 Sep 2014 09:47:02 -0400
Subject: [PATCH] giving up on that

---
 bare_metal_rev_two/ARM_TOOLS |  150 ----
 bare_metal_rev_two/README    | 1336 ----------------------------------
 2 files changed, 1486 deletions(-)
 delete mode 100644 bare_metal_rev_two/ARM_TOOLS
 delete mode 100644 bare_metal_rev_two/README

diff --git a/bare_metal_rev_two/ARM_TOOLS b/bare_metal_rev_two/ARM_TOOLS
deleted file mode 100644
index 7735159..0000000
--- a/bare_metal_rev_two/ARM_TOOLS
+++ /dev/null
@@ -1,150 +0,0 @@
-
-If you have not figured it out yet there are different processors
-out there.  Like people some folks speak spanish, french, english,
-etc even though we are all people.  Some processors use one
-instruction set others use another.  If you are programming on an
-x86 computer the native compiler compiles code for x86 which is not
-compatible with ARM.  So you have two choices find an ARM computer
-and use its native compiler or use what is called a cross compiler
-one that generates programs that are not native.
-
-There are other toolchains (collection of compiler tools) that will
-compile programs for ARM processors the one we care about here is
-the tools from the GNU folks http://gnu.org.  Now the problem with
-the GNU tools if you choose to call it a problem is that when you
-build these tools you have to choose the processor family, and the
-toolchain you build will only compile for that processor family.
-
-The first solution is to get another Raspberry Pi, one for running
-Linux as the foundation intended, which gives you an ARM computer
-basically and that means the native compiler tools know how to build
-ARM programs, the other Raspberry Pi is the one that you are doing
-your bare metal programming on.  Yes you could also use one Raspberry
-Pi and swap sd cards back and forth.  You can also run QEMU which
-is capable of simulating many different instruction sets and it is
-possible to run ARM Linux on anything that supports QEMU.  My Makefiles
-are not native compiler friendly but you could probably fix that
-if you take this path (ideally I am teaching you to fish not giving
-you a fish anyway so these are just examples that you then make
-your own).
-
-It is not hard to get the gnu sources and build the toolchain yourself
-using your native (gnu) compiler, well not hard until it fails to
-work.  Nevertheless I have a repository where I keep the simple
-build scripts for the cross compilers that I personally use.
-https://github.com/dwelch67/build_gcc
-I tend to use the tools I build from the gnu sources.  These scripts are
-for Linux users, they can be easily modified for Windows or MAC users
-but I long ago stopped running on those platforms and testing scripts
-like these.
-
-The easier path is to just get tools that someone else has built and
-you simply install.  These folks have tools for Windows, Linux
-and MAC.
-
-https://launchpad.net/gcc-arm-embedded
-
-Just download and install.
-
-Now if you are running one of the most recent Ubuntu distributions
-or derivatives (personally I run Linux Mint) then all you have to do
-is:
-
-apt-get install gcc-arm-linux-gnueabi
-
-and there you installed and ready to use.
-
-What was formerly http://codesourcery.com is now been assimilated by
-Mentor Graphics and the gnu tools they maintained still offer a Lite
-(free) version.  As well as the pay-for version, you are not necessarily
-paying for open source software but more like paying for tech support
-for open source software.  You have to wade through a few web pages
-sacrifice an email address where they send a special for you link
-to the download for the lite version you asked for.  Where I work
-we send our customers to Mentor Graphics, personally I typically use
-the ones I built, but will sometimes try out the launchpad one above
-and the apt-got one.
-
-What is abi, eabi, the difference between arm-none-eabi and arm-linux-
-gnueabi and all that?  Well much of it has to do with using those
-triple names when building the toolchain, the gnu build system takes
-that triplet and tailors the build.  In particular it targets a
-particular operating system or operating environment for the default
-linking and libraries linked in.  We are bare metal here so we dont
-have/want an operating system and we are not going to use the default
-linker script nor are we going to link in the operating specific
-libraries.  So long as we dont use any C library functions that
-ultimately make an operating system call (printf, fopen, etc) we can
-compile our bare metal programs using an arm cross compiler that is
-meant normaly to build arm linux programs or an arm cross compiler
-that is meant to make arm binaries for other environments.  We need
-an assembler, a linker, and a compiler that makes object files and
-we will learn how to beat those tools into submission.
-
-ABI, arm binary interface it is a standard that arm developed for
-compilers so they conform to arms parameter passing rules, something
-we will learn about to some extent.  EABI, is just enhanced abi they
-basically changed/improved the calling convention.  Again those
-triplets are gnu specific and mean something mostly to the gnu toolchain
-build system.  And fortunately or unfortunately you can tell the
-build system my triplet is a-b-c but when you build the finaly binaries
-dont call them a-b-c call them d-e-f which might be some other
-triplet that further confuses folks.
-
-So as mentioned in the main text, once installed you will have an
-assembler  something-as a linker something-ld and a compiler something-gcc
-the assembler and linker come from a gnu package called binutils.
-If you have no interest in the C programming and want assembly only
-then you only need binutils, you can
-
-apt-get install binutils-arm-linux-gnueabi
-
-for example instead of getting the compiler or take my build script
-and chop off gcc and libc and just build binutils.
-
-Now whatever your triplet is called once installed you should be
-able to go to a command line (set your PATH as needed) and run
-
-arm-linux-gnueabi-as --version
-
-and get some output that indicates that it is installed and working
-
-GNU assembler (GNU Binutils for Ubuntu) 2.24
-Copyright 2013 Free Software Foundation, Inc.
-This program is free software; you may redistribute it under the terms of
-the GNU General Public License version 3 or later.
-This program has absolutely no warranty.
-This assembler was configured for a target of `arm-linux-gnueabi'.
-
-
-arm-none-eabi-as --version
-
-GNU assembler (GNU Binutils) 2.24
-Copyright 2013 Free Software Foundation, Inc.
-This program is free software; you may redistribute it under the terms of
-the GNU General Public License version 3 or later.
-This program has absolutely no warranty.
-This assembler was configured for a target of `arm-none-eabi'.
-
-same goes for the linker
-
-arm-linux-gnueabi-ld --version
-GNU ld (GNU Binutils for Ubuntu) 2.24
-Copyright 2013 Free Software Foundation, Inc.
-This program is free software; you may redistribute it under the terms of
-the GNU General Public License version 3 or (at your option) a later version.
-This program has absolutely no warranty.
-
-and gcc if you are going to use the compiler (I highly recommend you do
-but if building from sources getting the compiler to build is harder
-than binutils)
-
-arm-linux-gnueabi-gcc --version
-arm-linux-gnueabi-gcc (Ubuntu/Linaro 4.7.3-12ubuntu1) 4.7.3
-Copyright (C) 2012 Free Software Foundation, Inc.
-This is free software; see the source for copying conditions.  There is NO
-warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
-
-The readme might default to arm-none-eabi-as for an example but if you
-have arm-linux-gnueabi-as installed instead you need to substitute the
-commands or for Makefiles modify the define at the top.
diff --git a/bare_metal_rev_two/README b/bare_metal_rev_two/README
deleted file mode 100644
index d75fed3..0000000
--- a/bare_metal_rev_two/README
+++ /dev/null
@@ -1,1336 +0,0 @@
-
-I am contemplating a do-over on this Raspberry Pi Bare Metal Programming
-repo.
-
-Bare Metal Programming simply means no operating system.  Although we
-could, we are not going to run off and make a gui based web browser
-or anything like that.  Bare metal is often used for things like
-booting a computer or the software that runs an alarm clock or TV
-remote control.  We are of course going to do it here for fun and
-education.  The purpose of the Raspberry Pi is education, for every
-million or so Python programmers we need a bare metal programmer.  The
-Raspberry Pi has pros and cons for use in learning bare metal
-programming.  On the pro side the peripherals are relatively easy to
-program on the con side the vendor provided documentation is far from
-the best I have seen.
-
-Most of bare metal programming has to do with things other than writing
-programs.  Reading datasheets, programmers reference manuals, schematics
-are all at the center of bare metal programming.  You dont have to be
-a computer engineer nor electrical engineer, if/when you do this
-professionally then there should be electrical engineers that you work
-very closely with, they do their thing, you do yours.  Hopefully I can
-hold your hand through the electrical part.
-
-Some assembly language programming is required for bare metal
-programming, the bulk of bare metal is C.  One nice thing about bare
-metal programming is that the programming itself does not have
-to be that complicated.  You need to have some programming experience
-here, doesnt have to be assembly language nor C although C would help.
-I will try to explain the assembly language, and the C should feel
-relatively natural for an experienced programmer, just a matter of
-syntax.
-
-My statistic above about a million to one Python to bare metal
-programmers is completely made up, but the percentage of bare metal
-programmers to other forms is a very small number.  This means for
-example the documentation we need is read by a relatively small
-number of people, it only has to be good enough, doesnt have to be
-great.  Likewise, more than the programming languages themselves
-(generally C with some assembly language) we do have to beat the
-programming tools into submission (assembler, compiler, linker) because
-we are going to use them in a way that is equally rarely used.
-
-The last word on bare metal programming in this introduction before
-we go onto what you need is that unlike programming an application
-on top of your favorite operating system, with bare metal programming
-it is possible to destroy hardware.  Sometimes you "let the smoke out"
-(the joke is there is a finite amount of smoke in chips and if you
-let even a little bit out the chip wont work) and sometimes you "brick"
-the system.  Bricking something in this context means that you have
-done something fatal to the hardware that doesnt let the smoke out
-but the board/product is not much more than a paperweight or a brick
-you might use to hold a door open.  On the good side, so far as we
-know, you cannot brick a Raspberry Pi, if your program crashes you do
-have the tools to fix it, in this case the tool is removing the sd card
-and replacing the program that crashed with one that doesnt.  With
-hardware other than the raspberry pi, there are various levels of pain
-for bricking a board sometimes you might be able to recover the board
-with a JTAG debugger.  Sometimes you can get a soldering iron out and
-remove and replace some components.  It is all part of the experience
-unfortunately.  With the raspberry pi if you are careful not to
-short anything out (dont touch the board with metal items, dont set it
-on metal items, basically dont create an electrical connection between
-any two exposed bits of metal on the board) and when connecting the
-serial interface below or other additional items we may talk about
-you dont get those connections wrong, you shouldnt have any smoke or
-bricking problems with your Raspberry Pi.  I will not take any
-responsibility for you damaging your hardware.
-
-Take a deep breath, you CAN do this...
-
-Naturally you will need a raspberry pi.  I am probably going to use
-my Model A for much of this since I added a reset button to it.  I have
-a number of Raspberry Pi boards, and for the most part this material
-should work on all of them.  If something board specific comes along,
-we will deal with it then.
-
-Looks like folks are retiring the Model A, Adafruit also showed the
-Model A as retired.
-
-https://www.sparkfun.com/products/retired/11837
-
-The Model B that is the same pc board as my Model A, but has more stuff
-on it (and costs a little more).
-
-https://www.sparkfun.com/products/11546
-
-The B+ has its led wired differently than the rest so you might have
-some first programs not work but later can catch up.
-
-Note that you dont have to sacrifice your linux install on your
-Raspbery Pi to play with bare metal, renaming a file will preserve
-that, as you will see.
-
-
-Why they didnt start from the beginning with a micro sd slot I will
-never understand, and the way the full sized sd slot sits so that
-the card hangs way out the side.  I have broken a number of sd cards
-in those slots, this little adapter board is wonderful for converting
-to a micro sd slot in a durable way.  This board is not required but
-you certainly have to have an sd card that fits in the board you are
-using.  It does not have to be a huge card (huge as in lots of
-gigabytes) in fact we will be using three fairly small files and that
-is it, early testing my old cards measured in megabytes didnt work
-for some reason, and 2GB and maybe even 4GB cards are harder and
-harder to find.  But whatever the popular size is under $10 or so
-should work just fine.
-
-https://www.sparkfun.com/products/12824
-
-I hate to do this but almost immediately you will need a serial
-interface to the Raspberry Pi to continue this tutorial.  Computers
-in general do not ship with serial ports any more, and even if they
-did you cant wire that directly up to this board, the voltage levels
-are wrong (smoke will come out somewhere).  The best solution is some
-flavor of usb to serial and it has to be 3.3V not 5.0V (smoke).  This
-cable with an integrated usb to serial built in is ideal.  You dont
-have to shop at sparkfun, in the USA it is a great place for this kind
-of stuff, and easy on the wallet as far as shipping goes, from the
-picture the wires appear to be labelled, you can probably find these
-usb to TTL 3.3v serial cables all kinds of places, ebay, etc.  They
-may not have labelled ends and if you are not experienced at electrical
-engineering and have the tools (multimeter, maybe a scope, etc) you
-dont want to just guess at it (smoke).
-
-https://www.sparkfun.com/products/12977
-
-You could go with other usb to serial and separately buy the usb
-cable and the hook up wires, but that is more expensive.  At the same
-time if you stick with bare metal programming beyond the Raspberry
-Pi, you will need tools like these in your toolbox.  A uart/serial
-port is still one of your primary debugging interfaces.
-
-https://www.sparkfun.com/products/9873
-https://www.sparkfun.com/products/9140
-
-The first documents you will need are found here
-
-You will want to go here
-http://elinux.org/RPi_Hardware
-And get the datasheet for the part
-http://www.raspberrypi.org/wp-content/uploads/2012/02/BCM2835-ARM-Peripherals.pdf
-(might be an old link, find the one on the wiki page)
-And the schematic for the board
-http://www.raspberrypi.org/wp-content/uploads/2012/04/Raspberry-Pi-Schematics-R1.0.pdf
-http://www.raspberrypi.org/wp-content/uploads/2012/10/Raspberry-Pi-R2.0-Schematics-Issue2.2_027.pdf
-(might be an old link, find the one on the wiki page)
-As well as some documents from ARM.
-
-The Raspberry Pi is centered around the Broadcom BCM2835 media
-processor.  ARM does not make chips they sell/license the source code
-to their processor design, which is normally integrated into what is
-called an SoC or System on Chip.  Which means some useful peripherals
-are added to the chip that might historically have been on separate
-chips like a DDR (memory) controller, or a USB controller, PCIe, etc.
-For power or size or economy of scale reasons the folks that buy ARM
-processor cores generally need a processor to add to their chip and
-it is easier sometimes to buy than make your own.  Most folks dont
-realize it and think that because almost every big box computer (server,
-desktop or laptop) is Intel x86 based (or a clone) that x86 processors
-dominate the world, not realizing that that same box has many other
-processors inside, not all of them ARM's but some.  For every x86 you
-own or use you likely own or use many many ARM based products.  This
-chip from Broadcom is one of the myriad of ARM based products out there
-fighting for a space in the various niche markets.
-
-Be it an ARM based chip or some other the first thing a bare metal
-programmer needs to do is figure out which processor you have.  Simply
-stating it is an ARM processor is not remotely enough.  ARM has an ever
-growing array of processor products.  Some chip vendors are more
-helpful than others at figuring this out.  The BCM2835 document
-mentioned above would normally be the place where you would find this
-out, but in this case it does say ARM in the document but doesnt even
-say ARM11 much less arm1176jzfs.  Fortunately the Raspberry Pi
-creators and community has the wiki page above which provides the
-information we need.  ARM has at least four different cores in the
-ARM11 category this one is the ARM1176 specifically arm1176jzfs a bunch
-of letters that mean something to ARM as to the features included.  For
-us that means wse can find one of the two documents we need from ARM.
-Generally you start at
-http://infocenter.arm.com
-And along the left side you find the processor series, in this case
-ARM11 processors.  Expand that and see the ARM1136, ARM1156, ARM1176
-and the MPCore.  We want ARM1176.  Our first goal here is to find
-the Technical Reference Manual, TRM, for the core we are using.  For
-the moment this is an accurate link directly to that document
-http://infocenter.arm.com/help/topic/com.arm.doc.ddi0301h/DDI0301H_arm1176jzfs_r0p7_trm.pdf
-In the preface of the TRM it gives us a hint as to the ARM ARM we need
-(ARM ARM = ARM Architectural Reference Manual).
-
-ARM Architecture Reference Manual (ARM DDI 0406)
-
-There used to be only one ARM ARM for the whole ARM world but the
-architectural differences were such that they left the original ARM ARM
-with the last architecture it supported and started creating new ones.
-So back on the left of the page expand ARM Architecture and then expand
-Reference Manuals.
-
-Unfortunately the didnt tell usin the TRM which architecture name to
-look for, so we have to fumble around a bit or do some Googling to find
-that we need the ARMv7-AR Reference Manual.  From that page it shows
-
-This manual describes the instruction set, memory model, and programmers'
-model for ARMv7 (A&R profile) compliant processors, including:
-  Cortex-A series
-  Cortex-R series
-  Qualcomm Scorpion.
-It also describes the later ARMv6 architecture releases for ARM11
-processors, and describes Thumb-2 and the TrustZone security
-extensions.
-
-If you get the manual through ARMs website they appear to require a
-login. It is free other than giving up an email address which no doubt
-you have or can create a gmail one or whatever.
-
-https://silver.arm.com/download/download.tm?pv=1603196
-
-So the r0p7 nomenclature means rev 0.7 the r is rev and the p is a period.
-Now hopefully the Raspberry Pi folks who provided that link gave us the
-right rev.  Just because ARM has fixed some bugs in some rev and the
-currently selling rev is some other number, any ARM based chip you are
-using is built from a specific rev of that product and there are times
-where a rev change generates different internal addressing or features
-in the chip (certainly if you have access to the errata, you need to
-be very careful to apply the correct errata to the right rev, far too
-often are workarounds applied improperly to arm code causing more
-problems for that software than solutions).  The ARM1176JZF-S has only
-the r0p7 rev of TRM.  But look at the ARM11 MPCore TRM and see there
-is an r1p0 and r2p0 and I know that if you use the wrong one there
-you can have stuff not work.  When in doubt take the newest one and
-hope for the best, if you know for sure, then even if the ARM web page
-marks that doc as Superseded, use that doc.
-
-To add to the confusion wikipedia shows that the ARM1176 is architecture
-version ARMv6Z.  The part we care about is the ARMv6 part as you will
-see soon.
-
-So what was the point of that exercise?  Well first off I gave you
-many answers for finding info, but finding that stuff on your own is
-a big part of bare metal programming.  Sometimes the TRM but usually
-the ARM ARM details the instruction set for that architecture.  And yes
-the ARM instruction sets are generally reverse compatible but ARM did
-create some new isntruction sets that we might talk about.  Each
-architecture adds a few or more instructions.  The original ARM ARM
-became what is now the ARMv5 reference manual which covers ARMv4 and
-ARMv5.  ARMv5 is basically the same instruction set but the processor
-added caches and an MMU which makes it significantly easier to run
-an operating system like Linux for example.  I want you to also
-download the ARMv5 Architectural Reference Manual because it is a little
-easier getting us started with booting the ARM. We need an instruction
-set reference so we can write assembly language we need assembly language so we can manage booting the processor and
-we need the manual to tell us how the processor boots.  In ARM land
-the archtecture manuals are the more common stuff across the
-architecture version in question (the instruction set), and the
-technical reference manual deals with specific processor core products
-within that archtecture version (this one has an FPU that one has
-a cache, etc), the various ARM11 processors for example are different
-processor products basically within the ARMv6 architecture.
-
-Really, the Raspberry Pi is not a bad introduction to bare metal
-programming, but there has already been and will be more of these
-nitty gritty details to work through.  So all processors have a
-procedure they follow for booting.  The hardware folks worry about
-supplying power and a clock or clocks to the processor and releasing
-reset then the fun begins.  Processors made by different companies
-dont all follow the same rules, if you take the time to study a few
-different ones you will see that they are as similar as they are
-different.  Generally you have some sort of non-volatile (meaning
-doesnt forget when it is powered off) storage like a rom (flash) or
-hard disk or something like that which holds the code that at a
-minimum boots the processor up to the point that you can run fun
-and interesting programs.  The ARM processor used in the Raspberry
-Pi as far as the ARM is concerned after reset starts running by
-starting execution at address 0x00000000.  And that is what we care
-about.  Normally the hardware folks will make the logic around
-the ARM processor core such that when the ARM does a read from address
-0x00000000 (and a lot more addresses that follow) that the chip
-talks to some flash somewhere on or off chip to fetch the instructions.
-But there may be some other address space maybe starting at 0x40000000
-that the chip folks make read from ram.  Your x86 computer for example
-has a rom/flash with a bootloader and eventually that bootloader
-reads from a hard disk and then boots the operating system from some
-code on the hard disk that knows how to do that and so on.  This is
-all very typical a flash/rom that either contains the application or
-operating system and some ram and if the flash doesnt contain everything
-then it contains code that knows how to reach out to some other storage
-and run the application or operating system.
-
-The Raspberry Pi boot process is not what you normally find.  Now
-remember this chip was not designed to be a Raspberry Pi, it was meant
-to be some sort of tablet or phone or set top box (ROKU) type product.
-So that basically means it has video processing capabilities, and in
-this case it has a relatively powerful (for its size and price)
-graphics processor which itself is a completely independent processor
-from the ARM.  It has a completely different instruction set, it
-has some normalish instructions but then a lot of floating point
-computation capabilities and other things that help it do graphics
-processing.  Broadcom is generally extremely secretive about their
-chips, and perhaps by plan or accident or against their will the
-Raspberry Pi has drawn the proper attention to first cause the
-GPU to be reverse engineered and then later for Broadcom to open
-up a fair amount of information about that part of the chip.  I didnt
-look for this answer, but either built into logic or or there is some
-on board flash or one time programmable rom that allows the GPU to
-boot first, before the ARM.  The GPU is what actually boots the
-Raspberry Pi.  Again either raw logic or a bootloader on chip the
-first thing that we see is the sd card is read looking for a file
-named bootcode.bin.  That is a program written in the GPU's instruction
-set.  It performs some booting tasks like initializing the DDR
-interface and other stuff.  Then comes start.elf, also GPU code.
-This is more of the embedded operating system that knows how to do
-all the GPU video processing supported by this chip in case you wanted
-to make a tablet or set top box out of this chip and wanted to play
-videos.  Then the GPU boots the ARM by going back to the sd card and
-looking for a file named kernel.img which is an ARM binary.  Although
-there are ways to change this but the default is for the GPU to place
-the bytes (ARM code) from that kernel.img file into ram (DRAM) at
-a place that is address 0x00008000 to the ARM.  So first off I thought
-you said the ARM boots at address 0x00000000, second why are you playing
-word games, the ARMs address rather than simply saying just address 0x8000.
-Well the GPU also writes to the ARM's address 0x00000000 the instruction
-or instructions needed for the ARM to jump to address 0x8000 causing
-it to runthe program that was found on the sd card.  Second, another
-thing you dont normally see, is that the entire memory space is
-shared between the ARM and the GPU.  Depending on the generation
-of Raspberry Pi you might have 256MBytes or 512, but all of that is
-available to both processors almost equally.  If both processors
-try to access the same memory at the same time the GPU wins and gets
-there first the ARM is held off to wait, otherwise if the ARM won
-and the GPU waited then the video output would studder or get messed
-up.
-
-The BCM2835 manual linked above, page 5 has a picture with three
-address spaces, VC CPU Bus Addresses (VC = Video Core or the GPU),
-ARM Physical Addresses and ARM Virtual Addresses.  The one we care
-about is the middle one the ARM Physical Addresses, but also the
-real map of the world is the left one the VC CPU Bus Addresses.
-The first thing this picture is telling us (and this is a complicated
-or perhaps at least confusing picture) is that however much RAM
-we have (I may have called it DDR or DRAM) in the system, called SDRAM
-in this picture, be it 256MBytes or 512MBytes or whatever, both the
-ARM and VC/GPU have access to all of that ram.  For the ARM that ram
-starts at ARM address 0x00000000 and goes up to whatever amount the
-system has.  In the middle it is mared as SDRAM (for the ARM) and
-VC SDRAM (optional), and there is a line in the middel that is vague,
-determined by VC platform configuration.  I dont keep track of this
-constantly for every version, but it has typically been a 50/50
-split, again something we can ask the VC/GPU bootloader to change
-but for this discussion there is no need.  So let's assume that
-if our Raspberry Pi has 512MB then 256MBytes or address 0x00000000
-to address 0x0FFFFFFF belongs to the ARM and the rest is for the GPU.
-This chart is also showing us that in the GPU's address space that
-ram is mapped certainly at addres 0xC0000000 and 0x00000000 and
-0x40000000 and 0x80000000.  That may seem strange to you but it is
-very easy to do in hardware and you will see this over time in your
-career.  We dont really care about that since that is GPU side and
-we are programming the ARM.  The other information that matters here is
-that the I/O base address for the peripherals starts at 0x20000000
-in the ARM address space and that maps to the same stuff at address
-0x7E000000 in the GPU address space.  This manual uses 0x7E000000
-based addresses throughout the document, but as ARM programmers we
-need to see 0x7E001000 for example and replace the 7E with a 20 and
-instead use address 0x20001000.  Again this may all seem very strange
-to you but is not uncommon and is generally easy to do in hardware.
-So what we can see here is that the GPU has the ability to read
-the kernel.img file (because it can get to the I/O Peripherals for
-example one of which talks to the sd card) and it can copy that
-data into its memory at 0xC0008000 which instantly becomes the
-ARMs memory at address 0x00008000 since it is the same physical
-memory.  Then the GPU can write an instruction or two to its
-address 0xC0000000 which is ARM's address 0x00000000 that will tell
-the ARM processor to jump to address 0x8000.  In addition since
-this platform is intended to run Linux on the ARM side the bootloader
-has a few more things to do before releasing reset on the ARM
-and allowing it to run.  If you have messed with Linux elsewhere
-even on a laptop or desktop computer there are things that can be
-passed to the kernel when it boots to change its behavior, in the
-case of the ARM we might want to have the same kernel.img work on
-both the 256MB Raspberry Pi and the 512MB Raspberry Pi so we need
-to tell that kernel how much memory it has to work with.  The scheme
-used is to take some of that memory in the case of the Raspberry Pi
-between 0x0000 and 0x8000 and put information like how much memory
-and other parameters in a formatted table and when the kernel starts
-it knows to look for that stuff.  Eventually the GPU releases reset
-on the ARM meaning it allows the ARM to run.  Like a normal ARM
-processor after a reset it looks for its first instruction at address
-0x00000000 and that instruction says jump to address 0x00008000 and
-all of the sudden the ARM is running the program that was basically
-the file kernel.img.  This is where we as bare metal programmers
-take over.  Instead of that kernel.img file being a linux kernel, we
-can make it any program we want.  The Raspberry Pi doesnt care, there
-is no magic or encryption or secret handshake, whatever bytes we put
-there the ARM will at least try to execute, if those bytes are
-not ARM instructions it may crash but so be it that is us taking over
-this platform.  You can see the beauty here though, if we do have a
-kernel.img file that is buggy or broken, all we have to do to fix it
-is power off the Raspberry Pi, pull out the sd card and overwrite
-the kernel.img file with something we hope is not broken and try
-again.
-
-Okay so lets actually get started.  You need to open the ARMv5 ARM ARM,
-chapter A2 the Programmers Model.  Hopefully ARM doesnt change the
-chapter numbers on me, but A2.6 Exceptions.  In this document the
-word exception means the processor is running along normally and
-something happens to cause it to stop what it was doing and run
-something else.  The first one on the list is Reset, now the
-very first reset after the power comes on the ARM wasnt doing anything
-that we caused an exception to, but if it were possible (and probably
-is) on this chip to have a reset while running then that exception
-would do the same thing as the first reset after power on.  This
-table shows us that the Reset changes the processor to Supervisor mode
-that just means that our programs are not limited we can run any
-instruction we want and access any address we want.  And that the
-normal thing to do is start executing the instruction at address
-0x00000000.  From the manual:
-
-"When an exception occurs, execution is forced from a fixed memory
-address corresponding to the type of exception. These fixed addresses
-are called the exception"
-
-Execution is forced basically the processor is forced to run from the
-address specified.  That is how I know that the first instruction
-executed after a reset is the instruction at address 0x00000000 the
-processore is forced to do that.
-
-Now if you have experience with this kind of stuff but maybe not
-the ARM you might have noticed that address 0x00000004 is where
-another exeception occurs and you may or may not know that the ARM
-instructions are 32 bit or 4 bytes.  So we have exactly one instruction
-to react to a reset, if we were to use two instructions that
-second instruction would be at address 0x00000004 and that second
-instruction would be the first instruction for an undefined exception
-which is when the ARM is asked to execute an instruction, machine code
-that is not defined by that processor as an instruction.
-
-The short answer is address 0x00000000 matters to us for booting an
-ARM and we will learn that there are only two instructions we can
-choose from that will do a jump and consume only 4 bytes.
-
-This is where the "some assembly language required" starts, we have
-to use assembly language so that we can place the exact instruction
-we want in the right place or order to do things like this jump.  On
-the Raspberry Pi the GPU has placed the machine code for the instruction
-we want at address 0x00000000 later we are going to mess with exceptions
-for now the GPU did that for us.  Now we are going to start with
-assembly language and the quickly move to using C.  Now if you know C or
-know other programming languages you can image that there is some
-software magic required before your programs first function actually
-runs.
-
-unsigned int myfun ( void )
-{
-    int a=5;
-    return(a+7);
-}
-
-Now an optimizer will simply return 12 and not generate the extra code.
-But pretend that didnt happen, to literally implement the above program
-somebody has to set aside some storage for the variable a and somebody
-has to fill that storage with the number 5 and THEN you can generate
-some code that does the add and the return.  So before we actually
-get to our programs first operation, the add, there was other stuff
-that had to happen, and that stuff has to happen in the world of
-software.  You might have heard the word stack and maybe have a vague
-idea of what it means, with assembly language you get to see what
-it really is (and it isnt all that magical).  In C before the code
-in the main() function actually executes, there is some bootstrap code
-that is required and you get this chicken and egg problem, how do you
-bootstrap C if you cant use C because you would need a bootstrap for
-the C you are using to bootstrap C.  That bootstrap has to be in
-some other language, basically that other code is assembly language.
-
-Before we get to that, please see the ARM_TOOLS file for ways to get
-yourself a gnu based assembler, and linker initially then pretty soon
-we need a C compiler as well.  As far as this document is concerned
-the exact name of the programs you have may vary but they will all
-in theory all work the same and you can be on a Linux box or Windows
-or MAC.  Your assembler command line might be arm-none-eabi-as or
-arm-elf-as or just as is what I am saying so you will need to
-mentally substitute the names I use for the ones you have.  See ARM_TOOLS.
-
-
-Now that you have your assembler and linker, I am not going to go into
-as much detail as I might like if this were purely about learning
-assembly language.  Processors are programmable logic, they are
-programmable in the sense that they are designed to operate on machine
-code.  Machine code or machine language being blobs of bits that
-define instructions that tell the processor what you want it to do.
-The machine language for a particular processor is very well defined
-in that it doesnt vary, the bit patterns for the instructions are
-what they are.  Now we can but it isnt easy or reliable to write
-programs in binary bits, so as humans and programmers we take the
-binary bit patterns and put names we can read and write.  Naturally
-to sell their product the inventor of the instruction set needs users
-and to get users they will generally create the assembly language which
-is the name of the human readable programming language whose syntax
-represents the machine code instructions.  They will also need to
-make or get someone to make an assembler, which is the program that
-takes the assembly language and converts it into machine code.  And
-typically a linker and a C compiler are the minimum tools needed to
-get folks to use your processor.  So they have defined an assembly
-language, but that doesnt make it a worldwide standard, it could
-have been invented on the fly by a single individual at the company and
-imposed on the rest of us.  The machine language is not changeable
-but the assembly language is and it is not unheard of to have a
-companies assembly language syntax changed.  gnu for example has
-changed a few subtle things with respect to most of the processors
-they support with their assembler.  Naturally as programmers we want
-labor saving features to our programming tools and languages and
-assembly language is no different.  Look at the C function from above
-
-unsigned int myfun ( void )
-{
-    int a=5;
-    return(a+7);
-}
-
-The syntax unsigned, int, myfun, void, int and even the variable
-name itself are not actually converted to actions we want the
-processor to perform.  They are part of the syntax that is there
-to support us telling the processor what to do and assembly language
-has labels and defines and other similar features.  And that extra
-stuff is another area where one assembler (software tool) may vary
-from another.  The short answer here is that the processor defines
-the machine code or machine language and that cannot vary, but the
-assembler, the tool that parses the assembly language program, defines
-what the assembly language is and so long as the assembler generates
-machine code that conforms to the processor the assembler can define
-whatever programming language syntax it wants.  You will soon see
-that I try to write my code to lean toward portable and reusable and
-try to avoid tool specific features because those things change
-over time and those things are definitely not portable so you have
-to re-write those portions more than the body of the program.  A
-weirdism you will see from me for example is that the assembly language
-world almost universally uses a semicolon (;) to mark a comment, the
-rest of the line after a semicolon is ignored as a comment.  But
-the gnu assembler folks (gas is a shortcut for gnu assembler) for the
-ARM assembler defined the semicolon to separate instructions on the
-same line.  Assembly langauges almost universally only allow one
-instruction per line, so this is pretty insane behavior by the gas
-folks.  They chose to use the @ sign to mark a comment, so my
-weridism or protest or whatever is I often use ;@ for comments, there
-was a time that I had access (the folks I worked for were willing to
-pay for) the ARM tools from ARM and I was writing assembly back
-and forth between ARM tools and GNU tools so if you try to make as
-much of the code not have to be re-written the combination of ;@ will
-give you a comment on both...
-
-Registers, these are the variables of assembly language, different
-processors have different numbers of them and different sizes sometimes
-some are general purpose some are special purpose.  Back to the
-ARMv5 ARM ARM, section A2.3 Registers, now ARM tries to confuse us
-by saying
-
-The ARM processor has a total of 37 registers:
-  Thirty-one general-purpose registers
-
-From an assembly language programmers perspective the ARM actually
-has only 16 general purpose registers there names are r0,r1,r2,r3...
-to r15.  r15 is a special purpose register it is called the
-program counter.  Program counter is a generic processor term it
-keeps track of the programs address.  We talked above about
-the first instruction after reset is address 0x00000000 then to
-run on the Raspberry Pi we need that first instruction to jump or
-branch to address 0x00008000 the program counter is the register that
-that keeps track of those addresses for us.  Probably all of our
-Raspberry Pi ARM programs will start with an instruction at 0x0000 then
-one at 0x8000 and one at 0x8004 and one at 0x8008 and at some point
-we are going to jump or branch or something and go backwards or skip
-some and so on.  The program counter keeps track of that.  All
-processors have one usually they use the term program counter or PC,
-but not always.  And not all processor families let you access the
-PC but ARM does.  And you can mess yourself up if you try to modify
-r15 that can and will make the processor change course to execute the
-instruction at the address to changed r15 to so we have to be careful
-with r15.  The other 15 registers r0-r14 do not have that problem.
-Now there are two other registers that are special in some way one is
-because it is hardcoded by the logic for some of the instructions
-the other is used as the stack pointer as a convention, you could
-technically use another register as you will see but ARM inteded
-r13 to be the stack pointer and we will get into what a stack is
-and a stack pointer in a bit.
-
-In the ARMv5 ARM ARM the same A2.3 Registers section Figure A2-1
-Register organization
-
-So what this is showing us is where that weird count of 37 registers
-came from.  Vertically we have these processor Modes, which is another
-topic for later, but what it is trying to show here is for example
-there is only one r0 register, when you switch modes you dont switch
-to a different r0 there is only one r0.  But for example there are
-many r13 registers, there is one r13 shared by User and System mode
-but Supervisor has its own r13 that is not the same, if you set
-r13 to some value while in supervisor mode then you switch to user
-mode and have an isntruction that uses r13 it will not have the
-same value because it is a different r13 that gets wired in when
-you switch modes.  r14 the same, the cpsr/spsr which we will talk
-about later.  Fast interrupt mode has a bunch of registers that are
-special to that mode and we will cover that later as well.  For almost
-all of this document assembly or C we are going to stay in supervisor
-mode and we have 16 registers to worry about r0 to r15.
-
-So chapters A3 and A4 in the ARMv5 ARM ARM begin to cover the
-instruction set the machine code, ARM has also defined their
-assembly language syntax here as well.  When it comes to the
-assembly language that has a one to one relationship with machine
-language instructions the gnu assembler and this documentation are
-in sync, if we hit a variation we will talk about it then.  The
-ARMv7 ARM ARM also defines the instruction set and being newer it
-includes the ARMv4, v5, v6 and v7 instructions and for each will
-tell you which architectures support that instruction.  So using
-the newer manual will help figure out which instructions were added
-at what time.  The older manual generally shows instructions that
-are supported on all future processors (there are maybe one or a few
-exceptions).
-
-lets stick with the ARMv5 ARM ARM for a little longer, A4.1 is
-the alphabetical list of ARM instructions, dont push down the thumb
-instruction path just yet.  So lets start by adding two numbers together
-how about 5 and 7.  In C we would might do something like
-
-unsigned int a;
-unsigned int b;
-unsigned int c;
-
-a = 5;
-b = 7;
-c = a + b;
-
-For now we have complete freedom to use almost any general purpose
-register (gpr) that we want for our programs (naturally avoiding r15).
-
-So go to A4.1.35 MOV.
-
-Under syntax we see
-
-MOV{<cond>}{S} <Rd>, <shifter_operand>
-
-And it describes each of these items Rd is the register we want to
-put our number in (r0 - r15 the one we choose).  The thing we are
-moving into Rd, the shifter operand is generic here because there
-are a number of different flavors of MOV that we can use.  To find
-these we follow the documents link and go to
-
-Addressing Mode 1 -Data-processing operands on page A5-2,
-
-The one we are going to use is
-
-1.
-#<immediate>
-See Data-processing operands - Immediate on page A5-6.
-
-The term immediate with respect to machine code means that the value
-is found in the immediate area, basically the value is part of the
-machine code.  The short answer is that our first two instructions are
-
-mov r0,#5
-mov r1,#7
-
-Some assemblers make you use capitals for the syntax, but we dont have
-to for these ARM tools.  We are not going to worry about the optional
-{<cond>} and {S} parameters.
-
-Our third and last instruction to perform this task is A4.1.3 ADD
-
-ADD{<cond>}{S} <Rd>, <Rn>, <shifter_operand>
-
-And to shortcut the hop through the document in this case the shifter
-operand we are using is Rm another register, the instruction we want
-is
-
-add r2,r0,r1
-
-Mentally read this instruction by replacing the commas
-
-add r2=r0+r1
-
-Our first ARM program
-
-mov r0,#5
-mov r1,#7
-add r2,r0,r1
-
-so lets assemble this code and then disassemble it.
-
-arm-none-eabi-as fun.s -o fun.o
-arm-none-eabi-objdump -D fun.o
-
-fun.o:     file format elf32-littlearm
-
-
-Disassembly of section .text:
-
-00000000 <.text>:
-   0:   e3a00005    mov r0, #5
-   4:   e3a01007    mov r1, #7
-   8:   e0802001    add r2, r0, r1
-
-The gnu tools work like most toolchains capable of more than tiny
-projects, your source code files are compiled or assembled into
-object files.  Object files have the machine code for the instructions
-plus some extra stuff to help the linker do its job.  The code in an
-object file doesnt know where in memory it is going to live that is
-the linkers job.  For example if we wanted these three instructions
-to live starting at address 0x8000 the object file doesnt know that
-the linker will be told to do that and the linked binary will
-reflect the 0x8000 address.  Since the object doesnt know this the
-disassembly shows address 0x0000.
-This e3a00005 is the machine code for mov r0, #5, we can go back
-to the ARM ARM and see that the 32 bit machine code definition is
-broken into a number of fields of which some are defined as either
-zero or one and those bits forced to zero or one are the ones that
-make this instruction a mov and not an add or some other instruction.
-So we see from the doc
-xxxx00x1101xxxxx....
-and from the disassembly
-111000111010....
-
-xxxx00x1101xxxxx....
-111000111010....
-
-They match.
-
-Also we see bits 15:12 are 0b0000 for the mov r0 instruction and that
-matches what we programmed (0b0000 = r0).  The second instruction
-has 0b0001 in those bits which are also correct 0b0001 = r1, 0b0010 =
-r2 and so on.
-
-SBZ means Should Be Zero and those bits are also zero, although
-should is not equal to must otherwise those bits would explicitly be
-defined as zeros.  Not for us to worry about right now but these
-could be bits that are ignored by this instruciton in the processor
-and maybe in the future these bits could be used to create a new
-instruction where zeros is mov and something else is the new instruction.
-
-Note that most folks are not going to teach assembly by talking you
-through machine code as well.  I find that at least loosly understanding
-the machine code helps with the assembly language, it resolves many
-otherwise unanswered questions, why cant I do this, why can I do that
-and the answer being simple, because the instruction set, the machine
-code does not permit it.  As to the whys and why nots of the machine
-code well the short answer there is it is because that is how the
-designers of the processor desinged the instruction set, if you can
-find and ask them go ahead but otherwise it is what it is, deal with it.
-
-We can do this with the ADD instruction as well.
-
-e0802001    add r2, r0, r1
-
-xxxx00x0100xxxx document
-111000001000xxx disassembly
-
-Now just like in C there is more than one way to do things...
-
-unsigned int a;
-
-a = 5;
-a = a + 7;
-
-Our second program
-
-mov r6,#5
-add r6,r6,#7
-
-assemble and disassemble:
-
-arm-none-eabi-as fun.s -o fun.o
-arm-none-eabi-objdump -D fun.o
-
-fun.o:     file format elf32-littlearm
-
-
-Disassembly of section .text:
-
-00000000 <.text>:
-   0:   e3a06005    mov r6, #5
-   4:   e2866007    add r6, r6, #7
-
-The next thing we need to learn to aim for an interesting program on
-hardware is to make a loop:
-
-    mov r0,#0
-top:
-    add r0,r0,#1
-    cmp r0,#7
-    bne top
-
-assemble and disassemble:
-
-arm-none-eabi-as fun.s -o fun.o
-arm-none-eabi-objdump -D fun.o
-
-fun.o:     file format elf32-littlearm
-
-
-Disassembly of section .text:
-
-00000000 <top-0x4>:
-   0:   e3a00000    mov r0, #0
-
-00000004 <top>:
-   4:   e2800001    add r0, r0, #1
-   8:   e3500007    cmp r0, #7
-   c:   1afffffc    bne 4 <top>
-
-
-Now the indentation doesnt matter just makes it a little easier to read.
-
-text with a colon is a label just like in C, so top: is not an
-instruction we will use it later.  The mov and add we know, cmp is new.
-Section A4.1.15 CMP shows us under Operation what is going on, for now
-assume the condition code passed so we go into alu_out = Rn - shifter_operand.
-in this case alu_out = r0 - 7.  Then it gets into flags, the flag we
-care about is the Z flag which says if alu_out == 0 then 1 else 0.
-The first time we run through this loop r0 by the time it hits the
-cmp instruction is equal to a 1 and 1 - 7 is not equal to 0 so the z
-flag will be a 0.
-
-We will come back to the cmp instruction, lets look at the bne
-instruction, the first problem is there is no BNE listed in the
-alphabetical list of instructions.  What we are looking for is
-A4.1.5 B,BL and now we have to talk about {<cond>}.  bne is really
-a B instruction with a condition code of NE and if we look at the
-operation for this instruciton if the condition passes then
-if L == 1 then, that is the BL instruction so we dont care about that,
-so on to PC = PC + (SignExtend_30(signed_immed_24) << 2).  Basically
-if the condition code passes then we are modifying the pc, and
-hopefully the modification is such that we branch (jump) back to the
-top label, add one more to r0 and keep doing that until the condition
-code doesnt pass.  But how do I know it is going to do that?
-
-A3.2 talks about the condition field.  All of the ARM mode instructions
-(thumb mode is later) start with a 4 bit condition field.  Up until
-now we have been operating with the default of AL or always encoded as
-0b1110 which is such that the condition code always passes.  For the
-bne, ne is the condition code, and the description says Z clear, so the
-ne codition code will pass if the Z flag is clear.  The Z flag is
-modified by the cmp instruction in this loop or lets say the Z flag
-doesnt change after the cmp and before the bne.  So cmp is defining
-the state of the z flag for the bne instruction.  And what we need
-to do to get the z flag a zero (clear) then r0 - 7 has to equal zero
-and that will happen when r0 = 7.  So the first time through
-r0 = 1, z is 1, bne (branch if not equal, branch if r0 is not equal to 7)
-branches back to top, we add one more, r0 = 2, z is still 1, and this
-continues for r0 = 3,4,5,6,7  and when r0 = 7 then z is 0 and the bne
-does not modify the pc so the program will continue to whatever
-instruction we program after bne.
-
-Now if we change the program to this
-
-    mov r0,#0
-top:
-    add r0,r0,#1
-    cmp r0,#7
-    b top
-
-The b instruction is now unconditional it uses the default of always
-as the condition so it always brances.  The cmp can modify all the
-flags it wants it wont change the branch.
-
-So what are and where are flags.  Flags are individual bits in a register
-generically called the program status word.  In section A2.5 ARM
-calls them Program status registers.  bit 30 is the Z flag, bit
-31 the N flag, 29 is C and 28 is V the four that we generally deal with
-and will worry about later.  ARM has names for their program status
-registers CPSR and SPSR.  We care about and maybe sometimes use CPSR
-the current program status word.  SPSR is the saved program status
-word and is used to save a copy of the CPSR in case we need to say
-handle an interrupt and then return, if an interrupt happened between
-the cmp and the bne above we dont want the interrupt to mess up
-our Z flag.  We will worry about interrupts later.
-
-Next thing before we can play with hardware is I cheated a little.  ARM
-at least for what we are looking at uses fixed length instructions
-in ARM mode (thumb is later) every instruction is exactly 32 bits or
-4 bytes, no more no less.  And you may have seen in A2.3 that the
-registers are also 32 bits.  And we have learned a enough about
-machine code to know that we need some of those instruction bits to
-tell the processor one instruction from another specifically the
-mov instruction we saw that a bunch of the bits are consumed just
-defining the parameters to the mov instruction, we moved an immediate
-value of 5 and 7 and that worked fine, but what about a larger number
-like 0x1234, or even worse 0x12345678 how could 0x12345678 possibly
-fit in the 12 bit shifter operand?
-
-mov r0,#0x12345678
-
-arm-none-eabi-as fun.s -o fun.o
-fun.s: Assembler messages:
-fun.s:2: Error: invalid constant (12345678) after fixup
-
-The answer is it cant.  You cannot squeeze 32 bits into 12 bits without
-losing some.  Obviously there is a way to do this.
-
-The assembly for this is
-
-ldr r0,somenumber
-...
-somenumber:
-    .word 0x12345678
-
-So the words (with no spaces) ending in a colon are labels.  Labels
-are simply addresses we dont know nor care what the actual address is
-but to let the assembler do the work for us we give the label a name
-and then somewhere else use that label to reference the address we
-are interested in.  Think about our function names in C those are just
-labels and we expect the compiler and assembler and lastly linker
-to finally give that label/function name an address so that other
-code that wants to call it or jump to it or otherwise access that
-address can.  As programmers we use the label, we let the tools
-do the hard work of figuring out how to get there.
-
-if we look up the ldr instruction it stands for load register, load
-is basically a read from some address.  So somenumber is an address
-we are asking the processor to read a word (a word is defined as 32
-bits in the ARM world (intel x86 world it is 16 bits) see A2.1 Data
-types) from the address somenumber and take the 32 bits you find
-there and put them in register r0.  The the label somenumber: tells
-the assembler that when you are generating the machine code, whatever
-address happens to be here in the program use that address for
-somenumber wherever I have referenced that label.  .word is a directive
-to the assembler, it is not an instruciton, it tells the assembler I
-want you to reserve a 32 bit memory location in the program and I want
-you to put the value I have defined there.  So the assembler is going
-to put the 32 bit value at the address somenumber, it and/or the linker
-will figure out what somenumber is and then ldr will know how to find
-that 32 bit number.  And there we go we can now load any 32 bit pattern
-into a register.
-
-Just to perhaps make this more clear
-
-    ldr r0,somenumber
-top:
-    add r0,r0,#1
-    add r0,r0,#1
-    add r0,r0,#1
-    add r0,r0,#1
-    add r0,r0,#1
-    b top
-somenumber:
-    .word 0x12345678
-    .word 0xABCD
-
-assemble and disassemble which you know how to do now.
-
-fun.o:     file format elf32-littlearm
-
-
-Disassembly of section .text:
-
-00000000 <top-0x4>:
-   0:   e59f0014    ldr r0, [pc, #20]   ; 1c <somenumber>
-
-00000004 <top>:
-   4:   e2800001    add r0, r0, #1
-   8:   e2800001    add r0, r0, #1
-   c:   e2800001    add r0, r0, #1
-  10:   e2800001    add r0, r0, #1
-  14:   e2800001    add r0, r0, #1
-  18:   eafffff9    b   4 <top>
-
-0000001c <somenumber>:
-  1c:   12345678    eorsne  r5, r4, #120, 12    ; 0x7800000
-  20:   0000abcd    andeq   sl, r0, sp, asr #23
-
-
-I put the add instructions in there to give some space between
-ldr and the address it was using.  Now the ARM docs and the disassembly
-are showing something interesting.  Off to the right it tells us
-the address is 1c which is the label somenumber.
-
-What happened is the assembler is doing some math on the program
-counter r15, it is saying add 20 to the program counter and then
-use that as an address to read from memory, then take that value read
-and put that in r0.   Well 20 in decimal is 0x14 hex if this
-instruction were really at address 0x000 then 0x0000+0x14 is 0x0014
-but the number we want is at address 0x1C.
-
-Well two things are going on.  If you think about how a very simple
-processor would have to work using the program counter as we have
-loosly defined.  The program counter would say the instruction
-we want to execute is at address 0x0000 how it says that is that
-register simply holds the address 0x0000.  So the processor is ready
-to execute the next instruction the pc is 0x0000 so it reads the
-instruction 0xe59f0014 from memory.  now what does the pc do?  at
-some point before it starts the next instruction at address 0x0004
-it has to change from 0x0000 to 0x0004.  Well many/most processors
-do just that after reading (called fetching if you are reading
-an instruction from memory) the instruction before actually executing
-it they move the program counter so in this case that moves the
-program counter to 0x0004.  0x0004 + 0x14 = 0x0018 we still are not
-at the 0x001C where our data is and where the disassembler implied
-it knew where our data is.  That is the second thing going on, something
-called pipelining.  It is exactly similar to a production line,
-you have stations along the production line the product is moved
-from one station to another, each station performs a relatively simple
-task on the product and the product moves on.  Well a piplelined
-processor does that as well.  If you had say only one employee at the
-assembly line then you could still have the assembly line but that
-one employee could only do one of the tasks at a time.  if there
-were 100 tasks then it would take 100 steps and then they could start
-over on the next product.  But if you had 100 employees after
-some time every station has a product in some partial state of
-completion every step the first person starts the product from scratch
-and every step the last person outputs a new product, so once all
-the stations have filled up you get one product every step instead of
-one product every 100 steps with the single employee.  The 100
-employees are working in parallel even though the production line is
-serial.  Well a processor has a few basic steps, first it has to fetch
-the instruction from memory, then it has to decode it, look for those
-fixed ones and zeros that tell it this is a mov instruction or an add
-instruction or whatever.  For the add we used above it then needs to
-go get the operands it may have to go get r1 and then go get r2.  And
-then it actually executes, it does the add, then it saves the result
-and done.  The even simpler steps are fetch, decode, execute.  Using
-that simplistic model if we were to step through a mini assembly
-line we would start with address zero entering the first station
-the fetch, then the address 0x00 instruciton moves from the first
-station to the second, decod.  In parallel the 0x04 instruction is in
-the first station execute.  Then the next step the 0x00 instruction
-moves to execute, 0x04 moves to decode and 0x08 moves to fetch.  Fetch
-in this case means the pc is 0x08 go fetch from 0x08.  So when the
-0x00 instruciton is executing the program counter is set to 0x08 the
-address of the instruciton being fetched.  That is two instructions
-ahead not just the one we talked about before.  That is the model
-that ARM is operation on, when you execute an instruction the
-program counter register is at an address two instructions ahead.
-So when we execute the ldr instruction at address 0x00 that means
-the program counter is two ahead, each is 0x04 so two ahead is
-0x00+0x04+0x04 = 0x08.  So if the pc is 0x0008 and we add the offset
-of 0x14 we get 0x1C.  Now here is the rub, that may have actually been
-the tiny pipeline used in very early ARM processors, but for
-reverse compatibility they preserved that two ahead rule for the PC,
-but the actual logic we run on today has a much deeper pipeline and
-how we dont get screwed up by having a program counter that is a bunch
-of instructions ahead is the actual program counter used today
-to keep track of fetching is not the same register we see as r15 it is
-a hidden register, the logic we use today provides us with an r15 that
-pretends to be the real pc but is actually a fake one two ahead.  They
-really had to do it that way.  Had they known that down the road we
-would not only have pipelined processors but much more complicated
-processor internals and that they would no longer have to impose this
-pc being adjusted by the pipeline, but instead would fake its value
-I would like to think they would have simply faked the value as being
-the address of the next instruciton 0x04 in this case not two after
-0x08.  And faked that address from the first pipelined processor to
-the current pipelined processor.
-
-Back to our problem of putting any value in a register.
-
-
-    ldr r0,somenumber
-top:
-    add r0,r0,#1
-    add r0,r0,#1
-    add r0,r0,#1
-    add r0,r0,#1
-    add r0,r0,#1
-    b top
-somenumber:
-    .word 0x12345678
-    .word 0xABCD
-
-I added a few more lessons here.  First off I put a branch before
-the somenumber lable, what if I had not done that?  Well what would
-happen is the assembler would without a peep have assembled what I
-told it to assemble:
-
-
-    ldr r0,somenumber
-top:
-    add r0,r0,#1
-    add r0,r0,#1
-    add r0,r0,#1
-    add r0,r0,#1
-    add r0,r0,#1
-somenumber:
-    .word 0x12345678
-    .word 0xABCD
-
-
-
-fun.o:     file format elf32-littlearm
-
-
-Disassembly of section .text:
-
-00000000 <top-0x4>:
-   0:   e59f0010    ldr r0, [pc, #16]   ; 18 <somenumber>
-
-00000004 <top>:
-   4:   e2800001    add r0, r0, #1
-   8:   e2800001    add r0, r0, #1
-   c:   e2800001    add r0, r0, #1
-  10:   e2800001    add r0, r0, #1
-  14:   e2800001    add r0, r0, #1
-
-00000018 <somenumber>:
-  18:   12345678    eorsne  r5, r4, #120, 12    ; 0x7800000
-  1c:   0000abcd    andeq   sl, r0, sp, asr #23
-
-And if you look at that after that fifth add r0,r0,#1 the next
-"instruction" is the bit pattern 0x12345678 and the processor would
-fetch that pattern and try to execute it.  And maybe that pattern is
-an actual instruction or maybe not but no doubt it is not something
-we meant to be an instruction.  If you are going to do something like
-this then you need to make sure you put that value somewhere that
-is not in the execution path, but is close enough to the ldr in
-this case so that the offset can be encoded in the instruction.
-
-I also put the 0xABCD in there to illustrate a point, the
-somenumber label resulted in the assembler deciding that that label
-is at the address 0x18 in this last example.  So a ldr of somenumber
-gives us the value at that address which is 0x12345678, if we wanted
-0xABCD just because it is a .word after the label doesnt mean it is
-also at the same address, it cant be it is at address 0x1C or
-somenumber+4.  if we wanted to use this technique to load another
-value that wont fit in the immediate field, then we need another
-label.
-
-    ldr r0,hello
-    ldr r1,world
-...
-hello:
-    .word 0x12345678
-world:
-    .word 0xABCD
-
-And the gnu assembler will allow you to put the instruction or
-directive on the same line, you dont have to use a separate line
-
-    ldr r0,hello
-    ldr r1,world
-...
-hello: .word 0x12345678
-world: .word 0xABCD
-
-Note .word is a gnu assembler specific directive I dont think that is
-what the ARM assembler uses, it is not necessarily portable code.
-
-Now both the ARM assembler and the GNU assembler have a nice little
-program saving device for lazy programmers:
-
-    ldr r0,=0x12345678
-    ldr r1,=0xABCD
-top:
-    add r0,r0,#1
-    add r0,r0,#1
-    add r0,r0,#1
-    add r0,r0,#1
-    add r0,r0,#1
-    b top
-
-
-assemble and disassemble
-
-fun.o:     file format elf32-littlearm
-
-
-Disassembly of section .text:
-
-00000000 <top-0x8>:
-   0:   e59f0018    ldr r0, [pc, #24]   ; 20 <top+0x18>
-   4:   e59f1018    ldr r1, [pc, #24]   ; 24 <top+0x1c>
-
-00000008 <top>:
-   8:   e2800001    add r0, r0, #1
-   c:   e2800001    add r0, r0, #1
-  10:   e2800001    add r0, r0, #1
-  14:   e2800001    add r0, r0, #1
-  18:   e2800001    add r0, r0, #1
-  1c:   eafffff9    b   8 <top>
-  20:   12345678    eorsne  r5, r4, #120, 12    ; 0x7800000
-  24:   0000abcd    andeq   sl, r0, sp, asr #23
-
-
-Generically the =something means the address of something.  Whether or
-not the thing after the equals is a label or a number the assembler
-finds a location for you in a safe place (not in the execution path)
-and then encodes a pc relative load (pc plus an offset).  If the
-thing after the equals is a label then the assembler (or linker) will
-place the address in that location so that it can be loaded into
-the register.  By putting a number here we can cheat and get the
-assembler to put that 32 bit value in our register.  It is possible
-that the assembler might not be able to find a place for our number
-and that is where this shortcut can get you into trouble.  Also
-you dont get to control eactly where the number is placed so you
-are giving up control to the assembler which is generally not what
-an assembly language programmer wants to do.
-
-So we can now put any bit pattern we want into a register, we can
-loop, we roughly understand that ldr means load a register with
-a value from an address.  We also saw from the disassembly that we
-can load from a register which holds an address, the ldr instructions
-above are encoded as load from r15 plus an adjustment to r15.  But we
-can use another register.
-
-    ldr r0,=0x12345678
-    ldr r1,[r0]
-
-The [brackets] mean a level of indirection, instead of the value r0
-the bracket means the thing at the address in r0.  The above code
-means read from memory at address 0x12345678 and the value read put that
-in r1.
-
-There has to be a write instruciton as well right?  Well load is a read
-and store is a write, store something at an address.
-
-
-    ldr r0,=0x12345678
-    mov r2,#7
-    str r2,[r0]
-
-This says write the number 7 to address 0x12345678.
-
-Some magic that may or may not be obvious as a non-bare metal
-programmer is that addresses dont only point at memory.  The address
-map for the ARM we saw a space starting at 0x20000000 where the
-I/O peripherals live.  Those peripherals are not ram the things at
-those addresses which are defined in the rest of that Broadcom
-manual.  Reading and writing things in that address space cause
-hardware stuff to happen.
-
-Hopefully by now you have figured out that
-
-int main ()
-{
-    printf("Hello World!\n");
-}
-
-when run on your desktop or laptop is a massively complicated program
-and obviously that is not at all an introduction program to bare
-metal programming.  The bare metal equivalent is turning on and/or
-blinking an led.
-
-(it should be painfully obvious that I wasnt kidding most of bare
-metal is not programming but finding out the information from manuals
-on what to program)
-
-If/when you get a job as a bare metal programmer and work closely with
-the hardware engineers they should already know but it is a good idea
-to wire up an led to a general purpose I/O port and/or wire some
-pads/test points to the general purpose I/O so that using an oscilloscope
-or for your prototype board you can have an led added but that led
-might not be on the production boards.  The Raspberry Pi folks did
-just that.  You need to open one of the schematics mentioned above
-I am looking at the rev 1 board.  Now what we are looking for is a
-symbol that has a triangle up against a line at the tip similar to the
-symbol for fast forward or rewind on an mp3 player but with one
-triangle not two.  That is a diode symbol a light emitting diode
-LED also has some sort of a lightning like symbol on or next to it
-that indicates light comes out of it.
-Sheet 04 of 05 upper middle of the page shows STATUS OK LED and
-POWER ON LED and has a diode symbol with two arrows pointing out.
-The things we care about from the schematic are following one wire
-we see the signal name STATUS_LED_N and the other end the wire
-is connected to +3V3 which they are indicating 3.3Volts which is the
-amount of voltage that powers stuff on this board.  Now from
-middle school science class we know that if you want to turn the
-light on you need to complete the circuit.  To complete the circuit
-in this case means one end of that wire needs to be on the power
-voltage (3.3V) and the other end ground to make the power flow.  If
-one end is left hanging then no power flows no light, also you probably
-didnt do this in middle school.  If both ends are tied to 3.3V then no
-power flows the light doesnt come on.  So now go to the upper middle
-left of Sheet 02 of 05.  What you are looking for is status_led_n
-is connected to a box labelled BCM2835 and the thing it is wired to
-is GPIO 16.  So we are done with the schematic for now, we can
-mess with the status led by messing with gpio 16.  In general and
-true with this processor, if we make gpio 16 an output and if we write
-a 0 to that gpio pin we will make it 0Volts or ground and that means
-the electricity flows and the led comes on.  If we write a 1 that makes
-the pin 3.3Volts, no electricity flows the led goes off.
-
-Now to the Broadcom BCM2835 manual, chapter 6 General Purpose I/O (GPIO).
-There is a diagram there, and it is certainly not obvious what is
-going on, but basically we will be messing with the Pin set and
-clear registers which affect the output state, which work their
-way left to the box on the left side which represents the gpio pin.
-For safety reasons (dont let the smoke out) GPIO pins typically are
-configured after reset as inputs.
-
-So now we get serious.  Remember this document uses the 0x7Exxxxxx
-based addresses for peripherals but that 0x7E hs to be replaced with
-0x20 for ARM.  We need to make pin 16 an output.  Fumbling around
-in this chapter we see
-
-"All pins reset to normal GPIO input operation."
-
-So we know we need to change it from input to output.  We also see
-in Table 6-2 – GPIO Alternate function select register 0 it shows
-a chart for FSEL9 that describes bit patterns for that three bit
-field that controls the function for that gpio, input, output, and
-the alternate functions.  What we take away from this is that to make
-a pin an output we need to set the three bits that control that
-pin to the bit pattern 0b001.
-
-Table 6-3 – GPIO Alternate function select register 1
-
-Contains the bits FSEL16 which are not obviously connected to GPIO16 but
-that is what they mean.  The bits we need to change to 0b001 are bits
-18 to 20 so 18 needs to be a 1, 19, a 0 and 20 a 0.  Some peripherals
-and/or some processors have a way that makes it easy to modify just
-some of the bits in a register.  This is not one of those cases we can
-only access this register on complete 32 bit reads or writes.  The
-proper way to modify these bits is read the register, modify the three
-bits then write the register back.  The power on state for this register
-is supposed to be all zeros (that is what the reset column means) so
-we can cheat for the purpose of this example and just write the whole
-register zeros for the other pins and 0b001 for gpio 16.  That
-means the value we need to write is 0x00040000.  Now the address to
-write to.  Function select register 1, we go up a few pages to
-6.1 Register View.  GPFSEL1 at address 0x7E200004 for the VC which is
-0x20200004 for the ARM.
-
-Now that just makes 16 an output, now we need to control the state
-of that pin a 0 or 1 (0 volts/ground or 3.3Volts).  Fumble around some
-more and we see the GPSETn registers, we can figure out from the
-table above the n is either 0 or 1 GPSET0, GPSET1.
-
-Table 6-8 – GPIO Output Set Register 0
-
-If a bit is set in that register when we write to it then the GPIO
-pin changes to a 0.
-
-Table 6-9 – GPIO Output Set Register 1
-
-If a bit is set in that register when we write to it then the GPIO
-pin changes to a 1.
-
-This is one of those cases where they have given us an easy way to
-change one output without messing up the others while still being
-limited to 32 bit writes.
-
-The GPSET0 register is at ARM address 0x2020001C and the GPSET1
-register is at ARM address 0x20200020.