adding asmdelay experiment, one of many reasons why benchmarks cant be trusted.

This commit is contained in:
root
2016-08-04 15:41:35 -04:00
parent 1fd89a6305
commit db252e2420
6 changed files with 873 additions and 0 deletions

View File

@@ -0,0 +1,35 @@
ARMGNU ?= arm-none-eabi
#ARMGNU ?= arm-linux-gnueabi
COPS = -Wall -O2 -nostdlib -nostartfiles -ffreestanding
all : asmdelay.bin
clean :
rm -f *.o
rm -f *.bin
rm -f *.hex
rm -f *.elf
rm -f *.list
rm -f *.img
rm -f *.bc
rm -f *.clang.s
start.o : start.s
$(ARMGNU)-as start.s -o start.o
asmdelay.o : asmdelay.c
$(ARMGNU)-gcc $(COPS) -c asmdelay.c -o asmdelay.o
periph.o : periph.c
$(ARMGNU)-gcc $(COPS) -c periph.c -o periph.o
asmdelay.bin : memmap start.o periph.o asmdelay.o
$(ARMGNU)-ld start.o periph.o asmdelay.o -T memmap -o asmdelay.elf
$(ARMGNU)-objdump -D asmdelay.elf > asmdelay.list
$(ARMGNU)-objcopy asmdelay.elf -O ihex asmdelay.hex
$(ARMGNU)-objcopy asmdelay.elf -O binary asmdelay.bin

View File

@@ -0,0 +1,249 @@
See the top level README file for more information on documentation
and how to run these programs.
Demonstrating the performance differences of a two instruction loop.
Same machine code, but where you put it with and without cache
and branch prediction, makes a vast difference in performance.
.globl ASMDELAY
ASMDELAY:
subs r0,r0,#1
bne ASMDELAY
bx lr
The two instructions in the loop are the subs and bne, so this is not
even differences in compilers or options. Same two instructions
131 thousand times in a loop. Should I explain this or my theories on
this or not?
Here is the punch line
min max difference
00016DDE 003E025D 003C947F
Yes! The minimum is 0.71 clocks per loop on average, less than one
clock per instruction! How is that possible?
And the worst case I could get was 43 times slower! How could those
two instructions on the same chip/board execute at such vastly
different speeds? Do you really want to know just how bogus benchmarks
really are? This is only a small taste, apply these simple things
to any benchmark, add to that compiler differences same source code.
Many folks dont realize that the same source code can execute several
times faster or slower by simply changing compiler options, likewise
two different compilers or versions of the same (or in the case of
source distributions like gcc or llvm just building the compiler can
change how it outputs without the different compiler builds having
different command line options) can/will/do produce different results.
Simple alignment tricks like adding or removing a single instruction
in the right place can/will move the whole binary up or down in memory
changing where it falls in what I call fetch lines or cache lines (two
separate but similar terms).
I have performed this stunt many times many ways, and there are
things that can be done to further widen the performance gap, adding
some magic number of nops between the subs and bne should help with
branch prediction saving time and on the worse side cost more fetches
per loop. Not going to do that today, these two instructions are enough.
This time around using self modifying code, traditionally I would
re-assemble with more or fewer nops out front of the loop under test
to adjust its alignment.
Using the disassembly of the loop in start.s
0000802c <ASMDELAY>:
802c: e2500001 subs r0, r0, #1
8030: 1afffffd bne 802c <ASMDELAY>
8034: e12fff1e bx lr
We can see the raw instructions, the conditional branch is pc relative
not absolute, basically position independent so can be used as is.
PUT32(ra+0x00,0xe2500001);
PUT32(ra+0x04,0x1afffffd);
PUT32(ra+0x08,0xe12fff1e);
I learned something new on this one, another ARM was doing fine the
raspberry pi (zero) was hanging with branch prediction enabled. I
didnt know there was a prefetch flush you needed to do. I went way
overboard and used flushes and dmbs and dsbs liberally, needed or not.
Prefetch flush made it so that the pi worked.
should I dive into this or not? hmm...
12345678 12345678 12345678 12345678 12345678
0019F158
0019F149
0019F0FE
0019F142
0019F1C6
00045C3F
00045C28
00045C27
00045C28
0000004A
00000031
00000031
00000031
00000041
00000031
00000031
00000031
C0000000 C0000000 C0000000 C0000000
00050078
00050078
C0006000 002200D2 002200D2 002200D2 00000000
C0006000 002200A6 002200A6 002200D2 0000002C
C0006000 00220145 002200A6 00220145 0000009F
C0006008 00220173 002200A6 00220173 000000CD
C0006010 00280096 002200A6 00280096 0005FFF0
C0006010 00280104 002200A6 00280104 0006005E
C000601C 003E015C 002200A6 003E015C 001C00B6
C000601C 003E01AA 002200A6 003E01AA 001C0104
C000602C 0022009D 0022009D 003E01AA 001C010D
C000603C 003E01BC 0022009D 003E01BC 001C011F
C000603C 003E0211 0022009D 003E0211 001C0174
C0006060 0022005E 0022005E 003E0211 001C01B3
C00060FC 003E024D 0022005E 003E024D 001C01EF
00050078
00050878
C0006000 001E0119 001E0119 001E0119 00000000
C0006000 001E00FB 001E00FB 001E0119 0000001E
C0006000 001E00C0 001E00C0 001E0119 00000059
C0006004 00200101 001E00C0 00200101 00020041
C0006008 001E00AD 001E00AD 00200101 00020054
C000600C 0020015F 001E00AD 0020015F 000200B2
C0006010 001E00A0 001E00A0 0020015F 000200BF
C0006014 00200177 001E00A0 00200177 000200D7
C000601C 003C010A 001E00A0 003C010A 001E006A
C000601C 003C01C0 001E00A0 003C01C0 001E0120
C0006028 001E008D 001E008D 003C01C0 001E0133
C000603C 003C01EC 001E008D 003C01EC 001E015F
C0006040 001E0065 001E0065 003C01EC 001E0187
C000605C 003C0252 001E0065 003C0252 001E01ED
C000609C 003C0258 001E0065 003C0258 001E01F3
C00060B0 001E0064 001E0064 003C0258 001E01F4
00050878
00050078
C0006000 0005B72B 0005B72B 0005B72B 00000000
C0006000 0005B6F1 0005B6F1 0005B72B 0000003A
C000601C 0005B731 0005B6F1 0005B731 00000040
C0006058 0005B732 0005B6F1 0005B732 00000041
C0006078 0005B73B 0005B6F1 0005B73B 0000004A
00051078
00051878
C0006000 00016E12 00016E12 00016E12 00000000
C0006000 00016DDE 00016DDE 00016E12 00000034
C0006004 000224E4 00016DDE 000224E4 0000B706
C000601C 000224F0 00016DDE 000224F0 0000B712
00051878
00051078
80000000 80000000 80000000 80000000
00050078
00050078
80006000 002200E1 002200E1 002200E1 00000000
80006000 002200C5 002200C5 002200E1 0000001C
80006000 002200B8 002200B8 002200E1 00000029
80006000 002200E7 002200B8 002200E7 0000002F
80006004 002200E9 002200B8 002200E9 00000031
80006004 002200AE 002200AE 002200E9 0000003B
80006004 0022018A 002200AE 0022018A 000000DC
80006008 00220075 00220075 0022018A 00000115
8000600C 0022005F 0022005F 0022018A 0000012B
80006010 00280105 0022005F 00280105 000600A6
8000601C 003E0168 0022005F 003E0168 001C0109
8000601C 003E01B7 0022005F 003E01B7 001C0158
8000603C 003E024B 0022005F 003E024B 001C01EC
800060FC 003E025A 0022005F 003E025A 001C01FB
00050078
00050878
80006000 001E00B2 001E00B2 001E00B2 00000000
80006000 001E00CD 001E00B2 001E00CD 0000001B
80006000 001E0158 001E00B2 001E0158 000000A6
80006004 00200102 001E00B2 00200102 00020050
80006004 0020010F 001E00B2 0020010F 0002005D
80006004 002001FC 001E00B2 002001FC 0002014A
80006008 001E006F 001E006F 002001FC 0002018D
80006008 001E005C 001E005C 002001FC 000201A0
8000601C 003C0161 001E005C 003C0161 001E0105
8000601C 003C0267 001E005C 003C0267 001E020B
8000603C 003C026C 001E005C 003C026C 001E0210
80006048 001E005B 001E005B 003C026C 001E0211
00050878
00050078
80006000 0005B711 0005B711 0005B711 00000000
80006000 0005B6F3 0005B6F3 0005B711 0000001E
80006004 0005B721 0005B6F3 0005B721 0000002E
80006018 0005B732 0005B6F3 0005B732 0000003F
80006018 0005B6F1 0005B6F1 0005B732 00000041
80006058 0005B733 0005B6F1 0005B733 00000042
00051078
00051878
80006000 00016E0A 00016E0A 00016E0A 00000000
80006000 00016DDF 00016DDF 00016E0A 0000002B
80006000 00016DDE 00016DDE 00016E0A 0000002C
80006004 000224E4 00016DDE 000224E4 0000B706
8000601C 000224F0 00016DDE 000224F0 0000B712
00051878
00051078
40000000 40000000 40000000 40000000
00050078
00050078
40006000 002200C8 002200C8 002200C8 00000000
40006000 00220118 002200C8 00220118 00000050
40006004 002200BB 002200BB 00220118 0000005D
40006004 00220190 002200BB 00220190 000000D5
40006008 002200A2 002200A2 00220190 000000EE
4000600C 00220073 00220073 00220190 0000011D
40006010 0028009C 00220073 0028009C 00060029
40006010 002800AF 00220073 002800AF 0006003C
40006010 002800BC 00220073 002800BC 00060049
40006014 002800DD 00220073 002800DD 0006006A
4000601C 003E014D 00220073 003E014D 001C00DA
4000601C 003E015F 00220073 003E015F 001C00EC
4000601C 003E0175 00220073 003E0175 001C0102
4000601C 003E0255 00220073 003E0255 001C01E2
4000603C 003E025D 00220073 003E025D 001C01EA
400060AC 0022005F 0022005F 003E025D 001C01FE
00050078
00050878
40006000 001E010C 001E010C 001E010C 00000000
40006000 001E0109 001E0109 001E010C 00000003
40006000 001E00DD 001E00DD 001E010C 0000002F
40006004 002000D4 001E00DD 002000D4 0001FFF7
40006004 00200103 001E00DD 00200103 00020026
40006004 00200196 001E00DD 00200196 000200B9
40006008 001E00AD 001E00AD 00200196 000200E9
40006010 001E007C 001E007C 00200196 0002011A
4000601C 003C025F 001E007C 003C025F 001E01E3
40006020 001E0073 001E0073 003C025F 001E01EC
40006020 001E006F 001E006F 003C025F 001E01F0
4000603C 003C0267 001E006F 003C0267 001E01F8
40006040 001E0069 001E0069 003C0267 001E01FE
400060B0 001E0066 001E0066 003C0267 001E0201
400060D0 001E0057 001E0057 003C0267 001E0210
00050878
00050078
40006000 0005B712 0005B712 0005B712 00000000
40006000 0005B6F3 0005B6F3 0005B712 0000001F
40006000 0005B6F1 0005B6F1 0005B712 00000021
40006008 0005B716 0005B6F1 0005B716 00000025
4000600C 0005B71E 0005B6F1 0005B71E 0000002D
40006018 0005B729 0005B6F1 0005B729 00000038
4000601C 0005B72F 0005B6F1 0005B72F 0000003E
4000605C 0005B730 0005B6F1 0005B730 0000003F
40006078 0005B733 0005B6F1 0005B733 00000042
00051078
00051878
40006000 00016E0A 00016E0A 00016E0A 00000000
40006000 00016DDE 00016DDE 00016E0A 0000002C
40006004 000224E5 00016DDE 000224E5 0000B707
4000601C 000224F0 00016DDE 000224F0 0000B712
4000603C 000224F2 00016DDE 000224F2 0000B714
00051878
00051078
00016DDE 003E025D 003C947F
12345678

View File

@@ -0,0 +1,289 @@
//-------------------------------------------------------------------
// Copyright (C) 2010 Netronome Systems
//-------------------------------------------------------------------
//d6004024 <ASMDELAY>:
//d6004024: e2500001 subs r0, r0, #1
//d6004028: 1afffffd bne d6004024 <ASMDELAY>
//d600402c: e12fff1e bx lr
#define ARM_TIMER_LOD 0x2000B400
#define ARM_TIMER_VAL 0x2000B404
#define ARM_TIMER_CTL 0x2000B408
#define ARM_TIMER_DIV 0x2000B41C
#define ARM_TIMER_CNT 0x2000B420
extern void PUT32 ( unsigned int, unsigned int );
extern unsigned int GET32 ( unsigned int );
extern void ASMDELAY ( unsigned int );
extern void uart_init(void);
extern void hexstrings ( unsigned int );
extern void hexstring ( unsigned int );
extern void HOP ( unsigned int, unsigned int );
extern unsigned int GET_CONTROL ( void );
extern void SET_CONTROL ( unsigned int );
extern void CLR_CONTROL ( unsigned int );
extern void start_l1cache ( void );
extern void stop_l1cache ( void );
extern void invalidate_l1cache ( void );
extern void PrefetchFlush ( void );
unsigned int gmin,gmax;
void do_it ( unsigned int base )
{
unsigned int ra;
unsigned int beg,end;
unsigned int rb;
unsigned int min,max;
unsigned int rc;
stop_l1cache(); //just in case
invalidate_l1cache();
hexstrings(base);
hexstrings(base);
hexstrings(base);
hexstring(base);
hexstring(GET_CONTROL());
CLR_CONTROL(1<<11);
hexstring(GET_CONTROL());
max=0;
min=0; min--;
for(ra=base+0x6000;ra<base+0x6100;ra+=4)
{
unsigned int flag;
PUT32(ra+0x00,0xe2500001);
PUT32(ra+0x04,0x1afffffd);
PUT32(ra+0x08,0xe12fff1e);
GET32(ra+0x08);
PrefetchFlush();
for(rc=0;rc<4;rc++)
{
beg=GET32(ARM_TIMER_CNT);
HOP(0x20000,ra);
end=GET32(ARM_TIMER_CNT);
rb=end-beg;
flag=0;
if(rb>gmax) gmax=rb;
if(rb<gmin) gmin=rb;
if(rb>max) { flag++; max=rb; }
if(rb<min) { flag++; min=rb; }
if(flag)
{
hexstrings(ra);
hexstrings(rb);
hexstrings(min);
hexstrings(max);
hexstring(max-min);
}
}
}
hexstring(GET_CONTROL());
SET_CONTROL(1<<11);
hexstring(GET_CONTROL());
if(1)
{
max=0;
min=0; min--;
for(ra=base+0x6000;ra<base+0x6100;ra+=4)
{
unsigned int flag;
PUT32(ra+0x00,0xe2500001);
PUT32(ra+0x04,0x1afffffd);
PUT32(ra+0x08,0xe12fff1e);
GET32(ra+0x08);
PrefetchFlush();
for(rc=0;rc<4;rc++)
{
beg=GET32(ARM_TIMER_CNT);
HOP(0x20000,ra);
end=GET32(ARM_TIMER_CNT);
rb=end-beg;
flag=0;
if(rb>gmax) gmax=rb;
if(rb<gmin) gmin=rb;
if(rb>max) { flag++; max=rb; }
if(rb<min) { flag++; min=rb; }
if(flag)
{
hexstrings(ra);
hexstrings(rb);
hexstrings(min);
hexstrings(max);
hexstring(max-min);
}
}
}
}
hexstring(GET_CONTROL());
CLR_CONTROL(1<<11);
hexstring(GET_CONTROL());
ra=GET32(ARM_TIMER_CNT);
start_l1cache();
max=0;
min=0; min--;
for(ra=base+0x6000;ra<base+0x6100;ra+=4)
{
unsigned int flag;
PUT32(ra+0x00,0xe2500001);
PUT32(ra+0x04,0x1afffffd);
PUT32(ra+0x08,0xe12fff1e);
GET32(ra+0x08);
PrefetchFlush();
invalidate_l1cache();
for(rc=0;rc<4;rc++)
{
beg=GET32(ARM_TIMER_CNT);
HOP(0x20000,ra);
end=GET32(ARM_TIMER_CNT);
rb=end-beg;
flag=0;
if(rb>gmax) gmax=rb;
if(rb<gmin) gmin=rb;
if(rb>max) { flag++; max=rb; }
if(rb<min) { flag++; min=rb; }
if(flag)
{
hexstrings(ra);
hexstrings(rb);
hexstrings(min);
hexstrings(max);
hexstring(max-min);
}
}
}
hexstring(GET_CONTROL());
SET_CONTROL(1<<11);
hexstring(GET_CONTROL());
max=0;
min=0; min--;
for(ra=base+0x6000;ra<base+0x6100;ra+=4)
{
unsigned int flag;
PUT32(ra+0x00,0xe2500001);
PUT32(ra+0x04,0x1afffffd);
PUT32(ra+0x08,0xe12fff1e);
GET32(ra+0x08);
PrefetchFlush();
invalidate_l1cache();
for(rc=0;rc<4;rc++)
{
beg=GET32(ARM_TIMER_CNT);
HOP(0x20000,ra);
end=GET32(ARM_TIMER_CNT);
rb=end-beg;
flag=0;
if(rb>gmax) gmax=rb;
if(rb<gmin) gmin=rb;
if(rb>max) { flag++; max=rb; }
if(rb<min) { flag++; min=rb; }
if(flag)
{
hexstrings(ra);
hexstrings(rb);
hexstrings(min);
hexstrings(max);
hexstring(max-min);
}
}
}
hexstring(GET_CONTROL());
CLR_CONTROL(1<<11);
hexstring(GET_CONTROL());
stop_l1cache();
}
//-------------------------------------------------------------------------
int notmain ( void )
{
//unsigned int ra,rb;
unsigned int ra;
unsigned int beg,end;
//uart_init();
hexstrings(0x12345678);
hexstrings(0x12345678);
hexstrings(0x12345678);
hexstrings(0x12345678);
hexstring(0x12345678);
gmax=0;
gmin=0; gmin--;
PUT32(ARM_TIMER_CTL,0x00000000);
PUT32(ARM_TIMER_CTL,0x00000200);
beg=GET32(ARM_TIMER_CNT);
ASMDELAY(100000);
end=GET32(ARM_TIMER_CNT);
hexstring(end-beg);
for(ra=0;ra<4;ra++)
{
beg=GET32(ARM_TIMER_CNT);
ASMDELAY(100000);
end=GET32(ARM_TIMER_CNT);
hexstring(end-beg);
}
start_l1cache();
for(ra=0;ra<4;ra++)
{
beg=GET32(ARM_TIMER_CNT);
ASMDELAY(100000);
end=GET32(ARM_TIMER_CNT);
hexstring(end-beg);
}
invalidate_l1cache();
for(ra=0;ra<4;ra++)
{
beg=GET32(ARM_TIMER_CNT);
ASMDELAY(10);
end=GET32(ARM_TIMER_CNT);
hexstring(end-beg);
}
invalidate_l1cache();
for(ra=0;ra<4;ra++)
{
beg=GET32(ARM_TIMER_CNT);
ASMDELAY(10);
end=GET32(ARM_TIMER_CNT);
hexstring(end-beg);
}
stop_l1cache();
do_it(0xC0000000);
do_it(0x80000000);
do_it(0x40000000);
hexstrings(gmin); hexstrings(gmax); hexstring(gmax-gmin);
hexstring(0x12345678);
return(0);
}

View File

@@ -0,0 +1,12 @@
MEMORY
{
ram : ORIGIN = 0x8000, LENGTH = 0x1000
}
SECTIONS
{
.text : { *(.text*) } > ram
.bss : { *(.bss*) } > ram
}

View File

@@ -0,0 +1,152 @@
//-------------------------------------------------------------------------
//-------------------------------------------------------------------------
#define PBASE 0x20000000
extern void PUT32 ( unsigned int, unsigned int );
extern void PUT16 ( unsigned int, unsigned int );
extern void PUT8 ( unsigned int, unsigned int );
extern unsigned int GET32 ( unsigned int );
extern void dummy ( unsigned int );
#define ARM_TIMER_CTL (PBASE+0x0000B408)
#define ARM_TIMER_CNT (PBASE+0x0000B420)
#define GPFSEL1 (PBASE+0x00200004)
#define GPSET0 (PBASE+0x0020001C)
#define GPCLR0 (PBASE+0x00200028)
#define GPPUD (PBASE+0x00200094)
#define GPPUDCLK0 (PBASE+0x00200098)
#define AUX_ENABLES (PBASE+0x00215004)
#define AUX_MU_IO_REG (PBASE+0x00215040)
#define AUX_MU_IER_REG (PBASE+0x00215044)
#define AUX_MU_IIR_REG (PBASE+0x00215048)
#define AUX_MU_LCR_REG (PBASE+0x0021504C)
#define AUX_MU_MCR_REG (PBASE+0x00215050)
#define AUX_MU_LSR_REG (PBASE+0x00215054)
#define AUX_MU_MSR_REG (PBASE+0x00215058)
#define AUX_MU_SCRATCH (PBASE+0x0021505C)
#define AUX_MU_CNTL_REG (PBASE+0x00215060)
#define AUX_MU_STAT_REG (PBASE+0x00215064)
#define AUX_MU_BAUD_REG (PBASE+0x00215068)
//GPIO14 TXD0 and TXD1
//GPIO15 RXD0 and RXD1
//------------------------------------------------------------------------
unsigned int uart_lcr ( void )
{
return(GET32(AUX_MU_LSR_REG));
}
//------------------------------------------------------------------------
unsigned int uart_recv ( void )
{
while(1)
{
if(GET32(AUX_MU_LSR_REG)&0x01) break;
}
return(GET32(AUX_MU_IO_REG)&0xFF);
}
//------------------------------------------------------------------------
unsigned int uart_check ( void )
{
if(GET32(AUX_MU_LSR_REG)&0x01) return(1);
return(0);
}
//------------------------------------------------------------------------
void uart_send ( unsigned int c )
{
while(1)
{
if(GET32(AUX_MU_LSR_REG)&0x20) break;
}
PUT32(AUX_MU_IO_REG,c);
}
//------------------------------------------------------------------------
void uart_flush ( void )
{
while(1)
{
if((GET32(AUX_MU_LSR_REG)&0x100)==0) break;
}
}
//------------------------------------------------------------------------
void hexstrings ( unsigned int d )
{
//unsigned int ra;
unsigned int rb;
unsigned int rc;
rb=32;
while(1)
{
rb-=4;
rc=(d>>rb)&0xF;
if(rc>9) rc+=0x37; else rc+=0x30;
uart_send(rc);
if(rb==0) break;
}
uart_send(0x20);
}
//------------------------------------------------------------------------
void hexstring ( unsigned int d )
{
hexstrings(d);
uart_send(0x0D);
uart_send(0x0A);
}
//------------------------------------------------------------------------
void uart_init ( void )
{
unsigned int ra;
PUT32(AUX_ENABLES,1);
PUT32(AUX_MU_IER_REG,0);
PUT32(AUX_MU_CNTL_REG,0);
PUT32(AUX_MU_LCR_REG,3);
PUT32(AUX_MU_MCR_REG,0);
PUT32(AUX_MU_IER_REG,0);
PUT32(AUX_MU_IIR_REG,0xC6);
PUT32(AUX_MU_BAUD_REG,270);
ra=GET32(GPFSEL1);
ra&=~(7<<12); //gpio14
ra|=2<<12; //alt5
ra&=~(7<<15); //gpio15
ra|=2<<15; //alt5
PUT32(GPFSEL1,ra);
PUT32(GPPUD,0);
for(ra=0;ra<150;ra++) dummy(ra);
PUT32(GPPUDCLK0,(1<<14)|(1<<15));
for(ra=0;ra<150;ra++) dummy(ra);
PUT32(GPPUDCLK0,0);
PUT32(AUX_MU_CNTL_REG,3);
}
//------------------------------------------------------------------------
void timer_init ( void )
{
//0xF9+1 = 250
//250MHz/250 = 1MHz
PUT32(ARM_TIMER_CTL,0x00F90000);
PUT32(ARM_TIMER_CTL,0x00F90200);
}
//-------------------------------------------------------------------------
unsigned int timer_tick ( void )
{
return(GET32(ARM_TIMER_CNT));
}
//-------------------------------------------------------------------------
//-------------------------------------------------------------------------
//-------------------------------------------------------------------------
//
// Copyright (c) 2012 David Welch dwelch@dwelch.com
//
// Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
//
// The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
//
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
//
//-------------------------------------------------------------------------

View File

@@ -0,0 +1,136 @@
;@-------------------------------------------------------------------------
;@-------------------------------------------------------------------------
.globl _start
_start:
mov sp,#0x8000
bl notmain
hang: b hang
.globl PUT32
PUT32:
str r1,[r0]
bx lr
.globl GET32
GET32:
ldr r0,[r0]
bx lr
.globl dummy
dummy:
bx lr
.globl GETPC
GETPC:
mov r0,lr
bx lr
.globl BRANCHTO
BRANCHTO:
bx r0
.globl ASMDELAY
ASMDELAY:
subs r0,r0,#1
bne ASMDELAY
bx lr
bne ASMDELAY
bne ASMDELAY
bne ASMDELAY
nop
.globl HOP
HOP:
bx r1
.globl GET_CONTROL
GET_CONTROL:
MRC p15,0,r0,c1,c0,0
bx lr
.globl SET_CONTROL
SET_CONTROL:
MRC p15,0,r1,c1,c0,0
orr r1,r0
mov r0,#0
MCR p15, 0, r0, c7, c10, 4 ;@ gross overkill
MCR p15, 0, r0, c7, c10, 5 ;@ gross overkill
MCR p15, 0, r0, c7, c5, 4 ;@ gross overkill
MCR p15, 0, r0, c7, c5, 6 ;@ gross overkill
MCR p15,0,r1,c1,c0,0
MCR p15, 0, r0, c7, c10, 4 ;@ gross overkill
MCR p15, 0, r0, c7, c10, 5 ;@ gross overkill
MCR p15, 0, r0, c7, c5, 4 ;@ gross overkill
MCR p15, 0, r0, c7, c5, 6 ;@ gross overkill
bx lr
.globl CLR_CONTROL
CLR_CONTROL:
MRC p15,0,r1,c1,c0,0
bic r1,r0
MCR p15,0,r1,c1,c0,0
bx lr
.globl PrefetchFlush
PrefetchFlush:
MCR p15, 0, r0, c7, c5, 4
MCR p15, 0, r0, c7, c5, 6
bx lr
.globl start_l1cache
start_l1cache:
mov r0, #0
mcr p15, 0, r0, c7, c7, 0 ;@ invalidate caches
mcr p15, 0, r0, c8, c7, 0 ;@ invalidate tlb
MCR p15, 0, r0, c7, c10, 4 ;@ DSB needed?
MCR p15, 0, r0, c7, c10, 5 ;@ DMB needed?
mrc p15, 0, r0, c1, c0, 0
orr r0,r0,#0x1000
mcr p15, 0, r0, c1, c0, 0
bx lr
.globl stop_l1cache
stop_l1cache:
mrc p15, 0, r0, c1, c0, 0
bic r0,r0,#0x1000
mcr p15, 0, r0, c1, c0, 0
bx lr
MCR p15, 0, r0, c7, c10, 4 ;@ DSB
MCR p15, 0, r0, c7, c10, 5 ;@ DMB
.globl invalidate_l1cache
invalidate_l1cache:
mov r0, #0
mcr p15, 0, r0, c7, c7, 0 ;@ invalidate caches
mcr p15, 0, r0, c8, c7, 0 ;@ invalidate tlb
MCR p15, 0, r0, c7, c5, 4 ;@ gross overkill
MCR p15, 0, r0, c7, c5, 6 ;@ gross overkill
MCR p15, 0, r0, c7, c10, 4 ;@ DSB needed?
MCR p15, 0, r0, c7, c10, 5 ;@ DMB needed?
bx lr
;@-------------------------------------------------------------------------
;@-------------------------------------------------------------------------
;@-------------------------------------------------------------------------
;@
;@ Copyright (c) 2012 David Welch dwelch@dwelch.com
;@
;@ Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
;@
;@ The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
;@
;@ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
;@
;@-------------------------------------------------------------------------