AN720 [SILICON]

PRECISION32™ OPTIMIZATION CONSIDERATIONS FOR CODE SIZE AND SPEED; PRECISION32â ?? ¢优化的考虑代码大小和速度
AN720
型号: AN720
厂家: SILICON    SILICON
描述:

PRECISION32™ OPTIMIZATION CONSIDERATIONS FOR CODE SIZE AND SPEED
PRECISION32â ?? ¢优化的考虑代码大小和速度

文件: 总28页 (文件大小:611K)
中文:  中文翻译
下载:  下载PDF数据表文档文件
AN720  
PRECISION32™ OPTIMIZATION CONSIDERATIONS FOR  
CODE SIZE AND SPEED  
1. Introduction  
The code size and execution speed of a 32-bit MCU project can vary greatly depending on the way the code is  
written, the toolchain libraries used, and the compiler and linker options. This document addresses how to  
determine what portions of code are taking extra space or time and ways to optimize for space or speed for  
different tool chains, including GCC redlib and newlib (Precision32 IDE) and Keil.  
2. Key Points  
The key topics of this document are:  
How to determine what portions of the project are taking the most space  
Ways to benchmark code execution speed  
Common strategies to reduce code size or improve execution speed  
Code startup time and ways to reduce it  
3. Using CoreMark™ as a Speed Benchmark  
CoreMark is a standard code base that can be ported to various processors to provide a speed benchmark. The  
CoreMark software provides a score that rates how fast the core and code is, providing a relative comparison  
between various toolchain options and settings. The CoreMark software package cannot be modified except for  
device-specific information in the portme files. For modes that do not support printf (nohosting libraries), the  
results were calculated using the value of the variable in code. See the CoreMark website for more information on  
the test and score reporting requirements (www.coremark.org).  
4. Non-Toolchain Considerations  
The coding style and technique can have a great effect on the overall size of the project.  
4.1. Coding Techniques  
There are many ways coding technique can affect code size, including library calls, inline code or data, or code  
optimizations made for global variables or pointers.  
For more information on writing C code for ARM architectures, see the following resources:  
EETimes - Energy efficient C code for ARM devices by Chris Shore: http://www.eetimes.com/design/  
embedded/4210470/Efficient-C-Code-for-ARM-Devices  
Compiler Coding Practices - ARM: http://infocenter.arm.com/help/index.jsp?topic=/  
com.arm.doc.dui0472c/CJAFJCFG.html  
These guidelines will largely apply regardless of the compiler used for the project.  
4.2. Number of Function Parameters  
Functions with either Keil or GCC can have as many parameters as desired. In general, the first four parameters  
are passed to the function efficiently using registers. Any additional parameters beyond four must be moved on or  
off the stack, which results in extra code size for each additional parameter and extra time to execute those  
instructions. If possible, keeping functions to no more than four parameters can help reduce code size and  
execution time.  
Rev. 0.1 9/12  
Copyright © 2012 by Silicon Laboratories  
AN720  
AN720  
4.3. Alignment  
In most cases, Cortex-M3 linkers place code in memory efficiently. In some projects, however, the alignment of  
functions and code can be carefully managed manually to reduce code size or change code execution speed. For  
example, if two functions in the same file call each other, but one ends up in flash and one ends up in RAM, the  
compiler may need to place extra code to perform a long jump and take longer to execute that jump. If needed,  
functions and variables can be explicitly located using scatterfiles and linker flags. More information on linker  
scripts and scatterfiles can be found on the Code Red (http://support.code-red-tech.com/CodeRedWiki/  
OwnLinkScripts) and ARM websites (http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.kui0101a/  
armlink_babddhbf.htm).  
4.4. RAM Size  
The RAM size of a project can be just as important as the code size. In particular, the default configurations for  
SiM3xxxx projects place the stack at the top of memory growing down and the heap at the end of program data  
growing up. If too much of the RAM is used by program data, then the stack and heap may collide, leading to  
difficult debugging issues in run-time. Projects should always leave enough RAM space to accommodate the  
function-calling depth of the code.  
4.5. SiM3xxxx Core and Flash Access Speed  
At the maximum device AHB speed, an SiM3xxxx device reading flash every pipeline cycle may violate the  
maximum flash access speed. To compensate for this, the FLASHCTRL module has controls to reduce the flash  
access speed (SPMD and RDSEN). Depending on the code density and make-up (i.e., 16-bit or 32-bit  
instructions), this may lead to stalls in the core before the next instructions can be fetched from flash. Executing at  
high speeds with strings of 16-bit instructions may yield the fastest core operation.  
4.6. SiM3xxxx Core and the Direct Memory Access (DMA) Module  
On SiM3xxxx devices, the core and the DMA can access multiple AHB slaves at the same time without any  
performance degradation. If the core and DMA access the same AHB slave at the same time (i.e., RAM), then the  
AHB has priority-based arbitration in the following precedence:  
1. Core data fetch  
2. DMA  
3. Core instruction fetch  
If multiple DMA channels are active at the same time and accessing the same memory areas as the core, this  
could lead to a reduction in core execution speed.  
2
Rev. 0.1  
AN720  
5. Precision32 IDE (redlib and newlib)  
This section discusses ways to optimize projects using the Precision32 IDE and both redlib and newlib libraries.  
The Precision32 GCC tools used for the code size and execution speed testing discussed in this document are  
ARM/embedded-4_6-branch revision 182083 (http://gcc.gnu.org/svn/gcc/branches/ARM/embedded-4_6-branch/)  
with newlib v1.19 and Redlib v2 (Precision32 IDE v4.2.1 [Build 73]).  
5.1. Reading the Map File  
The first step in the code size optimization process is to analyze the project map file and determine what portions of  
code take the most space.  
The map file is an output of the linker that shows the size of each function and variable and their positions in  
memory. This map file is located in the build files for a project.  
In addition to the functions, the map file includes information on variables and other symbols, including unused  
functions that are removed.  
For a Precision32 IDE Debug build, the map file is located in the project’s Debug directory. Figure 1 shows an  
excerpt of the sim3u1xx_Blinky redlib Debug example map file.  
For each function in the project, the map file lists the starting address and the length. For example, the  
my_rtc_alarm0_handler function starts at address 0x0000_04D4 and occupies 0x70 bytes of memory.  
Figure 1. sim3u1xx_Blinky Precision32 Debug Map File Example  
5.2. Determining a Project’s Code Size  
Each project’s library and function usage is different. Analyzing the project’s makeup can help determine the most  
effective way to reduce code space.  
All Precision32 SDK projects automatically output the code and RAM size after a build. To modify this output in the  
Precision32 IDE:  
1. Right-click on the project_name in the Project Explorer view.  
2. Select Properties.  
3. In the C/C++ BuildSettingsBuild Steps tab, remove or add the following in the Post-build  
stepsCommand box: arm-none-eabi-size "${BuildArtifactFileName}"  
After building the si32HAL 1.0.1 sim3u1xx_Blinky example, the IDE outputs:  
text  
data  
4
bss  
dec  
hex  
13312  
344 13660  
355c  
Rev. 0.1  
3
AN720  
The areas of memory are:  
text: code and read-only memory in decimal  
data: read-write data in decimal  
bss: zero-initialized data in decimal  
dec: total of text, data, and bss in decimal  
hex: total of text, data, and bss in hex  
More information about the size tool can be found on the Code Red website (http://support.code-red-tech.com/  
CodeRedWiki/FlashRamSize).  
Figure 2. Automatically Reporting Project Size on Project Build in Precision32  
4
Rev. 0.1  
AN720  
5.3. Toolchain Library Usage  
Some toolchains have multiple libraries or settings that can change the size or execution speed of code. The  
Precision32 tools have six options:  
newlib (standard GCC) with no standard I/O  
newlib (standard GCC) with nohosting standard I/O  
newlib (standard GCC) with semihosting standard I/O  
redlib (GCC) with no standard I/O  
redlib (GCC) with nohosting standard I/O  
redlib (GCC) with semihosting standard I/O  
The semihosting libraries have additional hooks to enable a project to send debugging information to an IDE  
running on a PC. The nohosting libraries have this additional capability removed. The none versions of the  
toolchains have no standard I/O capability (i.e., no printf()).  
For some example projects (like si32HAL 1.0.1 sim3u1xx_Blinky), the compile-time library can be modified by  
opening the myLinkerOptions_p32.ld file in the project directory and changing the uncommented line.  
Figure 3. Using the myLinkerOptions_p32.ld File to Select the Project Library  
The four lines in the file correspond to a library:  
GROUP(libgcc.a libc.a libm.a libcr_newlib_nohost.a) (line 4): newlib nohosting  
GROUP(libgca.a libc.a libm.a libcr_newlib_semihost.a) (line 5): newlib semihosting  
GROUP(libcr_semihost.a libcr_c.a libcr_eabihelpers.a) (line 6): redlib semihosting  
GROUP(libcr_nohost.a libcr_c.a libcr_eabihelpers.a) (line 7): redlib nohosting  
The none libraries do not have corresponding entries in this file. Add these lines to add support for none:  
GROUP(libgcc.a libc.a libm.a): newlib none  
GROUP(libcr_c.a libcr_eabihelpers.a): redlib none  
After setting the myLinkerOptions_P32.ld file to the correct setting, set the IDE to the same library using these  
steps:  
1. Left-click on the project_name in the Project Explorer view.  
2. Select Properties.  
3. Click on C/C++ BuildSettingsTool Settings tabMCU LinkerTarget and select the desired  
library from the Use C library drop-down menu. Figure 4 shows this dialog in the Precision32 IDE.  
4. Clean and Build the project.  
AppBuilder projects do not have a myLinkerOptions_P32.ld file and can use the Quickstart view setting only.  
Rev. 0.1  
5
AN720  
Figure 4. Using the Precision32 IDE to Select the Project Library  
Using the sim3u1xx_Blinky and demo_si32UsbAudio default examples in the si32HAL 1.0.1 software package,  
Table 1 and Table 2 show the relative Debug build sizes with the different toolchain library options. Table 3 shows  
the Debug build sizes for CoreMark, and Table 4 shows the relative CoreMark speed scores for each of these  
library options.  
For the newlib and redlib none libraries, see “5.4. Function Library Usage”.  
Table 1. Precision32 Toolchain Library Usage Comparison—sim3u1xx_Blinky Debug  
Read Only Data Read-Write Data Zero-Initialized  
Library  
Code (bytes)  
(bytes)  
(bytes)  
Data (bytes)  
newlib semihosting  
newlib nohosting  
newlib none  
35564  
34864  
2248  
124  
68  
2248  
N/A (requires printf() removal)  
redlib semihosting  
redlib nohosting  
redlib none  
13080  
13136  
4
344  
344  
4
N/A (requires printf() removal)  
6
Rev. 0.1  
AN720  
Table 2. Precision32 Toolchain Library Usage Comparison—demo_si32UsbAudio Debug  
Read Only Data Read-Write Data Zero-Initialized  
Library  
Code (bytes)  
(bytes)  
(bytes)  
Data (bytes)  
newlib semihosting  
newlib nohosting  
newlib none  
108844  
6944  
11904  
108144  
6944  
11848  
N/A (requires printf() removal)  
redlib semihosting  
redlib nohosting  
redlib none  
76176  
76120  
4704  
4704  
12124  
12124  
N/A (requires printf() removal)  
Table 3. Precision32 Toolchain Library Usage Comparison—CoreMark Debug Size  
Read Only Data Read-Write Data Zero-Initialized  
Library  
Code (bytes)  
(bytes)  
(bytes)  
Data (bytes)  
newlib semihosting  
newlib nohosting  
newlib none  
46900  
46208  
2352  
2140  
2352  
2084  
N/A (requires printf() removal)  
redlib semihosting  
redlib nohosting  
redlib none  
24400  
24344  
112  
112  
2360  
2360  
N/A (requires printf() removal)  
Table 4. Precision32 Toolchain Library Usage Comparison—CoreMark Debug Speed  
Library  
CoreMark Score  
newlib semihosting  
CoreMark 1.0 : 37.571643 / GCC4.6.2 20110921 (release) [ARM/embedded-4_6-  
branch revision 182083] Iterations=3000 / STACK  
newlib nohosting  
CoreMark 1.0 : 37.571643 / GCC4.6.2 20110921 (release) [ARM/embedded-4_6-  
branch revision 182083] Iterations=3000 / STACK  
newlib none  
N/A (requires printf() removal)  
redlib semihosting  
CoreMark 1.0 : 37.571643 / GCC4.6.2 20110921 (release) [ARM/embedded-4_6-  
branch revision 182083] Iterations=3000 / STACK  
redlib nohosting  
redlib none  
CoreMark 1.0 : 37.571643 / GCC4.6.2 20110921 (release) [ARM/embedded-4_6-  
branch revision 182083] Iterations=3000 / STACK  
N/A (requires printf() removal)  
Rev. 0.1  
7
AN720  
5.4. Function Library Usage  
Function libraries such as floating point math and printf() can significantly increase the size of a project. If a project  
is constrained by size, a careful analysis of the usage of these large libraries may be required. For example,  
floating point can often be approximated well by fixed point math, eliminating the need for the floating point  
libraries.  
The printf() library is often needed by projects for debugging or release code. If printf() is used for debugging  
purposes, using a defined symbol in the project to remove printf() when compiling a release build can dramatically  
reduce the size of a project. To define a symbol to differentiate between a Debug project and a Release project,  
see “ Contact Information”. The code can then use #ifdef...#endif preprocessor statements to remove debugging  
code or printf() calls.  
The removal of debugging printf() statements can dramatically reduce the code size of a project. A simple way to  
do this is to redefine the printf function at the top of the file containing the printf() calls using the following  
statement:  
#define printf(args...)  
For si32Library examples such as demo_si32UsbAudio, define the statement at the top of myBuildOptions.h to  
remove all calls to printf() with higher optimization settings. Additionally, reduce the code size footprint by disabling  
logging in myBuildOptions.h:  
#define si32BuildOption_enable_logging 0  
This method preserves the printf() statements for later use, if needed. The printf() define can also be  
encapsulated with preprocessor #if statements to automatically include this define when building with a Release  
configuration.  
When removing printf() for use with newlib none or redlib none, all references to printf() and stdio.h must be  
commented out of the project. The none libraries cannot be used with si32Library projects.  
To verify that all instances of printf() have been removed, search the map file for the project for the printf library. In  
the sim3u1xx_Blinky example, this means adding the statement to both the main.c and gCpu.c files.  
Instead of using standard printf(), which can have a high library cost, use integer-only print functions like iprintf()  
for newlib projects. For redlib projects in the Precision32 IDE, create a define CR_INTEGER_PRINTF in the project  
properties to force an integer-only version of printf(). For instances of printf() with a fixed-string, using puts() can  
dramatically reduce code size.  
More information about redlib and printf() can be found on the Code Red website: http://support.code-red-  
tech.com/CodeRedWiki/UsingPrintf.  
If a project does not use any standard I/O functions, use the redlib or newlib none toolchain option to reduce code  
size as discussed in “6.3. Toolchain Library Usage”.  
Using the sim3u1xx_Blinky default example in the si32HAL 1.0.1 software package, Table 5 shows the relative  
build sizes with the different printf() settings. The demo_si32UsbAudio comparison is not included since printf()  
removal requires higher optimization settings or code modifications. This section also does not include the  
CoreMark tests since printf is not part of the CoreMark benchmark.  
8
Rev. 0.1  
AN720  
Table 5. Precision32 printf() Comparison—sim3u1xx_Blinky Debug  
Read Only Data Read-Write Data Zero-Initialized  
Library  
Code (bytes)  
(bytes)  
(bytes)  
Data (bytes)  
newlib semihosting with printf  
newlib nohosting with printf  
35564  
2248  
124  
68  
34864  
19800  
2248  
newlib nohosting with integer  
printf (iprintf)  
2248  
68  
newlib nohosting with puts  
instead of printf  
8784  
2120  
68  
newlib nohosting without printf  
2064  
2064  
4
4
8
8
newlib none with all calls to stdio  
and printf removed  
redlib semihosting with printf  
redlib nohosting with printf  
12880  
12824  
8111  
4
4
4
344  
344  
344  
redlib nohosting with integer  
printf (CR_INTEGER_PRINTF)  
redlib nohosting with puts  
instead of printf  
4004  
4
344  
redlib nohosting without printf  
3868  
2068  
4
4
344  
8
redlib none with all calls to stdio  
and printf removed  
Rev. 0.1  
9
AN720  
5.5. Toolchain Optimization Settings  
In addition to the library types, each toolchain has multiple optimization settings that can affect the resulting code  
size. With the Precision32 toolchain, code optimization can be set by following these steps:  
1. Right-click on the project_name in the Project Explorer view.  
2. Select Properties.  
3. In the C/C++ BuildSettingsTool Settings tabMCU C CompilerOptimization options, select  
the desired optimization level.  
Figure 5 shows the optimization settings for the Precision32 IDE. Level -O0 has the least optimization, while -O3  
has the most optimization. An additional flag (-Os) allows for specific optimization for code size.  
More information on the optimization levels can be found on the Code Red website (http://support.code-red-  
tech.com/CodeRedWiki/CompilerOptimization) and the GCC website (http://gcc.gnu.org/onlinedocs/gcc-4.0.4/gcc/  
Optimize-Options.html). Declaring a variable as volatile will prevent the compiler from optimizing out the variable.  
Figure 5. Setting the Project Optimization in the Precision32 IDE  
The Precision32 IDE has two build configurations by default: Debug and Release. These build configurations have  
predefined optimization levels (None for Debug, -O2 for Release). To switch between the two configurations:  
1. Right-click on the project_name in the Project Explorer view.  
2. Select Build ConfigurationsSet Active and select between Debug and Release.  
10  
Rev. 0.1  
AN720  
Figure 6. Selecting the Active Build Configuration in the Precision32 IDE  
To change the settings of any build configuration:  
1. Right-click on the project_name in the Project Explorer view.  
2. Select Properties.  
3. In the C/C++ BuildSettingsTool Settings tab options, select the build configuration at the top and  
the desired build configuration options.  
Using the sim3u1xx_Blinky and demo_si32UsbAudio default examples in the si32HAL 1.0.1 software package,  
Table 6 and Table 7 show the relative Debug build sizes with the different optimization level settings. Table 8 shows  
the CoreMark Debug build sizes, and Table 9 lists the CoreMark speed scores for these optimization levels.  
Rev. 0.1  
11  
AN720  
Table 6. Precision32 Toolchain Optimization Comparison—sim3u1xx_Blinky Debug  
Read Only  
Data (bytes)  
Read-Write Zero-Initialized  
Library  
Code (bytes)  
Data (bytes)  
Data (bytes)  
newlib nohosting -O0  
newlib nohosting -O1  
newlib nohosting -O2  
newlib nohosting -O3  
newlib nohosting -Os  
redlib nohosting -O0  
redlib nohosting -O1  
redlib nohosting -O2  
redlib nohosting -O3  
redlib nohosting -Os  
34864  
2248  
2248  
2248  
2248  
2248  
4
68  
68  
34032  
33960  
33960  
33808  
13080  
12056  
12096  
12096  
11768  
68  
68  
68  
344  
344  
344  
344  
344  
4
4
4
4
Table 7. Precision32 Toolchain Optimization Comparison—demo_si32UsbAudio Debug  
Read Only  
Data (bytes)  
Read-Write Zero-Initialized  
Library  
Code (bytes)  
Data (bytes)  
Data (bytes)  
newlib nohosting -O0  
newlib nohosting -O1  
newlib nohosting -O2  
newlib nohosting -O3  
newlib nohosting -Os  
redlib nohosting -O0  
redlib nohosting -O1  
redlib nohosting -O2  
redlib nohosting -O3  
redlib nohosting -Os  
108144  
6944  
11848  
84400  
83152  
85136  
76528  
76120  
52048  
50752  
52736  
44128  
6944  
11852  
6944  
11852  
6944  
11856  
6928  
11848  
4704  
12124  
4700  
12124  
4700  
12124  
4700  
12128  
4688  
12120  
Table 8. Precision32 Toolchain Optimization Comparison—CoreMark Debug Size  
Read Only  
Data (bytes)  
Read-Write Zero-Initialized  
Library  
Code (bytes)  
Data (bytes)  
Data (bytes)  
newlib semihosting -O0  
newlib semihosting -O1  
newlib semihosting -O2  
newlib semihosting -O3  
newlib semihosting -Os  
redlib nohosting -O0  
redlib nohosting -O1  
redlib nohosting -O2  
redlib nohosting -O3  
redlib nohosting -Os  
46900  
2352  
2256  
2256  
2256  
2256  
112  
12  
2140  
41812  
42828  
45948  
40284  
24344  
19160  
20176  
23296  
17624  
2140  
2140  
2140  
2140  
2360  
2360  
12  
2360  
12  
2360  
12  
2360  
12  
Rev. 0.1  
AN720  
Table 9. Precision32 Toolchain Optimization Comparison—CoreMark Debug Speed  
Library  
CoreMark Score  
newlib semihosting -O0  
CoreMark 1.0 : 36.478654 / GCC4.6.2 20110921 (release) [ARM/embedded-4_6-  
branch revision 182083] Iterations=3000 / STACK  
newlib semihosting -O1  
newlib semihosting -O2  
newlib semihosting -O3  
newlib semihosting -Os  
redlib nohosting -O0  
redlib nohosting -O1  
redlib nohosting -O2  
redlib nohosting -O3  
redlib nohosting -Os  
CoreMark 1.0 : 79.807436 / GCC4.6.2 20110921 (release) [ARM/embedded-4_6-  
branch revision 182083] Iterations=3000 / STACK  
CoreMark 1.0 : 107.984518 / GCC4.6.2 20110921 (release) [ARM/embedded-4_6-  
branch revision 182083] Iterations=3000 / STACK  
CoreMark 1.0 : 103.509985 / GCC4.6.2 20110921 (release) [ARM/embedded-4_6-  
branch revision 182083] Iterations=3000 / STACK  
CoreMark 1.0 : 87.64509 / GCC4.6.2 20110921 (release) [ARM/embedded-4_6-  
branch revision 182083] Iterations=3000 / STACK  
CoreMark 1.0 : 37.571643 / GCC4.6.2 20110921 (release) [ARM/embedded-4_6-  
branch revision 182083] Iterations=3000 / STACK  
CoreMark 1.0 : 79.998784 / GCC4.6.2 20110921 (release) [ARM/embedded-4_6-  
branch revision 182083] Iterations=3000 / STACK  
CoreMark 1.0 : 107.984518 / GCC4.6.2 20110921 (release) [ARM/embedded-4_6-  
branch revision 182083] Iterations=3000 / STACK  
CoreMark 1.0 : 103.509985 / GCC4.6.2 20110921 (release) [ARM/embedded-4_6-  
branch revision 182083] Iterations=3000 / STACK  
CoreMark 1.0 : 87.64509 / GCC4.6.2 20110921 (release) [ARM/embedded-4_6-  
branch revision 182083] Iterations=3000 / STACK  
Rev. 0.1  
13  
AN720  
5.6. Unused Code Removal  
Each file in a project becomes an object that is included. In other words, if any functions in a file are used, then the  
entire file is included by default. This can become an issue for a project using the si32HAL and only a few functions  
from each module.  
Removed (unused) functions can be viewed in the map files for the projects.  
For Precision32, the -ffunction-sections and -fdata-sections optimization flags place each function and data item  
into separate sections in the file before linking them into the project. This means the compiler can optimize out any  
unused functions. These flags are present in Example and AppBuilder projects by default and should be configured  
on a file-by-file basis. To add or remove these options to a file:  
1. Right-click on the file_name in the Project Explorer view.  
2. Select Properties.  
3. In the C/C++ BuildSettingsTool Settings tabMCU C CompilerMiscellaneous options, add or  
remove the -ffunction-sections and -fdata-sections flags after the -fno-builtin flag to the Other flags  
text box.  
Figure 7. Modifying the Remove Unused Code Compiler Flags in the Precision32 IDE  
These flags must be compiled with the --gc-sections linker command, which is enabled by default in the  
Precision32 IDE. It is recommended that this linker command always remain enabled. These flags only have a  
benefit in some cases, and may cause larger code size and slower execution in some cases.  
Using the sim3u1xx_Blinky and demo_si32UsbAudio default examples in the si32HAL 1.0.1 software package,  
Table 10 and Table 11 show the relative Debug build sizes with different unused code removal settings. For no  
unused code removal, the projects were compiled without -ffunction-sections and-fdata-sections and with --gc-  
sections. For the examples with unused code removal, the projects were compiled with -ffunction-sections, -  
fdata-sections, and --gc-sections. Table 12 shows the CoreMark build sizes, and Table 13 shows the CoreMark  
scores for the different unused code removal settings.  
14  
Rev. 0.1  
AN720  
Table 10. Precision32 Unused Code Removal Comparison—sim3u1xx_Blinky Debug  
Read Only Data Read-Write Data Zero-Initialized  
Library  
Code (bytes)  
(bytes)  
(bytes)  
Data (bytes)  
newlib nohosting with no  
unused code removal  
35504  
35112  
13472  
13080  
2248  
68  
newlib nohosting with  
unused code removal  
2248  
68  
redlib nohosting with no  
unused code removal  
4
4
344  
344  
redlib nohosting with unused  
code removal  
Table 11. Precision32 Unused Code Removal Comparison—demo_si32UsbAudio Debug  
Read Only Data Read-Write Data Zero-Initialized  
Library  
Code (bytes)  
(bytes)  
(bytes)  
Data (bytes)  
newlib nohosting with no  
unused code removal  
122424  
7240  
12116  
newlib nohosting with  
unused code removal  
108144  
90288  
76120  
6944  
5000  
4704  
11848  
12392  
12124  
redlib nohosting with no  
unused code removal  
redlib nohosting with unused  
code removal  
Table 12. Precision32 Unused Code Removal Comparison—CoreMark Debug Size  
Read Only Data Read-Write Data Zero-Initialized  
Library  
Code (bytes)  
(bytes)  
(bytes)  
Data (bytes)  
newlib semihosting with no  
unused code removal  
47188  
46900  
24656  
24344  
2368  
2140  
newlib semihosting with  
unused code removal  
2352  
124  
112  
2140  
2360  
2360  
redlib nohosting with no  
unused code removal  
redlib nohosting with unused  
code removal  
Table 13. Precision32 Unused Code Removal Comparison—CoreMark Debug Speed  
Library CoreMark Score  
newlib semihosting with no CoreMark 1.0 : 37.452232 / GCC4.6.2 20110921 (release) [ARM/embedded-4_6-  
unused code removal  
branch revision 182083] Iterations=3000 / STACK  
newlib semihosting with  
unused code removal  
CoreMark 1.0 : 37.571643 / GCC4.6.2 20110921 (release) [ARM/embedded-4_6-  
branch revision 182083] Iterations=3000 / STACK  
redlib nohosting with no  
unused code removal  
CoreMark 1.0 : 37.875848 / GCC4.6.2 20110921 (release) [ARM/embedded-4_6-  
branch revision 182083] Iterations=3000 / STACK  
redlib nohosting with unused CoreMark 1.0 : 37.571643 / GCC4.6.2 20110921 (release) [ARM/embedded-4_6-  
code removal  
branch revision 182083] Iterations=3000 / STACK  
Rev. 0.1  
15  
AN720  
5.7. Reset Sequence  
The speed of the reset sequence of a device can be an important factor, especially for devices like the SiM3U1xx/  
SiM3C1xx that require a reset to exit the lowest power mode.  
After the hardware jumps to the reset vector and loads the stack pointer address, the core must initialize the  
memory of the device. This involves copying data from flash to RAM and zero-filling any zero-initialized segments.  
Then, the reset code typically calls a system initialization function and jumps to main.  
This reset sequence may take different times based on the library used with the project. The startup code should  
always be compiled with the fastest speed optimization to ensure it takes as little time as possible.  
The si32HAL examples have a ~500 ms delay added to a pin reset event to prevent code from switching to a non-  
existent clock source and disable the device. This delay can be removed by defining the  
si32HalOption_disable_pin_reset_delay symbol in the project.  
To define a symbol in the Precision32 IDE:  
1. Right-click on the project_name in the Project Explorer view.  
2. Select Properties.  
3. In the C/C++ BuildSettingsTool Settings tabMCU C CompilerSettings options, add or  
remove the symbol to the Defined symbols (-D) area.  
Figure 8. Adding a Project Define Symbol in the Precision32 IDE  
Table 14 shows the reset time comparison for the toolchain libraries using the fastest speed optimization on the  
start up code. This time was measured using the sim3u1xx_Blinky example in Debug mode from the fall of a port  
pin at the beginning of the Reset IRQ handler to the fall of a port pin at the beginning of main() on an oscilloscope.  
This test requires modification of the si32HAL startup sequence file startup_<device>_p32.c.  
16  
Rev. 0.1  
AN720  
Table 14. Precision32 Toolchain Library Usage Comparison—sim3u1xx_Blinky Debug Reset  
Sequence  
Library  
Reset Time (µs)  
newlib semihosting with printf()  
newlib nohosting with printf()  
newlib none with printf() removed  
redlib semihosting with printf()  
redlib nohosting with printf()  
redlib none with printf() removed  
242  
236  
9.4  
90  
90  
9.4  
Rev. 0.1  
17  
AN720  
6. ARM/Keil µVision  
This section discusses ways to optimize projects using the Keil or ARM toolchain in the µVision IDE. The Keil  
µVision tools used for the code size and execution speed testing discussed in this document are version  
v4.1.0.894.  
6.1. Reading the Map File  
The map file is an output of the linker that shows the size of each function and variable and their positions in  
memory. This map file is located in the build files for a project. In addition to the functions, the map file includes  
information on variables and other symbols, including unused functions that are removed.  
Figure 9 shows an excerpt from the sim3u1xx_Blinky map file from the Keil toolchain. The functions are listed with  
a base address and size. In this case, the my_rtc_alarm0_handler is 50 bytes located at address 0x0000_03A5.  
Figure 9. sim3u1xx_Blinky µVision Map File Example  
6.2. Determining a Project’s Code Size  
The Keil µVision IDE automatically displays the code size information at the end of a successful build. After  
building the si32HAL 1.0.1 sim3u1xx_Blinky example, the IDE outputs:  
Program Size: Code=1968 RO-data=296 RW-data=24 ZI-data=1536  
".\build\BlinkyApp.axf" - 0 Error(s), 0 Warning(s).  
The areas of memory are:  
Code: all program code in decimal  
RO-data: read-only data located in flash in decimal  
RW-data: read-write uninitialized data located in RAM in decimal  
ZI-data: zero-initialized data located in RAM in decimal  
18  
Rev. 0.1  
AN720  
6.3. Toolchain Library Usage  
Some toolchains have multiple libraries or settings that can change the size or execution speed of code.  
The Keil µVision tools have two options: standard and MicroLIB. To switch between the two:  
1. Right-click on the project_name in the Project window and select Options for Target ‘project_name’ or  
go to ProjectOptions for Target ‘project_name’.  
2. Select the Target tab.  
3. Use the Use MicroLIB checkbox to select the library.  
Figure 10 shows this dialog in the µVision IDE.  
Figure 10. Using the µVision IDE to Select the Project Library  
Using the sim3u1xx_Blinky and demo_si32UsbAudio default examples in the si32HAL 1.0.1 software package,  
Table 15 and Table 16 show the relative Debug build sizes with the different toolchain library options. Table 17  
shows the Debug build sizes for CoreMark, and Table 18 shows the relative CoreMark speed scores for each of  
these library options.  
Table 15. Keil Toolchain Library Usage Comparison—sim3u1xx_Blinky Debug  
Read Only Data Read-Write Data Zero-Initialized  
Library  
Code (bytes)  
(bytes)  
(bytes)  
Data (bytes)  
µVision standard  
µVision MicroLIB  
2296  
2068  
312  
24  
1632  
296  
24  
1536  
Rev. 0.1  
19  
AN720  
Table 16. Keil Toolchain Library Usage Comparison—demo_si32UsbAudio Debug  
Read Only Data Read-Write Data Zero-Initialized  
Library  
Code (bytes)  
(bytes)  
(bytes)  
Data (bytes)  
µVision standard  
µVision MicroLIB  
51176  
47264  
4388  
5196  
18068  
3832  
5208  
17972  
Table 17. Keil Toolchain Library Usage Comparison—CoreMark Debug Size  
Read Only Data Read-Write Data Zero-Initialized  
Library  
Code (bytes)  
(bytes)  
(bytes)  
Data (bytes)  
µVision standard  
µVision MicroLIB  
13860  
11276  
868  
156  
3632  
636  
156  
3536  
Table 18. Keil Toolchain Library Usage Comparison—CoreMark Debug Speed  
Library  
CoreMark Score  
µVision standard  
µVision MicroLIB  
CoreMark 1.0 : 65.602324/ARM4.2 (EDG gcc mode) Iterations=3000/STACK  
CoreMark 1.0 : 69.402323/ARM4.2 (EDG gcc mode) Iterations=3000/STACK  
20  
Rev. 0.1  
AN720  
6.4. Function Library Usage  
The removal of debugging printf() statements can dramatically reduce the code size of a project. A simple way to  
do this is to redefine the printf function at the top of the file containing the printf() calls using the following  
statement:  
#define printf(args...)  
For si32Library examples such as demo_si32UsbAudio, define the statement at the top of myBuildOptions.h to  
remove all calls to printf(). Additionally, reduce the footprint by disabling logging in myBuildOptions.h:  
#define si32BuildOption_enable_logging 0  
This method preserves the printf() statements for later use, if needed. The printf() define can also be  
encapsulated with preprocessor #if statements to automatically include this define when building with a Release  
configuration.  
To verify that all instances of printf() have been removed, search the map file for the project for the printf library. In  
the sim3u1xx_Blinky example, this means adding the statement to both the main.c and gCpu.c files.  
Using the sim3u1xx_Blinky and demo_si32UsbAudio default examples in the si32HAL 1.0.1 software package,  
Table 19 and Table 20 show the relative build sizes with the different printf() settings. This section does not include  
the CoreMark tests since printf is not part of the CoreMark benchmark.  
Table 19. Keil printf() Comparison—sim3u1xx_Blinky Debug  
Read Only Data Read-Write Data Zero-Initialized  
Library  
Code (bytes)  
(bytes)  
(bytes)  
Data (bytes)  
µVision MicroLIB with printf  
2068  
1392  
296  
24  
1536  
µVision MicroLIB without printf  
296  
12  
1536  
Table 20. Keil printf() Comparison—demo_si32UsbAudio Debug  
Read Only Data Read-Write Data Zero-Initialized  
Library  
Code (bytes)  
(bytes)  
(bytes)  
Data (bytes)  
µVision MicroLIB with printf  
47264  
39760  
3832  
5208  
17972  
µVision MicroLIB without printf  
4312  
5196  
17972  
Rev. 0.1  
21  
AN720  
6.5. Toolchain Optimization Settings  
In addition to the library types, each toolchain has multiple optimization settings that can affect the resulting code  
size. In Keil µVision, the optimization settings are set using the following steps:  
1. Right-click on the project_name in the Project window and select Options for Target ‘project_name’ or  
go to ProjectOptions for Target ‘project_name’.  
2. Select the C/C++ tab.  
3. Use the Optimization drop-down menu to set the project optimization setting.  
Figure 11 shows the optimization settings in the IDE.  
The available options are:  
Level 0: minimum optimization  
Level 1: restricted optimization, removing inline functions and unused static functions  
Level 2: high optimization  
Level 3: maximum optimization with aims to produce faster code or smaller code size than Level 2,  
depending on the options used  
In addition to the levels, µVision also has an Optimize for Time selection available below the Optimization drop-  
down menu. Declaring a variable as volatile will prevent the compiler from optimizing out the variable.  
More information on these optimization levels can be found on the Keil website (http://www.keil.com/support/man/  
docs/uv4/uv4_dg_adscc.htm).  
Figure 11. Setting the Project Optimization in the µVision IDE  
Using the sim3u1xx_Blinky and demo_si32UsbAudio default examples in the si32HAL 1.0.1 software package,  
Table 21 and Table 22 show the relative Debug build sizes with the different optimization level settings. Table 23  
shows the CoreMark Debug build sizes, and Table 24 lists the CoreMark speed scores for these optimization  
levels.  
22  
Rev. 0.1  
AN720  
Table 21. Keil Toolchain Optimization Comparison—sim3u1xx_Blinky Debug  
Read Only  
Data (bytes)  
296  
Read-Write Zero-Initialized  
Library  
Code (bytes)  
Data (bytes)  
Data (bytes)  
µVision MicroLIB -O0  
2068  
2068  
24  
24  
1536  
µVision MicroLIB -O0  
296  
1536  
(with Optimize for Time)  
µVision MicroLIB -O1  
1704  
1648  
296  
296  
20  
20  
1536  
1536  
µVision MicroLIB -O1  
(with Optimize for Time)  
µVision MicroLIB -O2  
1616  
1600  
296  
296  
20  
20  
1536  
1536  
µVision MicroLIB -O2  
(with Optimize for Time)  
µVision MicroLIB -O3  
1604  
1596  
296  
296  
20  
20  
1536  
1536  
µVision MicroLIB -O3  
(with Optimize for Time)  
Table 22. Keil Toolchain Optimization Comparison—demo_si32UsbAudio Debug  
Read Only  
Data (bytes)  
3832  
Read-Write Zero-Initialized  
Library  
Code (bytes)  
Data (bytes)  
Data (bytes)  
µVision MicroLIB -O0  
47264  
47264  
5208  
17972  
µVision MicroLIB -O0  
3832  
5208  
17972  
(with Optimize for Time)  
µVision MicroLIB -O1  
38816  
39924  
3832  
3832  
5132  
5132  
17952  
17952  
µVision MicroLIB -O1  
(with Optimize for Time)  
µVision MicroLIB -O2  
36540  
39840  
3832  
3832  
5132  
5132  
17952  
17952  
µVision MicroLIB -O2  
(with Optimize for Time)  
µVision MicroLIB -O3  
36468  
41532  
3832  
3832  
5132  
5132  
17952  
17952  
µVision MicroLIB -O3  
(with Optimize for Time)  
Rev. 0.1  
23  
AN720  
Table 23. Keil Toolchain Optimization Comparison—CoreMark Debug Size  
Read Only  
Data (bytes)  
636  
Read-Write Zero-Initialized  
Library  
Code (bytes)  
Data (bytes)  
Data (bytes)  
µVision MicroLIB -O0  
11276  
11276  
156  
156  
3536  
µVision MicroLIB -O0  
636  
3536  
(with Optimize for Time)  
µVision MicroLIB -O1  
9788  
616  
616  
140  
140  
3536  
3536  
µVision MicroLIB -O1  
10136  
(with Optimize for Time)  
µVision MicroLIB -O2  
9640  
616  
616  
140  
140  
3536  
3536  
µVision MicroLIB -O2  
10684  
(with Optimize for Time)  
µVision MicroLIB -O3  
9680  
616  
616  
140  
140  
3536  
3536  
µVision MicroLIB -O3  
11500  
(with Optimize for Time)  
Table 24. Keil Toolchain Optimization Comparison—CoreMark Debug Speed  
Library  
CoreMark Score  
µVision MicroLIB -O0  
CoreMark 1.0 : 69.402323 / ARM4.2 (EDG gcc mode) Iterations=3000 / STACK  
CoreMark 1.0 : 69.402323 / ARM4.2 (EDG gcc mode) Iterations=3000 / STACK  
µVision MicroLIB -O0  
(with Optimize for Time)  
µVision MicroLIB -O1  
CoreMark 1.0 : 75.279256 / ARM4.2 (EDG gcc mode) Iterations=3000 / STACK  
CoreMark 1.0 : 75.206352 / ARM4.2 (EDG gcc mode) Iterations=3000 / STACK  
µVision MicroLIB -O1  
(with Optimize for Time)  
µVision MicroLIB -O2  
CoreMark 1.0 : 74.247855 / ARM4.2 (EDG gcc mode) Iterations=3000 / STACK  
CoreMark 1.0 : 87.277701 / ARM4.2 (EDG gcc mode) Iterations=3000 / STACK  
µVision MicroLIB -O2  
(with Optimize for Time)  
µVision MicroLIB -O3  
CoreMark 1.0 : 79.520321 / ARM4.2 (EDG gcc mode) Iterations=3000 / STACK  
CoreMark 1.0 : 102.697150 / ARM4.2 (EDG gcc mode) Iterations=3000 / STACK  
µVision MicroLIB -O3  
(with Optimize for Time)  
24  
Rev. 0.1  
AN720  
6.6. Unused Code Removal  
Each file in a project becomes an object that is included. In other words, if any functions in a file are used, then the  
entire file is included by default. This can become an issue for a project using the si32HAL and only a few functions  
from each module.  
Removed (unused) functions can be viewed in the map files for the projects.  
The unused code removal feature is not automatically enabled in the Keil µVision IDE. To enable this feature:  
1. Right-click on the project_name in the Project window and select Options for Target ‘project_name’ or  
go to ProjectOptions for Target ‘project_name’.  
2. Select the C/C++ tab.  
3. Use the One ELF Section per Function checkbox to enable or disable unused code removal.  
Figure 12. Setting the Remove Unused Code Option in the µVision IDE  
Using the sim3u1xx_Blinky and demo_si32UsbAudio default examples in the si32HAL 1.0.1 software package,  
Table 25 and Table 26 show the relative Debug build sizes with different unused code removal settings. Table 27  
shows the CoreMark build sizes, and Table 28 shows the CoreMark scores for the different unused code removal  
settings.  
Rev. 0.1  
25  
AN720  
Table 25. Keil Unused Code Removal Comparison—sim3u1xx_Blinky Debug  
Read Only Data Read-Write Data Zero-Initialized  
Library  
Code (bytes)  
(bytes)  
(bytes)  
Data (bytes)  
µVision MicroLIB with no  
unused code removal  
1392  
296  
12  
1536  
µVision MicroLIB with  
unused code removal  
1184  
296  
12  
1536  
Table 26. Keil Unused Code Removal Comparison—demo_si32UsbAudio Debug  
Read Only Data Read-Write Data Zero-Initialized  
Library  
Code (bytes)  
(bytes)  
(bytes)  
Data (bytes)  
µVision MicroLIB with no  
unused code removal  
47264  
3832  
5208  
17972  
µVision MicroLIB with  
unused code removal  
43464  
3772  
5060  
17780  
Table 27. Keil Unused Code Removal Comparison—CoreMark Debug Size  
Read Only Data Read-Write Data Zero-Initialized  
Library  
Code (bytes)  
(bytes)  
(bytes)  
Data (bytes)  
µVision MicroLIB with no  
unused code removal  
11276  
636  
156  
3536  
µVision MicroLIB with  
unused code removal  
11012  
636  
156  
3536  
Table 28. Keil Unused Code Removal Comparison—CoreMark Debug Speed  
Library  
CoreMark Score  
µVision MicroLIB with no  
unused code removal  
CoreMark 1.0 : 69.402324 / ARM4.2 (EDG gcc mode) Iterations=3000 / STACK  
µVision MicroLIB with  
unused code removal  
CoreMark 1.0 : 67.374626 / ARM4.2 (EDG gcc mode) Iterations=3000 / STACK  
26  
Rev. 0.1  
AN720  
6.7. Reset Sequence  
The speed of the reset sequence of a device can be an important factor, especially for devices like the SiM3U1xx/  
SiM3C1xx that require a reset to exit the lowest power mode.  
After the hardware jumps to the reset vector and loads the stack pointer address, the core must initialize the  
memory of the device. This involves copying data from flash to RAM and zero-filling any zero-initialized segments.  
Then, the reset code typically calls a system initialization function and jumps to main.  
This reset sequence may take different times based on the library used with the project. The startup code should  
always be compiled with the fastest speed optimization to ensure it takes as little time as possible.  
The si32HAL examples have a ~500 ms delay added to a pin reset event to prevent code from switching to a non-  
existent clock source and disable the device. This delay can be removed by defining the  
si32HalOption_disable_pin_reset_delay symbol in the project.  
To define a symbol in Keil µVision:  
1. Right-click on the project_name in the Project window and select Options for Target ‘project_name’ or  
go to ProjectOptions for Target ‘project_name’.  
2. Select the C/C++ tab.  
3. Use the Define text box to add or remove project symbols.  
Figure 13. Adding a Project Define Symbol in the µVision IDE  
Table 29 shows the reset time comparison for the toolchain libraries using the fastest speed optimization on the  
start up code. This time was measured using the sim3u1xx_Blinky example in Debug mode from the rise of  
RESETb to the fall of a port pin at the beginning of main() on an oscilloscope.  
Table 29. Keil Toolchain Library Usage Comparison—sim3u1xx_Blinky Debug Reset Sequence  
Library  
Reset Time (µs)  
µVision standard  
µVision MicroLIB  
52  
48  
Rev. 0.1  
27  
AN720  
CONTACT INFORMATION  
Silicon Laboratories Inc.  
400 West Cesar Chavez  
Austin, TX 78701  
Tel: 1+(512) 416-8500  
Fax: 1+(512) 416-9669  
Toll Free: 1+(877) 444-3032  
Please visit the Silicon Labs Technical Support web page:  
https://www.silabs.com/support/pages/contacttechnicalsupport.aspx  
and register to submit a technical support request.  
Patent Notice  
Silicon Labs invests in research and development to help our customers differentiate in the market with innovative low-power, small size, analog-  
intensive mixed-signal solutions. Silicon Labs' extensive patent portfolio is a testament to our unique approach and world-class engineering team.  
The information in this document is believed to be accurate in all respects at the time of publication but is subject to change without notice.  
Silicon Laboratories assumes no responsibility for errors and omissions, and disclaims responsibility for any consequences resulting from  
the use of information included herein. Additionally, Silicon Laboratories assumes no responsibility for the functioning of undescribed features  
or parameters. Silicon Laboratories reserves the right to make changes without further notice. Silicon Laboratories makes no warranty, rep-  
resentation or guarantee regarding the suitability of its products for any particular purpose, nor does Silicon Laboratories assume any liability  
arising out of the application or use of any product or circuit, and specifically disclaims any and all liability, including without limitation conse-  
quential or incidental damages. Silicon Laboratories products are not designed, intended, or authorized for use in applications intended to  
support or sustain life, or for any other application in which the failure of the Silicon Laboratories product could create a situation where per-  
sonal injury or death may occur. Should Buyer purchase or use Silicon Laboratories products for any such unintended or unauthorized ap-  
plication, Buyer shall indemnify and hold Silicon Laboratories harmless against all claims and damages.  
Silicon Laboratories and Silicon Labs are trademarks of Silicon Laboratories Inc.  
Other products or brandnames mentioned herein are trademarks or registered trademarks of their respective holders.  
28  
Rev. 0.1  

相关型号:

PANASONIC

AN7203

FM FRONT-END IC FOR RADIO, RADIO CASSETTE RECORDER
ETC

AN7204

FM Receiver Circuit
ETC

AN7205

FM FRONT-END CIRCUIT FOR RADIOS AND RADIO / CASSETTE TAPE RECORDERS (3V OPERATION)
PANASONIC

AN7205S

FM FRONT-END CIRCUIT FOR RADIOS AND RADIO / CASSETTE TAPE RECORDERS (3V OPERATION)
PANASONIC

AN7208

TV/FM front-end IC for 1.5 V headphone stereo
PANASONIC

AN7208SA

TV/FM front-end IC for 1.5 V headphone stereo
PANASONIC

AN7213

FM FRONT END CIRCUIT FOR RADIO
PANASONIC

AN7213S

FM Receiver Circuit
ETC

AN7216

FM FRONT END CIRCUIT FOR RADIOS
PANASONIC

AN7216S

FM FRONT END CIRCUIT FOR RADIOS
PANASONIC

AN7218

AM/FM Receiver Circuit
ETC