README.txt 17 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474
  1. TI PRU C/C++ CODE GENERATION TOOLS
  2. 2.1.5
  3. May 2017
  4. ===============================================================================
  5. Contents:
  6. 1. New defect history files in 2.1.4
  7. 2. Features
  8. 3. Changes from previous releases
  9. 4. Example usage
  10. 5. Common options
  11. 6. Linking with an ARM executable
  12. 7. Assembly language
  13. 8. Calling Convention
  14. 9. Language extensions
  15. 10. Sections in the linker command file
  16. ===============================================================================
  17. New defect history files in 2.1.4
  18. As of the 2.1.4 release, the DefectHistory.txt file has been replaced with the
  19. two files Open_defects.html and Closed_defects.html. These show the open and
  20. closed defects for the 2.1.x compiler branch. For open bugs, a status of Open
  21. or Accepted means that the bug has not been examined yet, whereas a status of
  22. Planned means that an evaluation or fix is in progress.
  23. ===============================================================================
  24. Features
  25. 1. Full support of C/C++.
  26. 2. Generates ELF relocatable object files and executables.
  27. 3. Complete support of the PRU instruction sets (v0, v1, v2, v3).
  28. 4. Constant table accesses from C
  29. 5. Intrinsics for XFER instructions in C
  30. 6. Little and big endian modes
  31. ===============================================================================
  32. Changes from 1.1.0B1
  33. 1. __delay_cycles intrinsic
  34. 2. Boot routine specialization to reduce code size
  35. 3. Generation of the LOOP instruction from C
  36. 4. Improved performance when accessing cregister symbols
  37. 5. Improved support for -o4
  38. 6. Improved disassembler output
  39. 7. Addition of near and far data qualifiers
  40. ===============================================================================
  41. Changes from 1.0.0B1
  42. 1. Performance improvements
  43. 2. Big endian support
  44. 3. Ability to link with an ARM executable
  45. 4. New support for accessing the constant table registers from C
  46. ===============================================================================
  47. Example Usage
  48. If installed with the default options, the C compiler will get installed at
  49. below path
  50. C:\Program Files (x86)\Texas Instruments\PRU Code Generation Tools 1.1.0B1
  51. The first thing you need to do is add the bin folder of the toolchain in your
  52. path. If working from a Dos command window, type:
  53. set path=%path%;C:\Program Files (x86)\Texas Instruments\PRU Code Generation Tools 1.1.0B1
  54. An example command line for compiling/linking your code for PRU0 of the AM335x:
  55. clpru --silicon_version=3 -o1 main.c tests.c -z AM3359_PRU.cmd -o PRU_tests.out -m PRU_tests.map
  56. If you have CCS 6.0 you should be able to load the executable using the CCS
  57. loader. If the version of CCS you have does not support direct loading, please
  58. follow the steps below.
  59. The next step is to generate the data and program binary files for loading in
  60. CCS debugger:
  61. hexpru bin.cmd PRU_tests.out
  62. This generates 2 files: data.bin (containing the data sections) and
  63. text.bin (containing the .text sections).
  64. The example AM3359_PRU.cmd and bin.cmd files are included in this directory
  65. for reference.
  66. In Code Composer Studio, connect to PRU0 of the AM3359.
  67. In the Memory Browser, R-click -> Load memory. Select text.bin then click on
  68. Next. Select:
  69. - Start address 0
  70. - Memory page: Program_Memory
  71. - Type-size: 32-bits
  72. Click on Finish. Code sections get loaded in the PRU program memory.
  73. In the Memory Browser, R-click -> Load memory. Select data.bin then click on
  74. Next. Select:
  75. - Start address 0
  76. - Memory page: Data_Memory
  77. - Type-size: 32-bits
  78. Click on Finish. Data sections get loaded in the PRU data memory.
  79. You are ready for code execution. Launch from PC = _c_int00 address
  80. Note: the address of labels in the linker generated map file are 8-bit
  81. addresses. So to get the real address to be used/observed in CCS debugger -
  82. 32-bit -, you need to divide this address by 4.
  83. ===============================================================================
  84. Common options
  85. A list of all compiler options can be obtained with clpru --help. To get
  86. detailed information about an option run clpru <option> --help.
  87. --silicon_version=0,1,2,3
  88. Select the silcon version for the core. The default is 3.
  89. -O, --opt_level=off,0,1,2,3
  90. Select the optimization level to use for compilation. If no option
  91. is specified, the default is no optimization. Specifying -O with no
  92. option is the same as specifying -O2.
  93. --keep_asm
  94. Keeps the generated assembly language (.asm) file.
  95. --c_src_interlist:
  96. Interlists the C source with the assembly.
  97. ===============================================================================
  98. Linking with an ARM executable
  99. The hexpru tool provides the capability for converting a PRU executable into
  100. an ARM object file that can be linked in with an ARM project. The PRU
  101. code and data are converted to ARM data. The ARM application can reference
  102. symbols in the PRU file. This is useful for bare metal projects that are
  103. not running a high level operating system like Linux.
  104. The files for this example are in the "example" directory.
  105. > clpru -o3 test.c -z AM3359_PRU.cmd -o pru.out
  106. The executable for this will expose the shared_buf and local_data symbols.
  107. Now we need to convert the the executable to an ARM object file. This will
  108. be done using the included PRU_to_ARM.cmd file.
  109. > hexpru pru.out -o pru.obj PRU_to_ARM.cmd
  110. Now we can link the pru.obj into our dummy ARM application:
  111. > armcl arm.c pru0.obj -z dummy_ARM.cmd -o arm.out
  112. ===============================================================================
  113. Assembler
  114. -------------------------------------------------------------------------------
  115. Instructions
  116. -------------------------------------------------------------------------------
  117. The syntax for instruction mnemonics is very similiar to the existing PASM
  118. assembler. Exceptions are listed here:
  119. 1. MOV instruction
  120. MOV is only supported for register to register moves. The LDI instruction
  121. must be used to load a literal into a register.
  122. 2. LDI instruction
  123. LDI can only be used to load a 16-bit constant into a register. To load a
  124. 32-bit constant, you must use the LDI32 instruction.
  125. 3. MVI instruction
  126. MVI is only supported on core revisions 2 and 3. The existing PASM assembler
  127. supports the instruction in a limited form for v1 using pseudo operations.
  128. 4. ZERO instruction
  129. ZERO is only supported on v2 and v3 cores. For v1, the user should use
  130. LDI r0, 0.
  131. 5. LFC, SFC, and SCAN
  132. These instructions are not supported. If support is needed we can add them.
  133. 6. Operands with a '&' prefix
  134. The existing PASM assembler accepts operands with or without the &
  135. symbol: LBBO &r0 or LBBO r0. The assembler in this release requires the
  136. & for these operands.
  137. -------------------------------------------------------------------------------
  138. Directives
  139. -------------------------------------------------------------------------------
  140. The PRU assembler supports all standard directives in TI SDO assemblers.
  141. Documentation can be found in the manuals for other architectures for now.
  142. These will be completely documented in the PRU manuals once available.
  143. Here is a discussion of directives in the PASM assembler and how they map
  144. to SDO assemblers.
  145. 1. .origin
  146. Not supported in the assembler. The assembler produces relocatable object
  147. files which start at address 0.
  148. 2. .entrypoint
  149. Not supported in the assembler. The entrypoint can be specified at linktime
  150. using the --entrypoint option.
  151. 3. .setcallreg
  152. Not supported in the assembler. The default register R30.w0 is used by
  153. the RET and CALL instructions.
  154. 4. .macro, .mparam, and .endm
  155. Supported using the .macro and .endm directives.
  156. 5. .enter, .leave, .using
  157. Not supported in the assembler. The toolset supports separate compilation
  158. so namespaces are not as important in assembly.
  159. 6. .struct and related directives
  160. Supported using the .struct directive. An example would be:
  161. tag .struct
  162. field1 .int
  163. field2 .short
  164. field3 .char
  165. length .endstruct
  166. The label defined before the .endstruct directive will evaluate to the
  167. size of the struct.
  168. 7. .assign
  169. The .assign functionality is supported through the .sassign directive.
  170. Syntax:
  171. label .sassign r0.b1, tag
  172. ADD label.field1, label.field1, label.field2
  173. LBBO &label, r0, 0, length
  174. The directive does checking to ensure that all field members can be accessed
  175. through the register file. The struct must be defined using directives with
  176. predefined types (.int, .short, .char, ...). The .field directive will cause
  177. an error.
  178. ===============================================================================
  179. C Calling Convention
  180. This is the description of the calling convention for C callable routines.
  181. It is subject to change in future releases.
  182. Special registers:
  183. Link register: R3.w2
  184. Stack pointer: R2
  185. Registers R14-R29 are used for passing by value. Values are packed as tightly
  186. as possible into the registers. For instance:
  187. void foo(short a, int b, short c)
  188. a -> R14.w0
  189. b -> R15
  190. c -> R14.w2
  191. Return values are returned through R14
  192. Save on call registers are:
  193. R0-R1, R14-R29
  194. Save on entry registers are:
  195. R3.w2-R13
  196. ===============================================================================
  197. Data types
  198. Type Size (in bits)
  199. ---- --------------
  200. (un)signed char 8
  201. plain char (unsigned char) 8
  202. (un)signed short 16
  203. (un)signed int 32
  204. (un)signed long 32
  205. (un)signed long long 64
  206. float 32
  207. double 64
  208. long double 64
  209. data pointers 32
  210. code pointers 16
  211. All data types are byte aligned.
  212. ===============================================================================
  213. C/C++ Language Extensions for PRU
  214. -------------------------------------------------------------------------------
  215. Intrinsics
  216. -------------------------------------------------------------------------------
  217. void __xin (unsigned int device_id, unsigned int base_register,
  218. unsigned int use_remapping, void& object)
  219. void __xout (unsigned int device_id, unsigned int base_register,
  220. unsigned int use_remapping, void& object)
  221. void __xchg (unsigned int device_id, unsigned int base_register,
  222. unsigned int use_remapping, void& object)
  223. void __sxin (unsigned int device_id, unsigned int base_register,
  224. unsigned int use_remapping, void& object)
  225. void __sxout(unsigned int device_id, unsigned int base_register,
  226. unsigned int use_remapping, void& object)
  227. void __sxchg(unsigned int device_id, unsigned int base_register,
  228. unsigned int use_remapping, void& object)
  229. These intrinsics are used to generate the XFER instructions on the PRU.
  230. The parameters are defined as:
  231. device_id: Literal 0-255 correpsonding to the first parameter of the
  232. assembly instruction.
  233. base_register: Literal 0-32 corresponding to the register that must be used
  234. as the base register for the transfer.
  235. use_remapping: Boolean value (zero is false, non-zero is true) that
  236. specifies whether a register file shift amount is used
  237. to move the registers to the appropriate base register.
  238. An example of this are the bank shift supported for the
  239. scratchpad memory on ICSS.
  240. object: Any object with a size less than 44 bytes.
  241. -------------------------------------------------------------------------------
  242. unsigned int __lmbd(unsigned int src, unsigned int pattern)
  243. The __lmbd intrinsic can be used to generate the LMBD instruction.
  244. The LMBD instruction scans src for the bit pattern in position 0 of
  245. pattern and returns the first position where it is found.
  246. -------------------------------------------------------------------------------
  247. void __halt()
  248. The __halt intrinsic is used to generate the HALT instruction. This intrinsic
  249. is a barrier.
  250. -------------------------------------------------------------------------------
  251. void __delay_cycles(unsigned int cycles)
  252. The __delay_cycles intrinsic will delay CPU execution for the specified
  253. number of cycles. The number of cycles must be a constant.
  254. -------------------------------------------------------------------------------
  255. Constant Table Registers
  256. -------------------------------------------------------------------------------
  257. The cregister mechanism supported in the 1.0.0B1 release is no longer supported.
  258. The new mechanism is described here.
  259. Support for using the constant registers is provided through the new cregister
  260. and peripheral variable attributes. The syntax is:
  261. int x __attribute__((cregister("MEM", [near|far]), peripheral));
  262. The name "MEM" can be any name, although it will need to correspond to a
  263. memory range name in your linker command file (described below). The
  264. near or far parameter tells the compiler if it can use the immediate
  265. addressing mode (near) or register indirect addressing mode (far). If the
  266. data will be completely contained within the first 256 bytes from the top
  267. of the constant register pointer then it should be near, otherwise it should
  268. be far. If the parameter is omitted, near will be chosen.
  269. The peripheral attribute can only be used with the cregister attribute and has
  270. two effects. First, it puts the object in a section that will not be loaded
  271. onto the device. This will prevent initialization of the peripheral at runtime.
  272. Second, it allows the same object to be defined in multiple source files
  273. without a linker error. The intent is that peripherals can be completely
  274. described in a header file.
  275. The linker command file must be modified for the cregister attribute to work,
  276. otherwise you will get relocation errors. The updated linker command file
  277. syntax is:
  278. MEMORY
  279. {
  280. MEM: o=xxx l=xxx CREGISTER=4
  281. }
  282. This tells the linker that register C4 points to the top of MEM. Note that the
  283. name MEM in the linker command file must be the same name used in the cregister
  284. attribute. The linker will automatically place all objects declared to
  285. be in MEM into the memory range.
  286. The assmebly code for a near load will look like:
  287. LBCO &R0, __PRU_CREG_MEM, $CSBREL(x), 4
  288. A far load will look like:
  289. LDI R0.w0, $CSBREL(x)
  290. LBCO &R0, __PRU_CREG_MEM, R0.w0, 4
  291. The __PRU_CREG_MEM symbol will be defined at link time with the value of the
  292. register to be used. The $CSBREL modifier will cause a relocation entry
  293. relative to the memory pointed to by the constant register.
  294. -------------------------------------------------------------------------------
  295. The near and far keywords
  296. -------------------------------------------------------------------------------
  297. PRU can load a 16-bit address into a register with a single instruction,
  298. however pointers are 32 bits and load/store instructions can load from the
  299. full 32-bit address space. In releases prior to 2.0.0B1 all data addresses
  300. were assumed to be 32 bits so two instructions were used to load the address.
  301. In the 2.0.0B1 release the near and far keywords were introduced to allow
  302. more efficient loading of data. The near and far keywords can be applied to
  303. any data symbol. The near keyword asserts that the symbol will be in the
  304. lower 16 bits of memory. The far keyword asserts that the symbol will not
  305. be in the lower 16 bits of memory.
  306. By default all symbols are near. This is because the PRU local memory is
  307. always in the lower 16 bits and most accesses to the upper memory range
  308. will be for peripheral accesses.
  309. The __near and __far keywords are also accepted and are available even
  310. when --strict_ansi is specified. These are guaranteed to not conflict with
  311. user symbols named "near" and "far".
  312. All symbols that will reside in the upper 16 bits of memory must be delcared
  313. using far, even if they have the cregister attribute. A cregister symbol
  314. can have a far qualifier and be a near cregister access. As an example:
  315. __far int peripheral __attribute((cregister("PERIPH", near)));
  316. This means that relative to the cregister access peripheral is a near
  317. access, but if accessed using an absolute address it is far. This is
  318. important because the compiler may choose to not access peripheral using
  319. a cregister access.
  320. -------------------------------------------------------------------------------
  321. Structures in Registers
  322. -------------------------------------------------------------------------------
  323. The compiler has support to allocate aggregate types to registers. Today this
  324. is limited to structs of 44 bytes or smaller. The compiler will automatically
  325. allocate to registers when it is profitable.
  326. struct s
  327. {
  328. int a;
  329. int b;
  330. char c;
  331. short d;
  332. };
  333. struct s global_struct;
  334. void foo()
  335. {
  336. struct s x;
  337. x = global_struct;
  338. x.a++;
  339. x.d++;
  340. global_struct = x;
  341. }
  342. ===============================================================================
  343. Sections in the linker command file
  344. Executable sections:
  345. .text Executable code
  346. User data sections:
  347. .bss Uninitialized near data
  348. .data Initialized near data
  349. .rodata Constant read only near data
  350. .farbss Uninitialized far data
  351. .fardata Initialized far data
  352. .rofardata Constant read only far data
  353. Special data sections:
  354. .sysmem Heap for dynamic memory allocation (size is controlled by -heap option)
  355. .stack Stack space (size is controlled by -stack option)
  356. .init_array Table of constructors to be called at startup
  357. .cinit Tables for initializing global data at runtime
  358. .args Section for passing arguments in the argv array to main
  359. All data sections should be allocated on page 1 and all executable sections should be
  360. allocated on page 0