Integrate online documentation in repository.

2024-12-18 06:46:23 -05:00 · 2024-09-19 10:45:46 +08:00 · 2024-09-19 10:45:46 +08:00 · 8c464ae090
commit 8c464ae090
parent efdc683bfb
51 changed files with 6497 additions and 0 deletions
--- a/docs/11_toolchain.html
+++ b/docs/11_toolchain.html
@ -0,0 +1,288 @@
 <!DOCTYPE html>
 <html lang="en">
 <head>
    <meta charset="UTF-8">
    <title>1.1 GNU Arm Embedded Toolchain</title>
    <link type="text/css" rel="stylesheet" href="style.css">
 </head>
 <body>
 <h1>1.1 GNU Arm Embedded Toolchain</h1>
 Arm maintains a GCC cross-compiler available as binaries that run on Linux and
 Windows.<br>It is available on their
 <a href="https://developer.arm.com/tools-and-software/open-source-software/developer-tools/gnu-toolchain/gnu-rm">
 arm developer site</a>.
 <h2>Installation on Windows</h2>
 So far there are only win32 versions available for download either as
 <b>.exe</b> installers or <b>.zip</b> archives. File naming convention helps to
 identify the type of release (preview, major or update).
 <p>
 By default each release installs to its own directory instead of upgrading the
 previous one. This way several releases can coexist and you can select which
 one you use for a specific project. One downside to this is that the directory
 and filename convention is heavy. For practical use, you need to configure an
 IDE or encapsulate those paths and names in <b>Makefile</b> variables.
 <pre>
 ### Build environment selection
 GCCDIR = "D:/Program Files (x86)/GNU Arm Embedded Toolchain/arm-gnu-toolchain-13.3.rel1-mingw-w64-i686-arm-none-eabi"
 BINPFX  = $(GCCDIR)/bin/arm-none-eabi-
 CC      = $(BINPFX)gcc
 ### Build rules
 .PHONY: version
 version:
    $(CC) --version
 </pre>
 <ul>
 <li> <b>GCCDIR</b> holds the path to the folder where the toolchain is installed.
 When I install a new release, I need to update this path.
 <li> All the commands to build are located in one <b>bin</b> subfolder and they
 share the same name prefix <b>arm-none-eabi-</b>. So I have created a
 <b>BINPFX</b> to easily identify the commands.
 </ul>
 <pre>
 $ make
 "D:/Program Files (x86)/GNU Arm Embedded Toolchain/arm-gnu-toolchain-13.3.rel1-m
 ingw-w64-i686-arm-none-eabi"/bin/arm-none-eabi-gcc --version
 arm-none-eabi-gcc.exe (Arm GNU Toolchain 13.3.Rel1 (Build arm-13.24)) 13.3.1 202
 40614
 Copyright (C) 2023 Free Software Foundation, Inc.
 This is free software; see the source for copying conditions.  There is NO
 warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
 </pre>
 <h2>Installation on Linux</h2>
 Installation on Linux means downloading the Linux x86_64 tarball and extracting
 it. I use the <b>~/Packages</b> folder for this type of distribution. This
 means that the <b>Makefile</b> on Linux will be the same as the Windows one
 except for the value of the <b>GCCDIR</b> variable.
 <pre>
 GCCDIR = $(HOME)/Packages/arm-gnu-toolchain-13.3.rel1-x86_64-arm-none-eabi
 </pre>
 By selecting the path based on the development environment, there is no need
 to make changes while switching between OS. <b>Gmake</b> has the built-in
 variable <b>MAKE_HOST</b> that can be tested for this.
 <pre>
 ### Build environment selection
 ifeq (linux, $(findstring linux, $(MAKE_HOST)))
 GCCDIR = $(HOME)/Packages/arm-gnu-toolchain-13.3.rel1-x86_64-arm-none-eabi
 else
 GCCDIR = "D:/Program Files (x86)/GNU Arm Embedded Toolchain/arm-gnu-toolchain-13.3.rel1-mingw-w64-i686-arm-none-eabi"
 endif
 BINPFX  = $(GCCDIR)/bin/arm-none-eabi-
 CC      = $(BINPFX)gcc
 ### Build rules
 .PHONY: version
 version:
    $(CC) --version
 </pre>
 I use the path prefix <b>$(HOME)/Packages</b> instead of <b>~/Packages</b> when
 defining <b>GCCDIR</b> as some sub-processes called by <i>gmake</i> may have
 issues with <b>~</b> expansion (in this case <i>ld</i>). This way <i>gmake</i>
 will handle the expansion before calling the sub-processes.
 <h2>Toolchain’s chain of events</h2>
 In order to generate a file that can be loaded in the micro-controller, I
 need to sketch the chain of events that will take place.
 <ol>
 <li> <b>Compile</b> source codes (<b>.c</b>) to object modules (<b>.o</b>)
 <li> <b>Link</b> all the object modules (<b>.o</b>) together into an
 executable (<b>.elf</b>)
 <li> <b>Convert</b> the executable (<b>.elf</b>) into a format suitable for
 loading or flashing (<b>.bin</b> or <b>.hex</b>).
 </ol>
 <h3>1. Compile</h3>
 <b>Gmake</b> has default rules to built <b>.o</b> files out of <b>.c</b> files.
 As I have already defined with <b>CC</b> the command to compile, I can make a
 simple test of this step by creating an empty <b>.c</b> file and checking what
 happens when I try to compile it.
 <pre>
 $ touch empty.c
 $ make empty.o
 "D:/Program Files (x86)/GNU Arm Embedded Toolchain/arm-gnu-toolchain-13.3.rel1-m
 ingw-w64-i686-arm-none-eabi"/bin/arm-none-eabi-gcc    -c -o empty.o empty.c
 </pre>
 Compilation is succesful and <b>empty.o</b> file is generated.
 <h3>2. Link</h3>
 To link the object module generated in the first step, I need to
 <ul>
 <li> specify the command to link (<b>ld</b>)
 <li> provide the name of a link script
 <li> create a rule that calls the linker to generate an executable <b>.elf</b>
     file from the object module <b>.o</b> file.
 </ul>
 <p>
 There are sample link scripts that come with the toolchain, they are located
 in the subfolder <b>share/gcc-arm-none-eabi/samples/ldscripts</b>. For now I can
 use the simplest script: <b>mem.ld</b>.
 <pre>
 ### Build environment selection
 ifeq (linux, $(findstring linux, $(MAKE_HOST)))
 GCCDIR = $(HOME)/Packages/arm-gnu-toolchain-13.3.rel1-x86_64-arm-none-eabi
 else
 GCCDIR = "D:/Program Files (x86)/GNU Arm Embedded Toolchain/arm-gnu-toolchain-13.3.rel1-mingw-w64-i686-arm-none-eabi"
 endif
 BINPFX  = $(GCCDIR)/bin/arm-none-eabi-
 CC      = $(BINPFX)gcc
 LD      = $(BINPFX)ld
 LD_SCRIPT = $(GCCDIR)/share/gcc-arm-none-eabi/samples/ldscripts/mem.ld
 ### Build rules
 %.elf: %.o
    $(LD) -T$(LD_SCRIPT) -o $@ $<
 </pre>
 <pre>
 $ make empty.elf
 "D:/Program Files (x86)/GNU Arm Embedded Toolchain/arm-gnu-toolchain-13.3.rel1-m
 ingw-w64-i686-arm-none-eabi"/bin/arm-none-eabi-ld -T"D:/Program Files (x86)/GNU 
 Arm Embedded Toolchain/arm-gnu-toolchain-13.3.rel1-mingw-w64-i686-arm-none-eabi"
 /share/gcc-arm-none-eabi/samples/ldscripts/mem.ld -o empty.elf empty.o
 </pre>
 Link terminates successfully and creates <b>empty.elf</b>.
 <h3>3. Convert</h3>
 Finally, I use the command <b>objcopy</b> to convert the executable <b>.elf</b>
 file into binary or intel hex format suitable to load in the micro-controller.
 <pre>
 ### Build environment selection
 ifeq (linux, $(findstring linux, $(MAKE_HOST)))
 GCCDIR = $(HOME)/Packages/arm-gnu-toolchain-13.3.rel1-x86_64-arm-none-eabi
 else
 GCCDIR = "D:/Program Files (x86)/GNU Arm Embedded Toolchain/arm-gnu-toolchain-13.3.rel1-mingw-w64-i686-arm-none-eabi"
 endif
 BINPFX  = $(GCCDIR)/bin/arm-none-eabi-
 CC      = $(BINPFX)gcc
 LD      = $(BINPFX)ld
 OBJCOPY = $(BINPFX)objcopy
 LD_SCRIPT = $(GCCDIR)/share/gcc-arm-none-eabi/samples/ldscripts/mem.ld
 ### Build rules
 %.elf: %.o
    $(LD) -T$(LD_SCRIPT) -o $@ $<
 %.bin: %.elf
    $(OBJCOPY) -O binary $< $@
 %.hex: %.elf
    $(OBJCOPY) -O ihex $< $@
 </pre>
 Now, if I start in a directory that contains only this <b>Makefile</b> and an
 empty <b>empty.c</b> file, I can successfully build.
 <pre>
 $ make empty.bin empty.hex
 "D:/Program Files (x86)/GNU Arm Embedded Toolchain/arm-gnu-toolchain-13.3.rel1-m
 ingw-w64-i686-arm-none-eabi"/bin/arm-none-eabi-gcc    -c -o empty.o empty.c
 "D:/Program Files (x86)/GNU Arm Embedded Toolchain/arm-gnu-toolchain-13.3.rel1-m
 ingw-w64-i686-arm-none-eabi"/bin/arm-none-eabi-ld -T"D:/Program Files (x86)/GNU 
 Arm Embedded Toolchain/arm-gnu-toolchain-13.3.rel1-mingw-w64-i686-arm-none-eabi"
 /share/gcc-arm-none-eabi/samples/ldscripts/mem.ld -o empty.elf empty.o
 "D:/Program Files (x86)/GNU Arm Embedded Toolchain/arm-gnu-toolchain-13.3.rel1-m
 ingw-w64-i686-arm-none-eabi"/bin/arm-none-eabi-objcopy -O binary empty.elf empty
 .bin
 "D:/Program Files (x86)/GNU Arm Embedded Toolchain/arm-gnu-toolchain-13.3.rel1-m
 ingw-w64-i686-arm-none-eabi"/bin/arm-none-eabi-objcopy -O ihex empty.elf empty.h
 ex
 rm empty.o empty.elf
 </pre>
 Notice that <b>gmake</b> automatically removes the intermediary <b>.o</b> and
 <b>.elf</b> files on completion.
 <p>
 The generated <b>empty.bin</b> is empty.
 <h2>Cleaning up</h2>
 I want to keep the output of the build easy to understand without the clutter
 of the long command names. Also I need a way to clean the working directory
 back to its initial state.
 <ul>
 <li>By prefixing <b>BINPFX</b> with <i>@</i>, commands will not be displayed by
 <i>gmake</i> when they are executed. Adding an <b>echo</b> of the command
 target in the rules helps to keep track of the build progression.
 <li>A new <b>clean</b> rule will remove all generated files.
 </ul>
 <pre>
 ### Build environment selection
 ifeq (linux, $(findstring linux, $(MAKE_HOST)))
 GCCDIR = $(HOME)/Packages/arm-gnu-toolchain-13.3.rel1-x86_64-arm-none-eabi
 else
 GCCDIR = "D:/Program Files (x86)/GNU Arm Embedded Toolchain/arm-gnu-toolchain-13.3.rel1-mingw-w64-i686-arm-none-eabi"
 endif
 BINPFX  = @$(GCCDIR)/bin/arm-none-eabi-
 CC      = $(BINPFX)gcc
 LD      = $(BINPFX)ld
 OBJCOPY = $(BINPFX)objcopy
 LD_SCRIPT = $(GCCDIR)/share/gcc-arm-none-eabi/samples/ldscripts/mem.ld
 ### Build rules
 .PHONY: clean
 clean:
    @echo CLEAN
    @rm -f *.o *.elf *.bin *.hex
 %.elf: %.o
    @echo $@
    $(LD) -T$(LD_SCRIPT) -o $@ $<
 %.bin: %.elf
    @echo $@
    $(OBJCOPY) -O binary $< $@
 %.hex: %.elf
    @echo $@
    $(OBJCOPY) -O ihex $< $@
 </pre>
 <pre>
 $ make clean
 CLEAN
 $ make empty.bin empty.hex
 empty.elf
 empty.bin
 empty.hex
 rm empty.o empty.elf
 </pre>
 <h2>Checkpoint</h2>
 At this stage, I have a working toolchain and I am able to build from an empty
 source file (<b>empty.c</b>) to an empty binary file (<b>empty.bin</b>).
 <p>
 <a href="12_bootstrap.html">Next</a>, I will select a micro-controller
 from the STM32 family and generate a binary file that it can execute.
    <hr>© 2020-2024 Renaud Fivet
 </body>
 </html>
--- a/docs/12_bootstrap.html
+++ b/docs/12_bootstrap.html
@ -0,0 +1,306 @@
 <!DOCTYPE html>
 <html lang="en">
 <head>
    <meta charset="UTF-8">
    <title>1.2 Bootstrap</title>
    <link type="text/css" rel="stylesheet" href="style.css">
 </head>
 <body>
 <h1>1.2 Bootstrap</h1>
 <h2>Revising the link script</h2>
 To validate the toolchain, I picked up <b>mem.ld</b>, the simplest sample link
 script, and used it as it is.
 <pre>
 /* Linker script to configure memory regions.
 * Need modifying for a specific board.
 *   FLASH.ORIGIN: starting address of flash
 *   FLASH.LENGTH: length of flash
 *   RAM.ORIGIN: starting address of RAM bank 0
 *   RAM.LENGTH: length of RAM bank 0
 */
 MEMORY
 {
  FLASH (rx) : ORIGIN = 0x0, LENGTH = 0x20000 /* 128K */
  RAM (rwx) : ORIGIN = 0x10000000, LENGTH = 0x2000 /* 8K */
 }
 </pre>
 It needs to be modified with actual flash and ram locations and sizes.
 <p>
 Also this link script does not contain any information for the linker to know
 how to locate the output of the C compiler. Code, constant data and initial
 value of variables need to be located in flash, variables and stack need to be
 located in ram. I need a better link script that specify all of that.
 <b>nokeep.ld</b> in the sample scripts folder is the one I need.
 <pre>
 /* Linker script to configure memory regions.
 * Need modifying for a specific board.
 *   FLASH.ORIGIN: starting address of flash
 *   FLASH.LENGTH: length of flash
 *   RAM.ORIGIN: starting address of RAM bank 0
 *   RAM.LENGTH: length of RAM bank 0
 */
 MEMORY
 {
  FLASH (rx) : ORIGIN = 0x0, LENGTH = 0x20000 /* 128K */
  RAM (rwx) : ORIGIN = 0x10000000, LENGTH = 0x2000 /* 8K */
 }
 /* Linker script to place sections and symbol values. Should be used together
 * with other linker script that defines memory regions FLASH and RAM.
 * It references following symbols, which must be defined in code:
 *   Reset_Handler : Entry of reset handler
 *
 * It defines following symbols, which code can use without definition:
 *   __exidx_start
 *   __exidx_end
 ...
 ...
 *   __StackTop
 *   __stack
 */
 ENTRY(Reset_Handler)
 SECTIONS
 {
 ...
 ...
 </pre>
 From this snippet I can see that not only flash and ram parameters but also
 the entry point for code execution, <b>Reset_Handler</b>, needs to be provided.
 <p>
 As a check, let’s change the link script to <b>nokeep.ld</b> in <b>Makefile</b>
 and generate an executable <b>.elf</b> from the empty source code file
 <b>empty.c</b>:
 <pre>
 LD_SCRIPT = $(GCCDIR)/share/gcc-arm-none-eabi/samples/ldscripts/nokeep.ld
 </pre>
 <pre>
 $ make empty.elf
 empty.elf
 D:\Program Files (x86)\GNU Arm Embedded Toolchain\arm-gnu-toolchain-13.3.rel1-mi
 ngw-w64-i686-arm-none-eabi\bin\arm-none-eabi-ld.exe: warning: cannot find entry 
 symbol Reset_Handler; defaulting to 00000000
 rm empty.o
 </pre>
 The linker gives a warning and fallback on a default address as entry point.
 <p>
 So let’s create <b>boot.c</b> with an idle loop as <b>Reset_Handler</b>:
 <pre>
 void Reset_Handler( void) {
    for( ;;) ;
 }
 </pre>
 In order to better understand the output of the link phase, I make the
 following changes to the <b>Makefile</b>:
 <ul>
 <li> Add debug information by passing the <b>-g</b> flag to the C compiler.
 <li> Call the linker with the flags <b>-Map=$*.map -cref</b> to produce a link
 map that also includes a cross reference table.
 <li> List the size of the sections by using the <b>size</b> command.
 <li> call <b>objdump -hS</b> to generate a disassembly listing that includes a
 list of the sections and makes use of the debug info to mix C source with
 assembly code.
 <li> Insure that the <b>clean</b> rule removes the newly generated <b>.map</b> and
 <b>.lst</b> files.
 </ul>
 <pre>
 OBJDUMP = $(BINPFX)objdump
 SIZE    = $(BINPFX)size
 CFLAGS = -g
 clean:
    @echo CLEAN
    @rm -f *.o *.elf *.map *.lst *.bin *.hex
 %.elf: %.o
    @echo $@
    $(LD) -T$(LD_SCRIPT) -Map=$*.map -cref -o $@ $<
    $(SIZE) $@
    $(OBJDUMP) -hS $@ > $*.lst
 </pre>
 <pre>
 $ make boot.elf
 boot.elf
   text    data     bss     dec     hex filename
     12       0       0      12       c boot.elf
 rm boot.o
 </pre>
 I can see that this build results in 12 bytes of code in the text section.
 More details can be found in the <b>boot.map</b> and <b>boot.lst</b> files.
 <h2>Targeting the STM32F030F4P6</h2>
 To be able to build a boot code that could bootstrap a board equipped with a
 STM32F030F4P6, I need to know the following about this micro-controller:
 <ul>
 <li> <b>Core</b>: Arm 32 bit <b>Cortex-M0</b> CPU.
 <li> 16KB <b>Flash</b> located at 0x08000000.
 <li> 4KB <b>Ram</b> located at 0x20000000.
 </ul>
 I make a copy of the sample <b>nokeep.ld</b> link script in my working folder
 under the name <b>f030f4.ld</b> and change the <b>MEMORY</b> region accordingly.
 <pre>
 MEMORY
 {
    FLASH (rx)  : ORIGIN = 0x08000000, LENGTH = 16K
    RAM   (rwx) : ORIGIN = 0x20000000, LENGTH =  4K
 }
 </pre>
 The <b>Makefile</b> needs the following changes:
 <ul>
 <li> Specify <b>f030f4.ld</b> as the link script
 <li> C compiler flags to generate thumb Cortex-M0 code.
 <li> Request the C compiler to generate extra warning and optimize for size.
 </ul>
 <pre>
 CPU = -mthumb -mcpu=cortex-m0
 CFLAGS = $(CPU) -g -Wall -Wextra -Os
 LD_SCRIPT = f030f4.ld
 </pre>
 At boot time, the Arm core fetches the initial address of the stack pointer
 and the address where to start execution from the first two entries of its
 interrupt routine table. I have to modify <b>boot.c</b> to initialize such a
 table in accord with the symbols defined in the link script.
 <pre>
 /* Memory locations defined by linker script */
 extern long __StackTop ;        /* &__StackTop points after end of stack */
 void Reset_Handler( void) ;     /* Entry point for execution */
 /* Interrupt vector table:
 * 1  Stack Pointer reset value
 * 15 System Exceptions
 * NN Device specific Interrupts
 */
 typedef void (*isr_p)( void) ;
 isr_p const isr_vector[ 2] __attribute__((section(".isr_vector"))) = {
    (isr_p) &__StackTop,
 /* System Exceptions */
    Reset_Handler
 } ;
 void Reset_Handler( void) {
    for( ;;) ;
 }
 </pre>
 <b>__StackTop</b> is defined by the linker script and is located after the end
 of the RAM. I use the GNU C extension <b>__attribute__()</b> to name the section
 where I want the interrupt vector to be included. If you check the linker
 script you will see that it places <b>.isr_vector</b> at the start of the
 <b>text</b> area which is located at the beginning of the flash memory. I chose
 to name the interrupt vector table <b>isr_vector</b> to match the section name
 <b>.isr_vector</b>, but it is really the section name that matters here.
 <pre>
 $ make boot.hex
 boot.elf
   text    data     bss     dec     hex filename
     10       0       0      10       a boot.elf
 boot.hex
 rm boot.elf boot.o
 </pre>
 A build produces 10 bytes of code, I can check the disassembly in the
 <b>boot.lst</b> file.
 <pre>
 Disassembly of section .text:
 08000000 &lt;isr_vector>:
 8000000:   00 10 00 20 09 00 00 08                             ... ....
 08000008 &lt;Reset_Handler>:
 /* System Exceptions */
    Reset_Handler
 } ;
 void Reset_Handler( void) {
    for( ;;) ;
 8000008:   e7fe        b.n 8000008 &lt;Reset_Handler>
 </pre>
 <ul>
 <li> The interrupt vector table is at address <b>0x08000000</b>, the beginning
 of the flash.
 <li> Its first entry, the initial stack pointer value, is the address
 <b>0x20001000</b>, which is the first location after the end of the 4K RAM.
 <li> Its next entry is <b>0x08000009</b>. I expect <b>0x08000008</b> here, as
 this is the first location after the small interrupt vector table I created.
 The lowest bit is set to 1 to mark that it is indeed the address of an interrupt
 routine. So the value is correct even if an odd address value is not possible
 for an opcode location.
 <li> Finally, I found the code for our idle loop at <b>0x08000008</b> as
 expected.
 </ul>
 <h2>Checkpoint</h2>
 I have built a first executable targeting a member of the STM32 family.
 <p>
 <a href="13_flash.html">Next</a>, I will take a board with a STM32F030F4P6 and
 check if the code generated behaves as expected.
 <p>
 Below is the <b>Makefile</b> for reference. If you happen to cut&paste from this
 web page to a file, remember that <b><i>gmake</i></b> expects rules to be tab
 indented.
 <pre>
 ### Build environment selection
 ifeq (linux, $(findstring linux, $(MAKE_HOST)))
 GCCDIR = $(HOME)/Packages/arm-gnu-toolchain-13.3.rel1-x86_64-arm-none-eabi
 else
 GCCDIR = "D:/Program Files (x86)/GNU Arm Embedded Toolchain/arm-gnu-toolchain-13.3.rel1-mingw-w64-i686-arm-none-eabi"
 endif
 BINPFX  = @$(GCCDIR)/bin/arm-none-eabi-
 CC      = $(BINPFX)gcc
 LD      = $(BINPFX)ld
 OBJCOPY = $(BINPFX)objcopy
 OBJDUMP = $(BINPFX)objdump
 SIZE    = $(BINPFX)size
 CPU = -mthumb -mcpu=cortex-m0
 CFLAGS = $(CPU) -g -Wall -Wextra -Os
 LD_SCRIPT = f030f4.ld
 ### Build rules
 .PHONY: clean
 clean:
    @echo CLEAN
    @rm -f *.o *.elf *.map *.lst *.bin *.hex
 %.elf: %.o
    @echo $@
    $(LD) -T$(LD_SCRIPT) -Map=$*.map -cref -o $@ $<
    $(SIZE) $@
    $(OBJDUMP) -hS $@ > $*.lst
 %.bin: %.elf
    @echo $@
    $(OBJCOPY) -O binary $< $@
 %.hex: %.elf
    @echo $@
    $(OBJCOPY) -O ihex $< $@
 </pre>
    <hr>© 2020-2024 Renaud Fivet
 </body>
 </html>
--- a/docs/13_flash.html
+++ b/docs/13_flash.html
@ -0,0 +1,163 @@
 <!DOCTYPE html>
 <html lang="en">
 <head>
    <meta charset="UTF-8">
    <title>1.3 Flash – Boot – Debug</title>
    <link type="text/css" rel="stylesheet" href="style.css">
 </head>
 <body>
 <h1>1.3 Flash – Boot – Debug</h1>
 Now that I have an executable bootstrap, I need to flash an actual board
 with it to check if it works as expected. On a member of the STM32F030
 family, there are two options for flashing:
 <ul>
 <li> Use the Software Debug (<b>SWD</b>) interface with a ST-Link adapter and an
 utility software to flash.
 <li> Use the serial interface to communicate with the boot loader in System
 Memory via an USB to serial adapter.
 </ul>
 As the bootstrap code I want to test does nothing except giving control
 to an idle loop, I will need extra debugging on top of the flashing
 functionality, so the SWD interface is a must for what I want to achieve
 here.
 <h2>ST-Link<h2>
 <h3>1. SWD interface</h3>
 Arm Serial Wire Debug Port (<b>SW-DP</b>) is provided as a two wire interface.
 On the STM32F030, the functionality is activated at reset on two pins
 (PA13=SWCLK, PA14=SWDIO). Most boards available online have pre-soldered
 pins for Vcc=3.3V, Gnd, SWCLK and SWDIO.
 <h3>2. ST-Link v2</h3>
 ST-Link is an in-circuit debugger/programmer for the STM8 and STM32
 chipsets. There are three versions of the product as well as mini
 versions. STM32 Discovery and Nucleo boards have an onboard ST-Link v2.
 I am using ST-Link v2 mini clones. For simple use cases, the ST-Link can
 provide power to the board to flash or test.
 <p>
 <img src="img/13_stlink.png"
 alt="ST-Link v2 mini clone connected to STM32F030F4P6 based board">
 <h3>3. STM32 Cube Programmer</h3>
 Referenced as
 <a href="https://www.st.com/content/st_com/en/products/development-tools/software-development-tools/stm32-software-development-tools/stm32-programmers/stm32cubeprog.html">
 STM32CubeProg</a>
 on STMicroelectronics website, the STM32 Cube Programmer comes with USB
 drivers and a firmware upgrade utility for the ST-Link. It’s a java
 based application with available distribution for Win32, Win64, Mac and Linux.
 There are regular updates to support the latest chipsets. I am currently
 using version v2.17.0.
 <h2>Roadtesting the Bootstrap</h2>
 First I activate the connection in the programmer.
 <pre>
  14:37:31 : STM32CubeProgrammer API v2.17.0 | Windows-64Bits 
  14:37:38 : UR connection mode is defined with the HWrst reset mode
  14:37:38 : ST-LINK SN  : 55FF6B065177495619420887
  14:37:38 : ST-LINK FW  : V2J45S7
  14:37:38 : Board       : --
  14:37:38 : Voltage     : 3.27V
  14:37:38 : SWD freq    : 4000 KHz
  14:37:38 : Connect mode: Hot Plug
  14:37:38 : Reset mode  : Software reset
  14:37:38 : Device ID   : 0x444
  14:37:38 : Revision ID : Rev 1.0
  14:37:38 : Debug in Low Power mode is not supported for this device.
  14:37:39 : UPLOADING OPTION BYTES DATA ...
  14:37:39 :   Bank          : 0x00
  14:37:39 :   Address       : 0x1ffff800
  14:37:39 :   Size          : 16 Bytes
  14:37:39 : UPLOADING ...
  14:37:39 :   Size          : 1024 Bytes
  14:37:39 :   Address       : 0x8000000
  14:37:39 : Read progress:
  14:37:39 : Data read successfully
  14:37:39 : Time elapsed during the read operation is: 00:00:00.007
 </pre>
 Then program and verify the bootstrap code. Either binary, Intel hex or
 Motorola S rec format are supported. Our <b>Makefile</b> as rules for binary
 and Intel hex, <b><i>objcopy</i></b> also support Motorola S record as an output
 format. Last build produced <b>boot.hex</b>.
 <pre>
  14:40:24 : Memory Programming ...
  14:40:24 : Opening and parsing file: boot.hex
  14:40:24 :   File          : boot.hex
  14:40:24 :   Size          : 10.00 B 
  14:40:24 :   Address       : 0x08000000 
  14:40:24 : Erasing memory corresponding to segment 0:
  14:40:24 : Erasing internal memory sector 0
  14:40:24 : Download in Progress:
  14:40:24 : File download complete
  14:40:24 : Time elapsed during download operation: 00:00:00.130
  14:40:24 : Verifying ...
  14:40:24 : Read progress:
  14:40:24 : Download verified successfully 
  14:40:24 : RUNNING Program ... 
  14:40:24 :   Address:      : 0x08000000
  14:40:24 : Application is running, Please Hold on...
  14:40:24 : Start operation achieved successfully
 </pre>
 Finally check the registers in the MCU Core Panel:
 <pre>
 MSP: 0x20001000
 PC:  0x8000008
 </pre>
 After reset, the stack pointer has been initialized and the program
 counter is on the idle loop under execution.
 <p>
 If I check the Programming Manual
 <a href="https://www.st.com/content/st_com/en/search.html#q=PM0215-t=resources-page=1">
 PM0215 <i>STM32F0 series Cortex-M0 programming manual</i></a>, I can
 read the following about the registers <b>MSP</b> and <b>PC</b>:
 <pre>
 Stack pointer (SP) register R13
 In Thread mode, bit[1] of the CONTROL register indicates the stack
 pointer to use:
 ● 0: Main Stack Pointer (MSP)(reset value). On reset, the processor
 loads the MSP with the value from address 0x00000000.
 ● 1: Process Stack Pointer (PSP).
 Program counter (PC) register R15
 Contains the current program address. On reset, the processor loads the PC
 with the value of the reset vector, which is at address 0x00000004.
 Bit[0] of the value is loaded into the EPSR T-bit at reset and must be 1.
 </pre>
 - According to this, initial values for <b>MSP</b> and <b>PC</b> registers are
 fetched from address <b>0x00000000</b> and <b>0x00000004</b> respectively, but
 I have located the isr table at the beginning of the Flash memory at
 address <b>0x08000000</b>! This works because the memory space at address 0
 is a mirror of another memory area. Which area is mirrored depends of
 the state of the <b>BOOT0</b> pin. On the board I am testing, there is a
 jumper to select either Flash or System memory by setting the state of
 the <b>BOOT0</b> pin to high or low.
 <p>
 - The <b>ESPR T-bit</b>, mentioned in the description of the <b>PC</b>
 register is the Thumb bit. As I highlighted before when I checked the
 output of our first build, bit 0 of the second entry in our isr table is
 set to 1 as requested by this specification.
 <h2>Checkpoint</h2>
 I used the Serial Wire Debug (SWD) interface to flash and debug our bootstrap
 in an actual board using a ST-Link hardware adapter and STM32 Cube Programmer
 application.
 <p>
 <a href="14_ledon.html">Next</a>, I will provide feedback of execution directly
 through the board.
    <hr>© 2020-2024 Renaud Fivet
 </body>
 </html>
--- a/docs/14_ledon.html
+++ b/docs/14_ledon.html
@ -0,0 +1,133 @@
 <!DOCTYPE html>
 <html lang="en">
 <head>
    <meta charset="UTF-8">
    <title>1.4 User LED ON</title>
    <link type="text/css" rel="stylesheet" href="style.css">
 </head>
 <body>
 <h1>1.4 User LED ON</h1>
 Turning the user LED on is the simplest direct feedback you can get from
 a board. Most boards you can buy online have two LEDs: power and user.
 In order to know where the user LED is connected and how to drive it on,
 you need to know the board you are developing for.
 <h2>Meet your board</h2>
 The board I am using is made by VCC-GND Studio and its specifications
 can be found
 <a href="https://stm32-base.org/boards/STM32F030F4P6-VCC-GND.html">here</a>.
 Taking a look at the schematics I can see that the user LED is connected to
 GPIO B1 and will turn on when that pin is driven low.
 <p>
 <img alt="user LED connected to GPIO B1" src="img/14_ledpb1.png">
 <p>
 This board is based on the micro-controller STM32F030F4P6, so let's
 learn about its implementation of GPIOs.
 <h2>Know your chipset</h2>
 From the datasheet
 <a href="https://www.st.com/content/st_com/en/search.html#q=DS9773-t=resources-page=1">
 DS9773</a>,
 I learn that pin 14 of the 20 pin package defaults as PB1 after reset.
 It's a 3.3V tolerant I/O pin and can be configured either as an output,
 an input or one of several alternate functions. The GPIO peripherals are
 connected on the AHB2 Bus which means that they can be controlled
 through their registers visible in the AHB2 sub-range of the Peripherals
 range of address. For GPIO B, that means the address range 0x48000400 to
 0x480007FF.
 <p>
 Diving in the reference manual
 <a href="https://www.st.com/content/st_com/en/search.html#q=%20RM0360-t=resources-page=1">
 RM0360</a>,
 I find the layout of the GPIO B registers and their initial state plus
 the info that peripheral clocks need to be enabled through the Reset and
 Clock Controller (<b>RCC</b>) connected on the AHB1 bus. So I need to activate
 the clocks of GPIO B through the RCC before I can access its registers.
 <p>
 To turn the user LED on, I need to
 <ul>
 <li> Enable the clock of GPIO B
 <li> Configure B1 mode as an output by selecting mode 01
 <li> Configure B1 type as a push-pull, which is the reset value (0)
 <li> Set B1 value to 0, which is the reset value (0)
 </ul>
 <h2>Code and Build</h2>
 I start by making a copy of <b>boot.c</b> into <b>ledon.c</b> and rework the
 <b>Reset_Handler</b> to light up the LED before entering the idle loop.
 <pre>
 /* Memory locations defined by linker script */
 extern long __StackTop ;        /* &__StackTop points after end of stack */
 void Reset_Handler( void) ;     /* Entry point for execution */
 /* Interrupt vector table:
 * 1  Stack Pointer reset value
 * 15 System Exceptions
 * NN Device specific Interrupts
 */
 typedef void (*isr_p)( void) ;
 isr_p const isr_vector[ 2] __attribute__((section(".isr_vector"))) = {
    (isr_p) &__StackTop,
 /* System Exceptions */
    Reset_Handler
 } ;
 #define RCC                 ((volatile long *) 0x40021000)
 #define RCC_AHBENR          RCC[ 5]
 #define RCC_AHBENR_IOPBEN   0x00040000  /*  18: I/O port B clock enable */
 #define GPIOB               ((volatile long *) 0x48000400)
 #define GPIOB_MODER         GPIOB[ 0]
 void Reset_Handler( void) {
 /* User LED ON */
    RCC_AHBENR |= RCC_AHBENR_IOPBEN ;   /* Enable IOPB periph */
    GPIOB_MODER |= 1 << (1 * 2) ;       /* PB1 Output [01], over default 00 */
    /* OTYPER Push-Pull by default */
    /* PB1 output default LOW at reset */
    for( ;;) ;
 }
 </pre>
 - I use the C preprocessor to specify the mapping of the peripheral
 registers.
 <p>
 - The naming convention is from the Reference Manual, the address
 locations from the Data Sheet.
 <p>
 - Registers are indicated as <b>volatile</b> as they may change out of the code
 control, this way the compiler will avoid optimizations based on known states.
 <p>
 To build I just request the format I need, either <b>.bin</b> or <b>.hex</b>.
 <pre>
 $ make ledon.hex
 ledon.elf
   text    data     bss     dec     hex filename
     40       0       0      40      28 ledon.elf
 ledon.hex
 rm ledon.elf ledon.o
 </pre>
 <h2>Test</h2>
 Once the board has been flashed with this code, the user LED lights up
 at reset. It turns out to be blue. &#x1F60E;
 <p>
 <img alt="User LED on" src="img/14_ledon.png">
 <h2>Checkpoint</h2>
 I covered the basic of GPIO output programming by turning the user LED
 on.
 <p>
 <a href="15_blink.html">Next</a>, I will implement the classic blinking
 LED.
    <hr>© 2020-2024 Renaud Fivet
 </body>
 </html>
--- a/docs/15_blink.html
+++ b/docs/15_blink.html
@ -0,0 +1,101 @@
 <!DOCTYPE html>
 <html lang="en">
 <head>
    <meta charset="UTF-8">
    <title>1.5 Blinking user LED</title>
    <link type="text/css" rel="stylesheet" href="style.css">
 </head>
 <body>
 <h1>1.5 Blinking user LED</h1>
 Turning the user LED on shows that the board is alive, making it blink
 shows that it has a pulse!&nbsp; A steady power LED with a pulsing user LED is
 easy to interpret.
 <p>
 As I already manage to turn the LED on, making it blink is quite
 straightforward.
 <h2>Code and build</h2>
 I make a copy of <b>ledon.c</b> into <b>blink.c</b> and modify it as follow:
 <ul>
 <li>Introduce the Output Data Register (<b>ODR</b>) of <b>GPIOB</b> peripheral
 to the compiler as <b>GPIOB_ODR</b>. This is where the pin state is programmed.
 <li>Modify the idle loop to toggle PB1 after a delay. The delay is coded
 as an active wait loop on decrementing a counter.
 </ul>
 <pre>
 /* Memory locations defined by linker script */
 extern long __StackTop ;        /* &__StackTop points after end of stack */
 void Reset_Handler( void) ;     /* Entry point for execution */
 /* Interrupt vector table:
 * 1  Stack Pointer reset value
 * 15 System Exceptions
 * NN Device specific Interrupts
 */
 typedef void (*isr_p)( void) ;
 isr_p const isr_vector[ 2] __attribute__((section(".isr_vector"))) = {
    (isr_p) &__StackTop,
 /* System Exceptions */
    Reset_Handler
 } ;
 #define RCC                 ((volatile long *) 0x40021000)
 #define RCC_AHBENR          RCC[ 5]
 #define RCC_AHBENR_IOPBEN   0x00040000  /*  18: I/O port B clock enable */
 #define GPIOB               ((volatile long *) 0x48000400)
 #define GPIOB_MODER         GPIOB[ 0]
 #define GPIOB_ODR           GPIOB[ 5]
 void Reset_Handler( void) {
    int delay ;
 /* User LED ON */
    RCC_AHBENR |= RCC_AHBENR_IOPBEN ;   /* Enable IOPB periph */
    GPIOB_MODER |= 1 << (1 * 2) ;       /* PB1 Output [01], over default 00 */
    /* OTYPER Push-Pull by default */
    /* PB1 output default LOW at reset */
 /* User LED blink */
    for( ;;) {
        for( delay = 1000000 ; delay ; delay--) ;   /* delay between toggling */
        GPIOB_ODR ^= 1 << 1 ;                       /* toggle PB1 (User LED) */
    }
 }
 </pre>
 I set the value of the delay counter at one million. By default the
 internal clock is set to 8MHz at reset, which means the delay will still
 be less than one second.
 <pre>
 $ make blink.hex
 blink.elf
   text    data     bss     dec     hex filename
     68       0       0      68      44 blink.elf
 blink.hex
 rm blink.o blink.elf
 </pre>
 <h2>Test</h2>
 As the <a href="vid/15_blink.mp4">video</a> shows, the delay is roughly
 600ms. I captured three on/off transitions in this three second video, looking
 through the frames gives me a better estimation.
 <h2>Checkpoint</h2>
 This is just a small increment on my previous step, but that’s iterative
 development in a nutshell. Also I didn't come with a reasonable value
 for the delay counter at first, it's easy to underestimate how fast
 micro-controllers are.
 <p>
 <a href="16_ledtick.html">Next</a>, interrupt driven blinking, because
 active wait delay is just not cool.
    <hr>© 2020-2024 Renaud Fivet
 </body>
 </html>
--- a/docs/16_ledtick.html
+++ b/docs/16_ledtick.html
@ -0,0 +1,169 @@
 <!DOCTYPE html>
 <html lang="en">
 <head>
    <meta charset="UTF-8">
    <title>1.6 The Tick</title>
    <link type="text/css" rel="stylesheet" href="style.css">
 </head>
 <body>
 <h1>1.6 The Tick</h1>
 <img alt="It’s blue! It blinks! It’s the Tick!" src="img/16_tick.png">
 <p>
 In previous iteration, I made the user LED blink using an active delay
 loop. I have two issues with this implementation:
 <ul>
 <li> It’s hard to control the delay timing accurately
 </ul><ul>
 <li> Active loops are not cool
 </ul>
 So I am going to call in some serious reinforcement, which means one of
 the Arm Core Peripherals: the System Tick.
 <p>
 What the System Tick does is very similar to my active delay loop as
 can be seen from the following pseudo-code.
 <pre>
 while( enabled) {
    if( --current_value == 0) {
        current_value = reload_value ;
        countflag = true ;
        if( interrupt_enabled)
            SysTick_Handler() ;
    }
 }
 </pre>
 It’s an auto decremented counter that reloads and sets a flag when
 reaching zero. It can trigger a system interrupt if requested to. Its
 default clock is the processor clock and can be switched to external
 clock. Details can be found in the Programming Manual as this is part of
 Arm Core.
 <h2>Code, build and test</h2>
 I copy <b>blink.c</b> into <b>ledtick.c</b> to make the following
 modifications:
 <ul>
 <li> Expand the interrupt vector to make room for all 15 System Exceptions
 from <b>Reset_Handler</b> to <b>SysTick_Handler</b>.
 </ul><ul>
 <li> Introduce the <b>SysTick</b> core peripheral to the compiler, using
 pre-processor macroes to give the location of SysTick registers. As this
 is a core peripheral, it’s in a different address space than the
 peripherals I have seen so far.
 </ul><ul>
 <li> Start the <b>Reset_Handler</b> by initializing and enabling the System
 Tick. The reload value register is initialized with a constant based on
 the internal clock value divided by 8. As I want one tick per second and
 the default internal clock is 8 MHz, SysTick will have to count one
 million steps, from 999999 to 0. So the reload value is 999999.
 </ul><ul>
 <li> Create the <b>SysTick_Handler</b> interrupt routine which toggles GPIO
 B1.
 </ul><ul>
 <li> Replace the previous active delay loop by a <b>cool</b> idle loop.
 Instead of doing nothing actively, I instruct the processor to wait for
 interrupt, which means it will suspend until waked up by the SysTick.
 This is one way to lower power consumption, besides, waiting for SysTick
 interrupt is really the only thing left to do.
 </ul>
 <pre>
 /* Memory locations defined by linker script */
 extern long __StackTop ;        /* &__StackTop points after end of stack */
 void Reset_Handler( void) ;     /* Entry point for execution */
 void SysTick_Handler( void) ;
 /* Interrupt vector table:
 * 1  Stack Pointer reset value
 * 15 System Exceptions
 * NN Device specific Interrupts
 */
 typedef void (*isr_p)( void) ;
 isr_p const isr_vector[ 16] __attribute__((section(".isr_vector"))) = {
    (isr_p) &__StackTop,
 /* System Exceptions */
    Reset_Handler,
    0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,
    SysTick_Handler
 } ;
 #define SYSTICK             ((volatile long *) 0xE000E010)
 #define SYSTICK_CSR         SYSTICK[ 0]
 #define SYSTICK_RVR         SYSTICK[ 1]
 #define SYSTICK_CVR         SYSTICK[ 2]
 #define RCC                 ((volatile long *) 0x40021000)
 #define RCC_AHBENR          RCC[ 5]
 #define RCC_AHBENR_IOPBEN   0x00040000  /*  18: I/O port B clock enable */
 #define GPIOB               ((volatile long *) 0x48000400)
 #define GPIOB_MODER         GPIOB[ 0]
 #define GPIOB_ODR           GPIOB[ 5]
 void Reset_Handler( void) {
 /* By default SYSCLK == HSI [8MHZ] */
 /* SYSTICK */
    SYSTICK_RVR = 1000000 - 1 ;     /* HBA / 8 */
    SYSTICK_CVR = 0 ;
    SYSTICK_CSR = 3 ;               /* HBA / 8, Interrupt ON, Enable */
    /* SysTick_Handler will execute every 1s from now on */
 /* User LED ON */
    RCC_AHBENR |= RCC_AHBENR_IOPBEN ;   /* Enable IOPB periph */
    GPIOB_MODER |= 1 << (1 * 2) ;       /* PB1 Output [01], over default 00 */
    /* OTYPER Push-Pull by default */
    /* PB1 output default LOW at reset */
    for( ;;)
        __asm( "WFI") ; /* Wait for interrupt */
 }
 void SysTick_Handler( void) {
    GPIOB_ODR ^= 1 << 1 ;   /* Toggle PB1 (User LED) */
 }
 </pre>
 I didn’t initialize the GPIO B before enabling the SysTick as I have a
 whole second before the first interrupt will tick in.
 <p>
 Build is straightforward.
 <pre>
 $ make ledtick.hex
 ledtick.elf
   text    data     bss     dec     hex filename
    148       0       0     148      94 ledtick.elf
 ledtick.hex
 rm ledtick.o ledtick.elf
 </pre>
 If I compare with blink.hex, 56 bytes of the 80 bytes code increase are
 due to the expansion of the interrupt vector.
 <p>
 Once flashed in the board I can see that the LED changes state every second.
 <h2>Checkpoint</h2>
 I now have the foundation for timing and a first taste of shifting
 execution between the main loop and an interrupt routine.
 <p>
 Code size has been growing steadily since the first bootstrap. On the
 other hand, except for the stack, I have not used RAM memory so far.
 <pre>
 │        │ text │ data  │ bss   │
 ├────────┼──────┼───────┼───────┤
 │<b>boot</b>    │ 10	│ 0	│ 0     │
 │<b>ledon</b>   │ 40	│ 0	│ 0     │
 │<b>blink</b>   │ 68	│ 0	│ 0     │
 │<b>ledtick</b> │ 148	│ 0	│ 0     │
 </pre>
 <a href="17_cstartup.html">Next</a>, I will focus on RAM initialization.
    <hr>© 2020-2024 Renaud Fivet
 </body>
 </html>
--- a/docs/17_cstartup.html
+++ b/docs/17_cstartup.html
@ -0,0 +1,250 @@
 <!DOCTYPE html>
 <html lang="en">
 <head>
    <meta charset="UTF-8">
    <title>1.7 C Startup</title>
    <link type="text/css" rel="stylesheet" href="style.css">
 </head>
 <body>
 <h1>1.7 C Startup</h1>
 The C compiler uses a memory model where the RAM is divided into four
 contiguous sections. The linker provides the symbols needed to make sure
 the initial state meet the requirements of this memory model. So I need
 to write a piece of code to use those symbols to initialize or clear the
 RAM accordingly.
 <pre>
 │ <b>Section</b> │ <b>Description</b>                                  │
 ├─────────┼──────────────────────────────────────────────┤
 │ data    │ static initialized, initial values in flash  │
 │ bss     │ static unassigned, cleared                   │
 │ heap    │ dynamically allocated, user managed          │
 │ stack   │ automatically allocated, stack frame managed │
 </pre>
 My bootstrap since the first <b>boot.c</b> already initializes the stack. I
 need now to copy the initial values from flash to the <b>data</b> section and
 clear the <b>bss</b> section.
 <p>
 You can check your understanding of the C memory model by looking at the
 C test code below and figuring where the linker will allocate the
 variables.
 <pre>
 /** Test code: main.c *********************************************************/
 const char hexa[] = "0123456789abcdef" ;
 long first = 1 ;
 long i ;
 int main( void) {
    static char c = 'a' ;
    char *cp = &c ;
    *cp += i ;
    i += hexa[ 13] - c + first++ ;
    return 0 ;
 }
 </pre>
 <ul>
 <li> <b>data</b> section holds <code>first</code> and <code>c</code>, for a
 total of 8 bytes as sections are word aligned.
 </ul><ul>
 <li> <b>bss</b> section holds <code>i</code> for a total of 4 bytes.
 </ul><ul>
 <li> <b>text</b> section holds <code>hexa</code> with all the const data
 located after the code. As it is a zero terminated string, it occupies 17 bytes
 and is padded with 3 zero for word alignment.
 </ul><ul>
 <li> <b>text</b> section holds the initial value of <code>first</code> and
 <code>c</code> for a total of 8 bytes located after the const data.
 </ul><ul>
 <li> <b>stack</b> section holds <code>cp</code>, it is dynamically managed by
 the code generated by the C compiler.
 </ul><ul>
 <li> after executing <code>main()</code>, <code>hexa</code> and <code>c</code>
 are unchanged, <code>first</code> has the value 2, <code>i</code> has the
 value 4 and <code>cp</code> has been deallocated.
 </ul>
 <h2>Evolving the bootstrap</h2>
 First I make a copy of <b>boot.c</b> into <b>cstartup.c</b>.
 <p>
 I add the symbols defined by the linker script:
 <ul>
 <li> <code>__etext</code>, start of initial value copy in FLASH.
 <li> <code>__data_start</code>, start of initialized data in RAM.
 <li> <code>__bss_start</code>, start of unitialized data in RAM, it is the same
 location as <code>__data_end</code>.
 <li> <code>__bss_end</code>, first location after the bss section.
 </ul>
 I rework <code>Reset_handler()</code> to:
 <ul>
 <li> Initialize the <b>data</b> section with the initial values stored in flash.
 <li> Clear the <b>bss</b> section
 <li> Call the <code>main()</code> C function.
 <li> Fallback to idle loop after <code>main()</code> has been executed.
 </ul>
 Finally I append the test code for validation.
 <pre>
 /* Memory locations defined by linker script */
 extern long __StackTop ;        /* &__StackTop points after end of stack */
 void Reset_Handler( void) ;     /* Entry point for execution */
 extern const long __etext[] ;   /* start of initialized data copy in flash */
 extern long __data_start__[] ;
 extern long __bss_start__[] ;
 extern long __bss_end__ ;       /* &__bss_end__ points after end of bss */
 /* Interrupt vector table:
 * 1  Stack Pointer reset value
 * 15 System Exceptions
 * NN Device specific Interrupts
 */
 typedef void (*isr_p)( void) ;
 isr_p const isr_vector[ 2] __attribute__((section(".isr_vector"))) = {
    (isr_p) &__StackTop,
 /* System Exceptions */
    Reset_Handler
 } ;
 extern int main( void) ;
 void Reset_Handler( void) {
    const long  *f ;    /* from, source constant data from FLASH */
    long    *t ;        /* to, destination in RAM */
 /* Assume:
 **  __bss_start__ == __data_end__
 **  All sections are 4 bytes aligned
 */
    f = __etext ;
    for( t = __data_start__ ; t < __bss_start__ ; t += 1)
        *t = *f++ ;
    while( t < &__bss_end__)
        *t++ = 0 ;
    main() ;
    for( ;;) ;
 }
 /** Test code: main.c *********************************************************/
 const char hexa[] = "0123456789abcdef" ;
 long first = 1 ;
 long i ;
 int main( void) {
    static char c = 'a' ;
    char *cp = &c ;
    *cp += i ;
    i += hexa[ 13] - c + first++ ;
    return 0 ;
 }
 </pre>
 <h2>Build</h2>
 Building a binary, I can see that the <b>data</b> and <b>bss</b> sections are
 not empty anymore and their sizes match my estimations.
 <pre>
 $ make cstartup.bin
 cstartup.elf
   text    data     bss     dec     hex filename
    121       8       4     133      85 cstartup.elf
 cstartup.bin
 rm cstartup.o cstartup.elf
 </pre>
 If I look further in the <b>cstartup.map</b> generated by the linker.
 <pre>
 .text           0x08000000       0x79
 *(.isr_vector)
 .isr_vector    0x08000000        0x8 cstartup.o
                0x08000000                isr_vector
 *(.text*)
 .text          0x08000008       0x34 cstartup.o
                0x08000008                Reset_Handler
 .text.startup  0x0800003c       0x2c cstartup.o
                0x0800003c                main
 *(.rodata*)
 .rodata        0x08000068       0x11 cstartup.o
                0x08000068                hexa
 .data           0x20000000        0x8 load address 0x0800007c
                0x20000000                __data_start__ = .
 *(.data*)
 .data          0x20000000        0x8 cstartup.o
                0x20000004                first
                0x20000008                . = ALIGN (0x4)
                0x20000008                __data_end__ = .
 .bss            0x20000008        0x4 load address 0x08000084
                0x20000008                . = ALIGN (0x4)
                0x20000008                __bss_start__ = .
 *(.bss*)
 .bss           0x20000008        0x0 cstartup.o
 *(COMMON)
 COMMON         0x20000008        0x4 cstartup.o
                0x20000008                i
                0x2000000c                . = ALIGN (0x4)
                0x2000000c                __bss_end__ = .
 *(.stack*)
                0x20001000                __StackTop = (ORIGIN (RAM) + LENGTH (RAM))
 </pre>
 <ul>
 <li> <code>hexa</code> is located in <b>.rodata</b> at 0x08000068
 <li> <code>first</code> is located in <b>.data</b> at 0x20000004
 <li> <code>i</code> is located in <b>.bss</b> at 0x20000008
 <li> <code>c</code> is not listed as it doesn’t have global scope, but I can
 guess it’s located at 0x20000000.
 <li> Initial values for the <b>.data</b> section are located at 0x0800007c.
 </ul>
 A hexadecimal dump of <b>cstartup.bin</b> confirms that the initial value
 of <code>c</code> is at offset 0x7c, which also means that <code>c</code> has
 been located at 0x20000000.
 <pre>
 $ hexdump -C cstartup.bin
 00000000  00 10 00 20 09 00 00 08  10 b5 08 4a 08 4b 09 49  |... .......J.K.I|
 00000010  8b 42 06 d3 00 21 08 4a  93 42 05 d3 00 f0 0e f8  |.B...!.J.B......|
 00000020  fe e7 01 ca 01 c3 f3 e7  02 c3 f5 e7 7c 00 00 08  |............|...|
 00000030  00 00 00 20 08 00 00 20  0c 00 00 20 30 b5 08 49  |... ... ... 0..I|
 00000040  08 48 0a 78 04 68 4b 68  12 19 d2 b2 5d 1c 9b 1a  |.H.x.hKh....]...|
 00000050  64 33 1b 19 4d 60 03 60  0a 70 00 20 30 bd c0 46  |d3..M`.`.p. 0..F|
 00000060  00 00 00 20 08 00 00 20  30 31 32 33 34 35 36 37  |... ... 01234567|
 00000070  38 39 61 62 63 64 65 66  00 00 00 00 61 00 00 00  |89abcdef....a...|
 00000080  01 00 00 00                                       |....|
 00000084
 </pre>
 <h2>Debug</h2>
 I use the STM32 Cube Programmer to check that the code behaves as
 expected by checking the RAM content after <code>main()</code> has been
 executed.
 <p>
 <img alt="RAM display in STM32 Cube Programmer" src="img/17_cube.png"
 width="1024">
 <h2>Checkpoint</h2>
 I have now a sample bootstrap that puts the RAM memory in a state
 required for a C startup.
 <p>
 <a href="18_3stages.html">Next</a>, I will merge the C startup initialization with
 the <b>ledtick</b> code.
    <hr>© 2020-2024 Renaud Fivet
 </body>
 </html>
--- a/docs/18_3stages.html
+++ b/docs/18_3stages.html
@ -0,0 +1,265 @@
 <!DOCTYPE html>
 <html lang="en">
 <head>
    <meta charset="UTF-8">
    <title>1.8 Three-stage Rocket</title>
    <link type="text/css" rel="stylesheet" href="style.css">
 </head>
 <body>
 <h1>1.8 Three-stage Rocket</h1>
 As I merge the cstartup with the ledtick code, I split the
 functionalities between three files according to the three phases: boot,
 initialization and main execution.
 <ul>
 <li> <b>startup.c</b> is mainly concerned with the early set up of the board:
 initializing the memory, calling system initialization and then
 executing the main C function if initialization is successful.
 </ul><ul>
 <li> <b>init.c</b> holds the middleware implementation and will abstract the
 peripherals into higher level interfaces. Its entry point is the
 <code>init()</code> function.
 </ul><ul>
 <li> the <code>main()</code> C function will be the focus of the last file and
 it should be written in a subset of standard C. As an example I will use
 <b>success.c</b>.
 </ul>
 <pre>
 /* success.c -- success does nothing, successfully */
 #include &lt;stdlib.h>
 int main( void) {
    return EXIT_SUCCESS ;
 }
 </pre>
 <h2>startup.c</h2>
 Beside the interrupt vector and the <code>Reset_Handler()</code> that calls
 <code>init()</code> and <code>main()</code>, I have created stubs for all the
 System Exceptions listed in the Programming Manual. For those, if there is no
 implementation avalable in <b>init.c</b>, the linker will use the
 <code>Default_Handler()</code> provided with the <code>__attribute__()</code>
 Gnu C extension.
 <pre>
 /* Memory locations defined by linker script */
 extern long __StackTop ;        /* &__StackTop points after end of stack */
 void Reset_Handler( void) ;     /* Entry point for execution */
 extern const long __etext[] ;   /* start of initialized data copy in flash */
 extern long __data_start__[] ;
 extern long __bss_start__[] ;
 extern long __bss_end__ ;       /* &__bss_end__ points after end of bss */
 /* Stubs for System Exception Handler */
 void Default_Handler( void) ;
 #define dflt_hndlr( fun) void fun##_Handler( void) \
                                __attribute__((weak,alias("Default_Handler")))
 dflt_hndlr( NMI) ;
 dflt_hndlr( HardFault) ;
 dflt_hndlr( SVCall) ;
 dflt_hndlr( PendSV) ;
 dflt_hndlr( SysTick) ;
 /* Interrupt vector table:
 * 1  Stack Pointer reset value
 * 15 System Exceptions
 * NN Device specific Interrupts
 */
 typedef void (*isr_p)( void) ;
 isr_p const isr_vector[ 16] __attribute__((section(".isr_vector"))) = {
    (isr_p) &__StackTop,
 /* System Exceptions */
    Reset_Handler,
    NMI_Handler,
    HardFault_Handler,
    0,  0,  0,  0,  0,  0,  0,
    SVCall_Handler,
    0,  0,
    PendSV_Handler,
    SysTick_Handler
 } ;
 extern int init( void) ;
 extern int main( void) ;
 void Reset_Handler( void) {
    const long  *f ;    /* from, source constant data from FLASH */
    long    *t ;        /* to, destination in RAM */
 /* Assume:
 **  __bss_start__ == __data_end__
 **  All sections are 4 bytes aligned
 */
    f = __etext ;
    for( t = __data_start__ ; t < __bss_start__ ; t += 1)
        *t = *f++ ;
    while( t < &__bss_end__)
        *t++ = 0 ;
    if( init() == 0)
        main() ;
    for( ;;)
        __asm( "WFI") ; /* Wait for interrupt */
 }
 void Default_Handler( void) {
    for( ;;) ;
 }
 </pre>
 Except for the future addition of stubs to handle the device specific
 interrupts, this file will not grow much anymore.
 <h2>init.c</h2>
 This is the embryo of an hardware abstraction layer where most of the
 device specific code will be added. The current code is the peripherals
 part of <b>ledtick.c</b>.
 <pre>
 #define SYSTICK             ((volatile long *) 0xE000E010)
 #define SYSTICK_CSR         SYSTICK[ 0]
 #define SYSTICK_RVR         SYSTICK[ 1]
 #define SYSTICK_CVR         SYSTICK[ 2]
 #define RCC                 ((volatile long *) 0x40021000)
 #define RCC_AHBENR          RCC[ 5]
 #define RCC_AHBENR_IOPBEN   0x00040000  /*  18: I/O port B clock enable */
 #define GPIOB               ((volatile long *) 0x48000400)
 #define GPIOB_MODER         GPIOB[ 0]
 #define GPIOB_ODR           GPIOB[ 5]
 int init( void) {
 /* By default SYSCLK == HSI [8MHZ] */
 /* SYSTICK */
    SYSTICK_RVR = 1000000 - 1 ;     /* HBA / 8 */
    SYSTICK_CVR = 0 ;
    SYSTICK_CSR = 3 ;               /* HBA / 8, Interrupt ON, Enable */
    /* SysTick_Handler will execute every 1s from now on */
 /* User LED ON */
    RCC_AHBENR |= RCC_AHBENR_IOPBEN ;   /* Enable IOPB periph */
    GPIOB_MODER |= 1 << (1 * 2) ;       /* PB1 Output [01], over default 00 */
    /* OTYPER Push-Pull by default */
    /* PB1 output default LOW at reset */
    return 0 ;
 }
 void SysTick_Handler( void) {
    GPIOB_ODR ^= 1 << 1 ;   /* Toggle PB1 (User LED) */
 }
 </pre>
 <h2>Makefile</h2>
 As I now build from multiple source files, I have modified the
 <b>Makefile</b> to list the sources that combine together. All steps I have
 done so far can be found in the commented <code>SRCS</code> lines. Single file
 steps can be build explicitly (<code>make ledon.hex</code>) or implicitly
 (<code>make</code>) after removing the comment on the corresponding
 <code>SRCS</code> line. Multiple file steps can only be build implicitly when
 their <code>SRCS</code> line is uncommented.
 <pre>
 ### Build environment selection
 ifeq (linux, $(findstring linux, $(MAKE_HOST)))
 GCCDIR = $(HOME)/Packages/arm-gnu-toolchain-13.3.rel1-x86_64-arm-none-eabi
 else
 GCCDIR = "D:/Program Files (x86)/GNU Arm Embedded Toolchain/arm-gnu-toolchain-13.3.rel1-mingw-w64-i686-arm-none-eabi"
 endif
 BINPFX  = @$(GCCDIR)/bin/arm-none-eabi-
 CC      = $(BINPFX)gcc
 LD      = $(BINPFX)ld
 OBJCOPY = $(BINPFX)objcopy
 OBJDUMP = $(BINPFX)objdump
 SIZE    = $(BINPFX)size
 ### STM32F030F4P6 based board
 PROJECT = f030f4
 #SRCS = boot.c
 #SRCS = ledon.c
 #SRCS = blink.c
 #SRCS = ledtick.c
 #SRCS = cstartup.c
 SRCS = startup.c init.c success.c
 OBJS = $(SRCS:.c=.o)
 CPU = -mthumb -mcpu=cortex-m0
 CFLAGS = $(CPU) -g -Wall -Wextra -Os
 LD_SCRIPT = $(PROJECT).ld
 ### Build rules
 .PHONY: clean all
 all: $(PROJECT).hex $(PROJECT).bin
 clean:
    @echo CLEAN
    @rm -f *.o *.elf *.map *.lst *.bin *.hex
 $(PROJECT).elf: $(OBJS)
    @echo $@
    $(LD) -T$(LD_SCRIPT) -Map=$(PROJECT).map -cref -o $@ $(OBJS)
    $(SIZE) $@
    $(OBJDUMP) -hS $@ > $(PROJECT).lst
 %.elf: %.o
    @echo $@
    $(LD) -T$(LD_SCRIPT) -Map=$*.map -cref -o $@ $<
    $(SIZE) $@
    $(OBJDUMP) -hS $@ > $*.lst
 %.bin: %.elf
    @echo $@
    $(OBJCOPY) -O binary $< $@
 %.hex: %.elf
    @echo $@
    $(OBJCOPY) -O ihex $< $@
 </pre>
 A successful build will generate the files <b>f030f4.hex</b>, <b>f030f4.bin</b>,
 <b>f030f4.map</b>, <b>f030f4.lst</b>.
 <h2>Build and Test</h2>
 Even if <b>stdlib.h</b> is included in <b>success.c</b>, there is no C
 libraries needed to complete the build as only the constant
 <code>EXIT_SUCCESS</code> from that header is used. Furthermore, default
 location of the header files is derived by the compiler from the location of
 gcc.
 <pre>
 $ make
 f030f4.elf
   text    data     bss     dec     hex filename
    216       0       0     216      d8 f030f4.elf
 f030f4.hex
 f030f4.bin
 </pre>
 Building shows an increase in code, still no data.
 <p>
 Once <b>f030f4.hex</b> is loaded into the board, the behavior is the same
 as <b>ledtick.hex</b>. The new file structure and data initialization
 didn’t introduce any <del>bugs</del> changes, just code overhead.
 <h2>Checkpoint</h2>
 This step was mainly to achieve a better structure for future evolution.
 <p>
 <a href="19_publish.html">Next</a>, I will make the code available in a
 public git repository.
    <hr>© 2020-2024 Renaud Fivet
 </body>
 </html>
--- a/docs/19_publish.html
+++ b/docs/19_publish.html
@ -0,0 +1,27 @@
 <!DOCTYPE html>
 <html lang="en">
 <head>
    <meta charset="UTF-8">
    <title>1.9 Publish</title>
    <link type="text/css" rel="stylesheet" href="style.css">
 </head>
 <body>
 <h1>1.9 Publish</h1>
 As I have reached a stable point in previous step, I create a git
 repository <b>stm32bringup</b>, publish it on
 <a href="https://github.com/rfivet/stm32bringup">github.com</a> with a mirror
 on <a href="https://git.sdf.org/rfivet/stm32bringup">git.sdf.org</a>.
 <p>
 I chose MIT license as open source license for the code.
 <p>
 The only rework of the C sources is the addition of a copyright notice
 and conversion of tabs to spaces for portability.
 <p>
 <a href="index.html#part2">Next</a>, I will start working with USART
 peripherals: select an USB adapter, flash the board using the UART
 communication and say hello to the world.
    <hr>© 2020-2024 Renaud Fivet
 </body>
 </html>
--- a/docs/21_uart.html
+++ b/docs/21_uart.html
@ -0,0 +1,89 @@
 <!DOCTYPE html>
 <html lang="en">
 <head>
    <meta charset="UTF-8">
    <title>2.1 UART Validation</title>
    <link type="text/css" rel="stylesheet" href="style.css">
 </head>
 <body>
 <h1>2.1 UART Validation</h1>
 Before I start writing some code that communicates over the Universal
 Synchronous Asynchronous Receiver Transmitter (<b>USART</b>) peripheral, I need
 to validate that I have working hardware and software tools.
 <h2>Board Connectivity</h2>
 Even if the peripheral is capable of doing synchronous communication
 (that’s the S in USART), asynchronous communication (that’s the A) which
 only needs 3 wires (GND, TX, RX, (no clock)) is usually what is needed
 in non specialized cases.
 <p>
 Boards sold online often have dedicated pre-soldered pins for UART
 connectivity similar to what I have seen before for the SWD interface.
 The VCC-GND board I used previously doesn’t have such dedicated pins but the
 functionality is wired on the pins PA9 (<b>TX</b>) and PA10 (<b>RX</b>).
 <p>
 I will use a board with dedicated pins (GND, TX, RX, VCC 3.3V). Board
 specifications can be found
 <a href="https://stm32-base.org/boards/STM32F030F4P6-STM32F030F4-V2.00">
 here</a>.
 <p>
 <img alt="STM32F030F4-V2.00" src="img/21_boardv200.png">
 <h2>USB to UART adapter</h2>
 An adapter is needed to connect to a PC. Either due to difference in
 voltage (<b>RS232</b>) or serial protocol (<b>USB</b>). Pins PA9 and PA10 are
 5V tolerant, so you could interface an Arduino Uno to a STM32 board to use
 it as a USB to serial adapter if you happen to have a spare Arduino Uno.
 <p>
 I use an adapter based on <b>Silicon Labs CP2102</b> chipset.
 Windows has USB driver available for Silicon Labs CP210x chipset family.
 The adapter enumerates as <b>COM4</b> on my Windows PC.
 <p>
 I connect the adapter to the board to provide 3.3V and make sure to cross
 RX and TX wires (STM32 RX <-> Adapter TX, STM32 TX <-> Adapter RX).
 <h2>STM32 Cube Programmer UART connection</h2>
 So far I have been using the ST-Link interface with STM32 Cube
 Programmer to flash and debug. The application also support the UART
 interface.
 <h2>Embedded Boot Loader</h2>
 A reset of the board while jumper <b>BOOT0</b> is removed will select the
 System memory instead of the flash memory for execution. This is where
 the serial flash loader protocol is implemented on chipset side.
 <p>
 <img alt="BOOT0 Jumper Selection" src="img/21_boot0.png">
 <h2>Testing</h2>
 The checklist goes like this:
 <ul>
 <li> Board connected to USB adapter
 <li> USB driver installed on Windows PC
 <li> USB adapter plugged in and enumerates as a COM port
 <li> STM32 Cube Programmer list the COM port in device selection menu
 <li> BOOT0 jumper removed and board reset to start the embedded flash
 loader.
 <li> Board flash memory can be erased, written or read with the programmer.
 </ul>
 <h2>Checkpoint</h2>
 I have now working hardware and software that communicate through the
 serial link.
 <p>
 <a href="22_board.html">Next</a>, I will make sure the code I wrote so far is
 working on the new board.
    <hr>© 2020-2024 Renaud Fivet
 </body>
 </html>
--- a/docs/22_board.html
+++ b/docs/22_board.html
@ -0,0 +1,146 @@
 <!DOCTYPE html>
 <html lang="en">
 <head>
    <meta charset="UTF-8">
    <title>2.2 Another Day, Another Board</title>
    <link type="text/css" rel="stylesheet" href="style.css">
 </head>
 <body>
 <h1>2.2 Another Day, Another Board</h1>
 As I switched to a new board, I need to check that the code I wrote so
 far is still working as expected. Unfortunately the user LED on the new
 board is wired differently than the first one.
 <h2>Board Schematics</h2>
 <img src="img/22_ledv200.png">
 <p>
 The new board uses <b>PA4</b> to turn the LED ON when <b>high</b>. Also, by
 taking off a jumper, <b>PA4</b> can be used for other purposes.
 <p>
 The first board uses <b>PB1</b> to turn the LED ON when <b>low</b> as I have
 seen previously <a href="14_ledon.html">here</a>.
 <p>
 I can adapt the code by adding the base address of <b>GPIOA</b> and
 shifting pin location accordingly, but this type of variations is so
 common that I want to make sure adaptations can be done easily without
 errors.
 <h2>Board Description</h2>
 I want to be able to capture the board variations through simple definitions:
 <p>
 For the new board:
 <pre>
 #define LED_IOP A
 #define LED_PIN 4
 #define LED_ON  1
 </pre>
 And for the vcc-gnd board:
 <pre>
 #define LED_IOP B
 #define LED_PIN 1
 #define LED_ON  0
 </pre>
 <h2>Implementation</h2>
 I make a copy of <b>init.c</b> into <b>board.c</b> and rework the preprocessor
 macroes.
 <p>
 As I have several GPIO peripheral GPIOA .. GPIOF, I switch notation,
 instead of writing, say, <code>GPIOA_MODER</code>, I will write either
 <code>GPIOA[ MODER]</code> or <code>GPIO( A)[ MODER]</code>. This way I could
 refer directly to <code>GPIO( LED_IOP)[ MODER]</code>.
 <p>
 I use conditional compilation based on <code>LED_ON</code>. If
 <code>LED_ON</code> is high, I need an extra step during initialization
 compare to <code>LED_ON</code> low. On the other hand, if <code>LED_ON</code>
 is undefined, the code would be removed for a board that doesn’t have a user
 LED.
 <pre>
 #define SYSTICK             ((volatile long *) 0xE000E010)
 #define SYSTICK_CSR         SYSTICK[ 0]
 #define SYSTICK_RVR         SYSTICK[ 1]
 #define SYSTICK_CVR         SYSTICK[ 2]
 #define CAT( a, b) a##b
 #define HEXA( a) CAT( 0x, a)
 #define RCC                 ((volatile long *) 0x40021000)
 #define RCC_AHBENR          RCC[ 5]
 #define RCC_AHBENR_IOP( h)  (1 << (17 + HEXA( h) - 0xA))
 #define GPIOA               ((volatile long *) 0x48000000)
 #define GPIOB               ((volatile long *) 0x48000400)
 #define GPIO( x) CAT( GPIO, x)
 #define MODER   0
 #define ODR     5
 /* user LED ON when PA4 is high */
 #define LED_IOP A
 #define LED_PIN 4
 #define LED_ON  1
 void SysTick_Handler( void) {
 #ifdef LED_ON
    GPIO( LED_IOP)[ ODR] ^= 1 << LED_PIN ;   /* Toggle User LED */
 #endif
 }
 int init( void) {
 /* By default SYSCLK == HSI [8MHZ] */
 /* SYSTICK */
    SYSTICK_RVR = 1000000 - 1 ;     /* HBA / 8 */
    SYSTICK_CVR = 0 ;
    SYSTICK_CSR = 3 ;               /* HBA / 8, Interrupt ON, Enable */
    /* SysTick_Handler will execute every 1s from now on */
 #ifdef LED_ON
 /* User LED ON */
    RCC_AHBENR |= RCC_AHBENR_IOP( LED_IOP) ;        /* Enable IOPx periph */
    GPIO( LED_IOP)[ MODER] |= 1 << (LED_PIN * 2) ;  /* LED_IO Output [01],
                                                    ** over default 00 */
    /* OTYPER Push-Pull by default */
    /* Pxn output default LOW at reset */
 # if LED_ON
    SysTick_Handler() ;
 # endif
 #endif
    return 0 ;
 }
 </pre>
 <h2>Build and Test</h2>
 I just need to add the <code>SRCS</code> definition in <b>Makefile</b>.
 <pre>SRCS = startup.c board.c success.c</pre>
 and build.
 <pre>
 $ make
 f030f4.elf
   text    data     bss     dec     hex filename
    224       0       0     224      e0 f030f4.elf
 f030f4.hex
 f030f4.bin
 </pre>
 Once I have flashed the board with this new binary, I put back the BOOT0
 jumper and press reset. This board user LED is red. &#x1F60E;
 <h2>Checkpoint</h2>
 I made sure the code I have evolved so far works on the board with the
 serial connection.
 <p>
 <a href="23_hello.html">Next</a>, I will do some serial transmission.
    <hr>© 2020-2024 Renaud Fivet
 </body>
 </html>
--- a/docs/23_hello.html
+++ b/docs/23_hello.html
@ -0,0 +1,160 @@
 <!DOCTYPE html>
 <html lang="en">
 <head>
    <meta charset="UTF-8">
    <title>2.3 Hello There!</title>
    <link type="text/css" rel="stylesheet" href="style.css">
 </head>
 <body>
 <h1>2.3 Hello There!</h1>
 Looking for a <i>“you had me at hello”</i> moment?&nbsp; Let’s see how serial
 transmission works for you.
 <h2>Implementation</h2>
 I make a copy of <b>board.c</b> into <b>usart1tx.c</b> to add support for the
 USART1 peripheral.
 <p>
 In order to make a first transmission, the peripherals have to be
 initialized. As the TX/RX of USART1 are mapped on pin PA9 and PA10, I
 need to configure GPIOA first.
 <ul>
 <li> GPIOA needs to be enabled via RCC AHB Enable Register as GPIOs are on
 AHB bus.
 <li> PA9 and PA10 set to alternate mode.
 <li> Alternate function USART1 selected for PA9 and PA10.
 </ul>
 Then USART1 can be configured:
 <ul>
 <li> USART1 enabled via RCC APB2 Enable Register as USARTs are on the APB
 bus.
 <li> Baud rate set to 9600 bauds.
 <li> USART itself and transmission needs to be enabled via the Control
 Register (CR1).
 </ul>
 By default the transmission format is 8N1: 8 bit data, no parity and 1
 stop bit.
 <pre>
 /* USART1 9600 8N1 */
    RCC_AHBENR |= RCC_AHBENR_IOP( A) ;  /* Enable GPIOA periph */
    GPIOA[ MODER] |= 0x0A << (9 * 2) ;  /* PA9-10 ALT 10, over default 00 */
    GPIOA[ AFRH] |= 0x110 ;             /* PA9-10 AF1 0001, over default 0000 */
    RCC_APB2ENR |= RCC_APB2ENR_USART1EN ;
    USART1[ BRR] = 8000000 / 9600 ;     /* PCLK [8MHz] */
    USART1[ CR1] |= USART_CR1_UE | USART_CR1_TE ;   /* Enable USART & Tx */
 </pre>
 Sending data is done by writing in the Transmission Data Register (<b>TDR</b>).
 To check if it is ready for transmission you must check the state of the
 TX Empty (<b>TXE</b>) bit in the Interrupt & Status Register (<b>ISR</b>).
 <p>
 I write a basic <code>kputc()</code> function that does busy waiting if the
 <b>TDR</b> is not empty and insures that LF are mapped to CR LF. The ‘k’ in
 kputc refer to ‘kernel’, as kputc is a low level function that will be used
 mostly for debugging. With the busy wait and the recursive code this
 implementation is definitively not optimal, but it’s functional and
 that’s what matter most at this stage.
 <pre>
 void kputc( unsigned char c) {
    static unsigned char lastc ;
    if( c == '\n' && lastc != '\r')
        kputc( '\r') ;
 /* Active wait while transmit register is full */
    while( (USART1[ ISR] & USART_ISR_TXE) == 0) ;
    USART1[ TDR] = c ;
    lastc = c ;
 }
 </pre>
 The high level C function I need for this simple test is <code>puts()</code>.
 I make my own implementation but I keep the same declaration as the standard
 header that come with the C compiler.
 <pre>
 int puts( const char *s) {
    while( *s)
        kputc( *s++) ;
    kputc( '\n') ;
    return 0 ;
 }
 </pre>
 Finally I use a standard C implementation for <b>hello.c</b>.
 <pre>
 /* hello.c -- hello there */
 #include &lt;stdio.h>
 #include &lt;stdlib.h>
 int main( void) {
    puts( "hello, world") ;
    return EXIT_SUCCESS ;
 }
 </pre>
 <h2>Build</h2>
 To build I update the software composition in <b>Makefile</b> by adding a new
 <code>SRCS</code> line.
 <pre>SRCS = startup.c usart1tx.c hello.c</pre>
 Calling make, I can see that there is now some variable in <b>BSS</b> section
 of the RAM. It is <code>lastchar</code> local to <code>kputc()</code>. Because
 of word alignment <code>BSS</code> occupies 4 bytes.
 <pre>
 $ make
 f030f4.elf
   text    data     bss     dec     hex filename
    413       0       4     417     1a1 f030f4.elf
 f030f4.hex
 f030f4.bin
 </pre>
 <h2>Testing</h2>
 After flashing the board with the new executable, I place back the
 <b>BOOT0</b> jumper and press the reset button, the board user LED blinks
 as usual but I can see the RX LED on the USB to UART adapter flash
 briefly when I release the reset button.
 <p>
 On Windows PC, if I use PuTTY or Arduino IDE to open <b>COM4</b> at 9600
 baud, every time I press and release the reset button I can see ‘hello,
 world’ displayed on a new line in the terminal window.
 <p>
 On Linux, when I plug in the USB to UART adapter, it enumerates as
 <b>/dev/ttyUSB0</b>, so it is compatible with the USB driver for serial
 ports. If I try to open it with Arduino IDE, I get an error message as I
 need to belong to <b>dialout</b> group to open that TTY for reading and
 writing.
 <pre>sudo usermod -a -G dialout $USER</pre>
 Once added to <b>dialout</b>, I can open <b>/dev/ttyUSB0</b> at 9600 baud in
 Arduino IDE, each time I press and release the board RESET button, I can see
 ‘hello, world’ displayed on a new line in the Serial Monitor window.
 <h2>Checkpoint</h2>
 I have now a functional serial transmission channel through <b>USART1</b>. I
 have only a first implementation for <code>puts()</code>, but I will add
 support for other stdio functions when needed.
 <p>
 <a href="24_stm32flash.html">Next</a>, I will switch to an open source  tool
 for flashing over serial connection that works on both Windows and Linux.
    <hr>© 2020-2024 Renaud Fivet
 </body>
 </html>
--- a/docs/24_stm32flash.html
+++ b/docs/24_stm32flash.html
@ -0,0 +1,141 @@
 <!DOCTYPE html>
 <html lang="en">
 <head>
    <meta charset="UTF-8">
    <title>2.4 stm32flash</title>
    <link type="text/css" rel="stylesheet" href="style.css">
 </head>
 <body>
 <h1>2.4 stm32flash</h1>
 So far I have been flashing boards via UART or SWD interface using STM32
 Cube Programmer. An open source alternative to flash via UART is
 <b>stm32flash</b>.
 <h2>Linux Build and Install</h2>
 <b>stm32flash</b> project is hosted on
 <a href="https://sourceforge.net/projects/stm32flash/">SourceForge</a>
 and the git repository is mirrored on
 <a href="https://gitlab.com/stm32flash/stm32flash">gitlab</a>.
 <p>
 I clone the repository from <b>Sourceforge</b> in my <b>Projects</b> folder.
 <pre>
 $ cd ~/Projects
 $ git clone https://git.code.sf.net/p/stm32flash/code stm32flash-code
 Cloning into 'stm32flash-code'...
 remote: Enumerating objects: 1357, done.
 remote: Counting objects: 100% (1357/1357), done.
 remote: Compressing objects: 100% (682/682), done.
 remote: Total 1357 (delta 912), reused 996 (delta 671)
 Receiving objects: 100% (1357/1357), 1.04 MiB | 74.00 KiB/s, done.
 Resolving deltas: 100% (912/912), done.
 </pre>
 Build on Linux doesn’t show any warnings.
 <pre>
 $ cd stm32flash-code
 $ make
 cc -Wall -g   -c -o dev_table.o dev_table.c
 cc -Wall -g   -c -o i2c.o i2c.c
 cc -Wall -g   -c -o init.o init.c
 cc -Wall -g   -c -o main.o main.c
 cc -Wall -g   -c -o port.o port.c
 cc -Wall -g   -c -o serial_common.o serial_common.c
 cc -Wall -g   -c -o serial_platform.o serial_platform.c
 cc -Wall -g   -c -o stm32.o stm32.c
 cc -Wall -g   -c -o utils.o utils.c
 cd parsers && make parsers.a
 make[1]: Entering directory '~/Projects/stm32flash-code/parsers'
 cc -Wall -g   -c -o binary.o binary.c
 cc -Wall -g   -c -o hex.o hex.c
 ar rc parsers.a binary.o hex.o
 make[1]: Leaving directory '~/Projects/stm32flash-code/parsers'
 cc  -o stm32flash dev_table.o i2c.o init.o main.o port.o serial_common.o serial_
 platform.o stm32.o utils.o parsers/parsers.a
 </pre>
 I test the newly compiled command first by calling it without argument
 <code>./stm32flah</code> then with the serial port where the USB to UART
 adapter is plugged in.
 <p>
 <code>./stm32flash</code> gives a detailed help of the command.
 <p>
 Calling it with the serial port argument where the board is plugged in
 and set in bootloader mode gives a description of the chipset detected.
 <pre>
 $ ./stm32flash /dev/ttyUSB0
 stm32flash 0.7
 http://stm32flash.sourceforge.net/
 Interface serial_posix: 57600 8E1
 Version      : 0x31
 Option 1     : 0x00
 Option 2     : 0x00
 Device ID    : 0x0444 (STM32F03xx4/6)
 - RAM        : Up to 4KiB  (2048b reserved by bootloader)
 - Flash      : Up to 32KiB (size first sector: 4x1024)
 - Option RAM : 16b
 - System RAM : 3KiB
 </pre>
 I install the command by moving the executable to my local bin directory.
 <pre>$ mv stm32flash ~/bin</pre>
 If everything goes well, I will later <code>strip</code> and compress (with
 <code>upx</code>) the executable.
 <h2>Regression Testing</h2>
 As my board has been already flashed using STM32 Cube Programmer, I can
 perform a simple regression test.
 <ul>
 <li> Read the content of the chipset memory as previously flashed.
 <li> Flash the same executable using Linux version of stm32flash.
 <li> Read back the newly programmed chipset memory.
 <li> Compare the two read-outs.
 </ul>
 Reading 1 KB with stm32flash.
 <pre>$ stm32flash -r read.bin -S 0x08000000:1024 /dev/ttyUSB0</pre>
 Writing the executable in hex format.
 <pre>$ stm32flash -w f030f4.hex /dev/ttyUSB0</pre>
 Comparing the memory read-out using <code>od</code>, there is no difference.
 <h2>Build and Install on Windows</h2>
 There is a Windows binary that can be downloaded from <b>stm32flash</b> project
 page on <b>SourceForge</b>. But I did clone and build using both <b>Cygwin</b>
 and <b>MSYS2 64bit</b> environments on Windows.
 <p>
 The build phase gave more warnings than the Linux version, this is
 mostly due to stricter warnings in the GCC compiler version.
 <p>
 Usage of <b>stm32flash</b> only differs in the name of the serial device, in my
 case <b>COM4</b> instead of <b>/dev/ttyUSB0</b>.
 <h2>Checkpoint</h2>
 There is several other Windows applications available on ST.com for
 flashing STM32 chipsets: STM32 ST-Link Utility, STM32 Flash Loader
 Demonstrator, ST Visual Programmer STM32. They have been marked as <b>NRND</b>
 (Not Recommended for New Design), which means they won’t support latest
 chipsets as they are replaced by STM32 Cube Programmer.
 <p>
 <a href="25_prototype.html">Next</a>, I will write an application which make
 better use of transmission than <b>hello</b>.
    <hr>© 2020-2024 Renaud Fivet
 </body>
 </html>
--- a/docs/25_prototype.html
+++ b/docs/25_prototype.html
@ -0,0 +1,230 @@
 <!DOCTYPE html>
 <html lang="en">
 <head>
    <meta charset="UTF-8">
    <title>2.5 uptime prototype</title>
    <link type="text/css" rel="stylesheet" href="style.css">
 </head>
 <body>
 <h1>2.5 uptime prototype</h1>
 With the basic functionality available so far, I can write something in
 the vein of the Unix <code>uptime</code> command.
 <pre>
 $ man -k uptime
 uptime (1)           - Tell how long the system has been running.
 </pre>
 I am going to make a quick prototype first to validate the concept.
 <h2>Implementation</h2>
 I already have a one second based System Tick interrupt routine, so I
 just need to make sure it updates a count of seconds. I make a copy of
 <b>usart1tx.c</b> as <b>uplow.1.c</b> to make the changes. I use a number
 suffix for the filename when I anticipate making several revisions.
 <pre>
 volatile unsigned uptime ;  /* seconds elapsed since boot */
 #ifdef LED_ON
 static void userLEDtoggle( void) {
    GPIO( LED_IOP)[ ODR] ^= 1 << LED_PIN ;   /* Toggle User LED */
 }
 #endif
 void SysTick_Handler( void) {
    uptime += 1 ;
 #ifdef LED_ON
    userLEDtoggle() ;
 #endif
 }
 </pre>
 The global variable <code>uptime</code> is marked <code>volatile</code>, the
 compiler needs this information to avoid optimization as the value changes
 concurrently when an interrupt is triggered.
 <p>
 I move the user LED toggling code to a dedicated local function
 <code>userLEDtoggle()</code> as this is not the only task of
 <code>SysTick_Handler()</code> anymore and a call to toggle the LED is needed
 during initialization. I adjust the initialization code accordingly.
 <p>
 I write a first <b>uptime.1.c</b> to print the count of seconds every time
 the <code>uptime</code> counter value changes.
 <pre>
 /* uptime.1.c -- tells how long the system has been running */
 #include &lt;stdio.h>
 extern volatile unsigned uptime ;
 extern void kputc( unsigned char c) ;
 void kputu( unsigned u) {
    unsigned r = u % 10 ;
    u /= 10 ;
    if( u)
        kputu( u) ;
    kputc( '0' + r) ;
 }
 int main( void) {
    static unsigned last ;
    for( ;;)
        if( last != uptime) {
            last = uptime ;
            kputu( last) ;
            puts( " sec") ;
        } else
            __asm( "WFI") ; /* Wait for System Tick Interrupt */
 }
 </pre>
 As before for <code>kputc()</code>, the implementation of <code>kputu()</code>
 to print an unsigned integer in decimal format is not optimal but still
 functional.
 <h2>Build</h2>
 I update <b>Makefile</b> with the composition.
 <pre>SRCS = startup.c uplow.1.c uptime.1.c</pre>
 Unfortunately, when I try to build an executable, the link phase fails.
 <pre>
 $ make
 f030f4.elf
 D:\Program Files (x86)\GNU Arm Embedded Toolchain\arm-gnu-toolchain-13.3.rel1-mi
 ngw-w64-i686-arm-none-eabi\bin\arm-none-eabi-ld.exe: uptime.1.o: in function `kp
 utu':
 D:\Projects\stm32bringup\docs/uptime.1.c:13:(.text+0x6): undefined reference to 
 `__aeabi_uidivmod'
 D:\Program Files (x86)\GNU Arm Embedded Toolchain\arm-gnu-toolchain-13.3.rel1-mi
 ngw-w64-i686-arm-none-eabi\bin\arm-none-eabi-ld.exe: D:\Projects\stm32bringup\do
 cs/uptime.1.c:14:(.text+0x14): undefined reference to `__aeabi_uidiv'
 make: *** [Makefile:45: f030f4.elf] Error 1
 </pre>
 The compiler has generated code that references two functions
 <code>__aeabi_uidivmod</code> and <code>__aeabi_uidiv</code> when compiling
 the lines 13 and 14 of <b>uptime.1.c</b>.
 <pre>
    unsigned r = u % 10 ;
    u /= 10 ;
 </pre>
 This happens because the compiler generates code for Cortex-M0 which has
 no integer division support. So integer division needs to be implemented
 by code as it is not supported by hardware.
 I need to pass the linker a reference to GNU Arm Embedded Toolchain
 library for Cortex-M0. The library file is <b>libggc.a</b>, the option -l and
 -L of the linker tell what the library name is (-lgcc => libgcc.a) and
 where to look for it.
 <pre>
 LIBDIR  = $(GCCDIR)/lib/gcc/arm-none-eabi/13.3.1/thumb/v6-m/nofp
 LIB_PATHS = -L$(LIBDIR)
 LIBS = -lgcc
 $(PROJECT).elf: $(OBJS)
    @echo $@
    $(LD) -T$(LD_SCRIPT) $(LIB_PATHS) -Map=$(PROJECT).map -cref -o $@ $^ $(LIBS)
    $(SIZE) $@
    $(OBJDUMP) -hS $@ > $(PROJECT).lst
 </pre>
 Once the Makefile has been updated, the build finish successfully.
 <pre>
 $ make
 f030f4.elf
   text    data     bss     dec     hex filename
    769       0       8     777     309 f030f4.elf
 f030f4.hex
 f030f4.bin
 </pre>
 Checking the linker produced map file, <b>f030f4.map</b>, I can see which
 library (<b>libgcc.a</b>) but also which modules in the library (
 <b>_udivsi3.o</b> and <b>_dvmd_tls.o</b>) have been used to resolve the
 symbols (<code>__aeabi_uidiv</code> and <code>__aeabi_idiv0</code>).
 <pre>
 Archive member included to satisfy reference by file (symbol)
 D:/Program Files (x86)/GNU Arm Embedded Toolchain/arm-gnu-toolchain-13.3.rel1-mi
 ngw-w64-i686-arm-none-eabi/lib/gcc/arm-none-eabi/13.3.1/thumb/v6-m/nofp\libgcc.a
 (_udivsi3.o)
                              uptime.1.o (__aeabi_uidiv)
 D:/Program Files (x86)/GNU Arm Embedded Toolchain/arm-gnu-toolchain-13.3.rel1-mi
 ngw-w64-i686-arm-none-eabi/lib/gcc/arm-none-eabi/13.3.1/thumb/v6-m/nofp\libgcc.a
 (_dvmd_tls.o)
                              D:/Program Files (x86)/GNU Arm Embedded Toolchain/
 arm-gnu-toolchain-13.3.rel1-mingw-w64-i686-arm-none-eabi/lib/gcc/arm-none-eabi/1
 3.3.1/thumb/v6-m/nofp\libgcc.a(_udivsi3.o) (__aeabi_idiv0)
 </pre>
 <h2>Testing</h2>
 I flash the board and start execution, the output works as expected, the
 first line “<b>1 sec</b>” appears one second after reset with a new line
 following every second after that.
 <p>
 <img alt="uptime v1 output" src="img/25_uptimev1.png">
 <h2>Library management</h2>
 With Cortex-M0 version of <b>libgcc.a</b> available I have some extra
 flexibility in handling usage of the library.
 <ol>
 <li> Work with a local copy of the <b>gcc</b> library.
 <ul>
 <li> copy libgcc.a locally
 <li> LIB_PATHS = -L.
 <li> LIBS = -lgcc
 </ul>
 <li> Work with a local copy of the modules extracted from the <b>gcc</b>
 library.
 <ul>
 <li> ar x libgcc.a _udivsi3.o _dvmd_tls.o
 <li> LIB_PATHS = -L.
 <li> LIBS = _udivsi3.o _dvmd_tls.o
 </ul>
 <li> Work with my own library made from the needed modules extracted from
 the <b>gcc</b> library.
 <ul>
 <li> ar x libgcc.a _udivsi3.o _dvmd_tls.o
 <li> ar qc libstm32.a _udivsi3.o _dvmd_tls.o
 <li> LIB_PATHS = -L.
 <li> LIBS = -lstm32
 </ul>
 </ol>
 The <code>ar</code> command distributed by the GNU Arm embedded toolchain is
 the same <b>GNU ar</b> as the Linux or Cygwin and MSYS2 distributions on
 Windows. So I use my native environment implementation for convenience.
 This is true for the utility commands (<code>ar</code>, <code>objcopy</code>,
 <code>objdump</code> and <code>size</code>) but not for <code>gcc</code> and
 <code>ld</code>.
 <h2>Checkpoint</h2>
 I have hacked a quick prototype of <code>uptime</code> and found an extra
 dependency to Gnu Arm Embedded Toolchain: some modules included in
 <b>libgcc.a</b> have to be included at link time as the chipset I am using has
 no support for integer division. At this stage I will reuse the library as it
 is, but I know where to look in the map file generated by the linker to find
 which modules are included. If I ever need a better control of the link phase,
 I can use <code>ar</code> to extract locally those modules from the library.
 <p>
 <a href="26_uptime.html">Next</a>, I will write <code>uptime</code> with a
 better structure.
    <hr>© 2020-2024 Renaud Fivet
 </body>
 </html>
--- a/docs/26_uptime.html
+++ b/docs/26_uptime.html
@ -0,0 +1,339 @@
 <!DOCTYPE html>
 <html lang="en">
 <head>
    <meta charset="UTF-8">
    <title>2.6 uptime</title>
    <link type="text/css" rel="stylesheet" href="style.css">
 </head>
 <body>
 <h1>2.6 uptime</h1>
 It’s time to throw away the prototype and write a production version of
 <code>uptime</code>. There is several things I want to straighten up in
 <b>uptime.1.c</b>:
 <ul>
 <li> <code>main()</code> is using assembly code to wait for interrupt.
 Definitively not high level C.
 <li> Also not high level C the lines <code>kputu( last) ; puts( ” sec”) ;</code>.
 I should have <code>printf( “%u sec\n”, last) ;</code> instead.
 <li> <code>kputc()</code> function prototype and the external variable
 declaration for <code>uptime</code> should be included as a C header file.
 </ul>
 Similar to what I did when I split the functionalities between files
 according to the three stages of execution (boot, init, main), I will
 reorganize the code according to three categories (system, library,
 application).
 <h2>System</h2>
 First I make clear what the system interface is by writing the header
 <b>system.h</b>. Here belong the global variables external declarations and the
 function prototypes.
 <pre>
 /* system.h -- system services */
 extern volatile unsigned uptime ;   /* seconds elapsed since boot */
 int init( void) ;           /* System initialization, called once at startup */
 void kputc( unsigned char c) ;      /* character output */
 int  kputs( const char s[]) ;       /* string output */
 void yield( void) ;                 /* give way */
 </pre>
 Next, I make a revision of <b>uplow.1.c</b> by making a copy into
 <b>uplow.2.c</b>.
 <p>
 I include <b>system.h</b> which is the interface that <b>uplow.2.c</b>
 implements. I will have several implementations of the same interface,
 so <b>system.h</b> is not just the interface published by <b>uplow.2.c</b>,
 it’s <b>uplow.2.c</b> which is an implementation of <b>system.h</b>.
 <pre>
 #include "system.h" /* implements system.h */
 </pre>
 I extract the code for <b>puts()</b> as it is a library function that doesn’t
 really belong to the system.
 <p>
 I add the implementation of <b>kputs()</b> and <b>yield()</b>.
 <pre>
 int kputs( const char s[]) {    /* string output */
    int cnt = 0 ;
    int c ;
    while( (c = *s++) != 0) {
        kputc( c) ;
        cnt += 1 ;
    }
    return cnt ;
 }
 void yield( void) {             /* give way */
    __asm( "WFI") ; /* Wait for System Tick Interrupt */
 }
 </pre>
 <h2>Library</h2>
 I create the implementation of <code>printf()</code> in <b>printf.c</b>.
 <ul>
 <li> It uses the system interface <b>system.h</b>.
 <li> I have eliminated the recursivity from my previous <code>kputu()</code>
 version by adding characters at the beginning of a string.
 <li> <code>kputu()</code> takes one additional divider parameter, so it can be
 used to print unsigned integer in various format like octal, decimal and
 hexadecimal. Current implementation will work for base 8 to 16, it won’t
 work for binary or base 36.
 <li> <code>kputi()</code> outputs signed integer.
 <li> <code>printf()</code> implements a subset of the format interpreter: %%,
 %c, %d, %i, %o, %s, %u, %x, %X.
 </ul>
 <pre>
 /* printf.c -- format and print data */
 #include &lt;stdarg.h>
 #include &lt;stdio.h>
 #include "system.h" /* kputc(), kputs() */
 static int kputu( unsigned u, unsigned d) {
    char s[ 12] ;                   /* room for 11 octal digit + EOS */
    char *p = &s[ sizeof s - 1] ;   /* point to last byte */
    *p = 0 ;                        /* null terminated string */
    do {
        unsigned r = u % d ;
        u /= d ;
        *--p = "0123456789ABCDEF"[ r] ;
    } while( u) ;
    return kputs( p) ;
 }
 static int kputi( int i) {
    int flag = i < 0 ;
    if( flag) {
        i = -i ;
        kputc( '-') ;
    }
    return flag + kputu( i, 10) ;
 }
 int printf( const char *fmt, ...) {
    va_list ap ;
    int cnt = 0 ;
    int c ; /* current char in format string */
    va_start( ap, fmt) ;
    while( ( c = *fmt++) != 0)
        if( c != '%') {
            cnt += 1 ; kputc( c) ;
        } else if( ( c = *fmt++) == 0) {
            cnt += 1 ; kputc( '%') ;
            break ;
        } else
            switch( c) {
            case 'c':
                cnt += 1 ; kputc( va_arg( ap, int /* char */)) ;
                break ;
            case 'o':
                cnt += kputu( va_arg( ap, unsigned), 8) ;
                break ;
            case 'u':
                cnt += kputu( va_arg( ap, unsigned), 10) ;
                break ;
            case 'x':
            case 'X':
                cnt += kputu( va_arg( ap, unsigned), 16) ;
                break ;
            case 'i':
            case 'd':
                cnt += kputi( va_arg( ap, int)) ;
                break ;
            case 's':
                cnt += kputs( va_arg( ap, char *)) ;
                break ;
            default:
                cnt += 1 ; kputc( '%') ;
                /* fallthrough */
            case '%':
                cnt += 1 ; kputc( c) ;
            }
    va_end( ap) ;
    return cnt ;
 }
 </pre>
 <h2>Application</h2>
 I write my final version of uptime in <b>uptime.c</b>.
 <ul>
 <li> It uses the system interface and standard library.
 <li> Instead of a count of seconds elapsed it displays a breakdown in week,
 days, hours, minutes and seconds.
 </ul>
 <pre>
 /* uptime.c -- tells how long the system has been running */
 #include &lt;stdio.h>
 #include "system.h" /* uptime, yield() */
 static void display( unsigned u, const char *s) {
    if( u)
        printf( " %d %s%s", u, s, &"s"[ u <= 1]) ;
 }
 int main( void) {
    static unsigned last ;
    for( ;;)
        if( last != uptime) {
            unsigned w, d, h, m ,s ;
            last = uptime ;
            d = h = m = 0 ;
            s = last % 60 ;
            w = last / 60 ;
            if( w) {
                m = w % 60 ;
                w /= 60 ;
                if( w) {
                    h = w % 24 ;
                    w /= 24 ;
                    if( w) {
                        d = w % 7 ;
                        w /= 7 ;
                    }
                }
            }
            printf( "up") ;
            display( w, "week") ;
            display( d, "day") ;
            display( h, "hour") ;
            display( m, "minute") ;
            display( s, "second") ;
            printf( "\n") ;
        } else
            yield() ;   /* Wait for System Tick Interrupt */
 }
 </pre>
 <h2>Build</h2>
 To build I add the composition in <b>Makefile</b>.
 <pre>SRCS = startup.c uplow.2.c uptime.c printf.c</pre>
 Unfortunately, the build fails at the link phase.
 <pre>
 $ make
 f030f4.elf
 D:\Program Files (x86)\GNU Arm Embedded Toolchain\arm-gnu-toolchain-13.3.rel1-mi
 ngw-w64-i686-arm-none-eabi\bin\arm-none-eabi-ld.exe: uptime.o: in function `main
 ':
 D:\Projects\stm32bringup\docs/uptime.c:41:(.text.startup+0xa4): 
 undefined reference to `putchar'
 make: *** [Makefile:49: f030f4.elf] Error 1
 </pre>
 The linker found a reference to <code>putchar()</code> at line 41 of
 <b>uptime.c</b>.
 <pre>
            printf( "\n") ;
 </pre>
 I haven’t used <code>putchar()</code> in my code and line 41 is a
 <code>printf( "\n")</code> that can be optimized to a
 <code>putchar( '\n')</code>. This must be some high level C optimization of gcc.
 <p>
 I add the code for <code>putchar()</code> in <b>putchar.c</b> as it is a
 standard library function.
 <pre>
 /* putchar.c -- write a character to stdout */
 #include &lt;stdio.h>
 #include "system.h" /* kputc() */
 int putchar( int c) {
    kputc( c) ;
    return c ;
 }
 </pre>
 Updating <b>Makefile</b> by adding <code>putchar.c</code> to the composition.
 <pre>SRCS = startup.c uplow.2.c uptime.c printf.c putchar.c</pre>
 The build now complete successfully.
 <pre>
 $ make
 f030f4.elf
   text    data     bss     dec     hex filename
   1797       0      12    1809     711 f030f4.elf
 f030f4.hex
 f030f4.bin
 </pre>
 By checking the map file provided by the linker, I can see that the
 number of low level modules referred by the code generated by the
 compiler has increased. Both integer and unsigned division but also some
 code to handle <code>switch()</code> statement are now referenced.
 <pre>
 Archive member included to satisfy reference by file (symbol)
 D:/Program Files (x86)/GNU Arm Embedded Toolchain/arm-gnu-toolchain-13.3.rel1-mingw-w64-i686-arm-none-eabi/lib/gcc/arm-none-eabi/13.3.1/thumb/v6-m/nofp\libgcc.a(_thumb1_case_sqi.o)
                              printf.o (__gnu_thumb1_case_sqi)
 D:/Program Files (x86)/GNU Arm Embedded Toolchain/arm-gnu-toolchain-13.3.rel1-mingw-w64-i686-arm-none-eabi/lib/gcc/arm-none-eabi/13.3.1/thumb/v6-m/nofp\libgcc.a(_udivsi3.o)
                              uptime.o (__aeabi_uidiv)
 D:/Program Files (x86)/GNU Arm Embedded Toolchain/arm-gnu-toolchain-13.3.rel1-mingw-w64-i686-arm-none-eabi/lib/gcc/arm-none-eabi/13.3.1/thumb/v6-m/nofp\libgcc.a(_divsi3.o)
                              uptime.o (__aeabi_idiv)
 D:/Program Files (x86)/GNU Arm Embedded Toolchain/arm-gnu-toolchain-13.3.rel1-mingw-w64-i686-arm-none-eabi/lib/gcc/arm-none-eabi/13.3.1/thumb/v6-m/nofp\libgcc.a(_dvmd_tls.o)
                              D:/Program Files (x86)/GNU Arm Embedded Toolchain/arm-gnu-toolchain-13.3.rel1-mingw-w64-i686-arm-none-eabi/lib/gcc/arm-none-eabi/13.3.1/thumb/v6-m/nofp\libgcc.a(_udivsi3.o) (__aeabi_idiv0)
 </pre>
 <h2>Test</h2>
 I flash the board and start execution, the output works as expected.
 <p>
 <img alt="uptime" src="img/26_uptime.png">
 <p>
 It will take a while to see the days and weeks counts appear, so I will
 need to power the board independently from it’s serial interface. For
 test purpose I fast forward the execution by using a bigger value for
 the increment of <code>uptime</code> in <code>SysTick_handler()</code>.
 <h2>Checkpoint</h2>
 Rereading the code while writing this web page, I found a typo in the
 week calculation. After that I retested with a bigger time increment to
 make sure days and weeks values are correct. It’s also clear that the
 test coverage for the printf format interpreter is not sufficient as I have
 coded more than is necessary to implement <b>uptime</b>.
 <p>
 I didn’t expect gcc to optimize call to high level C functions,
 replacing <code>printf()</code> by <code>putchar()</code>, thus forcing me to
 write additional code. So far I am not concerned by execution speed, so this
 type of optimization is a bit counter productive.
 <p>
 <a href="27_library.html">Next</a>, I will make sure that what belongs to the
 library category fits in an actual library file.
    <hr>© 2020-2024 Renaud Fivet
 </body>
 </html>
--- a/docs/27_library.html
+++ b/docs/27_library.html
@ -0,0 +1,164 @@
 <!DOCTYPE html>
 <html lang="en">
 <head>
    <meta charset="UTF-8">
    <title>2.7 C Library</title>
    <link type="text/css" rel="stylesheet" href="style.css">
 </head>
 <body>
 <h1>2.7: C Library</h1>
 So far I have used three Standard C library functions for output:
 <code>printf()</code>, <code>putchar()</code> and <code>puts()</code>. It’s
 time to bundle them as a library. This will give me more flexibility as I will
 not have to give a full list of the modules to link, the linker will handle
 the missing dependencies by looking into the libraries.
 <h2>puts()</h2>
 I have already packaged <code>printf()</code> and <code>putchar()</code> in
 stand alone modules. As I have removed my previous implementation of
 <code>puts()</code> from the system, I need to create puts.c.
 <pre>
 /* puts.c -- write a string to stdout   */
 #include &lt;stdio.h>
 #include "system.h" /* kputc(), kputs() */
 int puts( const char *s) {
    kputs( s) ;
    kputc( '\n') ;
    return 0 ;
 }
 </pre>
 <h2>Updating Makefile</h2>
 I need to tell <b>GNU make</b> how to manage and use the library, which
 means updating <b>Makefile</b>.
 <p>
 What’s the name, the content and the rule to maintain the library:
 <pre>
 AR      = $(BINPFX)ar
 LIBOBJS = printf.o putchar.o puts.o
 LIBSTEM = stm32
 lib$(LIBSTEM).a: $(LIBOBJS)
    $(AR) rc $@ $?
 </pre>
 Where to look for and which libraries to use in the link phase:
 <pre>
 LIBS = -l$(LIBSTEM) -lgcc
 LIB_PATHS = -L. -L$(LIBDIR)
 $(PROJECT).elf: $(OBJS) lib$(LIBSTEM).a
    @echo $@ from $(OBJS)
    $(LD) -T$(LD_SCRIPT) $(LIB_PATHS) -Map=$(PROJECT).map -cref -o $@ $(OBJS) $(LIBS)
    $(SIZE) $@
    $(OBJDUMP) -hS $@ > $(PROJECT).lst
 </pre>
 Library modules are implicitly part of the composition, so it’s not
 necessary to list them anymore.
 <pre>
 #SRCS = startup.c uplow.2.c uptime.c printf.c putchar.c
 SRCS = startup.c uplow.2.c uptime.c
 </pre>
 I include libraries in the list of files to delete when doing a make
 clean.
 <pre>
 clean:
    @echo CLEAN
    @rm -f *.o *.elf *.map *.lst *.bin *.hex *.a
 </pre>
 <h2>Building uptime</h2>
 Build terminates successfully producing the same executable as before.
 <pre>
 $ make
 f030f4.elf from startup.o uplow.2.o uptime.o
   text    data     bss     dec     hex filename
   1797       0      12    1809     711 f030f4.elf
 f030f4.hex
 f030f4.bin
 </pre>
 Checking the map produced by the linker I can see that it fetched the
 necessary modules for <code>printf()</code> and <code>putchar()</code> from the
 newly created library.
 <pre>
 Archive member included to satisfy reference by file (symbol)
 .\libstm32.a(printf.o)        uptime.o (printf)
 .\libstm32.a(putchar.o)       uptime.o (putchar)
 D:/Program Files (x86)/GNU Arm Embedded Toolchain/arm-gnu-toolchain-13.3.rel1-mi
 ngw-w64-i686-arm-none-eabi/lib/gcc/arm-none-eabi/13.3.1/thumb/v6-m/nofp\libgcc.a
 (_thumb1_case_sqi.o)
                              .\libstm32.a(printf.o) (__gnu_thumb1_case_sqi)
 D:/Program Files (x86)/GNU Arm Embedded Toolchain/arm-gnu-toolchain-13.3.rel1-mi
 ngw-w64-i686-arm-none-eabi/lib/gcc/arm-none-eabi/13.3.1/thumb/v6-m/nofp\libgcc.a
 (_udivsi3.o)
                              uptime.o (__aeabi_uidiv)
 D:/Program Files (x86)/GNU Arm Embedded Toolchain/arm-gnu-toolchain-13.3.rel1-mi
 ngw-w64-i686-arm-none-eabi/lib/gcc/arm-none-eabi/13.3.1/thumb/v6-m/nofp\libgcc.a
 (_divsi3.o)
                              uptime.o (__aeabi_idiv)
 D:/Program Files (x86)/GNU Arm Embedded Toolchain/arm-gnu-toolchain-13.3.rel1-mi
 ngw-w64-i686-arm-none-eabi/lib/gcc/arm-none-eabi/13.3.1/thumb/v6-m/nofp\libgcc.a
 (_dvmd_tls.o)
                              D:/Program Files (x86)/GNU Arm Embedded Toolchain/
 arm-gnu-toolchain-13.3.rel1-mingw-w64-i686-arm-none-eabi/lib/gcc/arm-none-eabi/1
 3.3.1/thumb/v6-m/nofp\libgcc.a(_udivsi3.o) (__aeabi_idiv0)
 </pre>
 <h2>Building hello</h2>
 I can rebuild my <b>hello</b> application using the latest system
 implementation and the newly made library.
 <pre>SRCS = startup.c uplow.2.c hello.c</pre>
 Build terminates successfully, the changes in size are due to the
 difference in the system implementation.
 <pre>
 $ make
 f030f4.elf from startup.o uplow.2.o hello.o
   text    data     bss     dec     hex filename
    445       0       8     453     1c5 f030f4.elf
 f030f4.hex
 f030f4.bin
 </pre>
 Checking the map file produced in the link phase, I can see that only
 <b>puts.o</b> has been fetched from my local library.
 <pre>
 Archive member included to satisfy reference by file (symbol)
 .\libstm32.a(puts.o)          hello.o (puts)
 </pre>
 <h2>Checkpoint</h2>
 I had to deal with linking with <b>gcc</b> library
 <a href="25_prototype.html">before</a>, so introducing my own library
 implementation of the standard C library output functions is a simple step.
 <p>
 <a href="28_clocks.html">Next</a>, I will continue on the topic of asynchronous serial
 transmission and look into baud rate and clock configuration.
    <hr>© 2020-2024 Renaud Fivet
 </body>
 </html>
--- a/docs/28_clocks.html
+++ b/docs/28_clocks.html
@ -0,0 +1,333 @@
 <!DOCTYPE html>
 <html lang="en">
 <head>
    <meta charset="UTF-8">
    <title>2.8 Baud Rate and Clocks</title>
    <link type="text/css" rel="stylesheet" href="style.css">
 </head>
 <body>
 <h1>2.8 Baud Rate and Clocks</h1>
 <blockquote> “The time has come,” the walrus said, “to talk of many things: Of baud
  rates – and clocks – and quartz.”<br>
  -- Les huit scaroles --
 </blockquote>
 One thing to consider in any kind of transmission is the speed, how fast
 or how slowly you can transmit data. I have configured <b>USART1</b> at
 <b>9600&nbsp;baud</b>, keeping the other settings at default value (<b>8N1</b>),
 so how fast is that?
 <h2>A bit of theory</h2>
 Let’s interpret asynchronous serial transmission, 9600 baud, 8 bits, no
 parity, 1 stop bit.
 <ul>
 <li> Serial transmission means transmission on one wire, with each bit sent
 one after the other, usually low bit first.
 <li> Asynchronous means there is no extra wire for a clock, so transmitter
 and receiver must agree on a bit rate and a transmission pattern.
 <li> Transmission pattern, <b>8N1</b> in my case, is composed of a start bit,
 a data length (<b>8</b> bits), parity (Even, Odd or <b>N</b>one) and <b>1</b>,
 1.5 or 2 stop bits.
 <li> Because there is no common transmitted clock, the receiver
 resynchronizes based on the start bit/stop bit framing of the data. It
 samples the line at 16 times the frequency of the agreed clock to detect
 the change in the wire state.
 </ul>
 In my case, <b>8N1</b> means that, because of the framing pattern, for
 every byte of data sent, there is one extra start bit and one extra stop
 bit sent, it’s ten bits per byte of data. At 9600 bauds that means 960
 bytes per second, fast enough to transmit every characters of a 80×24
 terminal screen in two seconds.
 <h2>Baud rate accuracy</h2>
 It sounds like a pretty robust transmission scheme, sampling at 16 times
 the transmission clock isn’t call oversampling for nothing. Am I
 overdoing something here or just compensating for something I missed?
 <p>
 The thing is, I didn’t program USART1 to transmit at 9600 baud. As my
 default clock is 8MHz, I had to write in USART1 baud rate register a
 value close to 8000000/9600 or 2500/3, 833 is close enough but my actual
 transmission speed is closer to 9604, slightly faster than 9600 baud.
 <p>
 The error is small (4/10000) and the transmission works fine. Still
 common baud rates are 300, 1200, 2400, 9600, 19200, 38400, 57600, 115200.
 It would be better if my clock frequency was 6MHz or 12MHz if I
 want to work at higher baud rate.
 <h2>Clocks</h2>
 Looking at the clock tree in the datasheet can be intimidating, it’s
 definitively about several clocks.
 <p>
 <img alt="Clock Tree" src="img/28_clocktree.png">
 <p>
 The default configuration I have been using so far goes like this.
 <ul>
 <li> HSI is the output of a 8MHz High Speed Internal RC oscillator.
 <li> HSI is the default source of SYSCLK.
 <li> HCLK, the clock of AHB domain, is SYSCLK divided by HPRE a pre-scaler
 which default to 1.
 <li> The system tick clock default configuration is HCLK/8.
 <li> PCLK, the clock of APB domain, is HCLK divided by PPRE a pre-scaler
 which default to 1.
 <li> GPIO peripherals are on APB domain bus, they use PCLK.
 <li> USART1 is also on APB domain bus, but its input clock can be selected,
 PCLK is the default input.
 </ul>
 From the peripherals point of view.
 <ul>
 <li> SysTick Clock = HCLK/8 = SYSCLK/1/8 = HSI/1/8 = 8/1/8 MHz
 <li> GPIOx Clock = PCLK = HCLK/1 = SYSCLK/1/1 = HSI/1/1 = 8/1/1 MHz
 <li> USART1 Clock = PCLK = … = 8 MHz
 </ul>
 As I want to have a clock frequency different than 8 MHz as input for
 USART1, I can configure the Phase-Locked Loop (PLL) and switch SYSCLK to
 take its input from the PLL instead of HSI.
 <p>
 The PLL output frequency must be in the range 16-48 MHz. As I am looking
 for a frequency that can be divided by 3 to match most of the baud rate,
 I will use 24 MHz.
 <ul>
 <li> Select PLL input as HSI/2.
 <li> Set PLLMUL to 6.
 <li> Enable PLL and wait that it stabilizes.
 <li> Select SYSCLK input as PLL.
 <li> Wait for the switch to complete.
 </ul>
 <h2>Quartz</h2>
 I can also activate the quartz if there is one soldered on the board.
 It’s usually the case but specially for STM32F030F4 which has only 20
 pins, a quartz less design that free up two GPIO pins can be a day
 saver. Quartz value from 4 to 32 MHz are supported and most design use 8
 MHz.
 <p>
 To set a 24 MHz clock with a 8 MHz High Speed External Oscillator (HSE):
 <ul>
 <li> Enable HSE and wait that it stabilizes.
 <li> Select HSE as input for the PLL with a pre divider of 2.
 <li> Set PLLMUL to 6.
 <li> Enable PLL and wait that it stabilizes.
 <li> Select SYSCLK input as PLL.
 <li> Wait for the switch to complete.
 <li> Disable HSI.
 </ul>
 I can use different values for the pre divider and post multiplier of
 the PLL (/4, *12 or /1, *3 instead of /2, *6) but I want here to stay
 aligned with the HSI/2 input selection when HSE quartz value is 8MHz.
 <h2>Implementation</h2>
 I make a copy of <b>uplow.2.c</b> into <b>clocks.c</b> to make the changes.
 <p>
 I expand the board description part by adding <code>HSE</code>,
 <code>PLL</code> and <code>BAUD</code> macro definitions. Based on those I can
 handle four clock configurations: HSI, HSE, PLL HSI and PLL HSE.
 <pre>
 /* user LED ON when PA4 is high */
 #define LED_IOP A
 #define LED_PIN 4
 #define LED_ON  1
 /* 8MHz quartz, configure PLL at 24MHz */
 #define HSE     8000000
 #define PLL     6
 #define BAUD    9600
 #ifdef PLL
 # ifdef HSE
 #  define CLOCK HSE / 2 * PLL
 # else /* HSI */
 #  define CLOCK 8000000 / 2 * PLL
 # endif
 # if CLOCK < 16000000
 # error PLL output below 16MHz
 # endif
 #else
 # ifdef HSE
 #  define CLOCK HSE
 # else /* HSI */
 #  define CLOCK 8000000
 # endif
 #endif
 </pre>
 At compilation time there will be a check if the clock targeted is in
 the supported range of the chipset and a warning given if the baud rate
 generation is not accurate.
 <pre>
 #if CLOCK > 48000000
 # error clock frequency exceeds 48MHz
 #endif
 #if CLOCK % BAUD
 # warning baud rate not accurate at that clock frequency
 #endif
 </pre>
 I expand the definition of the Reset and Clock Control (RCC) peripheral
 to add the necessary bit fields.
 <pre>
 #define CAT( a, b) a##b
 #define HEXA( a) CAT( 0x, a)
 #define RCC                     ((volatile long *) 0x40021000)
 #define RCC_CR          RCC[ 0]
 #define RCC_CR_HSION    0x00000001  /*  1: Internal High Speed clock enable */
 #define RCC_CR_HSEON    0x00010000  /* 16: External High Speed clock enable */
 #define RCC_CR_HSERDY   0x00020000  /* 17: External High Speed clock ready flag$
 #define RCC_CR_PLLON    0x01000000  /* 24: PLL enable */
 #define RCC_CR_PLLRDY   0x02000000  /* 25: PLL clock ready flag */
 #define RCC_CFGR            RCC[ 1]
 #define RCC_CFGR_SW_MSK     0x00000003  /* 1-0: System clock SWitch Mask */
 #define RCC_CFGR_SW_HSE     0x00000001  /* 1-0: Switch to HSE as system clock */
 #define RCC_CFGR_SW_PLL     0x00000002  /* 1-0: Switch to PLL as system clock */
 #define RCC_CFGR_SWS_MSK    0x0000000C  /* 3-2: System clock SWitch Status Mask$
 #define RCC_CFGR_SWS_HSE    0x00000004  /* 3-2: HSE used as system clock */
 #define RCC_CFGR_SWS_PLL    0x00000008  /* 3-2: PLL used as system clock */
 #define RCC_CFGR_PLLSRC         0x00010000
 #define RCC_CFGR_PLLSRC_HSI     0x00000000      /* HSI / 2 */
 #define RCC_CFGR_PLLSRC_HSE     0x00010000      /* HSE */
 #define RCC_CFGR_PLLXTPRE       0x00020000
 #define RCC_CFGR_PLLXTPRE_DIV1  0x00000000  /* HSE */
 #define RCC_CFGR_PLLXTPRE_DIV2  0x00020000  /* HSE / 2 */
 #define RCC_CFGR_PLLMUL_MSK     (0x00F << 18)
 #define RCC_CFGR_PLLMUL( v)     ((v - 2) << 18)
 #define RCC_AHBENR              RCC[ 5]
 #define RCC_AHBENR_IOP( h)      (1 << (17 + HEXA( h) - 0xA))
 #define RCC_APB2ENR             RCC[ 6]
 #define RCC_APB2ENR_USART1EN    0x00004000  /* 14: USART1 clock enable */
 </pre>
 The code to configure the clocks follow the steps I have described
 before. The conditional compilation allows the generation of the four
 possible cases: HSI, HSE, PLL HSI and PLL HSE.
 <pre>
 /* By default SYSCLK == HSI [8MHZ] */
 #ifdef HSE
 /* Start HSE clock (8 MHz external oscillator) */
    RCC_CR |= RCC_CR_HSEON ;
 /* Wait for oscillator to stabilize */
    do {} while( (RCC_CR & RCC_CR_HSERDY) == 0) ;
 #endif
 #ifdef PLL
 /* Setup PLL HSx/2 * 6 [24MHz] */
    /* Default 0: PLL HSI/2 src, PLL MULL * 2 */
 # ifdef HSE
    RCC_CFGR = RCC_CFGR_PLLSRC_HSE | RCC_CFGR_PLLXTPRE_DIV2 ;
 # endif
    RCC_CFGR |= RCC_CFGR_PLLMUL( PLL) ;
    RCC_CR |= RCC_CR_PLLON ;
    do {} while( (RCC_CR & RCC_CR_PLLRDY) == 0) ;   /* Wait for PLL */
 /* Switch to PLL as system clock SYSCLK == PLL [24MHz] */
    RCC_CFGR = (RCC_CFGR & ~RCC_CFGR_SW_MSK) | RCC_CFGR_SW_PLL ;
    do {} while( (RCC_CFGR & RCC_CFGR_SWS_MSK) != RCC_CFGR_SWS_PLL) ;
 #else
 # ifdef HSE
 /* Switch to HSE as system clock SYSCLK == HSE [8MHz] */
    RCC_CFGR = (RCC_CFGR & ~RCC_CFGR_SW_MSK) | RCC_CFGR_SW_HSE ;
    do {} while( (RCC_CFGR & RCC_CFGR_SWS_MSK) != RCC_CFGR_SWS_HSE) ;
 # endif
 #endif
 #ifdef HSE
 /* Switch off HSI */
    RCC_CR &= ~RCC_CR_HSION ;
 #endif
 </pre>
 Systick reload value is calculated based on <code>CLOCK</code> constant value.
 <pre>
    SYSTICK_RVR = CLOCK / 8 - 1 ;   /* HBA / 8 */
 </pre>
 Similarly, USART1 baud rate register is calculated based on <code>CLOCK</code>
 and <code>BAUD</code> constant value.
 <pre>
    USART1[ BRR] = CLOCK / BAUD ;       /* PCLK is default source */
 </pre>
 I add a debug print at the end of <code>init()</code> to display which clock
 configuration has been set.
 <pre>
    kputs(
 #ifdef PLL
        "PLL"
 #endif
 #ifdef HSE
        "HSE"
 #else
        "HSI"
 #endif
        "\n") ;
 </pre>
 <h2>Build and test</h2>
 To build, I first update the composition in Makefile.
 <pre>SRCS = startup.c clocks.c uptime.c</pre>
 Build complete successfully, this is for PLL HSE board configuration.
 <pre>
 $ make
 f030f4.elf from startup.o clocks.o uptime.o
   text    data     bss     dec     hex filename
   1901       0      12    1913     779 f030f4.elf
 f030f4.hex
 f030f4.bin
 </pre>
 I use a board with a 8 MHz quartz soldered on and test the four clock
 configuration.
 <h2>Checkpoint</h2>
 I have tuned the baud rate setting by using a higher frequency for the
 system clock.&nbsp; The clock tree is complex and I have only looked at a part
 of it.&nbsp; Nevertheless the implementation for the clock configuration give
 me some flexibility and ease of setup.
 <p>
 <a href="29_interrupt.html">Next</a>, I will implement interrupt driven
 transmission.
    <hr>© 2020-2024 Renaud Fivet
 </body>
 </html>
--- a/docs/29_interrupt.html
+++ b/docs/29_interrupt.html
@ -0,0 +1,279 @@
 <!DOCTYPE html>
 <html lang="en">
 <head>
    <meta charset="UTF-8">
    <title>2.9 Interrupt Driven Transmission</title>
    <link type="text/css" rel="stylesheet" href="style.css">
 </head>
 <body>
 <h1>2.9 Interrupt Driven Transmission</h1>
 It’s time to revise the implementation of <code>kputc()</code>, remove the
 recursive call to handle CR LF transmission and avoid the busy wait
 loop. USART1 can trigger an interrupt when the Transmit Data Register
 (<b>TDR</b>) is empty which is all I need to implement interrupt driven
 transmission.
 <h2>Extending the interrupt vector</h2>
 I need to add the device specific interrupts to the interrupt vector. So
 far I have only mapped the initial Stack pointer and the 15 Core System
 Exceptions. The STF0x0 chipsets have 32 device specific interrupts which
 are listed in the <b><i>Reference Manual</i> RM0360</b>.
 <p>
 I make a copy of <b>startup.c</b> into <b>startup.txeie.c</b> to do the
 changes. The name txeie refers to Transmit Data Register Empty Interrupt
 Enabled.
 <pre>
 /* Stubs for System Exception Handler */
 void Default_Handler( void) ;
 #define dflt_hndlr( fun) void fun##_Handler( void) \
                                __attribute__((weak,alias("Default_Handler")))
 dflt_hndlr( NMI) ;
 dflt_hndlr( HardFault) ;
 dflt_hndlr( SVCall) ;
 dflt_hndlr( PendSV) ;
 dflt_hndlr( SysTick) ;
 dflt_hndlr( WWDG) ;
 dflt_hndlr( RTC) ;
 dflt_hndlr( FLASH) ;
 dflt_hndlr( RCC) ;
 dflt_hndlr( EXTI0_1) ;
 dflt_hndlr( EXTI2_3) ;
 dflt_hndlr( EXTI4_15) ;
 dflt_hndlr( DMA_CH1) ;
 dflt_hndlr( DMA_CH2_3) ;
 dflt_hndlr( DMA_CH4_5) ;
 dflt_hndlr( ADC) ;
 dflt_hndlr( TIM1_BRK_UP_TRG_COM) ;
 dflt_hndlr( TIM1_CC) ;
 dflt_hndlr( TIM3) ;
 dflt_hndlr( TIM6) ;
 dflt_hndlr( TIM14) ;
 dflt_hndlr( TIM15) ;
 dflt_hndlr( TIM16) ;
 dflt_hndlr( TIM17) ;
 dflt_hndlr( I2C1) ;
 dflt_hndlr( I2C2) ;
 dflt_hndlr( SPI1) ;
 dflt_hndlr( SPI2) ;
 dflt_hndlr( USART1) ;
 dflt_hndlr( USART2) ;
 dflt_hndlr( USART3_4_5_6) ;
 dflt_hndlr( USB) ;
 /* Interrupt vector table:
 * 1  Stack Pointer reset value
 * 15 System Exceptions
 * 32 Device specific Interrupts
 */
 typedef void (*isr_p)( void) ;
 isr_p const isr_vector[ 16 + 32] __attribute__((section(".isr_vector"))) = {
    (isr_p) &__StackTop,
 /* System Exceptions */
    Reset_Handler,
    NMI_Handler,
    HardFault_Handler,
    0,  0,  0,  0,  0,  0,  0,
    SVCall_Handler,
    0,  0,
    PendSV_Handler,
    SysTick_Handler,
 /* STM32F030xx specific Interrupts cf RM0360 */
    WWDG_Handler,
    0,
    RTC_Handler,
    FLASH_Handler,
    RCC_Handler,
    EXTI0_1_Handler,
    EXTI2_3_Handler,
    EXTI4_15_Handler,
    0,
    DMA_CH1_Handler,
    DMA_CH2_3_Handler,
    DMA_CH4_5_Handler,
    ADC_Handler,
    TIM1_BRK_UP_TRG_COM_Handler,
    TIM1_CC_Handler,
    0,
    TIM3_Handler,
    TIM6_Handler,
    0,
    TIM14_Handler,
    TIM15_Handler,
    TIM16_Handler,
    TIM17_Handler,
    I2C1_Handler,
    I2C2_Handler,
    SPI1_Handler,
    SPI2_Handler,
    USART1_Handler,
    USART2_Handler,
    USART3_4_5_6_Handler,
    0,
    USB_Handler
 } ;
 </pre>
 <h2>kputc() and USART1_Handler()</h2>
 I make a copy of <b>clocks.c</b> into <b>txeie.c</b> to make the changes to my
 system layer.
 <p>
 I add the description of the TX Empty Interrupt Enable bit in the
 Configuration Register of USART1:
 <pre>#define USART_CR1_TXEIE (1 << 7)    /* 7: TDR Empty Interrupt Enable */</pre>
 I use a Round Robin buffer to synchronize <code>kputc()</code> and
 <code>USART1_Handler()</code> making sure they don’t write to the same location.
 <pre>
 static unsigned char txbuf[ 8] ; // best if size is a power of 2 for cortex-M0
 #define TXBUF_SIZE (sizeof txbuf / sizeof txbuf[ 0])
 static unsigned char            txbufin ;
 static volatile unsigned char   txbufout ;
 </pre>
 <ul>
 <li> <code>kputc()</code> write in <code>txbuf[]</code> while
 <code>USART1_Handler()</code> only read from it.
 <li> <code>txbufin</code> is the index of the position where
 <code>kputc()</code> will insert a character, it’s written by
 <code>kputc()</code> and read by <code>USART1_Handler()</code>.
 <li> <code>txbufout</code> is the index of the position where
 <code>USART1_Handler()</code> will fetch a character, it’s written by
 <code>USART1_Handler()</code> and read by <code>kputc()</code>. The value of
 <code>txbufout</code> will change under interrupt, so it is marked as
 <code>volatile</code> to make sure the compiler will not optimize the code in a
 conflicting way.
 <li> As the index calculation in the Round Robin buffer uses integer
  division, it is best that the size of the buffer be a power of two on
  Cortex-M0 chipset. This way the compiler will optimize <code>(idx+1) %
  size</code> into <code>(idx+1) & (size-1)</code>, which is more efficient on chipset
  with no support for integer division.
 <li> I kept the buffer size small to be sure this code is well tested. With
  a value of 8, this means the buffer can hold up to 7 characters.
 </ul>
 <pre>
 void USART1_Handler( void) {
    if( txbufout == txbufin) {
    /* Empty buffer => Disable TXEIE */
        USART1[ CR1] &= ~USART_CR1_TXEIE ;
    } else {
        static unsigned char lastc ;
        unsigned char c ;
        c = txbuf[ txbufout] ;
        if( c == '\n' && lastc != '\r')
            c = '\r' ;
        else
            txbufout = (txbufout + 1) % TXBUF_SIZE ;
        USART1[ TDR] = c ;
        lastc = c ;
    }
 }
 void kputc( unsigned char c) {  /* character output */
    int nextidx ;
 /* Wait if buffer full */
    nextidx = (txbufin + 1) % TXBUF_SIZE ;
    while( nextidx == txbufout)
        yield() ;
    txbuf[ txbufin] = c ;
    txbufin = nextidx ;
 /* Trigger transmission by enabling interrupt */
    USART1[ CR1] |= USART_CR1_TXEIE ;
 }
 </pre>
 <ul>
 <li> <code>kputc()</code> enables the interrupt generation after a new
 character is inserted in the buffer.
 <li> <code>USART1_Handler()</code> disables the interrupt generation when the
 buffer is empty.
 <li> The conversion of LF to CR LF is done by the interrupt handler.
 <li> <code>kputc()</code> now yields when the buffer is full.
 </ul>
 <h2>Unmasking USART1 interrupt</h2>
 I have configured USART1 peripheral to generate an interrupt when the
 transmit data register is empty, now I have to tell the Core to pay
 attention to USART1 specific interrupt line.
 <p>
 The 32 device specific interrupts are enabled through the Nested
 Vectored Interrupt Controller (NVIC). <b>NVIC</b> is a core peripheral so its
 description is in the Programming Manual. Enabling is done through the
 Interrupt Set-Enable Register (ISER).
 <p>
 Set-Enable means writing a 1 enables while writing a 0 does nothing.
 Reading reports the current settings. There is a corresponding
 Clear-Enable register, to disable interrupts.
 <pre>
 #define NVIC                    ((volatile long *) 0xE000E100)
 #define NVIC_ISER               NVIC[ 0]
 #define unmask_irq( idx)        NVIC_ISER = 1 << idx
 #define USART1_IRQ_IDX          27
 </pre>
 I add a call to the macro unmask_irq() after USART1 initialization.
 <pre>
 /* Unmask USART1 irq */
    unmask_irq( USART1_IRQ_IDX) ;
 </pre>
 <h2>Build and test</h2>
 I add the composition into Makefile
 <pre>SRCS = startup.txeie.c txeie.c uptime.c</pre>
 Build completes successfully
 <pre>
 $ make
 f030f4.elf from startup.txeie.o txeie.o uptime.o
   text    data     bss     dec     hex filename
   2097       0      20    2117     845 f030f4.elf
 f030f4.hex
 f030f4.bin
 </pre>
 Checking the map and lst files I can verify that
 <ul>
 <li> This code has grown by 128 bytes due to the 32 extra interrupt handlers.
 <li> Previous implementation of <code>kputc()</code> was using 4 bytes of bss
 to hold lastchar (1 byte). The new version uses 12 bytes to hold the round
 robin buffer (8 bytes), its in and out indexes (2 bytes) and lastchar (1 byte).
 <li> The compiler optimizes the modulo instruction in <code>% size</code> to
 bit masking <code>& (size – 1)</code> as the size 8 is a power of 2.
 </ul>
 Flashing a device with the new executable, <b>uptime</b> works as the previous
 version.
 <h2>Checkpoint</h2>
 There is no obvious benefit in doing transmission under interrupt at this
 stage, my most complex application, <b>uptime</b>, prints one line every
 second, there is plenty of idle time.
 <p>
 <a href="index.html#part3">Next</a>, I will use an external sensor to do some
 measurement.
    <hr>© 2020-2024 Renaud Fivet
 </body>
 </html>
--- a/docs/31_dht11.html
+++ b/docs/31_dht11.html
@ -0,0 +1,434 @@
 <!DOCTYPE html>
 <html lang="en">
 <head>
    <meta charset="UTF-8">
    <title>3.1 DHT11 Humidity & Temperature</title>
    <link type="text/css" rel="stylesheet" href="style.css">
 </head>
 <body>
 <h1>3.1 DHT11 Humidity & Temperature</h1>
 The DHT11 is a low cost humidity and temperature sensor from Aosong
 which is easy to buy online. It is not popular as it has a non standard
 communication protocol and its precision is ±5% for humidity and ±2°C
 for temperature so it’s often overlooked for more expensive solution.
 <h2>Hardware consideration</h2>
 The DHT11 comes in a 4 pin package where only 3 pins are used: vcc, gnd
 and data io. It can be powered at 5V or 3.3V. At 3.3V I can connect its
 data io pin to any of the STM32 GPIO pins, if I want to test how it
 behaves when powered at 5V, I will have to use one of the 5V tolerant
 pin of the STM32.
 <p>
 The io data line when idle need to be at high level, so a pull up
 resistor is necessary.
 <p>
 I will use 3.3V, connect DHT11 data io pin to STM32 GPIOA0. The small
 DHT11 boards I use all have a pull up resistor between vcc and io data.
 <p>
 <img alt="DHT11 Boards" src="img/31_dht11.png">
 <h2>Measurement frequency</h2>
 The DHT11 needs one second to settle after power up, after that it can
 be queried no more often than every two seconds.
 <p>
 This requirement is easy to implement based on uptime 1 second counter.
 <h2>Communication protocol</h2>
 In idle state, the io data line is kept high by the pull up resistor and
 DHT11 is waiting for a request.
 <p>
 To request data, the STM32 needs to keep the io data line at low level
 for more than 18 ms.
 <p>
 Once the STM32 releases the line, the pull up will bring the level back to up
 and the DHT11 will assert the beginning of transmission by first pulling it
 down for 80 µs then up for 80 µs.
 <p>
 The DHT11 will transmit 40 bits (5 bytes), high bit first, encoding a
 zero as 50 µs low followed by 26-28 µs high and a one as 50 µs low
 followed by 70 µs high.
 <p>
 The last bit is followed by 50 µs low to signal the end of transmission
 and the return to idle state.
 <p>
 To implement this protocol on STM32 side:
 <ul>
 <li> GPIO pin as output low during 18 ms.
 <li> GPIO pin as input to sample the line at a frequency that allow
  differentiation of a 26-28 µs high (encoding a zero) versus a 70 µs
  high (encoding a one).
 </ul>
 <h2>Data encoding</h2>
 The 5 transmitted bytes hold humidity, temperature and checksum.
 <ul>
 <li> Byte 0: Integer part of humidity value.
 <li> Byte 1: Fractional part of humidity value. Equals to zero.
 <li> Byte 2: Integer part of temperature value.
 <li> Byte 3: One digit fractional part of temperature value. 0-9, bit 7
  indicate if temperature is below zero.
 <li> Byte 4: Cheksum of bytes 0-3.
 </ul>
 The STM32F030 has no support for floating point representation so I will
 use ad hoc representation of the temperature value.
 <h2>Implementing Low Level API</h2>
 I need to implement the following low level functionalities:
 <ul>
 <li> Configure a GPIOA pin as <b>input</b>. This is the default mode of most
  GPIOA pins. In input mode the line is floating and the level can be
  read.
 <li> Configure a GPIOA pin as <b>output</b>. The level will be asserted by
  STM32. The default output value is LOW.
 <li> <b>Read</b> a GPIOA pin level, either HIGH or LOW.
 <li> <b>Sleep</b> for a duration specified in µs.
 </ul>
 This is a minimal set of function based on the known state of the
 system, <b>GPIOA</b> has already been enabled as I am using serial
 transmission, the pins output level default to <b>LOW</b>, so I only need to
 configure the pin as an output to pull down the line.
 <p>
 The sleep granularity is <b>1 µs</b>, I need to assert the line LOW for at
 least 18 ms, so <b>1 ms</b> or even <b>10 ms</b> granularity would be fine.
 <p>
 I add the following lines to <b>system.h</b> to declare the interface I am
 going to implement.
 <pre>
 /* GPIOA low level API ********************************************************/
 typedef enum {
        LOW = 0,
        HIGH
 } iolvl_t ;
 void gpioa_input( int pin) ;        /* Configure GPIOA pin as input */
 void gpioa_output( int pin) ;       /* Configure GPIOA pin as output */
 iolvl_t  gpioa_read( int pin) ;     /* Read level of GPIOA pin */
 void usleep( unsigned usecs) ;      /* wait at least usec µs */
 </pre>
 I make a copy of <b>txeie.c</b> into <b>gpioa.c</b> to implement the new API.
 <pre>
 #define IDR     4
 /* GPIOA low level API ********************************************************/
 void gpioa_input( int pin) {        /* Configure GPIOA pin as input */
    GPIOA[ MODER] &= ~(3 << (pin * 2)) ;    /* Apin as input [00] */
 }
 void gpioa_output( int pin) {       /* Configure GPIOA pin as output */
    GPIOA[ MODER] |= 1 << (pin * 2) ;       /* Apin output (over [00]) */
 }
 iolvl_t gpioa_read( int pin) {      /* Read level of GPIOA pin */
    return LOW != (GPIOA[ IDR] & (1 << pin)) ;
 }
 </pre>
 I didn’t use the GPIO Input Data Register (<b>IDR</b>) until now, so I add it
 to the registers description.
 <p>
 <code>gpioa_output()</code> implementation is minimal. I know I am switching
 only between input and output mode, so I don’t need to mask the bit field
 first.
 <p>
 I use the System Tick to implement <code>usleep()</code>.
 <pre>
 void usleep( unsigned usecs) {      /* wait at least usec µs */
 #if CLOCK / 8000000 < 1
 # error HCLOCK below 8 MHz
 #endif
    usecs = SYSTICK_CVR - (CLOCK / 8000000 * usecs) ;
    while( SYSTICK_CVR > usecs) ;
 }
 </pre>
 The System Tick generates an interrupt every second but I can read the
 Current Value Register (<b>CVR</b>) to pause for smaller time period.
 <p>
 As I will read the sensor just after a new second count, I know that the
 <b>CVR</b> value is close to maximum and I don’t need to care for a roll
 over.
 <p>
 SysTick input clock is <b>HCLK/8</b>, this implementation will work for
 <b>HCLK</b> equal to a multiple of 8 MHz (8, 16, 24, 32, 40, 48).
 <h2>DHT11 API</h2>
 I create the header file <b>dht11.h</b> with the following interface.
 <pre>
 /* dht11.h -- DHT11 API */
 typedef enum {
    DHT11_SUCCESS,
    DHT11_FAIL_TOUT,
    DHT11_FAIL_CKSUM
 } dht11_retv_t ;
 /* 5 .. 95 %RH, -20 .. 60 °C */
 extern unsigned char dht11_humid ;  /* 5 .. 95 %RH */
 extern   signed char dht11_tempc ;  /* -20 .. 60 °C */
 extern unsigned char dht11_tempf ;  /* .0 .. .9 °C */
 void dht11_init( void) ;
 dht11_retv_t dht11_read( void) ;
 </pre>
 Usage:
 <ul>
 <li> Initialization: <code>dht11_init()</code>, once at startup.
 <li> Call: <code>dht11_read()</code>, not more often than every two seconds,
 starting one second after voltage stabilizes.
 <li> Test for error: transmission protocol is based on strict timing and
  data integrity is insured by checksum, so Timeout and Checksum error
  need to be checked.
 <li> Measurement available through global variables holding humidity,
  integer part of temperature and one digit fractional part of
  temperature.
 </ul>
 Based on this API, I write <b>dht11main.c</b>.
 <pre>
 /* dht11main.c -- sample DHT11 sensor */
 #include &lt;stdio.h>
 #include "system.h"
 #include "dht11.h"
 int main() {
    static unsigned last ;
    dht11_init() ;
    for( ;;)
        if( last == uptime)
            yield() ;
        else {
            last = uptime ;
            if( last & 1)   /* every 2 seconds starting 1 second after boot */
                switch( dht11_read()) {
                case DHT11_SUCCESS:
                    printf( "%u%%RH, %d.%uC\n", dht11_humid, dht11_tempc,
                                                                dht11_tempf) ;
                    break ;
                case DHT11_FAIL_TOUT:
                    puts( "Timeout") ;
                    break ;
                case DHT11_FAIL_CKSUM:
                    puts( "Cksum error") ;
                }
        }
 }
 </pre>
 <h2>DHT11 API implementation</h2>
 I first translate the specs into <b>pseudocode</b>.
 <pre>
 dht11_retv_t dht11_read( void) {
    unsigned char values[ 5] ;
 /* Host START: pulls line down for > 18ms then release line, pull-up raises to HIGH */
    dht11_output() ;
    usleep( 18000) ;
    dht11_input() ;
 /* DHT START: takes line, 80µs low then 80µs high */
    wait_level( LOW) ;  /* HIGH -> LOW, starts 80µs low */
    wait_level( HIGH) ; /* LOW -> HIGH, ends 80µs low, starts 80µs high */
 /* DHT transmits 40 bits, high bit first
 *  0 coded as 50µs low then 26~28µs high
 *  1 coded as 50µs low then 70µs high
 */
    wait_level( LOW) ; /* HIGH -> LOW, ends 80µs high, starts 50µs low */
    unsigned char sum = 0 ;
    unsigned char v = 0 ;
    for( int idx = 0 ; idx <= 4 ; idx += 1) {
        sum += v ;
        v = 0 ;
        for( unsigned char curbit = 128 ; curbit ; curbit >>= 1) {
        /* Measure duration of HIGH level */
            wait_level( HIGH) ; /* LOW -> HIGH, ends 50µs low, starts timed high */
            wait_level( LOW) ;  /* HIGH -> LOW, timed high ends, starts 50µs low */
        /* Set bit based on measured HIGH duration */
            if( duration is 70µs)  /* 0 == 26~28µs, 1 == 70µs */
                v |= curbit ;
        }
        values[ idx] = v ;
    }
 /* DHT STOP: releases line after 50µs, pull-up raises to HIGH */
    wait_level( HIGH) ; /* LOW -> HIGH, ends 50µs low, dht has released the line */
    if( sum != values[ 4])
        return DHT11_FAIL_CKSUM ;
    dht11_humid = values[ 0] ;
    dht11_tempc = values[ 2] ;
    dht11_tempf = values[ 3] ;
    if( dht11_tempf & 0x80) {
        dht11_tempc *= -1 ;
        dht11_tempf &= 0x7F ;
    }
    return DHT11_SUCCESS ;
 }
 </pre>
 To turn this pseudocode into real implementation I need to code
 <ul>
 <li> <code>wait_level()</code>: wait for a line transmission and triggers a
 timeout if there is none.
 <li> a way to measure the duration of a HIGH level.
 </ul>
 I implement <code>wait_level()</code> as a macro, triggering a timeout when the
 number of retries reach a limit. Originally, I set MAX_RETRIES to 999,
 later I tune it to be large enough for the highest frequency supported
 by STM32F030 (48 MHz).
 <pre>
 #define MAX_RETRIES 200         /* at 48 MHz, 160 retries for 80 µs HIGH */
 #define is_not_LOW( a) a != LOW
 #define is_not_HIGH( a) a == LOW
 #define wait_level( lvl) \
    retries = MAX_RETRIES ; \
    while( is_not_##lvl( dht11_bread())) \
        if( retries-- == 0) \
            return DHT11_FAIL_TOUT
 </pre>
 <code>wait_level()</code> allows me to measure the duration of a wait in retries
 unit. As DHT11 starts transmission by 80µs LOW followed by 80µs HIGH, I
 can measure 80µs in retries unit. This is all I need to calibrate the timing
 measurement.
 <p>
 I can do this calibration every time the DHT11 starts transmission, this
 way I don’t need to update some constant if I change the frequency of my
 system clock.
 <pre>
 /* DHT transmits 40 bits, high bit first
 *  0 coded as 50µs low then 26~28µs high
 *  1 coded as 50µs low then 70µs high
 */
    wait_level( LOW) ; /* HIGH -> LOW, ends, 80µs high, starts 50µs low */
    int threshold = (MAX_RETRIES + retries) / 2 ;
 </pre>
 Based on the measured duration of 80µs, I can define a threshold at
 40µs. Later to identify if a bit transmitted was a 0 (26~28µs) or a 1
 (70µs), I will check if its duration is below or higher than the
 threshold.
 <pre>
            wait_level( LOW) ;  /* HIGH -> LOW, timed high ends, starts 50µs low */
        /* Set bit based on measured HIGH duration */
            if( retries < threshold)    /* false == 26~28µs, true == 70µs */
                v |= curbit ;
 </pre>
 To finalize <code>dht11_read()</code>, I declare retries before the first
 <code>wait_level()</code>.
 <pre>
 /* DHT START: takes line, 80µs low then 80µs high */
    int retries ;       /* retry counter */
    wait_level( LOW) ;  /* HIGH -> LOW, starts 80µs low */
 </pre>
 There is still a bit of pseudocode left as I need to map
 <code>dht11_input()</code>, <code>dht11_output()</code> and
 <code>dht11_bread()</code> to actual GPIO peripheral, pin and low level
 functions. I am using GPIOA pin 0.
 <pre>
 /* dht11.c -- DHT11 humidity and temperature sensor reading */
 #include "dht11.h"      /* implements DHT11 API */
 #include "system.h"     /* usleep(), gpioa_*() */
 #define DIO 0
 #define dht11_input()   gpioa_input( DIO)
 #define dht11_output()  gpioa_output( DIO)
 #define dht11_bread()   gpioa_read( DIO)
 /* 5 .. 95 %RH, -20 .. 60 °C */
 unsigned char dht11_humid ; /* 5 .. 95 %RH */
  signed char dht11_tempc ; /* -20 .. 60 °C */
 unsigned char dht11_tempf ; /* .0 .. .9 °C */
 void dht11_init( void) {
    dht11_input() ;
 }
 </pre>
 After adding the includes, global variables declarations and the
 implementation of <code>dht11_init()</code>, I can build and test.
 <h2>Build and basic test</h2>
 I add the new composition to <b>Makefile</b>
 <pre>SRCS = startup.txeie.c gpioa.c dht11main.c dht11.c</pre>
 Build completes successfully
 <pre>
 $ make
 f030f4.elf from startup.txeie.o gpioa.o dht11main.o dht11.o
   text    data     bss     dec     hex filename
   1877       0      24    1901     76d f030f4.elf
 f030f4.hex
 f030f4.bin
 </pre>
 Flashing the board and starting execution, I can see a new output every
 two seconds.
 <p>
 <img alt="DHT11 output" src="img/31_output.png">
 <p>
 The humidity value seems off the mark. So I need to investigate what’s
 the issue.
 <h2>Checkpoint</h2>
 I have implemented DHT11 protocol using polling and auto timing calibration. I
 can read the values reported by the DHT11 sensor.
 <p>
 <a href="32_errata.html">Next</a>, I will investigate if the values read are
 correct.
    <hr>© 2020-2024 Renaud Fivet
 </body>
 </html>
--- a/docs/32_errata.html
+++ b/docs/32_errata.html
@ -0,0 +1,244 @@
 <!DOCTYPE html>
 <html lang="en">
 <head>
    <meta charset="UTF-8">
    <title>3.2 DHT11 Errata</title>
    <link type="text/css" rel="stylesheet" href="style.css">
 </head>
 <body>
 <h1>3.2 DHT11 Errata</h2>
 I only did basic testing so far, checking that the values read were
 displayed properly. But the humidity values didn’t seem correct, so I
 need to do some extra verifications. There are many possible causes and
 they can combine, bugs like humans are social animals, when you find
 one, look for its mates.
 <p>
 Some candidates for investigations:
 <ul>
 <li> Coding or wiring mistakes.
 <li> Wrong interpretation of the datasheet.
 <li> Mistake in the datasheet.
 <li> Bad sample of the DHT11.
 <li> Deviation from the specifications.
 <li> Sensitivity to power supply, either the accuracy of the voltage or the
  fact I am using 3.3V instead of 5.5V.
 </ul>
 On top of this I need to do extra testing
 <ul>
 <li> Test at temperature below 0°C.
 <li> Check if the readings are stable or jittery.
 </ul>
 <h2>Coding and wiring check</h2>
 My basic test shows that I manage to read values without errors. A loose
 wire would generate Timeout errors or Checksum errors. The temperature
 values look reasonable, only humidity seems way too high.
 <p>
 I double checked both code and wiring to be on the safe side.
 <p>
 I am confident that the protocol implementation matches the transmission
 of DHT11. So next I need recheck if the setup time (one second) and
 frequency of reading (every two seconds) are the correct requirement.
 <h2>Datasheet check</h2>
 I rechecked Aosong website for the latest version of the DHT11
 datasheet. There is only a Chinese version v1.3 branded ASAIR. I have
 the previous version v1.3 branded AOSONG which seems identical except a
 revision page.
 <p>
 I also have an English Aosong DHT11 datasheet that seems outdated.
 Aosong doesn’t seem to allow redistribution of their datasheet so most
 online vendor have made their own based on that English version.
 <p>
 The English datasheet states the temperature range as 0~50℃, the
 humidity range to be 20~95%RH and recommend a reading frequency greater
 than 5 seconds.
 <p>
 The Chinese datasheet states the temperature range as -20~60°C, the
 humidity range to be 0~95%RH and recommend a reading frequency greater
 than 2 seconds. It also explains the encoding of negative temperature.
 <p>
 My implementation is based on the latest Chinese version. I can retest
 with a longer interval between readings in case I am using a chip that
 follows the older specification (5 seconds interval instead of 2).
 <p>
 In <b>dht11main.c</b> I just need to change the test <code>if( 1 & last)</code>
 to <code>if( 2 == (last % 15))</code> in order to read every 15 seconds after
 a setup time of 2 seconds. If this works better, I can retry for shorter
 intervals by changing the modulo value.
 <p>
 When I test with readings every 15 seconds, I get stable humidity value
 around 25~26%RH. As I have changed both interval and setup time, I need
 confirm that it is the interval time that matters.
 <p>
 With intervals of 5 and 6 seconds, the reading jumps above 37%RH. So
 it’s clearly a problem with the interval.
 <p>
 I want to make a round number of samples per minute, so I need retest to
 check if 10 and 12 seconds work the same as 15, but before I do fine
 tuning, I better check if this is not just a problem with that
 particular DHT11.
 <h2>Product quality</h2>
 It’s clear that the DHT11 I am testing is not behaving according to old
 or new specifications for the sampling interval.
 <p>
 Defects happen! No production line output is 100% perfect, so I may just
 have a defective or damaged chip. As this particular product is low
 cost, I have several boards on hands I bought from different suppliers, I would
 be very unlucky if they end up all coming from the same bad production
 batch or all damaged by mishandling.
 <p>
 Testing the four DHT11 I have, I find that three of them are working in
 their precision range when sampled every five seconds. Understanding
 that humidity precision is ±5%RH and temperature precision is ±2℃, there
 is still room for quite some variation between readings from different
 devices.
 <p>
 I select the most accurate chip at my current environment humidity and
 temperature to do further test.
 <h2>Voltage tolerance</h2>
 When it come to measurement, precision is often related to the value of
 the reference voltage. I want to check the difference in measurement of
 the same chip when powered at 5V compared to 3.3V.
 <p>
 I need to use a 5V tolerant GPIO pin for this test, so I switch to GPIOA
 13 (SWDIO). By default that pin is configured as ALT SWDIO, floating
 input with weak pull-up, similar to the initial state for DHT11 Data IO
 pin.
 <p>
 The board needs to be powered by its USB connector 5V source instead of
 directly by the USB to Serial adapter, I make sure board and adapter are
 powered independently, the two being connected only by TX, RX and GND.
 I can then select the voltage of DHT11 according to what I want to test, 3.3V
 or 5V.
 <p>
 There is a difference in measurement, 3.3V giving slightly higher value
 than 5V, for this particular test: 2%RH more for humidity and 0.3℃ for
 temperature. There is no clear advantage to use 5V over 3.3V.
 <p>
 I am not doing precise voltage test as the precision of DHT11 and the
 variation between the chips I have would make the interpretation of the
 results irrelevant.
 <h2>Temperature below 0℃.</h2>
 The Chinese version of the datasheet gives the encoding for temperature
 below zero ℃ and a measurement range of -20~60℃.
 <p>
 I have implemented <code>dht11_read()</code> accordingly so I just need to
 test at below zero ℃.
 <p>
 From my test, I can see that the values reported are negative but I
 found a difference versus the datasheet.
 <p>
 According to the datasheet, the temperature values are encoded as 1 byte
 for the integer part and 1 byte for the one decimal digit fractional
 part, the highest bit of the fractional part indicating the sign.
 <p>
 So when temperature crosses zero, I expects to see
 <pre>
 0 + 2 => 0.2
 0 + 1 => 0.1
 0 + 0 => 0.0
 0 - 1 => -0.1
 0 - 2 => -0.2
 ...
 0 - 8 => -0.8
 0 - 9 => -0.9
 1 - 0 => -1.0
 1 - 1 => -1.1
 ...
 </pre>
 Instead the values transmitted are
 <pre>
 0 + 2 => 0.2
 0 + 1 => 0.1
 0 + 0 => 0.0
 0 - 9 => ???
 0 - 8 => ???
 ...
 0 - 2
 0 - 1
 0 - 0 => !!!
 1 - 9
 ...
 </pre>
 I have to modify my original implementation
 <pre>
    dht11_tempc = values[ 2] ;
    dht11_tempf = values[ 3] ;
    if( dht11_tempf & 0x80) {
        dht11_tempc *= -1 ;
        dht11_tempf &= 0x7F ;
    }
 </pre>
 And retest after the following modification.
 <pre>
    dht11_tempc = values[ 2] ;
    dht11_tempf = values[ 3] ;
    if( dht11_tempf & 0x80) {
        dht11_tempc *= -1 ;
        dht11_tempf = 10 - ( dht11_tempf & 0x7F) ;
        if( dht11_tempf == 10) {
            dht11_tempc -= 1 ;
            dht11_tempf = 0 ;
        }
    }
 </pre>
 <h2>Stability</h2>
 During my test I didn’t noticed big surges in measurements but the time
 to get to actual value is quite long. The interval between readings
 affects the measurement and initially it takes a long time for the
 readings to converge to actual temperature or humidity.
 <h2>Conclusions</h2>
 I didn’t find any English version of the latest version of the
 datasheet.
 <p>
 I found some difference between the Chinese datasheet and the behavior
 of the chips I have when it come to representation of temperature below
 0℃.
 <p>
 Cost and one pin transmission interface are the two main advantages of
 this chipset.
 <p>
 I could use the DHT11 for monitoring temperature and humidity
 variations. For fast and accurate measurements, this is not the droid I
 am looking for.
 <h2>Checkpoint</h2>
 I did multiple checks and found several issues. I can secure stable
 readings by controlling the reading interval, which means I can tune the
 timing to suit a specific chip. There is variations in readings between
 chips and due to the loose precision it is hard to understand how good
 or how bad the measurements are.
 <p>
 <a href="33_ds18b20.html">Next</a>, I will use another digital thermometer as a
 reference.
    <hr>© 2020-2024 Renaud Fivet
 </body>
 </html>
--- a/docs/33_ds18b20.html
+++ b/docs/33_ds18b20.html
@ -0,0 +1,401 @@
 <!DOCTYPE html>
 <html lang="en">
 <head>
    <meta charset="UTF-8">
    <title>3.3 DS18B20 Digital Thermometer</title>
    <link type="text/css" rel="stylesheet" href="style.css">
 </head>
 <body>
 <h1>3.3 DS18B20 Digital Thermometer</h2>
 The DS18B20 chip from <b>Maxim Integrated</b> is a digital thermometer able to
 do measurement from -55℃ to 125℃ with a precision of ±0.5℃ in the range
 -10℃ to 85℃.
 <h2>Hardware considerations</h2>
 The DS18B20 comes in several packaging where only 3 pins are used: vcc,
 gnd and data io. It can be powered at 5V or 3.3V.
 <p>
 The io data line when idle need to be at high level, so a pull up
 resistor is necessary. The small DS18B20 board I use has a pull up
 resistor between vcc and io data.
 <p>
 <img alt="DS18B20 Board" src="img/33_ds18b20.png">
 </p>
 It is possible to power the chip using data io and gnd only (no vcc) in
 Parasitic Power Mode if a two wire only interface is needed. I won’t use
 this feature for now.
 <h2>Communication protocol</h2>
 The data io line is a 1-Wire bus on which several 1-Wire devices can be
 connected. So there is a scheme to address multiple devices, but in
 simple case where there is only the host and one device, command can be
 broadcasted without specifying addresses.
 <p>
 A typical transaction sequence goes like this
 <ul>
 <li> Initialization
 <li> ROM Command (followed by any required data exchange)
 <li> Function Command (followed by any required data exchange)
 </ul>
 The <b>initialization</b> is a simple timed handshake where the host
 triggers a response from the device by pulling the line LOW for 480µs,
 then waits for the device to assert it LOW to confirm its presence.
 <pre>
 static ds18b20_retv_t initialization() {
 /* Reset */
    output() ;          /* Wire LOW */
    usleep( 480) ;
    input() ;           /* Wire floating, HIGH by pull-up */
 /* Presence */
    int retries ;
    wait_level( HIGH) ; /* Pull-up LOW -> HIGH, T1 */
    wait_level( LOW) ;  /* DS18B20 asserts line to LOW, T2, T2 - T1 = 15~60us */
    wait_level( HIGH) ; /* DS18B20 releases lines, Pull-up LOW -> HIGH, T3
                        **  T3 - T2 = 60~240us */
    usleep( 405) ;      /* 480 = 405 + 15 + 60 */
    return DS18B20_SUCCESS ;
 }
 </pre>
 The <b>ROM Command</b> is how the host selects the device for communication.
 Writing a <b>ROM Skip</b> command addresses all devices connected.
 <p>
 The <b>Function Command</b> is the request to the device selected by the ROM
 Command:
 <ul>
 <li> Read the device memory
 <li> Write the device memory
 <li> Start a temperature conversion
 </ul>
 To write command or data the host does timed pulse for each bit, there
 is no acknowledge from the device and no error detection.
 <pre>
 static void write( unsigned char uc) {
 /* Transmit byte, least significant bit first */
    for( unsigned char curbit = 1 ; curbit ; curbit <<= 1) {
    /* Transmit a bit takes 60us + 1us between transmit */
    /* Write 1: <15us LOW */
    /* Write 0:  60us LOW */
        unsigned t = uc & curbit ? 13 : 60 ;
        output() ;      /* Wire LOW */
        usleep( t) ;
        input() ;       /* Wire floating, HIGH by pull-up */
        usleep( 61 - t) ;
    }
 }
 </pre>
 When the host expects to read some data, it can triggers a 1 bit
 transmission from the device by first pulling the line LOW for 1µs then
 reading the state asserted by the device.
 <pre>
 static iolvl_t poll( void) {
    output() ;  /* Wire LOW */
    usleep( 1) ;
    input() ;   /* Wire floating, HIGH by pull-up */
    usleep( 5) ;
    iolvl_t bit = bread() ;
    usleep( 55) ;
    return bit ;
 }
 </pre>
 Integrity of the data transmitted by the device is guaranteed by 8 bit
 Cyclic Redundancy Check (CRC).
 <pre>
 static unsigned char read( unsigned char *p, int size) {
    unsigned char crc = 0 ;
    while( size--) {
    /* Receive byte, least significant bit first */
        unsigned char uc = 0 ;
        for( unsigned char curbit = 1 ; curbit ; curbit <<= 1) {
        /* read bit */
            int v = poll() ;
            if( v)
                uc |= curbit ;
        /* update CRC */
            v ^= crc ;
            crc >>= 1 ;
            if( v & 1)
                crc ^= 0x119 >> 1 ; /* reverse POLY = x^8 + x^5 + x^4 + 1 */
        }
    /* store byte */
        *p++ = uc ;
    }
    return crc ;
 }
 </pre>
 Base on this, complex transaction sequences can be coded.
 <p>
 The transaction to read the eight byte scratchpad (device memory) plus
 CRC:
 <pre>
 static ds18b20_retv_t read_scratchpad( unsigned char scratchpad[]) {
    ds18b20_retv_t ret = initialization() ;
    if( ret != DS18B20_SUCCESS)
        return ret ;
    write( 0xCC) ;  /* Skip ROM */
    write( 0xBE) ;  /* Read Scratchpad */
    return read( scratchpad, 9) ? DS18B20_FAIL_CRC : DS18B20_SUCCESS ;
 }
 </pre>
 <h2>Temperature conversion and encoding</h2>
 The DS18B20 can convert the temperature measured into a 12 bit signed
 digit, 8 bit integer part and 4 bit fractional part. As the time of
 conversion depends of the precision of the conversion, it is possible to
 select the resolution from 9 to 12 significant bits. Conversion time
 range from less than 93.75ms (9 bits) to maximum 750ms (12 bits).
 <p>
 The host requests the conversion, waits for the conversion to end, then
 fetch the device memory to read the measurement.
 <P>
 The host can <code>poll()</code> the device to check if the conversion is
 finished.
 <h2>DS18B20 API</h2>
 I create the header file <b>ds18b20.h</b> with the following interface.
 <pre>
 /* ds18b20.h -- 1-Wire temperature sensor */
 typedef enum {
    DS18B20_SUCCESS,
    DS18B20_FAIL_TOUT,
    DS18B20_FAIL_CRC
 } ds18b20_retv_t ;
 void ds18b20_init( void) ;
 ds18b20_retv_t ds18b20_resolution( unsigned res) ;  /* 9..12 bits  */
 ds18b20_retv_t ds18b20_convert( void) ;
 ds18b20_retv_t ds18b20_fetch( short *deciCtemp) ;/* -550~1250 = -55.0~125.0 C */
 ds18b20_retv_t ds18b20_read( short *deciCtemp) ; /* -550~1250 = -55.0~125.0 C */
 </pre>
 Usage:
 <ul>
 <li> Initialization: <code>ds18b20_init()</code>, once at startup.
 <li> Use <code>ds18b20_resolution()</code> to select the resolution. It can be
 done before starting a conversion. The value will be kept until the ds18b20
 is powered down.
 <li> To measure the temperature, use <code>ds18b20_read()</code>, which will
 start a conversion, wait until it finishes, fetch the value from the device
 memory and deci℃ (1 deci℃ = 0.1 ℃).
 <li> Alternatively, to avoid the blocking <code>ds18b20_read()</code>, call
 <code>ds18b20_convert()</code> followed by <code>ds18b20_fetch()</code> once
 enough time has elapsed to complete the conversion.
 </ul>
 Below is an application to print the temperature every second.
 <pre>
 /* ds18b20main.c -- sample temperature using 1-Wire temperature sensor */
 #include &lt;stdio.h>
 #include "system.h"     /* uptime */
 #include "ds18b20.h"    /* ds18b20_() */
 int main( void) {
    unsigned last = 0 ;
    ds18b20_init() ;
    ds18b20_resolution( 12) ;   /* Set highest resolution: 12 bits */
    ds18b20_convert() ;         /* start temperature conversion */
    for( ;;)
        if( last == uptime)
            yield() ;
        else {
            short val ;
            last = uptime ;
            switch( ds18b20_fetch( &val)) {
            case DS18B20_SUCCESS:
                printf( "%i.%i\n", val / 10, val % 10) ;
                break ;
            case DS18B20_FAIL_TOUT:
                puts( "Timeout") ;
                break ;
            case DS18B20_FAIL_CRC:
                puts( "CRC Error") ;
            }
            ds18b20_convert() ; /* start temperature conversion */
        }
 }
 </pre>
 <h2>DS18B20 API implementation</h2>
 I create <b>ds18b20.c</b>, starting with the GPIO mapping and initialization.
 <pre>
 /* ds18b20.c -- 1-Wire digital thermometer */
 #include "ds18b20.h"    /* implements DS18B20 API */
 #include "system.h"     /* gpioa_(), usleep() */
 #define DIO 13
 #define input()     gpioa_input( DIO)
 #define output()    gpioa_output( DIO)
 #define bread()     gpioa_read( DIO)
 #define MAX_RETRIES 999
 #define wait_level( lvl) \
    retries = MAX_RETRIES ; \
    while( bread() != lvl) \
        if( retries-- == 0) \
            return DS18B20_FAIL_TOUT
 void ds18b20_init( void) {
    input() ;           /* Wire floating, HIGH by pull-up */
 }
 </pre>
 I add the local functions that are the building block for the
 transactions (<code>initialization()</code>, <code>write()</code>,
 <code>poll()</code> and <code>read()</code>) and
 the <code>read_scratchpad()</code> transaction I explained before.
 <p>
 Start conversion transaction:
 <pre>
 ds18b20_retv_t ds18b20_convert( void) {
    ds18b20_retv_t ret ;
    ret = initialization() ;
    if( ret != DS18B20_SUCCESS)
        return ret ;
    write( 0xCC) ;  /* Skip ROM */
    write( 0x44) ;  /* Convert T */
    return DS18B20_SUCCESS ;
 }
 </pre>
 Fetch temperature, to be called after conversion is done.
 <pre>
 ds18b20_retv_t ds18b20_fetch( short *deciCtemp) { /* -550~1250 = -55.0~125.0 C $
    ds18b20_retv_t ret ;
    unsigned char vals[ 9] ;    /* scratchpad */
    ret = read_scratchpad( vals) ;
    if( ret != DS18B20_SUCCESS)
        return ret ;
    *deciCtemp = *((short *) vals) * 10 / 16 ;
    return DS18B20_SUCCESS ;
 }
 </pre>
 Blocking temperature read, which polls the device for end of conversion.
 <pre>
 ds18b20_retv_t ds18b20_read( short *deciCtemp) { /* -550~1250 = -55.0~125.0 C */
    ds18b20_retv_t ret ;
    ret = ds18b20_convert() ;
    if( ret != DS18B20_SUCCESS)
        return ret ;
    do
        usleep( 4000) ;
    while( poll() == LOW) ; /* up to 93.75ms for 9 bits, 750ms for 12 bits */
    return ds18b20_fetch( deciCtemp) ;
 }
 </pre>
 Set resolution.
 <pre>
 ds18b20_retv_t ds18b20_resolution( unsigned res) {  /* 9..12 bits  */
    ds18b20_retv_t ret ;
    unsigned char vals[ 9] ;    /* scratchpad */
    unsigned char curres ;
 /* read scratchpad */
    ret = read_scratchpad( vals) ;
    if( ret != DS18B20_SUCCESS)
        return ret ;
 /* update resolution if current value is different than requested */
    res = (res - 9) & 3 ;
    curres = vals[ 4] >> 5 ;
    if( curres != res) {
        vals[ 4] = (vals[ 4] & 0x1F) | (res << 5) ;
        ret = initialization() ;
        if( ret != DS18B20_SUCCESS)
            return ret ;
        write( 0xCC) ;  /* Skip ROM */
        write( 0x4E) ;  /* Write Scratchpad */
        write( vals[ 2]) ;
        write( vals[ 3]) ;
        write( vals[ 4]) ;
    }
    return DS18B20_SUCCESS ;
 }
 </pre>
 There is no error check when writing to the device, so it would make
 sense to read back the device memory after the set to make sure there
 was no error when writing in the first place.
 <h2>Build and test</h2>
 I add the new composition to Makefile.
 <pre>SRCS = startup.txeie.c gpioa.c ds18b20main.c ds18b20.c</pre>
 Build complete successfully.
 <pre>
 $ make
 f030f4.elf from startup.txeie.o gpioa.o ds18b20main.o ds18b20.o
   text    data     bss     dec     hex filename
   2530       0      16    2546     9f2 f030f4.elf
 f030f4.hex
 f030f4.bin
 </pre>
 Flashing the board and starting execution, I can see a new output every
 second.
 <p>
 <img alt="DS18B20 output" src="img/33_output.png">
 <h2>Checkpoint</h2>
 <a href="34_adcvnt.html">Next</a>, I will read the internal Voltage and
 Temperature sensors using Analog to Digital Conversion (<b>ADC</b>).
    <hr>© 2020-2024 Renaud Fivet
 </body>
 </html>
--- a/docs/34_adcvnt.html
+++ b/docs/34_adcvnt.html
@ -0,0 +1,371 @@
 <!DOCTYPE html>
 <html lang="en">
 <head>
    <meta charset="UTF-8">
    <title>3.4 Internal Voltage and Temperature</title>
    <link type="text/css" rel="stylesheet" href="style.css">
 </head>
 <body>
 <h1>3.4 Internal Voltage and Temperature</h1>
 The STM32 chipsets have internal temperature sensor and voltage
 reference connected to two channels of the Analog to Digital Converter
 (ADC) peripheral. I will take some readings to see how that compare to
 other temperature sensors.
 <p>
 The STM32F030 family has a suffix 6 (STM32F030F4P<b>6</b>) for the ambient
 operating temperature: -40℃ to 85℃. It should be fine to use not only
 inside rooms but also outdoor (may be not on the dashboard of a car
 parked under direct tropical sunlight at noon in summer).
 <h2>Analog to Digital Converter</h2>
 The STM32 ADC is very versatile which also means it has many options. I
 am looking for a minimal setup to read both voltage and temperature by
 issuing an ADC conversion command for each value I need to read.
 <p>
 The initialization steps:
 <ul>
 <li> Enable ADC peripheral.
 <li> Select a sampling clock.
 <li> Recalibrate after a change of sampling clock.
 <li> Enable command mode.
 <li> Select Voltage reference and Temperature sensor as inputs.
 <li> Configure acquisition direction and mode.
 </ul>
 <pre>
 static void adc_init( void) {
 /* Enable ADC peripheral */
    RCC_APB2ENR |= RCC_APB2ENR_ADCEN ;
 /* Setup ADC sampling clock */
 #ifdef HSI14
    RCC_CR2 |= RCC_CR2_HSI14ON ;                    /* Start HSI14 clock */
    do {} while( !( RCC_CR2 & RCC_CR2_HSI14RDY)) ;  /* Wait for stable clock */
 /* Select HSI14 as sampling clock for ADC */
 //  ADC_CFGR2 &= ~ADC_CFGR2_CKMODE ;    /* Default 00 == HSI14 */
 #else
 /* Select PCLK/2 as sampling clock for ADC */
    ADC_CFGR2 |= ADC_CFGR2_PCLK2 ;          /* 01 PCLK/2 Over default 00 */
 //  ADC_CFGR2 |= ADC_CFGR2_PCLK4 ;          /* 10 PCLK/4 Over default 00 */
 #endif
 /* Calibration */
    ADC_CR |= ADC_CR_ADCAL ;
    do {} while( ADC_CR & ADC_CR_ADCAL) ;   /* Wait end of calibration */
 /* Enable Command (below Work Around from Errata necessary with PCLK/4) */
    do {
        ADC_CR |= ADC_CR_ADEN ;
    } while( !( ADC_ISR & ADC_ISR_ADRDY)) ;
 /* Select inputs and precision */
    ADC_CHSELR = 3 << 16 ;  /* Channel 16: temperature, Channel 17: Vrefint */
    ADC_CCR |= ADC_CCR_TSEN | ADC_CCR_VREFEN ;
    ADC_SMPR = 7 ;
 /* Select acquisition direction and mode */
 /* Default scan direction (00) is Temperature before Voltage */
 //  ADC_CFGR1 &= ~ADC_CFGR1_SCANDIR ;   /* Default 0 is low to high */
    ADC_CFGR1 |= ADC_CFGR1_DISCEN ;     /* Enable Discontinuous mode */
 }
 </pre>
 The ADC characteristics in the STM32F030 datasheet states that the
 maximum ADC sampling clock is 14MHz. It is possible to select either an
 internal 14 MHz clock (HSI14) or PCLK divided by 2 or 4. I want to check
 how the clock affects the readings (HSI14, 28/2, 24/2, 48/4). At first I
 couldn’t manage to make PCLK/4 work until I found the note on ADC
 calibration work around in the errata
 <a href="https://www.st.com/content/st_com/en/search.html#q=%20ES0219-t=resources-page=1"
 >ES0219 2.5.3</a>
 <p>
 After this initialization, I can trigger an ADC conversion by issuing a
 command, wait until completion and fetch the result in the ADC Data
 Register.
 <pre>
 static unsigned adc_convert( void) {
 /* Either only one channel in sequence or Discontinuous mode ON */
    ADC_CR |= ADC_CR_ADSTART ;              /* Start ADC conversion */
    do {} while( ADC_CR & ADC_CR_ADSTART) ; /* Wait for start command cleared */
    return ADC_DR ;
 }
 </pre>
 There is two values to fetch, temperature and voltage in that order.
 <pre>
 temperature = adc_convert() ;
 voltage = adc_convert() ;
 </pre>
 The values resulting from an ADC conversion are raw, at the precision I
 selected during initialization (12 bits) it gives me a value between 0
 and 4095.
 <h2>Factory Calibration</h2>
 The raw data readings are relative to the analog operative voltage VDDA.
 <p>
 <b>Vmeasured = VDDA * VRAW / 4095</b>
 <p>
 This is why I measure both voltage and temperature. Ideally, VDDA should
 be 3.3V, in practice it’s highly dependent of the power supply.
 <p>
 Every chip is calibrated in factory during production, ADC conversion is
 done at 3.3V and 30℃ and the resulting values are stored in system
 memory.
 <p>
 <img alt="VREFINT_CAL" src="img/34_vrefint.png">
 <p>
 <img alt="TS_CAL1" src="img/34_tscal1.png">
 <pre>
 /* STM32F030 calibration addresses (at 3.3V and 30C) */
 #define TS_CAL1                 ((unsigned short *) 0x1FFFF7B8)
 #define VREFINT_CAL             ((unsigned short *) 0x1FFFF7BA)
 </pre>
 VREFINT is the embedded reference voltage measured on channel 17 of the
 ADC. According to the STM32F030 datasheet it is in the range 1.2V to
 1.25V. I have a chip whose VREFINT calibration value is 1526, that means
 the embedded reference voltage VREFINT is 3.3 * 1526 / 4095, close to
 the middle of the range and the given typical value of 1.23V.
 <p>
 I can read VREFINT raw data after ADC conversion and I know its factory
 calibrated value, so I can calculate VDDA.
 <p>
 <b>VREFINT = 3.3 * VCAL / 4095 = VDDA * VRAW / 4095</b>
 <p>
 <b>VDDA = 3.3 * VCAL / VRAW</b>
 <p>
 On my chip whose VCAL is 1526, if I measure VRAW at 1534, this will put
 VDDA at 3.28V. Less than 1% off the ideal 3.3V voltage.
 <p>
 There is only one temperature calibration value stored for STM32F030
 chips. You need two reference values to be able to calculate the
 temperature reliably, either the ADC readings taken at two temperatures
 (30℃ and 110℃ for other STM32 families) or one ADC reading plus the
 value of the temperature slope characteristic for that particular chip.
 <p>
 The STM32F030 datasheet gives the range 4~4.6 mV/℃ and a typical value
 of 4.3 mV/℃ for the average slope. The reference manual uses 4.3 mV/℃ in
 the temperature calculation code example.
 <p>
 When it comes to temperature measurement using the temperature sensor of
 STM32F030, you need to do your own two point calibration.
 <h2>Voltage and Temperature Conversion API</h2>
 I need to be able to
 <ul>
 <li> Initialize the ADC for voltage and temperature conversion and fetch
  the calibration values.
 <li> Fetch the calibration values.
 <li> Convert and read the raw ADC conversion values.
 <li> Convert and read the calculated voltage and temperature.
 </ul>
 I add the following API to <b>system.h</b>
 <pre>
 typedef enum {
    VNT_INIT,
    VNT_CAL,
    VNT_RAW,
    VNT_VNC
 } vnt_cmd_t ;
 void adc_vnt( vnt_cmd_t cmd, short *ptrV, short *ptrC) ;
 </pre>
 The voltage return value is in centiV (330 == 3.3V), the temperature
 return value is in deci℃ (300 == 30℃).
 <p>
 I make a copy of <b>gpioa.c</b> into <b>adc.c</b> adding the code I just
 explained for <code>adc_init()</code>, <code>adc_convert()</code>, the
 calibration value addresses and the following implementation of
 <code>adc_vnt()</code>.
 <pre>
 void adc_vnt( vnt_cmd_t cmd, short *ptrV, short *ptrC) {
    if( cmd == VNT_INIT)
        adc_init() ;
    if( cmd <= VNT_CAL) {
    /* Calibration Values */
        *ptrV = *VREFINT_CAL ;
        *ptrC = *TS_CAL1 ;
        return ;
    }
 /* ADC Conversion */
    *ptrC = adc_convert() ;
    *ptrV = adc_convert() ;
    if( cmd == VNT_VNC) {
        *ptrC = 300 + (*TS_CAL1 - *ptrC * *VREFINT_CAL / *ptrV) * 10000 / 5336 ;
        *ptrV = 330 * *VREFINT_CAL / *ptrV ;
    }
 }
 </pre>
 The calculation for the temperature is based on the code example from
 the reference manual (RM0360 A.7.16).
 <p>
 The only thing missing is the description of the newly used registers
 and bitfields.
 <p>
 ● The extra RCC bitfields and register for enabling the ADC peripheral and activating HSI14 clock.
 <pre>
 #define RCC_APB2ENR_ADCEN       0x00000200  /*  9: ADC clock enable */
 #define RCC_CR2                 RCC[ 13]
 #define RCC_CR2_HSI14ON         0x00000001  /*  1: HSI14 clock enable */
 #define RCC_CR2_HSI14RDY        0x00000002  /*  2: HSI14 clock ready */
 </pre>
 ● The ADC registers and bitfields.
 <pre>
 #define ADC                     ((volatile long *) 0x40012400)
 #define ADC_ISR                 ADC[ 0]
 #define ADC_ISR_ADRDY           1   /* 0: ADC Ready */
 #define ADC_ISR_EOC             4   /* 2: End Of Conversion flag */
 #define ADC_CR                  ADC[ 2]
 #define ADC_CR_ADEN             1   /* 0: ADc ENable command */
 #define ADC_CR_ADSTART          4   /* 2: ADC Start Conversion command */
 #define ADC_CR_ADCAL            (1 << 31)   /* 31: ADC Start Calibration cmd */
 #define ADC_CFGR1               ADC[ 3]     /* Configuration Register 1 */
 #define ADC_CFGR1_SCANDIR       4           /*  2: Scan sequence direction */
 #define ADC_CFGR1_DISCEN        (1 << 16)   /* 16: Enable Discontinuous mode */
 #define ADC_CFGR2               ADC[ 4]     /* Configuration Register 2 */
 #define ADC_CFGR2_CKMODE        (3 << 30)   /* 31-30: Clock Mode Mask   */
                                            /* 31-30: Default 00 HSI14  */
 #define ADC_CFGR2_PCLK2         (1 << 30)   /* 31-30: PCLK/2 */
 #define ADC_CFGR2_PCLK4         (2 << 30)   /* 31-30: PCLK/4 */
 #define ADC_SMPR                ADC[ 5]     /* Sampling Time Register */
 #define ADC_CHSELR              ADC[ 10]    /* Channel Selection Register */
 #define ADC_DR                  ADC[ 16]    /* Data Register */
 #define ADC_CCR                 ADC[ 194]   /* Common Configuration Register */
 #define ADC_CCR_VREFEN          (1 << 22)   /* 22: Vrefint Enable */
 #define ADC_CCR_TSEN            (1 << 23)   /* 23: Temperature Sensor Enable */
 </pre>
 <h2>Application</h2>
 I create <b>adcmain.c</b> to take readings every second.
 <pre>
 /* adcmain.c -- ADC reading of reference voltage and temperature sensor */
 #include &lt;stdio.h>
 #include "system.h"
 #define RAW
 int main( void) {
    unsigned last = 0 ;
    short calV, calC ;
 /* Initialize ADC and fetch calibration values */
    adc_vnt( VNT_INIT, &calV, &calC) ;
 #ifdef RAW
    printf( "%u, %u\n", calV, calC) ;
 #endif
    for( ;;)
        if( uptime == last)
            yield() ;
        else {
            short Vsample, Csample ;
            last = uptime ;
 #ifdef RAW
            adc_vnt( VNT_RAW, &Vsample, &Csample) ;
            printf( "%i, %i, %i, %i, ", calV, Vsample, calC, Csample) ;
            Csample = 300 + (calC - (int) Csample * calV / Vsample)
                                                                * 10000 / 5336 ;
            Vsample = 330 * calV / Vsample ;
 #else
            adc_vnt( VNT_VNC, &Vsample, &Csample) ;
 #endif
            printf( "%i.%i, %i.%i\n", Vsample / 100, Vsample % 100,
                                                Csample / 10, Csample % 10) ;
        }
 }
 </pre>
 <h2>Build and test</h2>
 I want to test using maximum ADC clock at 14 MHz provided by PLLHSI/2,
 as I am using a board with no external quartz.
 <pre>
 /* No quartz, configure PLL at 28MHz */
 //#define HSE     8000000
 #define PLL     7
 #define BAUD    9600
 //#define HSI14 1
 </pre>
 I add the composition in <b>Makefile</b>
 <pre>SRCS = startup.txeie.c adc.c adcmain.c</pre>
 The build gives some warning as 28MHz is not a perfect match for a
 baudrate of 9600 and the current implementation of <code>usleep()</code>. This
 will not affect my application.
 <pre>
 $ make
 adc.c:155:3: warning: #warning baud rate not accurate at that clock frequency [-
 Wcpp]
  155 | # warning baud rate not accurate at that clock frequency
      |   ^~~~~~~
 adc.c: In function 'usleep':
 adc.c:232:3: warning: #warning HCLK is not multiple of 8 MHz [-Wcpp]
  232 | # warning HCLK is not multiple of 8 MHz
      |   ^~~~~~~
 f030f4.elf from startup.txeie.o adc.o adcmain.o
   text    data     bss     dec     hex filename
   2464       0      16    2480     9b0 f030f4.elf
 f030f4.hex
 f030f4.bin
 </pre>
 Flashing the board and starting execution, I can see the results of the
 ADC conversion and the calculated values.
 <p>
 <img alt="Raw ADC conversion readings" src ="img/34_output.png">
 </p>
 The temperature readings are roughly 5℃ higher than room temperature.
 <h2>Checkpoint</h2>
 I have done a simple ADC settings to read the Voltage reference and the
 temperature sensor.
 <p>
 Using the factory calibration data I can convert the raw ADC measurement
 into actual Vref. This will help in adjusting the ADC readings. But for
 temperature, the provided calibration is insufficient, there is only one
 point measured in factory for the STM32F030 family members.
 <p>
 <a href="35_calibrate.html">Next</a>, I will do temperature calibration, which
 means taking two measurements as far apart as possible in the working range I
 want to use.
    <hr>© 2020-2024 Renaud Fivet
 </body>
 </html>
--- a/docs/35_calibrate.html
+++ b/docs/35_calibrate.html
@ -0,0 +1,229 @@
 <!DOCTYPE html>
 <html lang="en">
 <head>
    <meta charset="UTF-8">
    <title>3.5 Internal Temperature Sensor Calibration</title>
    <link type="text/css" rel="stylesheet" href="style.css">
 </head>
 <body>
 <h1>3.5 Internal Temperature Sensor Calibration</h1>
 When it comes to temperature measurement, the documented factory
 calibration of STM32F030 family members does not provide enough
 information to make a valid estimation. The datasheet tells us that the
 value stored is the raw data of ADC conversion of the temperature sensor
 reading at 3.3V (±10mV) and 30℃ (±5℃).
 <p>
 <img alt="TS_CAL1" src="img/34_tscal1.png">
 <p>
 There is only one point calibration documented and its reference
 temperature is known with a precision of ±5℃. That’s not enough to
 calculate temperature but it shows that the sensor was tested in
 production.
 <p>
 Notice that the calibration value name is <b>TS_CAL1</b>, some other STM32
 chipset families do have a second temperature factory calibration point
 <b>TS_CAL2</b>. Also some don’t have any factory calibration stored at all.
 So you have to refer to the datasheet that matches the chip you are
 targeting when you port your code to a new chipset.
 <p>
 <img alt="Temperature Sensor Characteristics" src="img/35_tschar.png">
 <p>
 The sensor linearity with temperature is at worst ±2℃. So I am curious
 to see how it performs over a range of temperature.
 <p>
 If you are so lucky that you have picked a STM32F030 chip that has the
 typical average slope of <b>4.3 mV/℃</b> and was calibrated in factory at
 exactly <b>3.3V</b> and <b>30℃</b>, then you could use the following formula
 to calculate the temperature.
 <p>
 <b>T = 30 + (TS_CAL1 * 3.3 – TS_RAW * VDDA) / 4095 / 0.0043</b>
 <p>
 with
 <p>
 <b>VDDA = 3.3 * VREFINT_CAL / V_RAW</b>
 <p>
 that gives
 <p>
 <b>T = 30 + 3.3 * (TS_CAL1 – TS_RAW * V_CAL / V_RAW) / 4095 / 0.0043</b>
 <p>
 If I express the average slope in raw ADC units per ℃ instead of mV/℃
 <p>
 <b>5.336 = 4095 * 0.0043 / 3.3</b>
 <p>
 the final formula is
 <p>
 <b>T = 30 + (TS_CAL1 – TS_RAW * V_CAL / V_RAW) * 1000 / 5336</b>
 <p>
 which matches the sample code for temperature computation given in the
 reference manual (RM0360 A.7.16).
 <pre>
 /* Temperature sensor calibration value address */
 #define TEMP30_CAL_ADDR ((uint16_t*) ((uint32_t) 0x1FFFF7B8))
 #define VDD_CALIB ((uint32_t) (3300))
 #define VDD_APPLI ((uint32_t) (3000))
 #define AVG_SLOPE ((uint32_t) (5336)) /* AVG_SLOPE in ADC conversion step
                                      (@3.3V)/°C multiplied by 1000 for
                                      precision on the division */
 int32_t temperature; /* will contain the temperature in degrees Celsius */
 temperature = ((uint32_t) *TEMP30_CAL_ADDR
            - ((uint32_t) ADC1->DR * VDD_APPLI / VDD_CALIB)) * 1000;
 temperature = (temperature / AVG_SLOPE) + 30;
 </pre>
 If I use the raw ADC readings from my last run
 <p>
 <b>VDDA = 3.3 * 1526 / 1538 = 3.274V</b>
 <p>
 <b>t = 30 + (1721 – 1718 * 1526 / 1538) * 1000 / 5336 = 33.07℃</b>
 <p>
 I confirm the voltage with a voltmeter (measured 3.282V versus 3.274V
 computed). The computed internal temperature value is roughly 5℃ higher
 than the room temperature.
 <h2>Undocumented calibration data</h2>
 For the STM32F030Fx, the 5KB space before the RAM contains the System
 Memory (3KB) and the option bytes.
 <pre>
 | Content       | Start Address | Size |
 |---------------|---------------|------|
 | System Memory | 0x1FFFEC00    | 3KB  |
 | Option Bytes  | 0x1FFFF800	| 2KB  |
 | RAM Memory	| 0x20000000    | 4KB  |
 </pre>
 The calibration data are saved in the last 96 bytes of the System
 Memory, starting at address 0x1FFFF7A0. So it’s simple to dump the
 content of that zone and compare the values for multiple chips.
 <pre>
 $ stm32flash -r - -S 0x1FFFF7a0:96 COM3 2>/dev/null | hexdump -C
 00000000  ff ff ff ff 31 00 10 00  ff ff ff ff 1c 00 3a 00  |....1.........:.|
 00000010  12 57 34 41 38 32 30 20  b9 06 f6 05 f0 ff ff ff  |.W4A820 ........|
 00000020  ff ff 11 05 ff ff ff ff  fc ff ff ff 10 00 ff ff  |................|
 00000030  ff ff ff ff ff ff ff ff  ff ff ff ff ff ff ff ff  |................|
 00000040  ff ff ff ff ff ff ff ff  f3 ff ff ff ff ff ff ff  |................|
 00000050  ff ff ff ff ff ff ff ff  68 97 52 ad 3b c4 3f c0  |........h.R.;.?.|
 00000060
 </pre>
 <pre>
 | Location   | Content         | Size | F030 | F0x1/2/8 |
 |------------|-----------------|------|------|----------|
 | 0x1FFFF7AC | Unique ID       | 12   |	     | √        |
 | 0x1FFFF7B8 | TS_CAL1         | 2    |	√    | √        |
 | 0x1FFFF7BA | VREFINT_CAL     | 2    |	√    | √        |
 | 0x1FFFF7C2 | TS_CAL2         | 2    |      | √        |
 | 0x1FFFF7CC | Flash size (KB) | 2    |	√    | √        |
 </pre>
 This is the same layout as the one documented in <b>RM0091 <i>Reference Manual
 STM32F0x1/STM32F0x2/STM32F0x8</i></b> which includes the following sample code
 for temperature computation.
 <pre>
 /* Temperature sensor calibration value address */
 #define TEMP110_CAL_ADDR ((uint16_t*) ((uint32_t) 0x1FFFF7C2))
 #define TEMP30_CAL_ADDR ((uint16_t*) ((uint32_t) 0x1FFFF7B8))
 #define VDD_CALIB ((uint16_t) (330))
 #define VDD_APPLI ((uint16_t) (300))
 int32_t temperature; /* will contain the temperature in degrees Celsius */
 temperature = ((int32_t) ADC1->DR * VDD_APPLI / VDD_CALIB)
            - (int32_t) *TEMP30_CAL_ADDR;
 temperature *= (int32_t)(110 - 30);
 temperature /= (int32_t)(*TEMP110_CAL_ADDR - *TEMP30_CAL_ADDR);
 temperature += 30;
 </pre>
 Factoring in the actual measured voltage, this gives
 <p>
 <b>T = 30 + (TS_CAL1 – TS_RAW * V_CAL / V_RAW) * 80 / (TS_CAL1 – TS_CAL2)</b>
 <p>
 If I use the raw ADC readings from my last run
 <pre>
 TSCAL2_SLOPE = (1721 - 1297) / 80 = 5.3 ADC step/℃
             = 3.3 * 5.3 / 4095 = 4.271 mV/℃
 </pre>
 <b>t = 30 + (1721 – 1718 * 1526 / 1538) * 80 / (1721 – 1297) = 33.09℃</b>
 <p>
 Which is only 0.02℃ higher than the previous result based on the more
 generic formula. Because the temperature measured is close to the
 calibration temperature, the correction is negligible. For this
 particular chip, to see a difference of 0.1℃ between the value computed
 by the two formulas, you need a delta of 15℃ from the calibration
 temperature.
 <h2>Tuning</h2>
 The factory calibration temperature is defined as 30℃ (±5℃). I can store
 a reference temperature value in the first User Data Option Byte. This
 way I don't need to modify the code for each chip.
 <p>
 I update the application to use <b>TS_CAL2</b> based temperature calculation
 and access the tuned reference temperature from the option bytes.
 <pre>
 /* adcmain.c -- ADC reading of reference voltage and temperature sensor */
 #include &lt;stdio.h>
 #include "system.h"
 #define RAW
 #define TS_CAL2 ((const short *) 0x1FFFF7C2)
 #define USER0   ((const unsigned char *) 0x1FFFF804)
 int main( void) {
    unsigned last = 0 ;
    short calV, calC ;
 /* Initialize ADC and fetch calibration values */
    adc_vnt( VNT_INIT, &calV, &calC) ;
 #ifdef RAW
    printf( "%u, %u\n", calV, calC) ;
    int baseC = 300 ;
 # ifdef USER0
    if( 0xFF == (USER0[ 0] ^ USER0[ 1]))
        baseC = USER0[ 0] * 10 ;
 # endif
 #endif
    for( ;;)
        if( uptime == last)
            yield() ;
        else {
            short Vsample, Csample ;
            last = uptime ;
 #ifdef RAW
            adc_vnt( VNT_RAW, &Vsample, &Csample) ;
            printf( "%i, %i, %i, %i, ", calV, Vsample, calC, Csample) ;
            Csample = baseC + (calC - (int) Csample * calV / Vsample)
 # ifdef TS_CAL2
                                                    * 800 / (calC - *TS_CAL2) ;
 # else
                                                                * 10000 / 5336 ;
 # endif
            Vsample = 330 * calV / Vsample ;
 #else
            adc_vnt( VNT_VNC, &Vsample, &Csample) ;
 #endif
            printf( "%i.%i, %i.%i\n", Vsample / 100, Vsample % 100,
                                                Csample / 10, Csample % 10) ;
        }
 }
 </pre>
 <h2>Checkpoint</h2>
 I have added an <a href="AA_factory.html">appendix</a> to track the factory
 written content.
 <p>
 <a href="36_update.html">Next</a>, I will cover the toolchain update that I
 made while working on the temperature sensors.
    <hr>© 2020-2024 Renaud Fivet
 </body>
 </html>
--- a/docs/36_update.html
+++ b/docs/36_update.html
@ -0,0 +1,179 @@
 <!DOCTYPE html>
 <html lang="en">
 <head>
    <meta charset="UTF-8">
    <title>3.6 Toolchain Update</title>
    <link type="text/css" rel="stylesheet" href="style.css">
 </head>
 <body>
 <h1>3.6 Toolchain Update</h1>
 When a new release of GNU ARM Embedded Toolchain comes out, I have to do some
 adaptations when switching to it.
 <p>
 By example, to switch from release 9 update to release 10 major, I made three
 changes to <b>Makefile</b>.
 <p>
 ● Update the Linux base directory location:
 <pre>
 #GCCDIR = $(HOME)/Packages/gcc-arm-none-eabi-9-2020-q2-update
 GCCDIR = $(HOME)/Packages/gcc-arm-none-eabi-10-2020-q4-major
 </pre>
 ● Update the Windows base directory location:
 <pre>
 #GCCDIR = D:/Program Files (x86)/GNU Arm Embedded Toolchain/9 2020-q2-update
 GCCDIR = D:/Program Files (x86)/GNU Arm Embedded Toolchain/10 2020-q4-major
 </pre>
 ● Update the library subdirectory location:
 <pre>
 #LIBDIR  = $(GCCDIR)/lib/gcc/arm-none-eabi/9.3.1/thumb/v6-m/nofp
 LIBDIR  = $(GCCDIR)/lib/gcc/arm-none-eabi/10.2.1/thumb/v6-m/nofp
 </pre>
 In the case of release 10 major, unfortunately while doing some regression
 testing by recompiling the projects so far, I found that the new release
 optimizes further the C startup clearing of BSS data by calling
 <code>memset()</code> from the distribution libraries.
 <pre>
 $ make
 f030f4.elf from startup.txeie.o adc.o adcmain.o
 D:\Program Files (x86)\GNU Arm Embedded Toolchain\10 2020-q4-major\bin\arm-none-
 eabi-ld.exe: startup.txeie.o: in function `Reset_Handler':
 D:\Renau\Documents\Projects\stm32bringup\docs/startup.txeie.c:126: undefined ref
 erence to `memset'
 make: *** [mk8:61: f030f4.elf] Error 1
 </pre>
 So I had to add <b>libc.a</b> and its location on top of <b>libgcc.a</b>
 to the list of libraries.
 <pre>
 LIBS = -l$(LIBSTEM) -lc -lgcc
 LIB_PATHS = -L. -L$(GCCDIR)/arm-none-eabi/lib/thumb/v6-m/nofp -L$(LIBDIR)
 </pre>
 with <b>libc.a</b> the link phase complete successfully.
 <pre>
 f030f4.elf from startup.txeie.o adc.o adcmain.o
   text    data     bss     dec     hex filename
   2644       0      16    2660     a64 f030f4.elf
 </pre>
 <p>
 As I don’t want to turn off size optimization and I am not willing to
 always pay the full 180 bytes for a production ready <code>memset()</code>
 when it is called only once at startup to clear a few bytes, I ended up adding
 my own version of <code>memset()</code> to my local library.
 <pre>
 #include &lt;string.h>
 void *memset( void *s, int c, size_t n) {
    char *p = s ;
    while( n--)
        *p++ = c ;
    return s ;
 }
 </pre>
 <pre>
 LIBOBJS = printf.o putchar.o puts.o memset.o
 </pre>
 Link succeed with a reduction of 152 bytes of code.
 <pre>
 f030f4.elf from startup.txeie.o adc.o adcmain.o
   text    data     bss     dec     hex filename
   2492       0      16    2508     9cc f030f4.elf
 </pre>
 <h2>GCC front end handling the libraries selection</h2>
 As I was investigating the compilation flags to find if there was a
 better way to solve this issue, I figure out I could let <b>gcc</b> handle
 the distribution libraries selection and their location based on the CPU
 type. So I changed the linker invocation accordingly and got rid of LD,
 LIBDIR and LIB_PATHS definitions.
 <pre>
 #   $(LD) -T$(LD_SCRIPT) $(LIB_PATHS) -Map=$(PROJECT).map -cref -o $@ $(OBJS) $(LIBS)
    $(CC) $(CPU) -T$(LD_SCRIPT) -L. -Wl,-Map=$(PROJECT).map,-cref \
        -nostartfiles -o $@ $(OBJS) -l$(LIBSTEM)
 </pre>
 As the compiler front end is now controlling the libraries selection it is
 possible to give it a hint how to select a better optimized memset().  The
 libc library comes in two flavors: regular and nano.
 <pre>
 OBJS = $(SRCS:.c=.o)
 LIBOBJS = printf.o putchar.o puts.o # memset.o
 CPU = -mthumb -mcpu=cortex-m0 --specs=nano.specs
 </pre>
 <code>memset()</code> included in the nano version of libc occupies the same
 space as my own implementation.
 <pre>
 f030f4.elf from startup.txeie.o adc.o adcmain.o
   text    data     bss     dec     hex filename
   2492       0      16    2508     9cc f030f4.elf
 </pre>
 <h2>PATH based command selection</h2>
 Finally, I revised the way I specify the commands location by updating the
 PATH environment variable in the Makefile instead of giving the full path of
 each command.<br>
 On Windows, I make sure that drive specification matches the development
 environment in use (Cygwin, MSYS2 and other).
 <pre>
 ### Build environment selection
 ifeq (linux, $(findstring linux, $(MAKE_HOST)))
 INSTALLDIR = $(HOME)/Packages
 #REVDIR = gcc-arm-none-eabi-10-2020-q4-major
 REVDIR = arm-gnu-toolchain-13.3.rel1-x86_64-arm-none-eabi
 else
 DRIVE = d
 ifeq (cygwin, $(findstring cygwin, $(MAKE_HOST)))
 OSDRIVE = /cygdrive/$(DRIVE)
 else ifeq (msys, $(findstring msys, $(MAKE_HOST)))
 OSDRIVE = /$(DRIVE)
 else
 OSDRIVE = $(DRIVE):
 endif
 INSTALLDIR = $(OSDRIVE)/Program Files (x86)
 #REVDIR = GNU Arm Embedded Toolchain/10 2020-q4-major
 REVDIR = GNU Arm Embedded Toolchain/arm-gnu-toolchain-13.3.rel1-mingw-w64-i686-arm-none-eabi
 endif
 GCCDIR = $(INSTALLDIR)/$(REVDIR)
 export PATH := $(GCCDIR)/bin:$(PATH)
 BINPFX  = @arm-none-eabi-
 AR      = $(BINPFX)ar
 CC      = $(BINPFX)gcc
 OBJCOPY = $(BINPFX)objcopy
 OBJDUMP = $(BINPFX)objdump
 SIZE    = $(BINPFX)size
 </pre>
 Switching back to latest version of the toolchain at the time of writing this,
 the link shows further improvement of the code size. The optimization via
 <code>memset()</code> has been skipped by the compiler.
 <pre>
 f030f4.elf from startup.txeie.o adc.o adcmain.o
   text    data     bss     dec     hex filename
   2464       0      16    2480     9b0 f030f4.elf
 </pre>
 <h2>Checkpoint</h2>
 Invoking the compiler instead of the linker gives more flexibility in
 case the toolchain directory structure changes or if I target a
 different core. The compiler is aware of the location of the toolchain
 libraries while the linker need explicit parameters to handle those
 changes.
 <p>
 <a href="37_inram.html">Next</a>, I will (re)build to execute code in RAM
 instead of FLASH.
    <hr>© 2020-2024 Renaud Fivet
 </body>
 </html>
--- a/docs/37_inram.html
+++ b/docs/37_inram.html
@ -0,0 +1,453 @@
 <!DOCTYPE html>
 <html lang="en">
 <head>
    <meta charset="UTF-8">
    <title>3.7 In RAM Execution</title>
    <link type="text/css" rel="stylesheet" href="style.css">
 </head>
 <body>
 <h1>3.7 In RAM Execution</h1>
 <h2>Selecting which memory is mapped at address 0x0</h2>
 So far, I have been executing either my own code from flash or the
 bootloader from system memory depending of the state of the BOOT0 pin at
 reset.
 <p>
 Using stm32flash I can request the bootloader to transfer execution to
 the code in flash memory.
 <pre>stm32flash -g 0 COM6</pre>
 With my current code, this works fine as far as I don’t use interrupt
 subroutine. <b>ledon</b> and <b>blink</b> both work, but <b>ledtick</b> will
 reset once the <code>SysTick_Handler()</code> interrupt routine is triggered
 for the first time. This is due to the fact that the system memory is still
 mapped at address 0x0 where my interrupt subroutine vector should be. To
 fix this, I need to insure the flash is mapped at address 0x0 before I
 enable interrupts.
 <p>
 The memory mapping is managed through the System Configuration
 controller <b>SYSCFG</b>, so I need to activate it and reconfigure the mapping
 before my <b>SysTick</b> initialization code in <code>init()</code>.
 <pre>
 /* Make sure FLASH Memory is mapped at 0x0 before enabling interrupts */
    RCC_APB2ENR |= RCC_APB2ENR_SYSCFGEN ;      /* Enable SYSCFG */
    SYSCFG_CFGR1 &= ~3 ;                       /* Map FLASH at 0x0 */
 </pre>
 and add the SYSCFG peripheral description.
 <pre>
 #define RCC_APB2ENR_SYSCFGEN    0x00000001  /*  1: SYSCFG clock enable */
 #define SYSCFG              ((volatile long *) 0x40010000)
 #define SYSCFG_CFGR1        SYSCFG[ 0]
 </pre>
 With this in place, I can now switch easily from bootloader to flash
 code by sending a go command via stm32flash.
 <h2>Sharing the RAM with the Bootloader</h2>
 Before I can ask the bootloader to transfer execution to code in RAM, I
 need first to ask it to write code there. As the bootloader data are
 located in RAM too, I have to avoid overwriting them. Where is it safe
 to write in RAM?
 <p>
 The answer is in the application note <b>AN2606 <i>STM32 microcontroller
 system memory boot mode</i></b>. Section 5 covers <b>STM32F03xx4/6 devices
 bootloader</b> and it states in <b>5.1 Bootloader Configuration</b>:
 <blockquote>2 Kbyte starting from address 0x20000000 are used by the bootloader
 firmware.</blockquote>
 <p>
 I am using a STM32F030F4P6, which has 4KB RAM and the bootloader
 firmware is using the first 2KB. That means I have only 2KB left to use
 starting from address 0x20000800.
 <p>
 Actually, I have only 2KB left to use until the bootloader firmware
 transfer execution to my code in RAM. Once my code executes, I can
 reclaim the first 2KB. This is exactly what I have to tell the linker.
 <p>
 I just create a new linker script <b>f030f4.ram.ld</b> by copying
 <b>f030f4.ld</b> and changing the memory configuration.
 <pre>
 /* FLASH means code, read only data and data initialization */
    FLASH (rx)  : ORIGIN = 0x20000800, LENGTH =  2K
    RAM   (rwx) : ORIGIN = 0x20000000, LENGTH =  2K
 </pre>
 I can build <b>ledon</b> or <b>blink</b> with that new linker script and check
 the resulting <b>f030f4.map</b> file.
 <ul>
 <li> isr vector, code, const data and data initialization are located from
  0x20000800.
 <li> Stack Pointer initial value is 0x20000800.
 <li> .data and .bss are located from 0x20000000.
 </ul>
 Let’s write this code in RAM and execute it!
 <pre>
 stm32flash -w blink.bin -S 0x20000800 COM6
 stm32flash -g 0x20000800 COM6
 </pre>
 This work just fine but of course the executable of <b>ledon</b> or
 <b>blink</b> doesn’t use interrupt routines.
 <h2>ISR Vector in RAM</h2>
 Like for FLASH, we need to make sure that RAM memory is mapped at
 address 0x0 and start with the ISR vector.
 <ul>
 <li> Use <b>SYSCFG</b> controller to map RAM at address 0x0
 <li> Tell the linker to reserve space at beginning of RAM before locating
  <b>.data</b> section.
 <li> Make a copy of <code>isr_vector[]</code> to the beginning of RAM in the
 space reserved by the linker.
 </ul>
 To select the RAM mapping, the <b>MEM_MODE</b> bits need to be set in
 <b>SYSCFG_CFGR1</b>.
 <pre>
 /* Make sure SRAM Memory is mapped at 0x0 before enabling interrupts */
    RCC_APB2ENR |= RCC_APB2ENR_SYSCFGEN ;        /* Enable SYSCFG */
    SYSCFG_CFGR1 |= 3 ;                          /* Map RAM at 0x0 */
 </pre>
 The ISR vector will have at most 16 + 32 entries for STM32F030xx, that
 means 192 bytes need to be reserved. I add a new section before <b>.data</b> in
 the link script.
 <pre>
    .isrdata :
    {
        ram_vector = . ;
        . = . + 192 ;
    } > RAM
    .data : AT (__etext)
    {
 ...
 </pre>
 In the startup code, I add the code to copy the <code>isr_vector[]</code> to
 the location reserved at the beginning of RAM.
 <pre>
 #define ISRV_SIZE (sizeof isr_vector / sizeof *isr_vector)
 extern isr_p ram_vector[] ;
 /* Copy isr vector to beginning of RAM */
    for( unsigned i = 0 ; i < ISRV_SIZE ; i++)
        ram_vector[ i] = isr_vector[ i] ;
 </pre>
 RAM initialization now consists of
 <ul>
 <li> Stack pointer initialization
 <li> ISR vector copy
 <li> .data initialization
 <li> .bss clearing
 </ul>
 I can now rebuild <b>ledtick</b> or <b>uptime prototype</b> for execution in
 RAM.&nbsp; <b>f030f4.map</b> now shows that .data starts at 0x200000C0, after
 <code>ram_vector[]</code>.
 <pre>
 .isrdata        0x20000000       0xc0
                0x20000000                ram_vector = .
                0x200000c0                . = (. + 0xc0)
 *fill*         0x20000000       0xc0
 .data           0x200000c0        0x0 load address 0x20000c88
                0x200000c0                __data_start__ = .
 </pre>
 I can now use <b>stm32flash</b> to write those executables in RAM and request
 execution.
 <h2>Memory Models</h2>
 I have now the choice between four memory models when I build.
 <pre>
 | Model   | ISRV Location      | Load address (word aligned)             |
 |---------|--------------------|-----------------------------------------|
 |BOOTFLASH| Beginning of FLASH | Beginning of FLASH                      |
 | BOOTRAM | Beginning of RAM   | Beginning of RAM                        |
 | GOFLASH | Beginning of RAM   | In FLASH                                |
 | GORAM   | Beginning of RAM   | In RAM, after bootloader reserved space |
 </pre>
 <ul>
 <li> <b>BOOTFLASH</b>: Executed at reset depending of BOOT0 pin level otherwise
  triggered by a (boot)loader.
 <li> <b>BOOTRAM</b>: Executed at reset depending of BOOT0/BOOT1 configuration
  otherwise triggered by a (boot)loader through SWD.
 <li> <b>GOFLASH</b>: Executed at reset if located at beginning of FLASH
  otherwise triggered by a (boot)loader. (Spoiler: IAP and multi boot)
 <li> <b>GORAM</b>: Triggered by a (boot)loader. Useful for development if RAM
  size allows it. (Spoiler: external storage)
 </ul>
 To avoid having to edit multiple files when switching between models or
 introducing a new chipset family, I make the following changes.
 <ol>
 <li> Use a generic linker script.
 <li> Let the startup code handle the isr vector initialization and the
   memory mapping.
 <li> Maintain the FLASH and RAM information and isr vector position in the
   Makefile.
 </ol>
 <h3>1. Generic Linker Script</h3>
 To turn f030f4.ram.ld into a generic linker script, I need to
 <ul>
 <li> abstract the memory part.
 <li> remove the RAM isr vector hardcoded size.
 </ul>
 <pre>
 MEMORY
 {
 /* FLASH means code, read only data and data initialization */
    FLASH (rx) : ORIGIN = DEFINED(FLASHSTART) ? FLASHSTART : 0x08000000,
        LENGTH =  DEFINED(FLASHSIZE) ? FLASHSIZE : 16K
    RAM  (rwx) : ORIGIN = DEFINED(RAMSTART) ? RAMSTART : 0x20000000,
        LENGTH =  DEFINED(RAMSIZE) ? RAMSIZE : 4K
 }
 </pre>
 The Makefile will provide the necessary addresses and sizes information
 by passing parameters to the linker: <code>FLASHSTART</code>,
 <code>FLASHSIZE</code>, <code>RAMSTART</code>, <code>RAMSIZE</code>.
 <pre>
    /* In RAM isr vector reserved space at beginning of RAM */
    .isrdata (NOLOAD):
    {
        KEEP(*(.ram_vector))
    } > RAM
 </pre>
 The startup code will allocate <code>ram_vector[]</code> in <b>.ram_vector</b>
 section if needed.
 <h3>2. Startup Code</h3>
 I create the startup code startup.ram.c from a copy of startup.txeie.c,
 using conditional compiled code selected by RAMISRV whose definition
 will be passed as parameter to the compiler.
 <pre>
 #if RAMISRV == 2
 # define ISRV_SIZE (sizeof isr_vector / sizeof *isr_vector)
 isr_p ram_vector[ ISRV_SIZE] __attribute__((section(".ram_vector"))) ;
 #endif
 int main( void) ;
 void Reset_Handler( void) {
    const long  *f ;    /* from, source constant data from FLASH */
    long    *t ;        /* to, destination in RAM */
 #if RAMISRV == 2
 /* Copy isr vector to beginning of RAM */
    for( unsigned i = 0 ; i < ISRV_SIZE ; i++)
        ram_vector[ i] = isr_vector[ i] ;
 #endif
 /* Assume:
 **  __bss_start__ == __data_end__
 **  All sections are 4 bytes aligned
 */
    f = __etext ;
    for( t = __data_start__ ; t < __bss_start__ ; t += 1)
        *t = *f++ ;
    while( t < &__bss_end__)
        *t++ = 0 ;
 /* Make sure active isr vector is mapped at 0x0 before enabling interrupts */
    RCC_APB2ENR |= RCC_APB2ENR_SYSCFGEN ;           /* Enable SYSCFG */
 #if RAMISRV
    SYSCFG_CFGR1 |= 3 ;                             /* Map RAM at 0x0 */
 #else
    SYSCFG_CFGR1 &= ~3 ;                            /* Map FLASH at 0x0 */
 #endif
    if( init() == 0)
        main() ;
    for( ;;)
        __asm( "WFI") ; /* Wait for interrupt */
 }
 </pre>
 The SYSCFG controller definition is now included through a chipset
 specific header file. This way I can maintain all the chipset
 controllers and peripherals in one place.
 <pre>#include "stm32f030xx.h"</pre>
 <h3> 3. Makefile</h3>
 The Makefile now holds the memory model definition that is passed as
 parameters to the compiler and the linker.
 <pre>
 ### Memory Models
 # By default we use the memory mapping from linker script
 # In RAM Execution, load and start by USART bootloader
 # Bootloader uses first 2K of RAM, execution from bootloader
 #FLASHSTART = 0x20000800
 #FLASHSIZE  = 2K
 #RAMSTART   = 0x20000000
 #RAMSIZE    = 2K
 # In RAM Execution, load and start via SWD
 # 4K RAM available, execution via SWD
 #FLASHSTART = 0x20000000
 #FLASHSIZE  = 3K
 #RAMSTART   = 0x20000C00
 #RAMSIZE    = 1K
 # In Flash Execution
 # if FLASHSTART is not at beginning of FLASH: execution via bootloader or SWD
 #FLASHSTART = 0x08000000
 #FLASHSIZE  = 16K
 #RAMSTART   = 0x20000000
 #RAMSIZE    = 4K
 # ISR vector copied and mapped to RAM when FLASHSTART != 0x08000000
 ifdef FLASHSTART
 ifneq ($(FLASHSTART),0x08000000)
  ifeq ($(FLASHSTART),0x20000000)
   # Map isr vector in RAM
   RAMISRV := 1
  else
   # Copy and map isr vector in RAM
   RAMISRV := 2
  endif
 endif
 BINLOC  = $(FLASHSTART)
 else
 BINLOC  = 0x08000000
 endif
 </pre>
 Compiler and linker have different syntax for defining symbols through
 command line parameters.
 <pre>
 CPU = -mthumb -mcpu=cortex-m0 --specs=nano.specs
 ifdef RAMISRV
 CDEFINES = -DRAMISRV=$(RAMISRV)
 endif
 WARNINGS=-pedantic -Wall -Wextra -Wstrict-prototypes
 CFLAGS = $(CPU) -g $(WARNINGS) -Os $(CDEFINES)
 LD_SCRIPT = generic.ld
 ifdef FLASHSTART
 LDOPTS  =--defsym FLASHSTART=$(FLASHSTART) --defsym FLASHSIZE=$(FLASHSIZE)
 LDOPTS +=--defsym RAMSTART=$(RAMSTART) --defsym RAMSIZE=$(RAMSIZE)
 endif
 LDOPTS +=-Map=$(subst .elf,.map,$@) -cref --print-memory-usage
 comma :=,
 space :=$() # one space before the comment
 LDFLAGS =-Wl,$(subst $(space),$(comma),$(LDOPTS))
 </pre>
 As I am revising the compilation flags, I have increased the level of
 warnings by adding -pedantic, -Wstrict-prototypes.
 <p>
 Build rules updated with new symbols for the linker.
 <pre>
 $(PROJECT).elf: $(OBJS) libstm32.a
 boot.elf: boot.o
 ledon.elf: ledon.o
 blink.elf: blink.o
 ledtick.elf: ledtick.o
 cstartup.elf: cstartup.o
 %.elf:
    @echo $@
    $(CC) $(CPU) -T$(LD_SCRIPT) $(LDFLAGS) -nostartfiles -o $@ $+
    $(SIZE) $@
    $(OBJDUMP) -hS $@ > $(subst .elf,.lst,$@)
 </pre>
 The projects composition need to be updated to use the new startup.
 <pre>SRCS = startup.ram.c txeie.c uptime.1.c</pre>
 Finally, to keep track of the memory model and the load location, I put
 the load address in the name of the binary file generated.
 <pre>all: $(PROJECT).$(BINLOC).bin $(PROJECT).hex</pre>
 This way if I build uptime prototype in GORAM memory model
 <pre>
 $ make
 f030f4.elf
   text    data     bss     dec     hex filename
   1164       0      20    1184     4a0 f030f4.elf
 f030f4.hex
 f030f4.0x20000800.bin
 </pre>
 The name of the file will remind me where to load the code.
 <pre>
 $ stm32flash -w f030f4.0x20000800.bin -S 0x20000800 COM6
 $ stm32flash -g 0x20000800
 </pre>
 <h2>Caveat: stm32flash v0.6 intel hex bug</h2>
 At the time of writing, <b>stm32flash</b> v0.6 has a bug that prevents
 writing intel hex files correctly at address other than the origin of
 the Flash. A bug fix and the possibility to directly read the base
 address from the intel hex file are planned to be included in v0.7.
 <p>
 Until v0.7 is out, I am using my own patched version of stm32flash or
 the binary files when I need to test GOFLASH and GORAM memory models.
 <p>
 As I branched off my own patched version of <b>stm32flash</b>, I added a
 <code>-x</code> option to write and execute an intel hex file:
 <pre>stm32flash -x file.hex COM#</pre>
 <h2>Testing</h2>
 I build all four memory models and check that they can be loaded and
 executed using both <b>stm32flash</b> and <b>STM32 Cube Programmer</b>.
 <p>
 Using the USART bootloader, I validate BOOTFLASH, GOFLASH and GORAM with
 <b>stm32flash</b> and <b>STM32 Cube Programmer</b>.
 <p>
 Using the SWD interface, I validate BOOTFLASH, GOFLASH, BOOTRAM and
 GORAM with <b>STM32 Cube Programmer</b>.
 <h2>Checkpoint</h2>
 <a href="38_crc32.html">Next</a>, I will add integrity check at startup by
 doing CRC32 validation of the code.
    <hr>© 2020-2024 Renaud Fivet
 </body>
 </html>
--- a/docs/38_crc32.html
+++ b/docs/38_crc32.html
@ -0,0 +1,333 @@
 <!DOCTYPE html>
 <html lang="en">
 <head>
    <meta charset="UTF-8">
    <title>3.8 CRC-32 Code Validation</title>
    <link type="text/css" rel="stylesheet" href="style.css">
 </head>
 <body>
 <h1>3.8 CRC-32 Code Validation</h1
 The STM32F030 family comes with a CRC calculation unit. It can be used
 during startup to validate the integrity of the code in memory.
 <p>
 Cyclic Redundancy Check is a way to do error detection and correction. I
 have already met CRC when dealing with the DS18B20 sensor where CRC-8 is
 used during the scratchpad 9 bytes transmission.
 <p>
 The STM32 CRC calculation unit has the following default characteristic:
 <ul>
 <li> POLY32 is 0x04C11DB7.
 <li> Initialisation 0xFFFFFFFF.
 <li> High bit first (left shift).
 <li> 32 bit word input, little endian.
 </ul>
 I don't plan to write a self-signing executable, so on top of the STM32
 startup code validation, I will also write a sign32 command to sign
 binary files during build.
 <h2>Implementation Steps</h2>
 <ol>
 <li> Update <b>stm32f030xx.h</b> with the CRC calculation unit definitions.
 <li> Update startup with `check_flash()` to be tested before `init()` is
  called.
 <li> Update <b>generic.ld</b> with a new section used as placeholder for the CRC
  sum at the end of the flashable content.
 <li> Write `sign32` command to sign a binary file.
 <li> Update <b>Makefile</b> to sign the binary executable and create an intel
 hex version out of it.
 </ol>
 <h3>1. stm32f030xx.h</h3>
 The CRC calculation unit is on the AHB bus and its clock need to be
 enabled before use.
 <pre>
 #define RCC_AHBENR_CRCEN        (1 << 6)    /*  6: CRC clock enable */
 </pre>
 I will make use of the default setup so I only need to refer to the Data
 Register and the Control Register. I create all register definitions as
 there is a gap in the memory layout.
 <pre>
 #define CRC             ((volatile unsigned *) 0x40023000)
 #define CRC_DR          CRC[ 0]
 #define CRC_IDR         CRC[ 1]
 #define CRC_CR          CRC[ 2]
 #define CRC_INIT        CRC[ 4]
 </pre>
 <h3>2. startup.crc.c</h3>
 I make a copy of <b>startup.ram.c</b> into <b>startup.crc.c</b>.
 <p>
 I use conditional compilation, the build option `CRC32SIGN` will be
 defined in the Makefile.
 <p>
 The constant variable `crcsum` is a placeholder with the hexadecimal value
 <code>DEADC0DE</code> in byte order. This value will be overriden by the
 computed CRC value during build. The linker will put `crcsum` at the end of the
 used FLASH.
 <p>
 `check_flash()` use the CRC calculation unit to compute the CRC value
 from beginning of FLASH `isr_vector` to end of FLASH `crcsum`. If
 `crcsum` value is the correct CRC, the computed result will be 0.
 <pre>
 #ifdef CRC32SIGN
 const unsigned crcsum __attribute__((section(".crc_chk"))) = 0xDEC0ADDE ;
 static int check_flash( void) {
    int ret = 0 ;
 /* Flash CRC validation */
    RCC_AHBENR |= RCC_AHBENR_CRCEN ;  /* Enable CRC periph */
    CRC_CR = 1 ;                      /* Reset */
    if( CRC_DR == 0xFFFFFFFF) {       /* CRC periph is alive and resetted */
        const unsigned *wp = (const unsigned *) isr_vector ;
        while( wp <= &crcsum)
            CRC_DR = *wp++ ;
        ret = CRC_DR == 0 ;
    }
    RCC_AHBENR &= ~RCC_AHBENR_CRCEN ; /* Disable CRC periph */
    return ret ;
 }
 #endif
 </pre>
 Flash content is checked before calling `init()`. This means the check
 is done using the default clock setup of HSI 8 MHz.
 <pre>
    if(
 #ifdef CRC32SIGN
      check_flash() &&
 #endif
      init() == 0)
        main() ;
 </pre>
 <h3>3. generic.ld</h3>
 I add a new section to hold the CRC value placeholder. This needs to be
 DWORD aligned and at the end of the used FLASH area.
 <pre>
    .crc __etext + SIZEOF(.data) :
    {
        KEEP(*(.crc_chk))
    } > FLASH
 </pre>
 <h3>4. sign32</h3>
 The command `sign32` creates a <b>signed.bin</b> file from it's input file
 specified as parameter.
 <pre>
 $ touch empty.bin
 $ ./sign32 empty.bin
 FFFFFFFF empty.bin: 0, signed.bin: 4
 </pre>
 If the input file is already signed, the output signed.bin is identical
 to the input.
 <pre>
 $ mv signed.bin FFFFFFFF.bin
 $ ./sign32 FFFFFFFF.bin
 00000000 FFFFFFFF.bin: 4, signed.bin: 4
 </pre>
 Padding with null is done on the input to insure the calculation is
 DWORD aligned.
 <pre>
 $ echo > nl.bin
 $ ./sign32 nl.bin
 E88E0BAD nl.bin: 1, signed.bin: 8
 $ hexdump -C signed.bin
 00000000  0a 00 00 00 ad 0b 8e e8                           |........|
 00000008
 </pre>
 Calculation stops when the placeholder DEADC0DE is found or the end of
 the input file is reached.
 <p>
 I create a folder <b>crc32/</b> for <b>sign32.c</b> and its <b>Makefile</b>.
 <p>
 The core of sign32.c is customizable to do CRC calculation bitwise,
 unrolled bitwise, tablewise or to generate the CRC table.
 <h3>5. Makefile</h3>
 The build option `CRC32SIGN` controls the signature of the binary file
 and the generation of the intel hex version from the signed binary using
 `objcopy`.
 <pre>
 # build options
 CRC32SIGN := 1
 ifdef CRC32SIGN
 CDEFINES += -DCRC32SIGN=$(CRC32SIGN)
 endif
 %.$(BINLOC).bin: %.elf
    @echo $@
    $(OBJCOPY) -O binary $< $@
 ifdef CRC32SIGN
    crc32/sign32 $@
    mv signed.bin $@
 %.hex: %.$(BINLOC).bin
    @echo $@
    $(OBJCOPY) --change-address=$(BINLOC) -I binary -O ihex $< $@
 endif
 </pre>
 <h2>Building and testing</h2>
 If I build an executable, I can see that the binary file is CRC-32
 signed. In the example below, the CRC-32 signature is 0xBC689506 and the
 total binary image is 2680 bytes long.
 <pre>
 $ make
 f030f4.elf
 Memory region         Used Size  Region Size  %age Used
           FLASH:        2680 B        16 KB     16.36%
             RAM:          24 B         4 KB      0.59%
   text    data     bss     dec     hex filename
   2673       4      20    2697     a89 f030f4.elf
 f030f4.0x08000000.bin
 crc32/sign32 f030f4.0x08000000.bin
 BC689506 f030f4.0x08000000.bin: 2676, signed.bin: 2680
 mv signed.bin f030f4.0x08000000.bin
 f030f4.hex
 </pre>
 I can double check that the value at the end of the binary file matches.
 <pre>
 $ hexdump -C f030f4.0x08000000.bin | tail
 000009f0  01 46 63 46 52 41 5b 10  10 46 01 d3 40 42 00 2b  |.FcFRA[..F..@B.+|
 00000a00  00 d5 49 42 70 47 63 46  5b 10 00 d3 40 42 01 b5  |..IBpGcF[...@B..|
 00000a10  00 20 00 f0 05 f8 02 bd  00 29 f8 d0 16 e7 70 47  |. .......)....pG|
 00000a20  70 47 c0 46 50 4c 4c 48  53 49 0a 00 20 25 64 20  |pG.FPLLHSI.. %d |
 00000a30  25 73 25 73 00 75 70 00  77 65 65 6b 00 64 61 79  |%s%s.up.week.day|
 00000a40  00 68 6f 75 72 00 6d 69  6e 75 74 65 00 73 65 63  |.hour.minute.sec|
 00000a50  6f 6e 64 00 30 31 32 33  34 35 36 37 38 39 41 42  |ond.0123456789AB|
 00000a60  43 44 45 46 00 00 20 2b  2b 10 0a 02 08 00 00 00  |CDEF.. ++.......|
 00000a70  ef 00 00 00 06 95 68 bc                           |......h.|
 00000a78
 </pre>
 I can flash the resulting intel hex file and see that it executes.
 <pre>
 $ stm32flash -x f030f4.hex COM3
 stm32flash 0.6-patch-hex
 http://stm32flash.sourceforge.net/
 Using Parser : Intel HEX
 Location     : 0x8000000
 Size         : 2680
 Interface serial_w32: 57600 8E1
 Version      : 0x31
 Option 1     : 0x00
 Option 2     : 0x00
 Device ID    : 0x0444 (STM32F03xx4/6)
 - RAM        : Up to 4KiB  (2048b reserved by bootloader)
 - Flash      : Up to 32KiB (size first sector: 4x1024)
 - Option RAM : 16b
 - System RAM : 3KiB
 Write to memory
 Erasing memory
 Wrote address 0x08000a78 (100.00%) Done.
 Starting execution at address 0x08000000... done.
 </pre>
 I can use stm32flash to compute the CRC-32 checksum on the first 2680
 bytes of FLASH, the result is 0 as this covers both the payload AND
 the CRC-32 checksum value.
 <pre>
 $ stm32flash -C -S 0x08000000:2680 COM3
 stm32flash 0.6-patch-hex
 http://stm32flash.sourceforge.net/
 Interface serial_w32: 57600 8E1
 Version      : 0x31
 Option 1     : 0x00
 Option 2     : 0x00
 Device ID    : 0x0444 (STM32F03xx4/6)
 - RAM        : Up to 4KiB  (2048b reserved by bootloader)
 - Flash      : Up to 32KiB (size first sector: 4x1024)
 - Option RAM : 16b
 - System RAM : 3KiB
 CRC computation
 CRC address 0x08000a78 (100.00%) Done.
 CRC(0x08000000-0x08000a78) = 0x00000000
 </pre>
 If I ask stm32flash to compute the CRC-32 checksum on the first 2676
 bytes (payload excluding CRC-32 checksum value), it returns 0xbc689506,
 which is the value computed at build time.
 <pre>
 $ stm32flash -C -S 0x08000000:2676 COM3
 stm32flash 0.6-patch-hex
 http://stm32flash.sourceforge.net/
 Interface serial_w32: 57600 8E1
 Version      : 0x31
 Option 1     : 0x00
 Option 2     : 0x00
 Device ID    : 0x0444 (STM32F03xx4/6)
 - RAM        : Up to 4KiB  (2048b reserved by bootloader)
 - Flash      : Up to 32KiB (size first sector: 4x1024)
 - Option RAM : 16b
 - System RAM : 3KiB
 CRC computation
 CRC address 0x08000a74 (100.00%) Done.
 CRC(0x08000000-0x08000a74) = 0xbc689506
 </pre>
 Because STM32F030 USART bootloader is v3.1, it doesn't implement the CRC
 checksum command included in v3.3. This means that in this case
 stm32flash computes the CRC checksum value on its own. You can check the
 sources of stm32flash for its implementation of the CRC-32 calculation.
 <h2>Checkpoint</h2>
 There is variation in the functionality of the CRC calculation unit
 among different STM32 chipset family. The <code>check_flash()</code>
 implementation I just made relying on the default settings for polynomial,
 initial value, polynomial length and shift direction should be common.
 <p>
 <a href="39_resistor.html">Next</a>, I will use the ADC to read a resistor
 value.
    <hr>© 2020-2024 Renaud Fivet
 </body>
 </html>
--- a/docs/39_resistor.html
+++ b/docs/39_resistor.html
@ -0,0 +1,58 @@
 <!DOCTYPE html>
 <html lang="en">
 <head>
    <meta charset="UTF-8">
    <title>3.9 Reading a Resistor Value</title>
    <link type="text/css" rel="stylesheet" href="style.css">
 </head>
 <body>
 <h1>3.9 Reading a Resistor Value</h1>
 I used the ADC previously to read the internal sensors, so it’s simple
 to move on to external ones. Once you can read the value of a resistor,
 there is a wide choice of analog applications: thermistors, photocells,
 potentiometers, sliders, joysticks …
 <h2>Voltage Divider Circuit</h2>
 <img alt="Voltage Divider Circuit" src="img/39_vdivider.png">
 <p>
 <b>Vout = Vin * Rref / (Rref + Rmeasured)</b>
 <h2>ADC Readings</h2>
 Assuming the ADC is configured for 12 bit precision and return a value
 in the range [0 … 4095]
 <p>
 <b>Vout = VADC = VDDA * ADCRAW / 4095</b>
 <p>
 <b>Vin = VDDA</b>
 <p>
 <b>~VDDA~ * ADCRAW / 4095 = ~VDDA~ * Rref / (Rref + Rmeasured)</b>
 <p>
 <b>ADCRAW * Rmeasured = Rref * (4095 – ADCRAW)</b>
 <p>
 <b>Rmeasured = Rref * (4095 – ADCRAW) / ADCRAW</b>
 <p>
 <b>Rmeasured = Rref * (4095 / ADCRAW – 1)</b>
 <h2>Devil in the whatnots</h2>
 <ul>
 <li> Avoiding division by zero.
 <li> Integer versus floating point calculation.
 <li> Choosing the reference resistor.
 <li> Calibration of the reference resistor.
 </ul>
 <pre>
 #include &lt;limits.h>
 int R = ADCraw ? Rref * (4095 / ADCraw - 1) : INT_MAX ;
 </pre>
 <pre>int R = ADCraw ? Rref * 4095 / ADCraw - Rref : INT_MAX ;</pre>
    <hr>© 2020-2024 Renaud Fivet
 </body>
 </html>
--- a/docs/AA_factory.html
+++ b/docs/AA_factory.html
@ -0,0 +1,60 @@
 <!DOCTYPE html>
 <html lang="en">
 <head>
    <meta charset="UTF-8">
    <title>Appendix A: Factory written content</title>
    <link type="text/css" rel="stylesheet" href="style.css">
 </head>
 <body>
 <h1>Appendix A: Factory written content</h1>
 <h2>Acquisition</h2>
 <pre>
 $ stm32flash -r - -S 0x1FFFF7a0:96 COM6 2>/dev/null | hexdump -C
 00000000  ff ff ff ff 31 00 10 00  ff ff ff ff 14 80 20 00  |....1......... .|
 00000010  13 57 33 52 31 37 31 20  ce 06 f5 05 f0 ff ff ff  |.W3R171 ........|
 00000020  ff ff 27 05 ff ff ff ff  fc ff ff ff 10 00 ff ff  |................|
 00000030  ff ff ff ff ff ff ff ff  ff ff ff ff ff ff ff ff  |................|
 00000040  ff ff ff ff ff ff ff ff  f3 ff ff ff ff ff ff ff  |................|
 00000050  ff ff ff ff ff ff ff ff  68 97 64 9b 3c c3 3f c0  |........h.d.<.?.|
 00000060
 </pre>
 <h2>Layout</h2>
 <pre>
 | Location   | Content         | Size | Format                   | Reference   |
 |------------|-----------------|------|--------------------------|-------------|
 | 0x1FFFF7A0 | Protocol 3 Ver. | 2    | xFFFF => unsupported     |             |
 | 0x1FFFF7A2 | Protocol 2 Ver. | 2    | xFFFF => unsupported     |             |
 | 0x1FFFF7A4 | Protocol 1 Ver. | 2    | x31 => 3.1               | AN2606 4.2  |
 | 0x1FFFF7A6 | Bootloader ID   | 2    | x10 => 1 USART, 1st ver. | AN2606 4.2  |
 | 0x1FFFF7AC | UID.X           | 2    | bit signed, x8014 => -20 |             |
 | 0x1FFFF7AE | UID.Y           | 2    |                          |             |
 | 0x1FFFF7B0 | UID.WAF_NUM     | 1    | unsigned                 |             |
 | 0x1FFFF7B1 | UID.LOT_NUM     | 7    | ASCII                    |             |
 | 0x1FFFF7B8 | TS_CAL1         | 2    |                          | DS9773 3.10 |
 | 0x1FFFF7BA | VREFINT_CAL     | 2    |                          | DS9773 3.10 |
 | 0x1FFFF7C2 | TS_CAL2         | 2    |                          |             |
 | 0x1FFFF7CC | Flash size (KB) | 2    | x10 => 16 KB             | RM0360 27.1 |
 </pre>
 <h2>Sampled values</h2>
 <pre>
 | BootID | X     | Y   | Wafer | Lot       | TS_CAL1 | VREFINT_CAL | TS_CAL2 | Flash | TBD     |
 |--------|-------|-----|-------|-----------|---------|-------------|---------|-------|---------|
 | x10    | x8014 | x20 | x13   | ‘W3R171 ‘ | x6CE    | x5F5        | x527    | 16    | hd&lt;?    |
 | x10    | x8011 | x49 | x0E   | ‘W3U795 ‘ | x6D1    | x5F0        | x523    | 16    | hbF?    |
 | x10    | x8015 | x21 | x13   | ‘W4A195 ‘ | x6DA    | x5EE        | x52A    | 16    | h^D?    |
 | x10    | x1C   | x3A | x12   | ‘W4A820 ‘ | x6B9    | x5F6        | x511    | 16    | hR;?    |
 | x10    | x09   | x4D | x0B   | ‘W4C593 ‘ | x6E6    | x5F4        | x53C    | 16    | haG?    |
 | x10    | x1D   | x41 | x12   | ‘W4R342 ‘ | x6E4    | x5F1        | x535    | 16    | hZJ?    |
 | x10    | x10   | x17 | x05   | ‘WAA390 ‘ | x6E9    | x5F8        | x523    | 16    | hY=?    |
 | x10    | x800A | x41 | x08   | ‘QMY687 ‘ | x6E6    | x5F1        | x53E    | 16    | hZE?    |
 | x10    | xDDC1 |xDDC1| x0D   | ‘       ‘ | x703    | x5E9        | x535    | 32    | hNb\xFF |
 | x10    | xDDC1 |xDDC1| x0F   | ‘       ‘ | x70B    | x5EF        | x537    | 32    | hHX\xFF |
 | x10    | xDDC1 |xDDC1| x10   | ‘       ‘ | x6FE    | x5E9        | x539    | 32    | hZa\xFF |
 | x21    | x4D   | x47 | x0C   | ‘QMT476 ‘ | x6D8    | x5F3        | x52D    | 64    | hlR\xBF |
 | x21    | x36   | x47 | x0A   | ‘QRW813 ‘ | x6DF    | x5F3        | x539    | 64    | hhF\xFF |
 | x21    | x41   | x1E | x12   | ‘SNG712 ‘ | x6EB    | x5F0        | x526    | 64    | hjS\xFF |
 </pre>
    <hr>© 2020-2024 Renaud Fivet
 </body>
 </html>
--- a/docs/img/13_stlink.png
+++ b/docs/img/13_stlink.png
--- a/docs/img/14_ledon.png
+++ b/docs/img/14_ledon.png
--- a/docs/img/14_ledpb1.png
+++ b/docs/img/14_ledpb1.png
--- a/docs/img/16_tick.png
+++ b/docs/img/16_tick.png
--- a/docs/img/17_cube.png
+++ b/docs/img/17_cube.png
--- a/docs/img/21_boardv200.png
+++ b/docs/img/21_boardv200.png
--- a/docs/img/21_boot0.png
+++ b/docs/img/21_boot0.png
--- a/docs/img/22_ledv200.png
+++ b/docs/img/22_ledv200.png
--- a/docs/img/25_uptimev1.png
+++ b/docs/img/25_uptimev1.png
--- a/docs/img/26_uptime.png
+++ b/docs/img/26_uptime.png
--- a/docs/img/28_clocktree.png
+++ b/docs/img/28_clocktree.png
--- a/docs/img/31_dht11.png
+++ b/docs/img/31_dht11.png
--- a/docs/img/31_output.png
+++ b/docs/img/31_output.png
--- a/docs/img/33_ds18b20.png
+++ b/docs/img/33_ds18b20.png
--- a/docs/img/33_output.png
+++ b/docs/img/33_output.png
--- a/docs/img/34_output.png
+++ b/docs/img/34_output.png
--- a/docs/img/34_tscal1.png
+++ b/docs/img/34_tscal1.png
--- a/docs/img/34_vrefint.png
+++ b/docs/img/34_vrefint.png
--- a/docs/img/35_tschar.png
+++ b/docs/img/35_tschar.png
--- a/docs/img/39_vdivider.png
+++ b/docs/img/39_vdivider.png
--- a/docs/index.html
+++ b/docs/index.html
@ -0,0 +1,128 @@
 <!DOCTYPE html>
 <html lang="en">
 <head>
    <meta charset="UTF-8">
    <title>STM32 Bringup</title>
    <link type="text/css" rel="stylesheet" href="style.css">
 </head>
 <body>
 <h1>STM32 Bringup</h1>
 <h2>Introduction</h2>
 Getting started with a micro-controller usually means picking up a board,
 an IDE, some RTOS or a set of libraries. Depending of your level of experience,
 your budget and the solutions you select, the learning curve may be a steep
 one and what you will learn can be very limited if you end up cornered in a
 sandbox with no understanding of what’s going on under the hood.
 <p>
 Commercial solutions and mature open source projects are a must if you want to
 develop products with some level of quality. Unfortunately their complexity is
 high because they have to satisfy complex requirements. Their documentation
 and source code when available are often hard to navigate, out of date or just
 not addressing what you need to learn.
 <p>
 Starting from scratch, on the other hand, is not something often documented and
 when it is, it is usually after the fact. So if you want to learn how to do it
 you need to catch the opportunity to watch someone going through the steps and
 explaining what’s going on.
 <p>
 I will try to capture here my own “STM32 bring up” journey using a step by step
 approach, writing down the problems faced and decisions taken while evolving
 simple projects.
 <h2>Part I: Bring it up!</h2>
 I proceed by small incremental steps that are easy to reproduce and simple
 enough to adapt to a variant of the micro-controller or a different board
 layout.
 <ul>
 <li> Pick up a <a href="11_toolchain.html">toolchain</a>, install it and check
 that it can build an executable.
 </ul><ul>
 <li> Write a minimal <a href="12_bootstrap.html">bootstrap</a> for a target
 micro-controller and build a first executable.
 </ul><ul>
 <li> <a href="13_flash.html">Flash</a> the first executable in an actual board
 and verify that it boots.
 </ul><ul>
 <li> Provide feedback by turning the <a href="14_ledon.html">user LED ON</a>
 and making it <a href="15_blink.html">blink</a>.
 </ul><ul>
 <li> Use the System <a href="16_ledtick.html">Tick</a> to handle the blinking.
 </ul><ul>
 <li> Insure that RAM memory is initialized as expected for a
 <a href="17_cstartup.html">C startup</a>.
 </ul><ul>
 <li> Structure the code according to the
 <a href="18_3stages.html">three stages</a>: boot, initialization and main
 execution.
 </ul><ul>
 <li> <a href="19_publish.html">Publish</a> the code to a web git repository
 for further evolution.
 </ul>
 <h2><a id="part2">Part II: Let's talk!</a></h2>
 It’s time to move to a more talkative interface so that the board not
 only winks but also speaks. Again I will go through several steps to get
 to a working asynchronous serial communication.
 <ul>
 <li> <a href="21_uart.html">Validate</a> the serial connection by wiring a
 board with an USB to UART adapter and using a Serial Flash loader application
 to read the chipset flash memory.
 </ul><ul>
 <li> Make sure that the code evolved so far works on the
 <a href="22_board.html">board</a> with a serial connection.
 </ul><ul>
 <li> Say <a href="23_hello.html">hello</a> as first transmission.
 </ul><ul>
 <li> Use <a href="24_stm32flash.html">stm32flash</a> as flashing tool on both
 Windows and Linux.
 </ul><ul>
 <li> <a href="25_prototype.html">Prototype</a> an application that tells how
 long the system has been running.
 </ul><ul>
 <li> Write a production version of <a href="26_uptime.html">uptime</a> application.
 </ul><ul>
 <li> Bundle the standard C library output functions into an actual
  <a href="27_library.html">library</a>.
 </ul><ul>
 <li> <a href="28_clocks.html">Configure</a> baud rate and clocks.
 </ul><ul>
 <li> Handle the transmission with <a href="29_interrupt.html">interrupt</a>.
 </ul>
 <h2><a id="part3">Part III: Sensors! So hot! So wet!</a></h2>
 <ul>
 <li> Implement <a href="31_dht11.html">DHT11</a> humidity and temperature
 sensor reading.
 </ul><ul>
 <li> <a href="32_errata.html">Investigate</a> the quality of the DHT11
 measurements.
 </ul><ul>
 <li> Use <a href="33_ds18b20.html">DS18B20</a> digital thermometer for accurate
 temperature reading.
 </ul><ul>
 <li> Trigger <a href="34_adcvnt.html">ADC</a> conversion to read the internal
 voltage and temperature sensors.
 </ul><ul>
 <li> <a href="35_calibrate.html">Calibrate</a> the internal temperature sensor.
 </ul><ul>
 <li> <a href="36_update.html">Update</a> toolchain to latest.
 </ul><ul>
 <li> Build for <a href="37_inram.html">In RAM Execution</a>.
 </ul><ul>
 <li> Perform <a href="38_crc32.html">CRC-32</a> flash content validation
 during startup.
 </ul><ul>
 <li> Read a <a href="39_resistor.html">Resistor</a> Value.
 </ul>
 <h2>Appendices</h2>
 <ul>
 <li> <a href="AA_factory.html">Factory-programmed</a> values.
 </ul>
    <hr>© 2020-2024 Renaud Fivet
 </body>
 </html>
--- a/docs/style.css
+++ b/docs/style.css
@ -0,0 +1,24 @@
 body {
    width: 1024px ;
    margin-left: auto ;
    margin-right: auto ;
    font-family: "Open Sans", "Helvetica Neue", Helvetica, Arial, sans-serif;
    font-size: 110% ;
 }
 pre {
    background-color: #F3F6FA ;
    margin-left: 1% ;
    margin-right: 25% ;
    font-size: 120% ;
    overflow-x: auto ;
 }
 code {
    background-color: #E0E0E0 ;
    font-size: 120% ;
 }
 a {
    font-weight: bold ;
 }
--- a/docs/vid/15_blink.mp4
+++ b/docs/vid/15_blink.mp4