comparison doc/Loadtools-performance @ 618:6824c4d55848

doc/Loadtools-performance: program-m0 slowness documented
author Mychaela Falconia <falcon@freecalypso.org>
date Tue, 25 Feb 2020 18:40:00 +0000
parents 39b74c39d914
children 8c6e7b7e701c
comparison
equal deleted inserted replaced
617:97fe41e9242a 618:6824c4d55848
17 run times do depend on the host system and USB-serial adapter or other serial 17 run times do depend on the host system and USB-serial adapter or other serial
18 port hardware - this host system dependency exists because of the way these 18 port hardware - this host system dependency exists because of the way these
19 operations are implemented in our architecture. 19 operations are implemented in our architecture.
20 20
21 Here are some examples of expected flash programming times, all obtained on the 21 Here are some examples of expected flash programming times, all obtained on the
22 Mother's Slackware 14.2 host system: 22 Mother's Slackware 14.2 host system, using the flash program-bin command as
23 opposed to program-m0 or program-srec:
23 24
24 Flashing an Openmoko GTA02 modem (K5A3281CTM flash chip) with a new firmware 25 Flashing an Openmoko GTA02 modem (K5A3281CTM flash chip) with a new firmware
25 image (2376448 bytes), using a PL2303 USB-serial cable at 115200 baud: 7m35s 26 image (2376448 bytes), using a PL2303 USB-serial cable at 115200 baud: 7m35s
26 27
27 Flashing the same OM GTA02 modem with the same fw image, using a CP2102 28 Flashing the same OM GTA02 modem with the same fw image, using a CP2102
44 * The time it takes for the bits to be transferred over the serial link; 45 * The time it takes for the bits to be transferred over the serial link;
45 * The time it takes for the flash programming operation to complete on the 46 * The time it takes for the flash programming operation to complete on the
46 target (physics inside the flash chip); 47 target (physics inside the flash chip);
47 * The overhead of command-response exchanges between fc-loadtool and loadagent. 48 * The overhead of command-response exchanges between fc-loadtool and loadagent.
48 49
49 XRAM loading via fc-xram is similar to flash programming in that fc-xram sends 50 If you are starting out with a firmware image in m0 format, converting it to
50 a separate ML command to loadagent for each S-record, thus the total XRAM image 51 binary with mokosrec2bin (like our FC Magnetite build system always does) and
51 loading time is not only the serial bit transfer time, but also the overhead of 52 then flashing via program-bin is faster than flashing the original m0 image
52 command-response exchanges between fc-xram and loadagent. The flash programming 53 directly via program-m0. Following the last example above of flashing a
53 times listed above include flashing an FC Magnetite fw image into an FCDEV3B, 54 Magnetite hybrid fw image into an FCDEV3B, the flashing operation via
54 which took 2m11s; doing an fc-xram load of the same FC Magnetite fw image (built 55 program-bin took 2m11s; flashing the same image via program-m0 took 3m54s.
55 as ramimage.srec) into the same FCDEV3B via the same FT2232D adapter at 812500 56
56 baud takes 2m54s. 57 Flashing via program-bin is faster than program-m0 or program-srec because the
58 program-bin operation uses a larger unit size internally. fc-loadtool
59 implements all flash programming operations by sending AMFW or INFW commands to
60 loadagent; each AMFW or INFW command carries a string of 16-bit words to be
61 programmed. Our program-bin operation programs 256 bytes at a time, i.e.,
62 sends one AMFW or INFW command per 256 bytes of image payload; our program-m0
63 and program-srec operations program one S-record at a time, i.e., each S-record
64 in the source image turns into its own AMFW or INFW command to loadagent. In
65 the case of m0 images produced by TI's hex470 post-linker, each S-record carries
66 30 bytes of payload, thus flashing that m0 image directly with program-m0 will
67 proceed in 30-byte units, whereas converting it to binary and then flashing with
68 program-bin will proceed in 256-byte units. The smaller unit size slows down
69 the overall operation by increasing the overhead of command-response exchanges.
70
71 XRAM loading via fc-xram is similar to flash program-m0 and program-srec in that
72 fc-xram sends a separate ML command to loadagent for each S-record, thus the
73 total XRAM image loading time is not only the serial bit transfer time, but also
74 the overhead of command-response exchanges between fc-xram and loadagent. Going
75 back to the same FC Magnetite fw image that can be flashed into an FCDEV3B in
76 2m11s via program-bin or in 3m54s via program-m0, doing an fc-xram load of that
77 same fw image (built as ramimage.srec) into the same FCDEV3B via the same
78 FT2232D adapter at 812500 baud takes 2m54s - thus we can see that fc-xram
79 loading is faster than flash program-m0 or program-srec, but slower than flash
80 program-bin.
57 81
58 Why does XRAM loading take longer than flashing? Shouldn't it be faster because 82 Why does XRAM loading take longer than flashing? Shouldn't it be faster because
59 the flash programming step on the target is replaced with a simple memcpy()? 83 the flash programming step on the target is replaced with a simple memcpy()?
60 Answer: fc-xram is currently slower than flash program-bin because the latter 84 Answer: fc-xram is currently slower than flash program-bin because the latter
61 sends 256 bytes at a time to loadagent, whereas fc-xram sends one S-record at a 85 sends 256 bytes at a time to loadagent, whereas fc-xram sends one S-record at a