comparison doc/Loadtools-performance @ 615:39b74c39d914

doc/Loadtools-performance: complete for now
author Mychaela Falconia <falcon@freecalypso.org>
date Tue, 25 Feb 2020 01:33:23 +0000
parents ab4021fb1c66
children 6824c4d55848
comparison
equal deleted inserted replaced
614:02bdb2f366bc 615:39b74c39d914
16 programming and fc-xram loading operations are quite different in that their 16 programming and fc-xram loading operations are quite different in that their
17 run times do depend on the host system and USB-serial adapter or other serial 17 run times do depend on the host system and USB-serial adapter or other serial
18 port hardware - this host system dependency exists because of the way these 18 port hardware - this host system dependency exists because of the way these
19 operations are implemented in our architecture. 19 operations are implemented in our architecture.
20 20
21 Here is one example of expected flash programming time: flashing a FreeCalypso 21 Here are some examples of expected flash programming times, all obtained on the
22 Magnetite hybrid fw image (2378084 bytes) into an FCDEV3B board (S71PL129N 22 Mother's Slackware 14.2 host system:
23 flash chip) via an FT2232D adapter at 812500 baud takes 2m11s on the Mother's 23
24 Slackware 14.2 system. This time is just for the flash program-bin operation, 24 Flashing an Openmoko GTA02 modem (K5A3281CTM flash chip) with a new firmware
25 not counting the flash erase which must be done first. Flash erase times are 25 image (2376448 bytes), using a PL2303 USB-serial cable at 115200 baud: 7m35s
26 determined entirely by physical processes inside the flash chip and are not 26
27 affected by software design or the serial link: for each sector to be erased, 27 Flashing the same OM GTA02 modem with the same fw image, using a CP2102
28 fc-loadtool issues the sector erase command to the flash chip and then polls 28 USB-serial cable at 812500 baud: 1m52s
29 the chip for operation completion status; the polling is done over the serial 29
30 link and thus may seem very slow, but the extra bit of latency added by the 30 Flashing a Magnetite hybrid fw image (2378084 bytes) into an FCDEV3B board
31 finite polling speed is still negligible compared to the time of the actual 31 (S71PL129N flash chip) via an FT2232D adapter at 812500 baud: 2m11s
32 sector erase operation inside the flash chip. In contrast, the execution time 32
33 of a flash program-bin operation is a sum of 3 components: 33 These times are just for the flash program-bin operation, not counting the
34 flash erase which must be done first. Flash erase times are determined
35 entirely by physical processes inside the flash chip and are not affected by
36 software design or the serial link: for each sector to be erased, fc-loadtool
37 issues the sector erase command to the flash chip and then polls the chip for
38 operation completion status; the polling is done over the serial link and thus
39 may seem very slow, but the extra bit of latency added by the finite polling
40 speed is still negligible compared to the time of the actual sector erase
41 operation inside the flash chip. In contrast, the execution time of a flash
42 program-bin operation is a sum of 3 components:
34 43
35 * The time it takes for the bits to be transferred over the serial link; 44 * The time it takes for the bits to be transferred over the serial link;
36 * The time it takes for the flash programming operation to complete on the 45 * The time it takes for the flash programming operation to complete on the
37 target (physics inside the flash chip); 46 target (physics inside the flash chip);
38 * The overhead of command-response exchanges between fc-loadtool and loadagent. 47 * The overhead of command-response exchanges between fc-loadtool and loadagent.
39 48
40 [To be continued] 49 XRAM loading via fc-xram is similar to flash programming in that fc-xram sends
50 a separate ML command to loadagent for each S-record, thus the total XRAM image
51 loading time is not only the serial bit transfer time, but also the overhead of
52 command-response exchanges between fc-xram and loadagent. The flash programming
53 times listed above include flashing an FC Magnetite fw image into an FCDEV3B,
54 which took 2m11s; doing an fc-xram load of the same FC Magnetite fw image (built
55 as ramimage.srec) into the same FCDEV3B via the same FT2232D adapter at 812500
56 baud takes 2m54s.
57
58 Why does XRAM loading take longer than flashing? Shouldn't it be faster because
59 the flash programming step on the target is replaced with a simple memcpy()?
60 Answer: fc-xram is currently slower than flash program-bin because the latter
61 sends 256 bytes at a time to loadagent, whereas fc-xram sends one S-record at a
62 time; the division of the image into S-records is determined by the tool that
63 generates the SREC image, but TI's hex470 post-linker generates images with 30
64 bytes of payload per S-record. Having the operation proceed in smaller chunks
65 increases the overhead of command-response exchanges and thus increases the
66 overall time.