FreeCalypso > hg > freecalypso-tools
comparison doc/Loadtools-performance @ 618:6824c4d55848
doc/Loadtools-performance: program-m0 slowness documented
author | Mychaela Falconia <falcon@freecalypso.org> |
---|---|
date | Tue, 25 Feb 2020 18:40:00 +0000 |
parents | 39b74c39d914 |
children | 8c6e7b7e701c |
comparison
equal
deleted
inserted
replaced
617:97fe41e9242a | 618:6824c4d55848 |
---|---|
17 run times do depend on the host system and USB-serial adapter or other serial | 17 run times do depend on the host system and USB-serial adapter or other serial |
18 port hardware - this host system dependency exists because of the way these | 18 port hardware - this host system dependency exists because of the way these |
19 operations are implemented in our architecture. | 19 operations are implemented in our architecture. |
20 | 20 |
21 Here are some examples of expected flash programming times, all obtained on the | 21 Here are some examples of expected flash programming times, all obtained on the |
22 Mother's Slackware 14.2 host system: | 22 Mother's Slackware 14.2 host system, using the flash program-bin command as |
23 opposed to program-m0 or program-srec: | |
23 | 24 |
24 Flashing an Openmoko GTA02 modem (K5A3281CTM flash chip) with a new firmware | 25 Flashing an Openmoko GTA02 modem (K5A3281CTM flash chip) with a new firmware |
25 image (2376448 bytes), using a PL2303 USB-serial cable at 115200 baud: 7m35s | 26 image (2376448 bytes), using a PL2303 USB-serial cable at 115200 baud: 7m35s |
26 | 27 |
27 Flashing the same OM GTA02 modem with the same fw image, using a CP2102 | 28 Flashing the same OM GTA02 modem with the same fw image, using a CP2102 |
44 * The time it takes for the bits to be transferred over the serial link; | 45 * The time it takes for the bits to be transferred over the serial link; |
45 * The time it takes for the flash programming operation to complete on the | 46 * The time it takes for the flash programming operation to complete on the |
46 target (physics inside the flash chip); | 47 target (physics inside the flash chip); |
47 * The overhead of command-response exchanges between fc-loadtool and loadagent. | 48 * The overhead of command-response exchanges between fc-loadtool and loadagent. |
48 | 49 |
49 XRAM loading via fc-xram is similar to flash programming in that fc-xram sends | 50 If you are starting out with a firmware image in m0 format, converting it to |
50 a separate ML command to loadagent for each S-record, thus the total XRAM image | 51 binary with mokosrec2bin (like our FC Magnetite build system always does) and |
51 loading time is not only the serial bit transfer time, but also the overhead of | 52 then flashing via program-bin is faster than flashing the original m0 image |
52 command-response exchanges between fc-xram and loadagent. The flash programming | 53 directly via program-m0. Following the last example above of flashing a |
53 times listed above include flashing an FC Magnetite fw image into an FCDEV3B, | 54 Magnetite hybrid fw image into an FCDEV3B, the flashing operation via |
54 which took 2m11s; doing an fc-xram load of the same FC Magnetite fw image (built | 55 program-bin took 2m11s; flashing the same image via program-m0 took 3m54s. |
55 as ramimage.srec) into the same FCDEV3B via the same FT2232D adapter at 812500 | 56 |
56 baud takes 2m54s. | 57 Flashing via program-bin is faster than program-m0 or program-srec because the |
58 program-bin operation uses a larger unit size internally. fc-loadtool | |
59 implements all flash programming operations by sending AMFW or INFW commands to | |
60 loadagent; each AMFW or INFW command carries a string of 16-bit words to be | |
61 programmed. Our program-bin operation programs 256 bytes at a time, i.e., | |
62 sends one AMFW or INFW command per 256 bytes of image payload; our program-m0 | |
63 and program-srec operations program one S-record at a time, i.e., each S-record | |
64 in the source image turns into its own AMFW or INFW command to loadagent. In | |
65 the case of m0 images produced by TI's hex470 post-linker, each S-record carries | |
66 30 bytes of payload, thus flashing that m0 image directly with program-m0 will | |
67 proceed in 30-byte units, whereas converting it to binary and then flashing with | |
68 program-bin will proceed in 256-byte units. The smaller unit size slows down | |
69 the overall operation by increasing the overhead of command-response exchanges. | |
70 | |
71 XRAM loading via fc-xram is similar to flash program-m0 and program-srec in that | |
72 fc-xram sends a separate ML command to loadagent for each S-record, thus the | |
73 total XRAM image loading time is not only the serial bit transfer time, but also | |
74 the overhead of command-response exchanges between fc-xram and loadagent. Going | |
75 back to the same FC Magnetite fw image that can be flashed into an FCDEV3B in | |
76 2m11s via program-bin or in 3m54s via program-m0, doing an fc-xram load of that | |
77 same fw image (built as ramimage.srec) into the same FCDEV3B via the same | |
78 FT2232D adapter at 812500 baud takes 2m54s - thus we can see that fc-xram | |
79 loading is faster than flash program-m0 or program-srec, but slower than flash | |
80 program-bin. | |
57 | 81 |
58 Why does XRAM loading take longer than flashing? Shouldn't it be faster because | 82 Why does XRAM loading take longer than flashing? Shouldn't it be faster because |
59 the flash programming step on the target is replaced with a simple memcpy()? | 83 the flash programming step on the target is replaced with a simple memcpy()? |
60 Answer: fc-xram is currently slower than flash program-bin because the latter | 84 Answer: fc-xram is currently slower than flash program-bin because the latter |
61 sends 256 bytes at a time to loadagent, whereas fc-xram sends one S-record at a | 85 sends 256 bytes at a time to loadagent, whereas fc-xram sends one S-record at a |