FreeCalypso > hg > freecalypso-tools
annotate doc/Loadtools-performance @ 619:f82551c77e58
libserial-newlnx: ASYNC_LOW_LATENCY patch reverted
Reports from Das Signal indicate that loadtools performance on Debian
is about the same as on Slackware, and that including or omitting the
ASYNC_LOW_LATENCY patch from Serg makes no difference. Because the
patch in question does not appear to be necessary, it is being reverted
until and unless someone other than Serg reports an actual real-world
system on which loadtools operation times are slowed compared to the
Mother's Slackware reference and on which Slackware-like performance
can be restored by setting the ASYNC_LOW_LATENCY flag.
author | Mychaela Falconia <falcon@freecalypso.org> |
---|---|
date | Thu, 27 Feb 2020 01:09:48 +0000 |
parents | 6824c4d55848 |
children | 8c6e7b7e701c |
rev | line source |
---|---|
611
c847d742ab38
doc/Loadtools-performance: article started
Mychaela Falconia <falcon@freecalypso.org>
parents:
diff
changeset
|
1 Here are the expected run times for the flash dump2bin operation of dumping the |
c847d742ab38
doc/Loadtools-performance: article started
Mychaela Falconia <falcon@freecalypso.org>
parents:
diff
changeset
|
2 entire flash content of a Calypso GSM device: |
c847d742ab38
doc/Loadtools-performance: article started
Mychaela Falconia <falcon@freecalypso.org>
parents:
diff
changeset
|
3 |
c847d742ab38
doc/Loadtools-performance: article started
Mychaela Falconia <falcon@freecalypso.org>
parents:
diff
changeset
|
4 Dump of 4 MiB flash (e.g., Openmoko GTA01/02 or Mot C139/140) at 115200 baud: |
c847d742ab38
doc/Loadtools-performance: article started
Mychaela Falconia <falcon@freecalypso.org>
parents:
diff
changeset
|
5 12m53s |
c847d742ab38
doc/Loadtools-performance: article started
Mychaela Falconia <falcon@freecalypso.org>
parents:
diff
changeset
|
6 |
c847d742ab38
doc/Loadtools-performance: article started
Mychaela Falconia <falcon@freecalypso.org>
parents:
diff
changeset
|
7 The same 4 MiB flash dump at 812500 baud: 1m50s |
c847d742ab38
doc/Loadtools-performance: article started
Mychaela Falconia <falcon@freecalypso.org>
parents:
diff
changeset
|
8 |
c847d742ab38
doc/Loadtools-performance: article started
Mychaela Falconia <falcon@freecalypso.org>
parents:
diff
changeset
|
9 Dump of 8 MiB flash (e.g., Mot C155/156) at 812500 baud: 3m40s |
c847d742ab38
doc/Loadtools-performance: article started
Mychaela Falconia <falcon@freecalypso.org>
parents:
diff
changeset
|
10 |
c847d742ab38
doc/Loadtools-performance: article started
Mychaela Falconia <falcon@freecalypso.org>
parents:
diff
changeset
|
11 Because of the architecture of fc-loadtool and its loadagent back-end, the run |
c847d742ab38
doc/Loadtools-performance: article started
Mychaela Falconia <falcon@freecalypso.org>
parents:
diff
changeset
|
12 time of a flash dump operation depends only on the serial baud rate and the |
c847d742ab38
doc/Loadtools-performance: article started
Mychaela Falconia <falcon@freecalypso.org>
parents:
diff
changeset
|
13 size of the flash area to be dumped; it should not depend on the USB-serial |
c847d742ab38
doc/Loadtools-performance: article started
Mychaela Falconia <falcon@freecalypso.org>
parents:
diff
changeset
|
14 adapter type or any host system properties, as long as the host system and |
c847d742ab38
doc/Loadtools-performance: article started
Mychaela Falconia <falcon@freecalypso.org>
parents:
diff
changeset
|
15 serial adapter combination supports the desired baud rate. In contrast, flash |
c847d742ab38
doc/Loadtools-performance: article started
Mychaela Falconia <falcon@freecalypso.org>
parents:
diff
changeset
|
16 programming and fc-xram loading operations are quite different in that their |
c847d742ab38
doc/Loadtools-performance: article started
Mychaela Falconia <falcon@freecalypso.org>
parents:
diff
changeset
|
17 run times do depend on the host system and USB-serial adapter or other serial |
c847d742ab38
doc/Loadtools-performance: article started
Mychaela Falconia <falcon@freecalypso.org>
parents:
diff
changeset
|
18 port hardware - this host system dependency exists because of the way these |
c847d742ab38
doc/Loadtools-performance: article started
Mychaela Falconia <falcon@freecalypso.org>
parents:
diff
changeset
|
19 operations are implemented in our architecture. |
c847d742ab38
doc/Loadtools-performance: article started
Mychaela Falconia <falcon@freecalypso.org>
parents:
diff
changeset
|
20 |
615
39b74c39d914
doc/Loadtools-performance: complete for now
Mychaela Falconia <falcon@freecalypso.org>
parents:
613
diff
changeset
|
21 Here are some examples of expected flash programming times, all obtained on the |
618
6824c4d55848
doc/Loadtools-performance: program-m0 slowness documented
Mychaela Falconia <falcon@freecalypso.org>
parents:
615
diff
changeset
|
22 Mother's Slackware 14.2 host system, using the flash program-bin command as |
6824c4d55848
doc/Loadtools-performance: program-m0 slowness documented
Mychaela Falconia <falcon@freecalypso.org>
parents:
615
diff
changeset
|
23 opposed to program-m0 or program-srec: |
615
39b74c39d914
doc/Loadtools-performance: complete for now
Mychaela Falconia <falcon@freecalypso.org>
parents:
613
diff
changeset
|
24 |
39b74c39d914
doc/Loadtools-performance: complete for now
Mychaela Falconia <falcon@freecalypso.org>
parents:
613
diff
changeset
|
25 Flashing an Openmoko GTA02 modem (K5A3281CTM flash chip) with a new firmware |
39b74c39d914
doc/Loadtools-performance: complete for now
Mychaela Falconia <falcon@freecalypso.org>
parents:
613
diff
changeset
|
26 image (2376448 bytes), using a PL2303 USB-serial cable at 115200 baud: 7m35s |
39b74c39d914
doc/Loadtools-performance: complete for now
Mychaela Falconia <falcon@freecalypso.org>
parents:
613
diff
changeset
|
27 |
39b74c39d914
doc/Loadtools-performance: complete for now
Mychaela Falconia <falcon@freecalypso.org>
parents:
613
diff
changeset
|
28 Flashing the same OM GTA02 modem with the same fw image, using a CP2102 |
39b74c39d914
doc/Loadtools-performance: complete for now
Mychaela Falconia <falcon@freecalypso.org>
parents:
613
diff
changeset
|
29 USB-serial cable at 812500 baud: 1m52s |
39b74c39d914
doc/Loadtools-performance: complete for now
Mychaela Falconia <falcon@freecalypso.org>
parents:
613
diff
changeset
|
30 |
39b74c39d914
doc/Loadtools-performance: complete for now
Mychaela Falconia <falcon@freecalypso.org>
parents:
613
diff
changeset
|
31 Flashing a Magnetite hybrid fw image (2378084 bytes) into an FCDEV3B board |
39b74c39d914
doc/Loadtools-performance: complete for now
Mychaela Falconia <falcon@freecalypso.org>
parents:
613
diff
changeset
|
32 (S71PL129N flash chip) via an FT2232D adapter at 812500 baud: 2m11s |
39b74c39d914
doc/Loadtools-performance: complete for now
Mychaela Falconia <falcon@freecalypso.org>
parents:
613
diff
changeset
|
33 |
39b74c39d914
doc/Loadtools-performance: complete for now
Mychaela Falconia <falcon@freecalypso.org>
parents:
613
diff
changeset
|
34 These times are just for the flash program-bin operation, not counting the |
39b74c39d914
doc/Loadtools-performance: complete for now
Mychaela Falconia <falcon@freecalypso.org>
parents:
613
diff
changeset
|
35 flash erase which must be done first. Flash erase times are determined |
39b74c39d914
doc/Loadtools-performance: complete for now
Mychaela Falconia <falcon@freecalypso.org>
parents:
613
diff
changeset
|
36 entirely by physical processes inside the flash chip and are not affected by |
39b74c39d914
doc/Loadtools-performance: complete for now
Mychaela Falconia <falcon@freecalypso.org>
parents:
613
diff
changeset
|
37 software design or the serial link: for each sector to be erased, fc-loadtool |
39b74c39d914
doc/Loadtools-performance: complete for now
Mychaela Falconia <falcon@freecalypso.org>
parents:
613
diff
changeset
|
38 issues the sector erase command to the flash chip and then polls the chip for |
39b74c39d914
doc/Loadtools-performance: complete for now
Mychaela Falconia <falcon@freecalypso.org>
parents:
613
diff
changeset
|
39 operation completion status; the polling is done over the serial link and thus |
39b74c39d914
doc/Loadtools-performance: complete for now
Mychaela Falconia <falcon@freecalypso.org>
parents:
613
diff
changeset
|
40 may seem very slow, but the extra bit of latency added by the finite polling |
39b74c39d914
doc/Loadtools-performance: complete for now
Mychaela Falconia <falcon@freecalypso.org>
parents:
613
diff
changeset
|
41 speed is still negligible compared to the time of the actual sector erase |
39b74c39d914
doc/Loadtools-performance: complete for now
Mychaela Falconia <falcon@freecalypso.org>
parents:
613
diff
changeset
|
42 operation inside the flash chip. In contrast, the execution time of a flash |
39b74c39d914
doc/Loadtools-performance: complete for now
Mychaela Falconia <falcon@freecalypso.org>
parents:
613
diff
changeset
|
43 program-bin operation is a sum of 3 components: |
613
ab4021fb1c66
doc/Loadtools-performance: flash programming added
Mychaela Falconia <falcon@freecalypso.org>
parents:
611
diff
changeset
|
44 |
ab4021fb1c66
doc/Loadtools-performance: flash programming added
Mychaela Falconia <falcon@freecalypso.org>
parents:
611
diff
changeset
|
45 * The time it takes for the bits to be transferred over the serial link; |
ab4021fb1c66
doc/Loadtools-performance: flash programming added
Mychaela Falconia <falcon@freecalypso.org>
parents:
611
diff
changeset
|
46 * The time it takes for the flash programming operation to complete on the |
ab4021fb1c66
doc/Loadtools-performance: flash programming added
Mychaela Falconia <falcon@freecalypso.org>
parents:
611
diff
changeset
|
47 target (physics inside the flash chip); |
ab4021fb1c66
doc/Loadtools-performance: flash programming added
Mychaela Falconia <falcon@freecalypso.org>
parents:
611
diff
changeset
|
48 * The overhead of command-response exchanges between fc-loadtool and loadagent. |
ab4021fb1c66
doc/Loadtools-performance: flash programming added
Mychaela Falconia <falcon@freecalypso.org>
parents:
611
diff
changeset
|
49 |
618
6824c4d55848
doc/Loadtools-performance: program-m0 slowness documented
Mychaela Falconia <falcon@freecalypso.org>
parents:
615
diff
changeset
|
50 If you are starting out with a firmware image in m0 format, converting it to |
6824c4d55848
doc/Loadtools-performance: program-m0 slowness documented
Mychaela Falconia <falcon@freecalypso.org>
parents:
615
diff
changeset
|
51 binary with mokosrec2bin (like our FC Magnetite build system always does) and |
6824c4d55848
doc/Loadtools-performance: program-m0 slowness documented
Mychaela Falconia <falcon@freecalypso.org>
parents:
615
diff
changeset
|
52 then flashing via program-bin is faster than flashing the original m0 image |
6824c4d55848
doc/Loadtools-performance: program-m0 slowness documented
Mychaela Falconia <falcon@freecalypso.org>
parents:
615
diff
changeset
|
53 directly via program-m0. Following the last example above of flashing a |
6824c4d55848
doc/Loadtools-performance: program-m0 slowness documented
Mychaela Falconia <falcon@freecalypso.org>
parents:
615
diff
changeset
|
54 Magnetite hybrid fw image into an FCDEV3B, the flashing operation via |
6824c4d55848
doc/Loadtools-performance: program-m0 slowness documented
Mychaela Falconia <falcon@freecalypso.org>
parents:
615
diff
changeset
|
55 program-bin took 2m11s; flashing the same image via program-m0 took 3m54s. |
6824c4d55848
doc/Loadtools-performance: program-m0 slowness documented
Mychaela Falconia <falcon@freecalypso.org>
parents:
615
diff
changeset
|
56 |
6824c4d55848
doc/Loadtools-performance: program-m0 slowness documented
Mychaela Falconia <falcon@freecalypso.org>
parents:
615
diff
changeset
|
57 Flashing via program-bin is faster than program-m0 or program-srec because the |
6824c4d55848
doc/Loadtools-performance: program-m0 slowness documented
Mychaela Falconia <falcon@freecalypso.org>
parents:
615
diff
changeset
|
58 program-bin operation uses a larger unit size internally. fc-loadtool |
6824c4d55848
doc/Loadtools-performance: program-m0 slowness documented
Mychaela Falconia <falcon@freecalypso.org>
parents:
615
diff
changeset
|
59 implements all flash programming operations by sending AMFW or INFW commands to |
6824c4d55848
doc/Loadtools-performance: program-m0 slowness documented
Mychaela Falconia <falcon@freecalypso.org>
parents:
615
diff
changeset
|
60 loadagent; each AMFW or INFW command carries a string of 16-bit words to be |
6824c4d55848
doc/Loadtools-performance: program-m0 slowness documented
Mychaela Falconia <falcon@freecalypso.org>
parents:
615
diff
changeset
|
61 programmed. Our program-bin operation programs 256 bytes at a time, i.e., |
6824c4d55848
doc/Loadtools-performance: program-m0 slowness documented
Mychaela Falconia <falcon@freecalypso.org>
parents:
615
diff
changeset
|
62 sends one AMFW or INFW command per 256 bytes of image payload; our program-m0 |
6824c4d55848
doc/Loadtools-performance: program-m0 slowness documented
Mychaela Falconia <falcon@freecalypso.org>
parents:
615
diff
changeset
|
63 and program-srec operations program one S-record at a time, i.e., each S-record |
6824c4d55848
doc/Loadtools-performance: program-m0 slowness documented
Mychaela Falconia <falcon@freecalypso.org>
parents:
615
diff
changeset
|
64 in the source image turns into its own AMFW or INFW command to loadagent. In |
6824c4d55848
doc/Loadtools-performance: program-m0 slowness documented
Mychaela Falconia <falcon@freecalypso.org>
parents:
615
diff
changeset
|
65 the case of m0 images produced by TI's hex470 post-linker, each S-record carries |
6824c4d55848
doc/Loadtools-performance: program-m0 slowness documented
Mychaela Falconia <falcon@freecalypso.org>
parents:
615
diff
changeset
|
66 30 bytes of payload, thus flashing that m0 image directly with program-m0 will |
6824c4d55848
doc/Loadtools-performance: program-m0 slowness documented
Mychaela Falconia <falcon@freecalypso.org>
parents:
615
diff
changeset
|
67 proceed in 30-byte units, whereas converting it to binary and then flashing with |
6824c4d55848
doc/Loadtools-performance: program-m0 slowness documented
Mychaela Falconia <falcon@freecalypso.org>
parents:
615
diff
changeset
|
68 program-bin will proceed in 256-byte units. The smaller unit size slows down |
6824c4d55848
doc/Loadtools-performance: program-m0 slowness documented
Mychaela Falconia <falcon@freecalypso.org>
parents:
615
diff
changeset
|
69 the overall operation by increasing the overhead of command-response exchanges. |
6824c4d55848
doc/Loadtools-performance: program-m0 slowness documented
Mychaela Falconia <falcon@freecalypso.org>
parents:
615
diff
changeset
|
70 |
6824c4d55848
doc/Loadtools-performance: program-m0 slowness documented
Mychaela Falconia <falcon@freecalypso.org>
parents:
615
diff
changeset
|
71 XRAM loading via fc-xram is similar to flash program-m0 and program-srec in that |
6824c4d55848
doc/Loadtools-performance: program-m0 slowness documented
Mychaela Falconia <falcon@freecalypso.org>
parents:
615
diff
changeset
|
72 fc-xram sends a separate ML command to loadagent for each S-record, thus the |
6824c4d55848
doc/Loadtools-performance: program-m0 slowness documented
Mychaela Falconia <falcon@freecalypso.org>
parents:
615
diff
changeset
|
73 total XRAM image loading time is not only the serial bit transfer time, but also |
6824c4d55848
doc/Loadtools-performance: program-m0 slowness documented
Mychaela Falconia <falcon@freecalypso.org>
parents:
615
diff
changeset
|
74 the overhead of command-response exchanges between fc-xram and loadagent. Going |
6824c4d55848
doc/Loadtools-performance: program-m0 slowness documented
Mychaela Falconia <falcon@freecalypso.org>
parents:
615
diff
changeset
|
75 back to the same FC Magnetite fw image that can be flashed into an FCDEV3B in |
6824c4d55848
doc/Loadtools-performance: program-m0 slowness documented
Mychaela Falconia <falcon@freecalypso.org>
parents:
615
diff
changeset
|
76 2m11s via program-bin or in 3m54s via program-m0, doing an fc-xram load of that |
6824c4d55848
doc/Loadtools-performance: program-m0 slowness documented
Mychaela Falconia <falcon@freecalypso.org>
parents:
615
diff
changeset
|
77 same fw image (built as ramimage.srec) into the same FCDEV3B via the same |
6824c4d55848
doc/Loadtools-performance: program-m0 slowness documented
Mychaela Falconia <falcon@freecalypso.org>
parents:
615
diff
changeset
|
78 FT2232D adapter at 812500 baud takes 2m54s - thus we can see that fc-xram |
6824c4d55848
doc/Loadtools-performance: program-m0 slowness documented
Mychaela Falconia <falcon@freecalypso.org>
parents:
615
diff
changeset
|
79 loading is faster than flash program-m0 or program-srec, but slower than flash |
6824c4d55848
doc/Loadtools-performance: program-m0 slowness documented
Mychaela Falconia <falcon@freecalypso.org>
parents:
615
diff
changeset
|
80 program-bin. |
615
39b74c39d914
doc/Loadtools-performance: complete for now
Mychaela Falconia <falcon@freecalypso.org>
parents:
613
diff
changeset
|
81 |
39b74c39d914
doc/Loadtools-performance: complete for now
Mychaela Falconia <falcon@freecalypso.org>
parents:
613
diff
changeset
|
82 Why does XRAM loading take longer than flashing? Shouldn't it be faster because |
39b74c39d914
doc/Loadtools-performance: complete for now
Mychaela Falconia <falcon@freecalypso.org>
parents:
613
diff
changeset
|
83 the flash programming step on the target is replaced with a simple memcpy()? |
39b74c39d914
doc/Loadtools-performance: complete for now
Mychaela Falconia <falcon@freecalypso.org>
parents:
613
diff
changeset
|
84 Answer: fc-xram is currently slower than flash program-bin because the latter |
39b74c39d914
doc/Loadtools-performance: complete for now
Mychaela Falconia <falcon@freecalypso.org>
parents:
613
diff
changeset
|
85 sends 256 bytes at a time to loadagent, whereas fc-xram sends one S-record at a |
39b74c39d914
doc/Loadtools-performance: complete for now
Mychaela Falconia <falcon@freecalypso.org>
parents:
613
diff
changeset
|
86 time; the division of the image into S-records is determined by the tool that |
39b74c39d914
doc/Loadtools-performance: complete for now
Mychaela Falconia <falcon@freecalypso.org>
parents:
613
diff
changeset
|
87 generates the SREC image, but TI's hex470 post-linker generates images with 30 |
39b74c39d914
doc/Loadtools-performance: complete for now
Mychaela Falconia <falcon@freecalypso.org>
parents:
613
diff
changeset
|
88 bytes of payload per S-record. Having the operation proceed in smaller chunks |
39b74c39d914
doc/Loadtools-performance: complete for now
Mychaela Falconia <falcon@freecalypso.org>
parents:
613
diff
changeset
|
89 increases the overhead of command-response exchanges and thus increases the |
39b74c39d914
doc/Loadtools-performance: complete for now
Mychaela Falconia <falcon@freecalypso.org>
parents:
613
diff
changeset
|
90 overall time. |