annotate doc/Loadtools-performance @ 619:f82551c77e58

libserial-newlnx: ASYNC_LOW_LATENCY patch reverted Reports from Das Signal indicate that loadtools performance on Debian is about the same as on Slackware, and that including or omitting the ASYNC_LOW_LATENCY patch from Serg makes no difference. Because the patch in question does not appear to be necessary, it is being reverted until and unless someone other than Serg reports an actual real-world system on which loadtools operation times are slowed compared to the Mother's Slackware reference and on which Slackware-like performance can be restored by setting the ASYNC_LOW_LATENCY flag.
author Mychaela Falconia <falcon@freecalypso.org>
date Thu, 27 Feb 2020 01:09:48 +0000
parents 6824c4d55848
children 8c6e7b7e701c
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
611
c847d742ab38 doc/Loadtools-performance: article started
Mychaela Falconia <falcon@freecalypso.org>
parents:
diff changeset
1 Here are the expected run times for the flash dump2bin operation of dumping the
c847d742ab38 doc/Loadtools-performance: article started
Mychaela Falconia <falcon@freecalypso.org>
parents:
diff changeset
2 entire flash content of a Calypso GSM device:
c847d742ab38 doc/Loadtools-performance: article started
Mychaela Falconia <falcon@freecalypso.org>
parents:
diff changeset
3
c847d742ab38 doc/Loadtools-performance: article started
Mychaela Falconia <falcon@freecalypso.org>
parents:
diff changeset
4 Dump of 4 MiB flash (e.g., Openmoko GTA01/02 or Mot C139/140) at 115200 baud:
c847d742ab38 doc/Loadtools-performance: article started
Mychaela Falconia <falcon@freecalypso.org>
parents:
diff changeset
5 12m53s
c847d742ab38 doc/Loadtools-performance: article started
Mychaela Falconia <falcon@freecalypso.org>
parents:
diff changeset
6
c847d742ab38 doc/Loadtools-performance: article started
Mychaela Falconia <falcon@freecalypso.org>
parents:
diff changeset
7 The same 4 MiB flash dump at 812500 baud: 1m50s
c847d742ab38 doc/Loadtools-performance: article started
Mychaela Falconia <falcon@freecalypso.org>
parents:
diff changeset
8
c847d742ab38 doc/Loadtools-performance: article started
Mychaela Falconia <falcon@freecalypso.org>
parents:
diff changeset
9 Dump of 8 MiB flash (e.g., Mot C155/156) at 812500 baud: 3m40s
c847d742ab38 doc/Loadtools-performance: article started
Mychaela Falconia <falcon@freecalypso.org>
parents:
diff changeset
10
c847d742ab38 doc/Loadtools-performance: article started
Mychaela Falconia <falcon@freecalypso.org>
parents:
diff changeset
11 Because of the architecture of fc-loadtool and its loadagent back-end, the run
c847d742ab38 doc/Loadtools-performance: article started
Mychaela Falconia <falcon@freecalypso.org>
parents:
diff changeset
12 time of a flash dump operation depends only on the serial baud rate and the
c847d742ab38 doc/Loadtools-performance: article started
Mychaela Falconia <falcon@freecalypso.org>
parents:
diff changeset
13 size of the flash area to be dumped; it should not depend on the USB-serial
c847d742ab38 doc/Loadtools-performance: article started
Mychaela Falconia <falcon@freecalypso.org>
parents:
diff changeset
14 adapter type or any host system properties, as long as the host system and
c847d742ab38 doc/Loadtools-performance: article started
Mychaela Falconia <falcon@freecalypso.org>
parents:
diff changeset
15 serial adapter combination supports the desired baud rate. In contrast, flash
c847d742ab38 doc/Loadtools-performance: article started
Mychaela Falconia <falcon@freecalypso.org>
parents:
diff changeset
16 programming and fc-xram loading operations are quite different in that their
c847d742ab38 doc/Loadtools-performance: article started
Mychaela Falconia <falcon@freecalypso.org>
parents:
diff changeset
17 run times do depend on the host system and USB-serial adapter or other serial
c847d742ab38 doc/Loadtools-performance: article started
Mychaela Falconia <falcon@freecalypso.org>
parents:
diff changeset
18 port hardware - this host system dependency exists because of the way these
c847d742ab38 doc/Loadtools-performance: article started
Mychaela Falconia <falcon@freecalypso.org>
parents:
diff changeset
19 operations are implemented in our architecture.
c847d742ab38 doc/Loadtools-performance: article started
Mychaela Falconia <falcon@freecalypso.org>
parents:
diff changeset
20
615
39b74c39d914 doc/Loadtools-performance: complete for now
Mychaela Falconia <falcon@freecalypso.org>
parents: 613
diff changeset
21 Here are some examples of expected flash programming times, all obtained on the
618
6824c4d55848 doc/Loadtools-performance: program-m0 slowness documented
Mychaela Falconia <falcon@freecalypso.org>
parents: 615
diff changeset
22 Mother's Slackware 14.2 host system, using the flash program-bin command as
6824c4d55848 doc/Loadtools-performance: program-m0 slowness documented
Mychaela Falconia <falcon@freecalypso.org>
parents: 615
diff changeset
23 opposed to program-m0 or program-srec:
615
39b74c39d914 doc/Loadtools-performance: complete for now
Mychaela Falconia <falcon@freecalypso.org>
parents: 613
diff changeset
24
39b74c39d914 doc/Loadtools-performance: complete for now
Mychaela Falconia <falcon@freecalypso.org>
parents: 613
diff changeset
25 Flashing an Openmoko GTA02 modem (K5A3281CTM flash chip) with a new firmware
39b74c39d914 doc/Loadtools-performance: complete for now
Mychaela Falconia <falcon@freecalypso.org>
parents: 613
diff changeset
26 image (2376448 bytes), using a PL2303 USB-serial cable at 115200 baud: 7m35s
39b74c39d914 doc/Loadtools-performance: complete for now
Mychaela Falconia <falcon@freecalypso.org>
parents: 613
diff changeset
27
39b74c39d914 doc/Loadtools-performance: complete for now
Mychaela Falconia <falcon@freecalypso.org>
parents: 613
diff changeset
28 Flashing the same OM GTA02 modem with the same fw image, using a CP2102
39b74c39d914 doc/Loadtools-performance: complete for now
Mychaela Falconia <falcon@freecalypso.org>
parents: 613
diff changeset
29 USB-serial cable at 812500 baud: 1m52s
39b74c39d914 doc/Loadtools-performance: complete for now
Mychaela Falconia <falcon@freecalypso.org>
parents: 613
diff changeset
30
39b74c39d914 doc/Loadtools-performance: complete for now
Mychaela Falconia <falcon@freecalypso.org>
parents: 613
diff changeset
31 Flashing a Magnetite hybrid fw image (2378084 bytes) into an FCDEV3B board
39b74c39d914 doc/Loadtools-performance: complete for now
Mychaela Falconia <falcon@freecalypso.org>
parents: 613
diff changeset
32 (S71PL129N flash chip) via an FT2232D adapter at 812500 baud: 2m11s
39b74c39d914 doc/Loadtools-performance: complete for now
Mychaela Falconia <falcon@freecalypso.org>
parents: 613
diff changeset
33
39b74c39d914 doc/Loadtools-performance: complete for now
Mychaela Falconia <falcon@freecalypso.org>
parents: 613
diff changeset
34 These times are just for the flash program-bin operation, not counting the
39b74c39d914 doc/Loadtools-performance: complete for now
Mychaela Falconia <falcon@freecalypso.org>
parents: 613
diff changeset
35 flash erase which must be done first. Flash erase times are determined
39b74c39d914 doc/Loadtools-performance: complete for now
Mychaela Falconia <falcon@freecalypso.org>
parents: 613
diff changeset
36 entirely by physical processes inside the flash chip and are not affected by
39b74c39d914 doc/Loadtools-performance: complete for now
Mychaela Falconia <falcon@freecalypso.org>
parents: 613
diff changeset
37 software design or the serial link: for each sector to be erased, fc-loadtool
39b74c39d914 doc/Loadtools-performance: complete for now
Mychaela Falconia <falcon@freecalypso.org>
parents: 613
diff changeset
38 issues the sector erase command to the flash chip and then polls the chip for
39b74c39d914 doc/Loadtools-performance: complete for now
Mychaela Falconia <falcon@freecalypso.org>
parents: 613
diff changeset
39 operation completion status; the polling is done over the serial link and thus
39b74c39d914 doc/Loadtools-performance: complete for now
Mychaela Falconia <falcon@freecalypso.org>
parents: 613
diff changeset
40 may seem very slow, but the extra bit of latency added by the finite polling
39b74c39d914 doc/Loadtools-performance: complete for now
Mychaela Falconia <falcon@freecalypso.org>
parents: 613
diff changeset
41 speed is still negligible compared to the time of the actual sector erase
39b74c39d914 doc/Loadtools-performance: complete for now
Mychaela Falconia <falcon@freecalypso.org>
parents: 613
diff changeset
42 operation inside the flash chip. In contrast, the execution time of a flash
39b74c39d914 doc/Loadtools-performance: complete for now
Mychaela Falconia <falcon@freecalypso.org>
parents: 613
diff changeset
43 program-bin operation is a sum of 3 components:
613
ab4021fb1c66 doc/Loadtools-performance: flash programming added
Mychaela Falconia <falcon@freecalypso.org>
parents: 611
diff changeset
44
ab4021fb1c66 doc/Loadtools-performance: flash programming added
Mychaela Falconia <falcon@freecalypso.org>
parents: 611
diff changeset
45 * The time it takes for the bits to be transferred over the serial link;
ab4021fb1c66 doc/Loadtools-performance: flash programming added
Mychaela Falconia <falcon@freecalypso.org>
parents: 611
diff changeset
46 * The time it takes for the flash programming operation to complete on the
ab4021fb1c66 doc/Loadtools-performance: flash programming added
Mychaela Falconia <falcon@freecalypso.org>
parents: 611
diff changeset
47 target (physics inside the flash chip);
ab4021fb1c66 doc/Loadtools-performance: flash programming added
Mychaela Falconia <falcon@freecalypso.org>
parents: 611
diff changeset
48 * The overhead of command-response exchanges between fc-loadtool and loadagent.
ab4021fb1c66 doc/Loadtools-performance: flash programming added
Mychaela Falconia <falcon@freecalypso.org>
parents: 611
diff changeset
49
618
6824c4d55848 doc/Loadtools-performance: program-m0 slowness documented
Mychaela Falconia <falcon@freecalypso.org>
parents: 615
diff changeset
50 If you are starting out with a firmware image in m0 format, converting it to
6824c4d55848 doc/Loadtools-performance: program-m0 slowness documented
Mychaela Falconia <falcon@freecalypso.org>
parents: 615
diff changeset
51 binary with mokosrec2bin (like our FC Magnetite build system always does) and
6824c4d55848 doc/Loadtools-performance: program-m0 slowness documented
Mychaela Falconia <falcon@freecalypso.org>
parents: 615
diff changeset
52 then flashing via program-bin is faster than flashing the original m0 image
6824c4d55848 doc/Loadtools-performance: program-m0 slowness documented
Mychaela Falconia <falcon@freecalypso.org>
parents: 615
diff changeset
53 directly via program-m0. Following the last example above of flashing a
6824c4d55848 doc/Loadtools-performance: program-m0 slowness documented
Mychaela Falconia <falcon@freecalypso.org>
parents: 615
diff changeset
54 Magnetite hybrid fw image into an FCDEV3B, the flashing operation via
6824c4d55848 doc/Loadtools-performance: program-m0 slowness documented
Mychaela Falconia <falcon@freecalypso.org>
parents: 615
diff changeset
55 program-bin took 2m11s; flashing the same image via program-m0 took 3m54s.
6824c4d55848 doc/Loadtools-performance: program-m0 slowness documented
Mychaela Falconia <falcon@freecalypso.org>
parents: 615
diff changeset
56
6824c4d55848 doc/Loadtools-performance: program-m0 slowness documented
Mychaela Falconia <falcon@freecalypso.org>
parents: 615
diff changeset
57 Flashing via program-bin is faster than program-m0 or program-srec because the
6824c4d55848 doc/Loadtools-performance: program-m0 slowness documented
Mychaela Falconia <falcon@freecalypso.org>
parents: 615
diff changeset
58 program-bin operation uses a larger unit size internally. fc-loadtool
6824c4d55848 doc/Loadtools-performance: program-m0 slowness documented
Mychaela Falconia <falcon@freecalypso.org>
parents: 615
diff changeset
59 implements all flash programming operations by sending AMFW or INFW commands to
6824c4d55848 doc/Loadtools-performance: program-m0 slowness documented
Mychaela Falconia <falcon@freecalypso.org>
parents: 615
diff changeset
60 loadagent; each AMFW or INFW command carries a string of 16-bit words to be
6824c4d55848 doc/Loadtools-performance: program-m0 slowness documented
Mychaela Falconia <falcon@freecalypso.org>
parents: 615
diff changeset
61 programmed. Our program-bin operation programs 256 bytes at a time, i.e.,
6824c4d55848 doc/Loadtools-performance: program-m0 slowness documented
Mychaela Falconia <falcon@freecalypso.org>
parents: 615
diff changeset
62 sends one AMFW or INFW command per 256 bytes of image payload; our program-m0
6824c4d55848 doc/Loadtools-performance: program-m0 slowness documented
Mychaela Falconia <falcon@freecalypso.org>
parents: 615
diff changeset
63 and program-srec operations program one S-record at a time, i.e., each S-record
6824c4d55848 doc/Loadtools-performance: program-m0 slowness documented
Mychaela Falconia <falcon@freecalypso.org>
parents: 615
diff changeset
64 in the source image turns into its own AMFW or INFW command to loadagent. In
6824c4d55848 doc/Loadtools-performance: program-m0 slowness documented
Mychaela Falconia <falcon@freecalypso.org>
parents: 615
diff changeset
65 the case of m0 images produced by TI's hex470 post-linker, each S-record carries
6824c4d55848 doc/Loadtools-performance: program-m0 slowness documented
Mychaela Falconia <falcon@freecalypso.org>
parents: 615
diff changeset
66 30 bytes of payload, thus flashing that m0 image directly with program-m0 will
6824c4d55848 doc/Loadtools-performance: program-m0 slowness documented
Mychaela Falconia <falcon@freecalypso.org>
parents: 615
diff changeset
67 proceed in 30-byte units, whereas converting it to binary and then flashing with
6824c4d55848 doc/Loadtools-performance: program-m0 slowness documented
Mychaela Falconia <falcon@freecalypso.org>
parents: 615
diff changeset
68 program-bin will proceed in 256-byte units. The smaller unit size slows down
6824c4d55848 doc/Loadtools-performance: program-m0 slowness documented
Mychaela Falconia <falcon@freecalypso.org>
parents: 615
diff changeset
69 the overall operation by increasing the overhead of command-response exchanges.
6824c4d55848 doc/Loadtools-performance: program-m0 slowness documented
Mychaela Falconia <falcon@freecalypso.org>
parents: 615
diff changeset
70
6824c4d55848 doc/Loadtools-performance: program-m0 slowness documented
Mychaela Falconia <falcon@freecalypso.org>
parents: 615
diff changeset
71 XRAM loading via fc-xram is similar to flash program-m0 and program-srec in that
6824c4d55848 doc/Loadtools-performance: program-m0 slowness documented
Mychaela Falconia <falcon@freecalypso.org>
parents: 615
diff changeset
72 fc-xram sends a separate ML command to loadagent for each S-record, thus the
6824c4d55848 doc/Loadtools-performance: program-m0 slowness documented
Mychaela Falconia <falcon@freecalypso.org>
parents: 615
diff changeset
73 total XRAM image loading time is not only the serial bit transfer time, but also
6824c4d55848 doc/Loadtools-performance: program-m0 slowness documented
Mychaela Falconia <falcon@freecalypso.org>
parents: 615
diff changeset
74 the overhead of command-response exchanges between fc-xram and loadagent. Going
6824c4d55848 doc/Loadtools-performance: program-m0 slowness documented
Mychaela Falconia <falcon@freecalypso.org>
parents: 615
diff changeset
75 back to the same FC Magnetite fw image that can be flashed into an FCDEV3B in
6824c4d55848 doc/Loadtools-performance: program-m0 slowness documented
Mychaela Falconia <falcon@freecalypso.org>
parents: 615
diff changeset
76 2m11s via program-bin or in 3m54s via program-m0, doing an fc-xram load of that
6824c4d55848 doc/Loadtools-performance: program-m0 slowness documented
Mychaela Falconia <falcon@freecalypso.org>
parents: 615
diff changeset
77 same fw image (built as ramimage.srec) into the same FCDEV3B via the same
6824c4d55848 doc/Loadtools-performance: program-m0 slowness documented
Mychaela Falconia <falcon@freecalypso.org>
parents: 615
diff changeset
78 FT2232D adapter at 812500 baud takes 2m54s - thus we can see that fc-xram
6824c4d55848 doc/Loadtools-performance: program-m0 slowness documented
Mychaela Falconia <falcon@freecalypso.org>
parents: 615
diff changeset
79 loading is faster than flash program-m0 or program-srec, but slower than flash
6824c4d55848 doc/Loadtools-performance: program-m0 slowness documented
Mychaela Falconia <falcon@freecalypso.org>
parents: 615
diff changeset
80 program-bin.
615
39b74c39d914 doc/Loadtools-performance: complete for now
Mychaela Falconia <falcon@freecalypso.org>
parents: 613
diff changeset
81
39b74c39d914 doc/Loadtools-performance: complete for now
Mychaela Falconia <falcon@freecalypso.org>
parents: 613
diff changeset
82 Why does XRAM loading take longer than flashing? Shouldn't it be faster because
39b74c39d914 doc/Loadtools-performance: complete for now
Mychaela Falconia <falcon@freecalypso.org>
parents: 613
diff changeset
83 the flash programming step on the target is replaced with a simple memcpy()?
39b74c39d914 doc/Loadtools-performance: complete for now
Mychaela Falconia <falcon@freecalypso.org>
parents: 613
diff changeset
84 Answer: fc-xram is currently slower than flash program-bin because the latter
39b74c39d914 doc/Loadtools-performance: complete for now
Mychaela Falconia <falcon@freecalypso.org>
parents: 613
diff changeset
85 sends 256 bytes at a time to loadagent, whereas fc-xram sends one S-record at a
39b74c39d914 doc/Loadtools-performance: complete for now
Mychaela Falconia <falcon@freecalypso.org>
parents: 613
diff changeset
86 time; the division of the image into S-records is determined by the tool that
39b74c39d914 doc/Loadtools-performance: complete for now
Mychaela Falconia <falcon@freecalypso.org>
parents: 613
diff changeset
87 generates the SREC image, but TI's hex470 post-linker generates images with 30
39b74c39d914 doc/Loadtools-performance: complete for now
Mychaela Falconia <falcon@freecalypso.org>
parents: 613
diff changeset
88 bytes of payload per S-record. Having the operation proceed in smaller chunks
39b74c39d914 doc/Loadtools-performance: complete for now
Mychaela Falconia <falcon@freecalypso.org>
parents: 613
diff changeset
89 increases the overhead of command-response exchanges and thus increases the
39b74c39d914 doc/Loadtools-performance: complete for now
Mychaela Falconia <falcon@freecalypso.org>
parents: 613
diff changeset
90 overall time.