view MEMIF-wait-states @ 108:e26623146358 default tip

new article DSP-speech-decoder
author Mychaela Falconia <falcon@freecalypso.org>
date Tue, 29 Oct 2024 22:11:41 +0000
parents c01155dec65b
children
line wrap: on
line source

The Calypso chip's MEMIF (ARM memory interface) block has a few configuration
registers; most settings in these registers are quite straightforward, but the
WS setting (number of wait states to be inserted for external memory access)
requires some non-trivial analysis.

Calypso MEMIF timings are described on pages 7 through 11 of this TI document:

ftp://ftp.freecalypso.org/pub/GSM/Calypso/cal000_a.pdf

as well as this more recently discovered newer version:

ftp://ftp.freecalypso.org/pub/GSM/Calypso/cal000_a_v0.8.pdf

When running on a Calypso C035 target, our TCS211 reference fw as well as most
vendor firmwares we've examined run the ARM7 core at its maximum clock frequency
of 52 MHz.  These same firmwares typically configure WS=3 for both flash and
XRAM.  Most Calypso-based phones and modems have flash and RAM chips with 70 ns
access time, and for a long time it seemed that this combination of ARM7 at
52 MHz and WS=3 was OK for 70 ns memories: one ARM7 clock cycle at 52 MHz is
19.23 ns, WS=3 means 4 cycles total per access (it's an N+1 arrangement),
19.23 ns * 4 = 76.92 ns, thus it should be OK for 70 ns memories, right?  Not
so fast: as shown in the formula on cal000_a.pdf page 11 and can be seen from
the timing diagrams, two other timing parameters (tda and tsu) also need to be
factored in.  The sum of tda+tsu for 2.8V MEMIF as given in the CAL000/A v0.2
document is 10.5 ns, thus if we run the ARM7 core at 52 MHz and set WS=3, the
available safe window for memory access time is only about 66 ns, which is 4 ns
short of the 70 ns flash and RAM access time specs.

The more recently discovered version 0.8 of this same CAL000/A document
indicates that the tables for 2.8V and 1.8V MEMIF were erroneously swapped in
the older version, and the new correct tda+tsu number for 2.8V MEMIF now appears
to be 8.0 ns rather than 10.5 ns.  The available safe window for memory access
time with WS=3 thus becomes 68.92 ns - this new figure is much closer to 70 ns,
but it is still a negative margin, short by 1.08 ns.

TI's reference fw setting of WS=3 in conjuction with ARM7 running at 52 MHz has
made its way into the official firmwares of Openmoko devices and several Compal
phones, including Mot C11x/12x, Mot C139/140 and Sony Ericsson J100.  At least
in the case of Openmoko we know that the hardware features a flash chip with
70 ns access time (the combined flash+RAM chip is K5A3281CTM-D755, with the
suffix meaning 70 ns access time for flash and 55 ns for RAM), and in the case
of Compal phones it is highly unlikely that they used flash chips faster than
70 ns, thus we have strong evidence that the access time spec is being violated
by about 1.1 ns.  It works in practice because the official specs are guaranteed
worst-case numbers and the shortcoming is very small, but it is still wrong in
the strict sense.

We have strong evidence that this WS=3 setting comes from TI's mainline
reference fw, as opposed to being customized by or for Openmoko or Compal.
The evidence is in the following instruction sequence which appears verbatim-
identical across Openmoko's, Mot C11x/12x and C139/140 firmware versions:

      ldr	r1, =0xFFFFFB00
      mov	r0, #0xA3
      strh	r0, [r1, #0]
      strh	r0, [r1, #2]
      mov	r2, #0xA5
      strh	r2, [r1, #4]
      strh	r0, [r1, #6]
      mov	r0, #0x80
      strh	r0, [r1, #0xA]
      mov	r0, #0xC0
      strh	r0, [r1, #0xC]
      mov	r0, #0x40
      strh	r0, [r1, #8]

(The SE J100 version differs only in the nCS2 configuration; apparently this
SE J100 phone has its ringtone melody generator chip hooked up to nCS2, whereas
on both OM's modem and Mot C11x/12x/139/140 this chip select is unused and
unconnected, meaning that its setting is a dummy just like nCS3 and nCS4.)

The above instruction sequence has been reconstructed into the following
sequence of C macro calls:

      MEM_INIT_CS0(3, MEM_DVS_16, MEM_WRITE_EN, 0);
      MEM_INIT_CS1(3, MEM_DVS_16, MEM_WRITE_EN, 0);
      MEM_INIT_CS2(5, MEM_DVS_16, MEM_WRITE_EN, 0);
      MEM_INIT_CS3(3, MEM_DVS_16, MEM_WRITE_EN, 0);
      MEM_INIT_CS4(0, MEM_DVS_8,  MEM_WRITE_EN, 0);

      MEM_INIT_CS6(0, MEM_DVS_32, MEM_WRITE_EN, 0);
      MEM_INIT_CS7(0, MEM_DVS_32, MEM_WRITE_DIS, 0);

(The last two lines setting nCS6 and nCS7 don't need to be considered, as those
are internal to the Calypso chip itself.)

Thus we see that what appears to be TI's mainline code sets WS=3 for both nCS0
and nCS1 (flash and XRAM, respectively), and then sets what appears to be a
dummy config for the unused nCS2, nCS3 and nCS4.  I say "appears to be" because
we have no original source with comments, only a COFF binary object which our
reconstructed recompilable C code has been made to match.

We may never know the truth unless we miraculously find a surviving copy of the
original (not reconstructed from disassembly) init.c source from TCS211, but my
(Mother Mychaela's) current working hypothesis is that the above MEMIF settings
were originally made for the D-Sample board and never changed for Leonardo.
The D-Sample board has flash on nCS0, main XRAM bank on nCS1, an additional
XRAM bank (typically unused) on nCS2 and peripherals (principally the LCD) on
nCS3.  Furthermore, the original D-Sample boards had Calypso C05 chips populated
on them, and that chip version has no nCS4, only CS4 which is muxed with ADD22
and used for the latter on the D-Sample.

I further hypothetize that the above MEMIF settings were likely cast into code
in the days of Calypso C05, and that the WS=3 setting was computed when the
ARM7 core ran at 39 MHz.  The combination of ARM7 at 39 MHz, WS=3 and the more
generous tda+tsu = 10.5 ns adjustment from the older cal000_a.pdf document
(officially corresponding to Calypso C035 F751774) gives an access time of
92 ns, which is very sensible.  The hypothesis further goes that later TI moved
to Calypso C035 silicon and started running the ARM7 core at 52 MHz, but the WS
setting was never changed (overlooked), and the 92 ns access time turned into a
mere 68.92 ns.  The latter works with 70 ns memories in practice despite being
strictly incorrect (negative margin), and so the error escaped notice.

Solution adopted for FreeCalypso
================================

Pirelli's firmware on the DP-L10 sets WS=4 for both flash and XRAM, and we have
always used the same setting in FreeCalypso when running on this target.  When
we made our FCDEV3B hardware using the same Spansion flash+RAM chip copied from
the Pirelli DP-L10, we adopted the same WS=4 setting for our own FreeCalypso
hardware family on the reasoning that it is needed for this chip.  But now we
have a better theoretical foundation: the flash+RAM chip in question has 70 ns
access time for both flash and pSRAM parts, same as most other flash and RAM
chips used in most Calypso devices, and the WS=4 setting should really be used
for all Calypso C035 targets (ARM7 at 52 MHz) with 70 ns memories.  Thus the
new FreeCalypso strategy is to treat WS=4 as the generic default for Calypso
C035 platforms unless explicitly overridden for specific targets, and to stop
treating TI's reconstructed setup with WS=3 as canonical.

When running on Openmoko GTA01/02, Mot C11x/12x, Mot C139/140 and SE J100
targets (this specific list), we are going to keep WS=3 for nCS0 and nCS1 and
the dummies for nCS2, nCS3 and nCS4 unchanged for now, i.e., run with exactly
the same MEMIF settings as each manufacturer's respective original official fw.
The reason is political: we are not the product manufacturer of record, and the
error of negative design margin in the memory access timings is the liability
of FIC/Openmoko and Compal/Motorola/SE, not us.  If we change from WS=3 to WS=4
on these targets, our firmware will necessarily run a little slower, and given
that the original official fw "works just fine", we may be accused of needlessly
or artificially slowing down our aftermarket fw.  But when we market our own
handset or modem products under the FreeCalypso trademark, then the full
responsibility for the entire product (hw+fw) falls on us, hence we use the
correct WS=4 setting.

Interim WS setting during boot
==============================

There is one more complication to this picture.  The MEMIF settings discussed
above for the operational phase with Calypso DPLL producing fast clocks are
made in the Init_Target() function, but there is another interim setting
established early on in assembly code, used prior to DPLL enabling, when the
ARM7 core runs at unmultiplied 13 MHz or 26 MHz as fed to the Calypso by the
board.  This interim setting is first set in bootloader.s, then again in int.s
(with the definition residing in the included init.asm file), and the registers
are set to 0x2A1, meaning WS=1 and 1 dummy cycle.

Unlike the situation with the censored init.c source file, we have the original
source for the assembly modules in question, and the only preprocessor
conditionals found therein are based on BOARD and CHIPSET symbols.  Remember
that TI's Leonardo board never got its own BOARD number, instead it shares
BOARD=41 with D-Sample, yet the two boards have different Calypso clock inputs:
13 MHz on the DS, 26 MHz on the Leonardo.  The C code in init.c (this part
survived in the LoCosto source) uses a preprocessor conditional on the RF_FAM
symbol to differentiate between 13 MHz and 26 MHz input clock arrangements, but
there is no conditional of any such sort in the assembly code.  Thus it is my
(Mother Mychaela's) educated guess that the WS=1 setting was chosen assuming a
13 MHz clock, and when Leonardo came along with its 26 MHz clock, the problem
spot was once again overlooked.

WS=1 at 13 MHz is equivalent to WS=7 at 52 MHz, thus there is plenty of margin.
But WS=1 at 26 MHz is equivalent to WS=3 at 52 MHz, once again putting us in
the troubled territory of negative margin with 70 ns flash and RAM chips.
Except that this case is even more difficult for firmware engineers to spot:
Pirelli's fw still has the same 0x2A1 setting in its early boot path, i.e.,
their fw engineers have changed WS=3 to WS=4 for the main body of the fw, but
missed the early boot code.

The solution adopted for FreeCalypso is to change the early MEMIF setting from
0x2A1 to 0x2A2, i.e., set WS=2 for the interim boot phase.