FreeCalypso > hg > freecalypso-docs
diff MEMIF-wait-states @ 17:3d65bdaf00da
MEMIF-wait-states article written
author | Mychaela Falconia <falcon@freecalypso.org> |
---|---|
date | Sun, 16 Jun 2019 23:30:33 +0000 |
parents | |
children | c01155dec65b |
line wrap: on
line diff
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/MEMIF-wait-states Sun Jun 16 23:30:33 2019 +0000 @@ -0,0 +1,166 @@ +The Calypso chip's MEMIF (ARM memory interface) block has a few configuration +registers; most settings in these registers are quite straightforward, but the +WS setting (number of wait states to be inserted for external memory access) +requires some non-trivial analysis. + +Calypso MEMIF timings are described on pages 7 through 11 of this TI document: + +ftp://ftp.freecalypso.org/pub/GSM/Calypso/cal000_a.pdf + +When running on a Calypso C035 target, our TCS211 reference fw as well as most +vendor firmwares we've examined run the ARM7 core at its maximum clock frequency +of 52 MHz. These same firmwares typically configure WS=3 for both flash and +XRAM. Most Calypso-based phones and modems have flash and RAM chips with 70 ns +access time, and for a long time it seemed that this combination of ARM7 at +52 MHz and WS=3 was OK for 70 ns memories: one ARM7 clock cycle at 52 MHz is +19.23 ns, WS=3 means 4 cycles total per access (it's an N+1 arrangement), +19.23 ns * 4 = 76.92 ns, thus it should be OK for 70 ns memories, right? Not +so fast: as shown in the formula on cal000_a.pdf page 11 and can be seen from +the timing diagrams, two other timing parameters (tda and tsu) also need to be +factored in. The sum of tda+tsu for 2.8V MEMIF as given in the only document +we have available is 10.5 ns, thus if we run the ARM7 core at 52 MHz and set +WS=3, the available safe window for memory access time is only about 66 ns, +which is 4 ns short of the 70 ns flash and RAM access time specs. + +TI's reference fw setting of WS=3 in conjuction with ARM7 running at 52 MHz has +made its way into the official firmwares of Openmoko devices and several Compal +phones, including Mot C11x/12x, Mot C139/140 and Sony Ericsson J100. At least +in the case of Openmoko we know that the hardware features a flash chip with +70 ns access time (the combined flash+RAM chip is K5A3281CTM-D755, with the +suffix meaning 70 ns access time for flash and 55 ns for RAM), and in the case +of Compal phones it is highly unlikely that they used flash chips faster than +70 ns, thus we have strong evidence that the access time spec is being violated +by about 4 ns. It works in practice because the official specs are guaranteed +worst-case numbers, but it is still wrong in the strict sense. + +We have strong evidence that this WS=3 setting comes from TI's mainline +reference fw, as opposed to being customized by or for Openmoko or Compal. +The evidence is in the following instruction sequence which appears verbatim- +identical across Openmoko's, Mot C11x/12x and C139/140 firmware versions: + + ldr r1, =0xFFFFFB00 + mov r0, #0xA3 + strh r0, [r1, #0] + strh r0, [r1, #2] + mov r2, #0xA5 + strh r2, [r1, #4] + strh r0, [r1, #6] + mov r0, #0x80 + strh r0, [r1, #0xA] + mov r0, #0xC0 + strh r0, [r1, #0xC] + mov r0, #0x40 + strh r0, [r1, #8] + +(The SE J100 version differs only in the nCS2 configuration; apparently this +SE J100 phone has its ringtone melody generator chip hooked up to nCS2, whereas +on both OM's modem and Mot C11x/12x/139/140 this chip select is unused and +unconnected, meaning that its setting is a dummy just like nCS3 and nCS4.) + +The above instruction sequence has been reconstructed into the following +sequence of C macro calls: + + MEM_INIT_CS0(3, MEM_DVS_16, MEM_WRITE_EN, 0); + MEM_INIT_CS1(3, MEM_DVS_16, MEM_WRITE_EN, 0); + MEM_INIT_CS2(5, MEM_DVS_16, MEM_WRITE_EN, 0); + MEM_INIT_CS3(3, MEM_DVS_16, MEM_WRITE_EN, 0); + MEM_INIT_CS4(0, MEM_DVS_8, MEM_WRITE_EN, 0); + + MEM_INIT_CS6(0, MEM_DVS_32, MEM_WRITE_EN, 0); + MEM_INIT_CS7(0, MEM_DVS_32, MEM_WRITE_DIS, 0); + +(The last two lines setting nCS6 and nCS7 don't need to be considered, as those +are internal to the Calypso chip itself.) + +Thus we see that what appears to be TI's mainline code sets WS=3 for both nCS0 +and nCS1 (flash and XRAM, respectively), and then sets what appears to be a +dummy config for the unused nCS2, nCS3 and nCS4. I say "appears to be" because +we have no original source with comments, only a COFF binary object which our +reconstructed recompilable C code has been made to match. + +We may never know the truth unless we miraculously find a surviving copy of the +original (not reconstructed from disassembly) init.c source from TCS211, but my +(Mother Mychaela's) current working hypothesis is that the above MEMIF settings +were originally made for the D-Sample board and never changed for Leonardo. +The D-Sample board has flash on nCS0, main XRAM bank on nCS1, an additional +XRAM bank (typically unused) on nCS2 and peripherals (principally the LCD) on +nCS3. Furthermore, the original D-Sample boards had Calypso C05 chips populated +on them, and that chip version has no nCS4, only CS4 which is muxed with ADD22 +and used for the latter on the D-Sample. + +I further hypothetize that the above MEMIF settings were likely cast into code +in the days of Calypso C05, and that the WS=3 setting was computed when the +ARM7 core ran at 39 MHz. The combination of ARM7 at 39 MHz, WS=3 and the same +tda+tsu = 10.5 ns adjustment from the available cal000_a.pdf document +(officially corresponding to Calypso C035 F751774) gives an access time of +92 ns, which is very sensible. The hypothesis further goes that later TI moved +to Calypso C035 silicon and started running the ARM7 core at 52 MHz, but the WS +setting was never changed (overlooked), and the 92 ns access time turned into a +mere 66 ns. The latter works with 70 ns memories in practice despite being +strictly incorrect (negative margin), and so the error escaped notice. + +Solution adopted for FreeCalypso +================================ + +Pirelli's firmware on the DP-L10 sets WS=4 for both flash and XRAM, and we have +always used the same setting in FreeCalypso when running on this target. When +we made our FCDEV3B hardware using the same Spansion flash+RAM chip copied from +the Pirelli DP-L10, we adopted the same WS=4 setting for our own FreeCalypso +hardware family on the reasoning that it is needed for this chip. But now we +have a better theoretical foundation: the flash+RAM chip in question has 70 ns +access time for both flash and pSRAM parts, same as most other flash and RAM +chips used in most Calypso devices, and the WS=4 setting should really be used +for all Calypso C035 targets (ARM7 at 52 MHz) with 70 ns memories. Thus the +new FreeCalypso strategy is to treat WS=4 as the generic default for Calypso +C035 platforms unless explicitly overridden for specific targets, and to stop +treating TI's reconstructed setup with WS=3 as canonical. + +When running on Openmoko GTA01/02, Mot C11x/12x, Mot C139/140 and SE J100 +targets (this specific list), we are going to keep WS=3 for nCS0 and nCS1 and +the dummies for nCS2, nCS3 and nCS4 unchanged for now, i.e., run with exactly +the same MEMIF settings as each manufacturer's respective original official fw. +The reason is political: we are not the product manufacturer of record, and the +error of negative design margin in the memory access timings is the liability +of FIC/Openmoko and Compal/Motorola/SE, not us. If we change from WS=3 to WS=4 +on these targets, our firmware will necessarily run a little slower, and given +that the original official fw "works just fine", we may be accused of needlessly +or artificially slowing down our aftermarket fw. But when we market our own +handset or modem products under the FreeCalypso trademark, then the full +responsibility for the entire product (hw+fw) falls on us, hence we use the +correct WS=4 setting. + +Interim WS setting during boot +============================== + +There is one more complication to this picture. The MEMIF settings discussed +above for the operational phase with Calypso DPLL producing fast clocks are +made in the Init_Target() function, but there is another interim setting +established early on in assembly code, used prior to DPLL enabling, when the +ARM7 core runs at unmultiplied 13 MHz or 26 MHz as fed to the Calypso by the +board. This interim setting is first set in bootloader.s, then again in int.s +(with the definition residing in the included init.asm file), and the registers +are set to 0x2A1, meaning WS=1 and 1 dummy cycle. + +Unlike the situation with the censored init.c source file, we have the original +source for the assembly modules in question, and the only preprocessor +conditionals found therein are based on BOARD and CHIPSET symbols. Remember +that TI's Leonardo board never got its own BOARD number, instead it shares +BOARD=41 with D-Sample, yet the two boards have different Calypso clock inputs: +13 MHz on the DS, 26 MHz on the Leonardo. The C code in init.c (this part +survived in the LoCosto source) uses a preprocessor conditional on the RF_FAM +symbol to differentiate between 13 MHz and 26 MHz input clock arrangements, but +there is no conditional of any such sort in the assembly code. Thus it is my +(Mother Mychaela's) educated guess that the WS=1 setting was chosen assuming a +13 MHz clock, and when Leonardo came along with its 26 MHz clock, the problem +spot was once again overlooked. + +WS=1 at 13 MHz is equivalent to WS=7 at 52 MHz, thus there is plenty of margin. +But WS=1 at 26 MHz is equivalent to WS=3 at 52 MHz, once again putting us in +the troubled territory of negative margin with 70 ns flash and RAM chips. +Except that this case is even more difficult for firmware engineers to spot: +Pirelli's fw still has the same 0x2A1 setting in its early boot path, i.e., +their fw engineers have changed WS=3 to WS=4 for the main body of the fw, but +missed the early boot code. + +The solution adopted for FreeCalypso is to change the early MEMIF setting from +0x2A1 to 0x2A2, i.e., set WS=2 for the interim boot phase.