view CMU200-maintenance-notes @ 101:916488f7a8e0

Linux-DTR-RTS-flaw: link to current location of patches
author Mychaela Falconia <falcon@freecalypso.org>
date Mon, 11 Sep 2023 06:26:36 +0000
parents 2ac06a49dfbc
children
line wrap: on
line source

Rohde & Schwarz CMU200 instrument is an absolutely essential piece of test
equipment for anyone in the business (or hobby) of designing and building his
or her own personal cellphones of 2G and/or 3G variety.  I (Mother Mychaela)
currently only work with GSM, but depending on installed hw and sw options,
CMU200 instruments also support AMPS, IS-136, IS-95 (CDMA 2G) and both WCDMA
and CDMA2000 varieties of 3G.

Over the course of owning and maintaining a CMU200 instrument since 2017 and
having had to repair it twice now (as of 2022-01), and having conversed with
another CMU200 owner who had to repair his instrument in the same way, I
started observing a pattern in that many of these instruments are now failing
in the field in exactly the same ways.  All of these failures happen in the
RXTX board, and the purpose of this article is to educate other CMU instrument
owners about these failures and most importantly, how to repair them.

Credit attribution
==================

I sincerely thank Michael Katzmann, NV3Z / VK2BEA / G4NYV, for his invaluable
help in reverse-engineering the insides of the culprit RXTX board, identifying
various critical components on that board, including the ones that habitually
fail, and identifying Eccosorb-caused galvanic corrosion as the root cause of
these failures.  Without his help, I would not have made it this far!

What is this RXTX board
=======================

This board is common among CMU200, CMU300 and CRTU-RU instruments from R&S - or
at least these are the ones I know - maybe there are others I don't know about.
This board encapsulates the instrument's main RF Rx and Tx chains: on the Rx
side it takes RF input from the front end and performs triple (or quadruple,
explained later in this article) IF downconversion to 10.7 MHz IF3, and on the
Tx side it takes 13.85 MHz IF3 input and upconverts it to RF output, going
through IF2 and IF1 in the process - triple or quadruple IF in both directions,
as explained in more detail later in this article.

Every CMU200 instrument always has one RXTX board - it is an absolutely required
component irrespective of option configurations.  The hardware architecture of
this instrument also has a place for an optional second RXTX board, providing a
complete second Rx and Tx channel - however, as far as I can tell, CMU200
software won't do anything with it, i.e., there are no test modes or
applications in CMU200 software repertoire that can make use of a second RXTX
board.  Instead it seems that configurations with two RXTX boards are better
supported on the CRTU-RU platform - but I know next to nothing about that one.

Also note: if your CMU200 is equipped with Aux Tx model B96 (as opposed to B95),
there is an output from that B96 add-on that goes to the front end input that
was originally meant for second RXTX.

RXTX board failures
===================

In terms of externally visible symptoms, almost all CMU200 units are now failing
in the same ways:

1) If Tx side fails, the visible symptom is completely absent or extremely weak
output, and the internal loopback test fails with no signal detected at any of
the frequencies in the test sequence.  A key point is that this failure mode is
independent of the selected output frequency.

2) If Rx side fails, different frequency ranges are affected differently.  As I
shall explain momentarily, there are two different IF1 Rx paths inside the RXTX
board: one handles the frequency range from > 1200 to <= 2200 MHz, and the other
handles lower (<= 1200 MHz) and higher (> 2200 MHz) input frequencies.  When a
given RXTX board develops Rx path failure, this failure happens separately in
each of these two IF1 Rx paths.  The resulting symptoms vary: if only one of
the two IF1 Rx paths fails, then only that frequency range will be affected,
or if both fail, the observed loss will typically be different between the two
frequency ranges.  The failure symptom is unexpected large attenuation:
sometimes around 5 to 6 dB of loss, othertimes as much as 25 dB of loss.

The internal loopback test invoked from the Maintenance menu is a good first
step in diagnosis.  In this test the instrument software configures both Rx and
Tx chains to connect to RF1 (and then RF2 if you press Continue), and then it
tests a longish list of different frequencies in sequence, spanning the full
range from 10 to 2700 MHz.  For each test frequency, the instrument software
configures both the signal generator and the Rx chain, and it reports what was
measured on Rx vs. what was put out on Tx.  The test is considered a failure if
nothing was received or if the Rx signal level was too far from the expected
value, otherwise the test is declared as passed.

If the loopback test fails at every frequency with no signal detected, then you
don't really know what's going on, and you will need to manually test Rx and Tx
separately (using an external spectrum analyzer and an external signal
generator) in order to figure out what is broken.  However, out of the commonly
observed failure modes, dead Tx will produce this symptom.

If the Tx side is OK but Rx IF1 filters (one or both paths) have gone bad, the
visible symptom in the loopback test will be Rx signal level that is lower than
it should be.  The instrument software may declare the test as either passed or
failed depending on the magnitude of the error: in this Mother's experience,
when one of the two IF1 paths on my CMU200 developed a loss of some 5.8 dB, the
loopback test was reported as passing - but a closer look at the numbers in the
report window showed the unexpected attenuation.

Because different input frequency ranges are handled via different Rx paths as
explained in the following section, when Rx IF1 filters fail, the loss behaviour
will be frequency-dependent.  In the internal loopback test, you will see one
behaviour for frequencies from 10 to 1200 MHz, then a marked change for
frequencies from 1205 to 2200 MHz, and then another change (most likely a
reversion to low frequency behaviour) at the highest frequencies above 2200 MHz.

We do not currently know if there are any other failure modes elsewhere in the
CMU200 instrument that can also cause a stepwise change in behaviour at these
frequency cutover points.  It is my (Mother Mychaela's) suspicion that the front
end may have some filters too, each covering a wide frequency swath, with
instrument software switching these filters depending on the configured
listening frequency - but we don't know for certain if any such additional
filters are there or not.  If you find yourself wondering whether the problem
you are seeing is in the RXTX board or the front end, the best way to narrow it
down would be to remove the semi-rigid coax pieces that carry RF between the two
and use an external spectrum analyser to look at the Tx output from the RXTX
board and/or the Rx output from the front end.

RXTX board architecture explained
=================================

Unfortunately R&S' official service manual for CMU200 instruments is only a part
swapper guide: it tells you which boards do what in general terms and tells you
how to remove and replace each part, but no schematics, and no detailed
explanation of what happens inside each board.  They do provide a little bit of
info: I draw the reader to the block diagram on page 3.2 of this manual - this
block diagram does provide an important starting point for understanding what
happens inside the RXTX board - however, it is simplified and incomplete.

In the Tx direction, 13.85 MHz IF3 comes in from the digital board - or from
B68 board in WCDMA test modes.  This Tx IF3 is mixed with Tx LO3 to produce
Tx IF2.  This Tx IF2 is fixed at 487.52 MHz, thus one would think that Tx LO3
frequency ought to be fixed as well - but it seems to be a synthesized variable
frequency, and the manual describes it as "LO3TX with small tuning range".
Calculations done by Michael VK2BEA put Tx LO3 at 473.67 MHz (needs
confirmation), but it is still not clear why it is a synthesized frequency
"with small tuning range", as opposed to simply fixed.

Tx IF2 of 487.52 MHz is then passed through a pair of identical SAW filters,
Sawtek 855272 - two cascaded identical filters, with an amplifier in between.
This SAW filter has a center frequency of 479.75 MHz with 20 MHz bandwidth,
thus the passband spans from 469.75 to 489.75 MHz.  Notice how Tx IF2 of
487.52 MHz stands just 2.23 MHz away from the edge of the passband - is it
intentional?  What are they filtering?  Without original design notes, we can
only guess.  As I shall explain later in this article, one of these two Tx IF2
SAW filters is a component prone to failure.

[Note from Michael VK2BEA: "The LO frequency is only 13.85 MHz from the IF.  It
makes sense to shift this to the edge of the passband to help the suppression
of LO feed through.  Also explains the use of SAW filters (sharp skirts) and
that there are two."]

After these cascaded SAW filters, Tx IF2 is mixed with LO2.  Unlike LO1 and LO3,
there is only one LO2 for both Rx and Tx, and it is fixed at 1329.6 MHz.  When
Tx IF2 at fixed 487.52 MHz is mixed with LO2 at fixed 1329.6 MHz, the output of
this mixer will always contain two frequencies: 842.08 MHz and 1817.12 MHz.
These are the two possible Tx IF1 frequencies, and there is a frequency-
selective filter for each of these two Tx IF1 modes.  Based on the final output
frequency to be generated, instrument control software selects either low or
high Tx IF1, controlling switches before and/or after the filters.  I have not
investigated to see if the frequency ranges for high vs. low Tx IF1 are the same
as on the Rx side or not - maybe they are the same, maybe they are different.

After Tx IF1 output is combined or switched from the two filters, it is mixed
with Tx LO1 to produce an output that may or may not be final RF.  The mixer
that does this job is MACOM SM4T, which is one of the larger, prominently
visible components on the board.  Tx LO1 has "large tuning range and very fine
frequency resolution used for setting the desired transmitter frequency" - quote
from the manual; by doing some frequency arithmetics, we can see that this Tx
LO1 tuning range needs to span from 1827.12 to 3042.08 MHz in order to produce
output frequencies from 10 to 2200 MHz starting from 842.08 MHz or 1817.12 MHz
IF1.  (LO1 - IF1 is the desired output frequency, whereas the sum will be a
much higher frequency above 2.7 GHz - I presume that the latter must be
suppressed by some LPF somewhere.)

The "RF" output from Tx SM4T mixer (LO1-IF1 as explained above) is indeed the
final RF output going to the front end for output frequencies below 2200 MHz.
In the uppermost frequency range of 2200 to 2700 MHz, a fourth mixer and LO
stage come into play - NOT shown on the block diagram in the manual!  In this
highest frequency range, the output from SM4T mixer should be considered a
fourth IF - but because it is not covered at all in the manual and not named,
we have to invent our own name for it.  I (Mother Mychaela) propose that we
call it IF0, and refer to the corresponding LO as LO0 - this way we remain
consistent with official naming that puts IF1 closest to RF and IF3 closest to
digital.

The preliminary analysis by Michael VK2BEA is that Tx LO0 frequency is fixed at
3318.46 MHz (same as its counterpart on the Rx side), with IF0 (taking the place
of lower RF) ranging from 1118.46 to 618.46 MHz (reverse range) to produce final
output frequencies of 2200 to 2700 MHz.  However, these numbers have NOT been
confirmed by actual measurements yet.

On the Rx side the same process happens in reverse, but the specific frequencies
used for IF1, IF2 and IF3 are slightly different.  At first there is a stage
that only kicks in for frequencies above 2200 MHz (bypassed otherwise), and
then there is an SM4T mixer (identical to the one on Tx side) that takes in RF
and Rx LO1 to produce Rx IF1.  High-side injection is used, i.e., Rx LO1 is
programmed to generate frequency equal to the external RF of interest PLUS the
desired Rx IF1 output.

Rx LO1 is programmed as follows by the instrument control software:

* Rx IF1 will be at 1816.115 MHz (call it high) if the listening frequency is
  <= 1200 MHz or > 2200 MHz;

* Rx IF1 will be at 843.085 MHz (call it low) if the listening frequency is in
  the intermediate range, i.e., 1200 MHz < RF <= 2200 MHz.

In addition to programming Rx LO1 to produce the desired IF1 per the logic
above, the software also controls switches that select one or the other IF1
filter: either the filter that passes low IF1 or the one that passes high IF1.

The filters used for low and high IF1 modes are the same on both Rx and Tx
sides.  (The actual frequencies are slightly different, but in each case they
fit within the passband of the common filter parts.)  The filter for low IF1 is
Murata DFC3R836P025HHD, package marking 836 CD, and the one for high IF1 is
DFC31R84P075HHA, package marking CR.  The two filter packages are NOT the same
mechanically: the low IF1 filter is physically larger.  Both parts are ceramic
monoblock filters from the same family, and it seems that these filter parts
were originally made for mobile phones, not for RF metrology instruments: the
"836 CD" filter is for AMPS uplink band, and the "CR" filter is for DCS downlink
band.

On the Tx side of the board there are only two IF1 filters: one for low Tx IF1
and one for high Tx IF1.  However, on the Rx side there are 3 of these ceramic
filters in total: two for high IF1 (two cascaded identical filters with an
amplifier in between) and just one for low IF1.  Why am I covering these filters
in so much detail?  You probably guessed it: they are components that fail, as
will be covered shortly.

After the selection of either low or high IF1 filter, Rx IF1 coming out of the
selected filter (either 843.085 MHz or 1816.115 MHz) is mixed with LO2, which is
shared between Rx and Tx sides and fixed at 1329.6 MHz.  The output of this
mixer is Rx IF2 at 486.515 MHz.  This Rx IF2 then passes through a pair of
cascaded Sawtek 855272 filters, two identical filters with an amplifier in
between, exactly the same as on the Tx side.  Then there is Rx LO3 and the final
mixer, producing Rx IF3 at 10.7 MHz that goes to the digital board, to the rear
panel BNC output and to the WCDMA board (B68) if the latter is present.

Frequency conversion tables
===========================

Michael VK2BEA worked out a pair of frequency conversion tables, one for Rx and
one for Tx.  Here are these tables, with further corrections by Mother Mychaela:

Rx frequency conversion, RF to IF1:

RF (MHz)    LO0 (MHz)   IF0 (MHz)        LO1 (MHz)           IF1 (MHz)
----------------------------------------------------------------------
  10-1200                                1826.115-3016.115   1816.115
1200-2200                                2043.085-3043.085    843.085
2200-2700   3318.46     1118.46-618.46   2934.575-2434.575   1816.115

Rx frequency conversion, IF1 to IF3:

IF1 (MHz)   LO2 (MHz)   IF2 (MHz)   LO3 (MHz)   IF3 (MHz)
--------------------------------------------------------
1816.115    1329.6      486.515     497.215     10.7
 843.085    1329.6      486.515     497.215     10.7

Tx frequency conversion, IF3 to IF1:

IF3 (MHz)   LO3 (MHz)   IF2 (MHz)   LO2 (MHz)   IF1 (MHz)
---------------------------------------------------------
 13.85      473.67      487.52      1329.6      1817.12
 13.85      473.67      487.52      1329.6       842.08

Tx frequency conversion, IF1 to RF:

IF1 (MHz)   LO1 (MHz)         IF0 (MHz)        LO0 (MHz)   RF(MHz)
------------------------------------------------------------------
1817.12     1827.12-3017.12                                10-1200
 842.08     2042.08-3042.08                              1200-2200
1817.12     2935.58-2435.58   1118.46-618.46   3318.46   2200-2700

Notes:

* In Michael's original version each table covered the full chain from RF on
  one end to IF3 on the other end, but I (Mychaela) had to split each table
  into two in order to fit within 80 columns.

* The numbers for LO3 (473.67 MHz for Tx, 497.215 MHz for Rx) are from Michael;
  I (Mychaela) have not verified them.

* All details for the IF0/LO0 stage (upper frequency range) are from Michael;
  his notes indicate that the numbers are confirmed for Rx, but not for Tx.

How these RXTX boards fail
==========================

There are 3 specific components on this RXTX board that have been seen to fail
over and over in the field:

* The second of the two cascaded IF2 SAW filters (Sawtek 855272) on the Tx side
  often fails, breaking the Tx chain (output totally gone or extremely weak)
  for all frequencies.  Note that there are a total of 4 identical Sawtek 855272
  filters on this board (2 on Rx side, 2 on Tx side), and only one of the four
  fails: Tx side, second filter in the cascade.

* The "836 CD" filter on the Rx side is prone to failure.  When it fails, the
  visible symptom is severe attenuation in measured Rx signal levels for input
  frequencies in the 1200 MHz < RF <= 2200 MHz range.  Only the Rx side filter
  fails, not the identical one on the Tx side!

* One of the two cascaded "CR" filters on the Rx side likewise fails - this time
  it is the first one in the cascade.  The other two identical "CR" filters on
  the same board (the second in cascade for Rx and the one for Tx) are likewise
  NOT seen to fail.

The root cause of all 3 component failures has been traced to galvanic corrosion
caused by direct contact between these components and Eccosorb RF absorber foam.
The complete RXTX board assembly consists of the traditional PCBA plus heavy
metal shields on both sides; the front and back metal shield pieces are custom-
made for this board, with individually shielded cavities matching different
sections of the board.  Some (not all) of these cavities are filled with a
special black foam called Eccosorb - it is an RF absorber, presumably added to
lower the Q of these cavities to prevent parasitic oscillations.  Trouble occurs
when this Eccosorb foam comes into direct contact with metal surfaces of
components on the board: the result is galvanic corrosion, a process that takes
many years before it results in component failure.  The reason why only 3
particular filter components fail is because they got the bad luck of residing
in cavities with Eccosorb - the other identical components that don't fail
reside in cavities without Eccosorb.

[Note from Michael VK2BEA: "The copper surface of these filters form an integral
part of the component.  It is this copper that forms the cavity of the combline
filter.  When this is compromised by corrosion, the filter is detuned and there
is leakage causing excessive loss."]

It appears that R&S only noticed this design flaw toward the end of "product
life" of these instruments, probably because failures occur only after many
years.  Some of the newer boards have had modifications to prevent contact
between Eccosorb and the two troubled Rx filters, either by way of thinner
Eccosorb fill or by way of an added plastic barrier.  It is not clear if these
modifications were applied to newer produced RXTX boards from the start, or if
they are a result of field service repairs.

How to repair failed boards
===========================

All 3 of the failing filter components (one SAW filter part and two ceramic
monoblock filter parts) are now unobtainium.  However, because so many of these
RXTX boards fail in exactly the same ways, our community at large is now
accumulating a very substantial "graveyard" of failed boards, and here is the
good news: we can make one good board out of every two failed ones.  Suppose
that every RXTX board in our community's collective inventory has fully failed,
leaving no failure-free boards - what now?  Here is the recipe for making one
good RXTX board out of two fully failed ones:

1) Out of the two failed boards, choose one to be the part donor and the other
   to be the part recipient.

2) Take the part donor board and harvest 3 parts from it: one of the 3 Sawtek
   855272 filters that aren't subject to corrosion, and the two IF1 filters
   (one 836 CD and one CR) from the Tx side.  Tx side IF1 filters aren't in
   contact with Eccosorb and thus don't corrode, and 3 out of the 4 SAW filters
   are likewise safe - hence we expect that every "dead" RXTX board can still
   serve as a donor of good parts in this manner.

3) Take the part recipient board and transplant the donor parts onto it,
   replacing all 3 corroded filters.

4) Before putting the repaired board back into its metal casing, cover all
   corrosion-prone components with Kapton tape, preventing direct galvanic
   contact with Eccosorb - this way the newly transplanted uncorroded components
   won't suffer the same fate.

RXTX disassembly instructions
=============================

Before you can start working on an individual RXTX board, you first need to pull
it out of your CMU.  Disassembly instructions are provided in the official part
swapper guide from R&S (which they call "service manual"), but here is the gist:

* Using a Torx T20 screwdriver, remove the 4 rear feet and lift the sleeve part
  of the instrument case.

* Remove two small Phillips screws that secure the cover over the main board
  cage, and lift that cover off.

* Unhook all MMCX little coax connections from the RXTX board: 3 on the top side
  (IF3 interface) and one on the bottom (netclock input).

* Loosen and remove the two semi-rigid coax pieces that connect RF between the
  RXTX board and the front end.  In this Mother's opinion, this step is the
  least pleasant of all, but it is unavoidable.

* After ensuring that nothing remains connected to the RXTX board on the bottom
  side, pull the board out from the top.

Once you got the complete RXTX board assembly out, how do you extract the actual
board out of the metal casing?  The not-immediately-obvious answer is that you
don't need to remove all of the screws, instead there are shortcuts that will
save you a lot of pain:

* There are two smooth thin metal plates, one on the front side of the board
  (facing toward the front of the CMU when installed) and one on the back side.
  Each is secured with a small Phillips screw.  You only need to remove the one
  on the front side.  You don't need to remove the thin metal plate from the
  back side of RXTX assembly - doing so will only add more clutter and loose
  parts to your lab bench while the board is disassembled.

* Once you remove the thin metal plate from the *front* side of your RXTX
  assembly, you will see all of the many screws that hold together the sandwich
  of two heavy metal pieces with the board in the middle.  These screws are
  Torx T8.

* Put the board down on your bench so that the side that faces the front of the
  CMU when installed (the side with the T8 screw heads) will become the top,
  with the rear side becoming bottom.

* Each of the T8 screws passes through thread in the top metal piece, a hole in
  the PCB, and then thread in the bottom metal piece.  As you loosen these
  screws, you don't need to remove them all the way - instead loosen each screw
  so that its far end comes out of the thread in the bottom metal piece, but
  let it remain captive in the top metal piece.  Letting the screws remain
  captive in the top metal piece will reduce bench clutter while the board is
  disassembled, and there is a lot less screwing and unscrewing work to be done,
  as there is no need to work through the thread in the top metal piece.

Once you loosen all of the T8 screws, the top metal piece should lift off,
leaving just the bottom metal piece and the PCBA.  The bottom metal piece has
two thin metal pins sticking out of it; both the PCBA and the top metal piece
align on these two pins.

When you lift the top metal piece (the one with the screws), the side of the
board that will be immediately exposed to you is the side that faces the front
of the CMU when the board is installed.  It is the Rx side, and you can confirm
that you are looking at the Rx side by noting that there are two "CR" filters
for high IF1, as opposed to just one.  And chances are, right here at this step
in the disassembly process you will see the galvanic corrosion or the lead-up
to it.

As you lift the top metal piece from the board, look at its inside and note the
many individual cavities.  Also note how some of these cavities are filled with
some black foam - that's the Eccosorb.  And note how only some of the cavities
have Eccosorb in them, not all.

Now look at the ceramic IF1 filters on the Rx side of the board.  The one "CR"
filter that is NOT in contact with Eccosorb will be bright copper-colored (it
actually is copper), whereas the two filters that are in contact with Eccosorb
(one 836 CD, one CR) will often be green instead of copper-colored on their top
surface - that's patinated copper!  Furthermore, there will typically be some
black Eccosorb material directly adhered to the corroding top surfaces of those
two unlucky filters.

Now lift the PCBA off the two metal pins, separating it from the bottom metal
piece.  Like you did with the top metal piece, observe the inside of the bottom
metal piece: note which cavities have Eccosorb in them and which don't.  Then
flip the board over and look at its Tx side.  You will see that there are only
two ceramic IF1 filters on this side (one 836 CD and one CR), and both should
be in pristine shape, bright copper-colored, no corrosion - these two are not
in contact with Eccosorb!

Now look at the two Sawtek 855272 filters on the Tx side.  The one closer to
the middle of the board will often appear in worse physical condition that the
other 3 - and the culprit is once again in contact with Eccosorb.

MACOM SM4T mixer corrosion
==========================

Neither I nor my collaborator on this project have seen an RXTX board on which
either the Rx SM4T mixer or the Tx one went bad - i.e., we haven't seen a
failure in this part *yet*.  However, this mixer *is* in contact with Eccosorb,
and looking visually at the collection of RXTX boards in my possession, I
(Mother Mychaela) see definite signs of corrosion - the metal surface of this
SM4T mixer component is beginning to corrode.  Therefore, as a preventative
measure, I recommend cleaning off any Eccosorb that is adhered to this component
and then covering the component with Kapton tape before putting the board back
into its metal casing.

Unlike the failing filters, this MACOM SM4T mixer is still available new - but
it's an expensive component, so let's protect these mixers from corrosion.

[Michael VK2BEA notes: "The case of the mixer is purely for shielding and is
much thicker than the thin copper of the filter that is essential for
operation."  Mother Mychaela's response: it may be so, but if you are going to
take the RXTX board out of your CMU, take it out of its metal casing and either
replace filters or at least cover them with Kapton tape for protection, it
won't hurt to put the same Kapton tape on the mixers too - and the signs of
corrosion are very real.]