Questions on Reference Planes for DDR3 signals

Source: Time:2014/9/10
Questions on Reference Planes for DDR3 signals
Hello all,
I've been looking at DDR3 layout guides from Micron and from some manufacturers of DDR3 controller devices and haven't found a clear recommendation for how to deal with the issue of reference planes for DDR3 interfaces in a stripline configuration. Also, there are some other "reference-plane" issues which I hope someone can clarify.
So here are the questions I haven't found a clear answer for:
(1)Almost all recommendations that I've found are to route data-group signals "adjacent to a solid GND Layer". Can I, therefore, route them adjacent to both a GND and a Power plane in a stripline configuration? Does it matter if the stripline is symmetric (relative to the 2 ref planes) or not? 
(2) Almost all recommendations that I've found are to route address/command/control signals "adjacent to a solid Power or GND Layer". Can I route them adjacent to a Power plane with a voltage different than the DDR3-1.5Volt VDD?
(3) If DDR-VDD is required, can I use another solid Power plane as the other plane in stripline configurations (provided the 1st plane is VDD)?
(4) What reference plane is recommended for the Clock pair? Micron, for example, doesn't say anything on this matter, but some other manufacturers recommend using only a GND plane. Is this true, as far as you know?

Thanks in advance for your help!
Itzhak Hirshtal
Elta Systems
Hi Itzhak,
Please see below my comments:
1. This recommendation is to reduce the XTALK issues. (The more the traces are close to the GND planes, the XTALK reduces. It shouldn't matter if it's asymmetric stripline as long as the ref planes are GND. (Since the rise times are very short the XTALK issues can be problematic in case not treated)
2. The important thing is maintaining a solid return path. Even if the return path is not GND, it will get there via eddy currents
3. Didn't quite understand (you can call me on that - please see below my contact)
4. We can discuss this as well over the phone
Best Regards,
Dgtronix Ltd. I Founder & CEO I Dudi Tash 
eFax: +972-3-7256490 I Mobile: +972-54-6345629 I Office: +972-9-9660967

HI Itzhak,
        The core issue from you is that : could the power plane be the 
reference plane? My opinion is right. You can use any plane as current return path, power or gnd. But the impedance between these two plane should be very small. Because if the return path would change plane, the impedance discontinutivity is as less as possible.

Zhenwei(Jason) Wang
Cisco Systems (Shanghai) Video Technology Co., Ltd.

Hirshtal, this all amounts to reducing signal disturbance.  
Unfortunately, this is another case of:  "It all depends."   The data 
signals constitute the higher data rates as well as the greater number 
SSOs, so they deserve more care than the A/C signals.  However, if you 
want reliable operation at speed, reasonable care must be employed with 
both.  You have a few basic options:
1. Follow a proven topology exactly as though your life depends on it.  
Pray that the gods of reference designs will reward your dedicated 
obedience by blessing your effort.

2. Develop and evaluate design rules that are suitable to your 
particular situation.  Tool sets are available to evaluate different 
topologies, such as from Si-Soft.  If you don't have the tools, the 
budget, or the time, then you can lean on an outside service to do this 
for you.

3. Try to develop rules ad-hoc without checking and hope that Murphy 
does not decide to amuse himself at your expense.  The more conservative 
the rules you set, the more likely you can save yourself the wrath of 
Slides 14 and 15 of this presentation available on my web-site 
illustrate what you face with different routing topologies:
A couple of comments:  Memory packages and DIMMs reference data lines to Vss.  DIMMs reference A/C to VDDQ.  This was done to support low layer 
count PCBs.  It works.  You will avoid introducing extra noise and 
cross-talk by extending those references in your channel end to end.  
What your memory controllers reference varies.  Any reference change you introduce will inject energy between the references used.  This can be handled, but requires work and more analysis.  And unless the design is low performance, it usually imposes more cost for equal performance.
A final note:  Always treat clock with the utmost care.  Barring some 
terrible cost impact a continuous Vss reference is a good way to treat 

Follow Steve advice. 
The only thing I might add is to be sure to extend the return planes along the entire data/address/control pathway. Avoid crossing gaps or jumping from one reference plane to another along the return path. We designers often forget that it's the signal -current- that's important. Where does the return current want to go as it returns to the source? High frequency (> 10 kHz!) currents tend to flow on the path of -least impedance-. We need to make it easy by avoiding the crossing of gaps/slots in reference planes and jumping from one reference to another. 

Also, avoid running clock lines along the edge of your PC board - keep them internal. You might also lay them out first, so they are direct and -short-
Cheers, Ken.
Have you verified that memory packages reference data lines only to Vss?
The memory pinout suggests that data lines are referenced to both Vss 
and Vddq.

Has anyone else tried the experiment of using VDDQ as the reference plane 
instead of ground on an actual motherboard design?  If so, can you share your results?用VDDQ代替地做参考层的实际经验?
When we tried it on a microstrip channel, we saw margins degraded dramatically, but not for the reasons cited below.  Noise on the VDDQ plane was the only predictor of degraded margin.  No amount of TDR'ing, TDT'ing, measuring xtalk, etc., could show a dramatic difference between channels routed w/ GND referencing vs. those w/ VDDQ referencing.  I could see measurable differences, but not enough to account for the dramatic margin degradation.
What was dramatic was the noise on the VDDQ plane that was then introduced on the data lines, degrading their margin appreciably. 
    Simulations backed up this finding - w/o power supply noise, margins weren't degraded significantly.   With power supply noise, margin degradation on a per-bit basis mimicked our measurements extremely well. 
 The biggest question for me was whether we'd introduced an abnormally noisy VDDQ plane, since it had to be "kludged" into the design (pinouts are tailored towards GND referencing).  But, we're not sure this was unique to our design - we think you would do that if you tried to use VDDQ referencing on most actual designs.  The pinouts favor GND referencing - you end up introducing a large floating VDDQ plane if you try to use it as 
reference.  I'm afraid this is hard to explain w/o pictures - sorry.
Bottom line is that I think that you'll degrade your margin appreciably 
(dramatically) by using VDDQ referencing on microstrip, but not for the reasons cited.  I'd love to see data from others who have actually performed the experiment.
Similar findings for stripline, margins reduced dramatically if VDDQ-only referencing was used, though we didn't perform the same intense analysis to find the root cause, though I suspect noise on the VDDQ planes as the primary culprit.  Again, the pinout forced the introduction of VDDQ referencing to be an awkward (non-ideal) design, so I'm not sure this is avoidable in real designs.
Margins were reduced on asymmetric stripline when the far reference plane was changed from GND to VDDQ, though not near as dramatically; here too I suspect noise on the VDDQ plane as the primary culprit.

I look forward to hearing what work others have done to validate this design rule on actual designs, vs. theoretical studies.
Also, could someone please clarify the comments about "clock"?  Are you talking about the strobes?  If so, I don't know how you would treat those significantly different (reference-wise) than the data lines, since they are positioned in the middle of the data lines.  To cut out a dedicated reference plane for those would seem to be extremely problematic (if not pathological), or am I missing something?  Aren't you forced to treat them very similarly to the data lines?
Thanks in advance,
Jeff Loyer

Yep attempted it, on the low power products, like flash we got some results, on high power like CPU very poor results -Lyndell 

Vinu this was the practice with the last packages I looked at several 
years ago.
Absent a 128 bit TDR I don't know how a TDR would show the problem.

A network DRAM package we looked at had signal lines referenced to both 
Vddq and Vss. It also had a pinout where the data pins were interspersed 
with Vddq and Vss.
If the DDR3 DRAM package routing is symmetrically referenced to 
Vddq/Vss, it enables construction of a signal path that is symmetrically 
referenced end-to-end (non-DIMM applications). Along with the fact that 
data lines are thevenin terminated to Vddq/Vss, it creates a special 
signaling configuration where theoretically SSO would be zero.

When VDDQ is used as reference throught, it is common requirement to 
have lot of GND transition caps (VDDQ to GND) at source and destination 
area of the trace.
Also near location where there is a reference layer transition.

Vinu, a completely balanced transmission path can theoretically divide 
SSO in half.  It does not cancel it.  Let us suppose that we have a 
completely balanced transmission path using a symmetrical stripline all 
the way back to the die launch.  For a given di/dt * N bits 
transitioning from low to high, relative to BOTH planes the di/dt is 
positive.  If the interconnect is completely symmetrical then we have 
half the total inductance and half the SSO amplitude.  We do not cancel 
the SSO.  The only way to cancel SSO is to code such that the summation 
of transitions multiplied by polarity equals zero.

One can view it as two t-lines - signal/Vdd and signal/Vss - both with 
the same characteristic impedance and both terminated at their characteristic impedance - say 100 ohm. For a low-high transition, the 
pull-up structure discharges the signal/Vdd t-line and charges the 
signal/Vss t-line with ~10mA current drawn from the supply. For a 
high-low transition, the the pull-down structure discharges the 
signal/Vss t-line and the same 10mA from the supply charges the 
signal/Vdd t-line. A constant current from the supply is steered to 
charge the signal/Vss or signal/Vdd t-line. Since there is no switched 
current, di/dt is 0.
Power noise would be limited to crowbar current or a gap in conduction 
between pull-up/pull-down structures. For high performance buffer 
designs, this should already be small.
In the case of a Vss only referenced t-line, the power supply current 
switches between 0-20mA resulting in SSN.
The bus needs to be unidirectional to take full advantage of 
symmetrically referenced signaling.

> Steve,
> One can view it as two t-lines - signal/Vdd and signal/Vss - both with
> the same characteristic impedance and both terminated at their
> characteristic impedance - say 100 ohm.
Yes, and each of these lines couples into the driver through the package 
inductance.  The inductance of the Vss network is lumped as Lvss, and 
the inductance of the Vdd network as Lvdd
>   For a low-high transition, the
> pull-up structure discharges the signal/Vdd t-line and charges the
> signal/Vss t-line with ~10mA current drawn from the supply.
No, a low to high transition imposes di/dt of the same polarity into 
both structures.  The signal line is going high:  It is drawing positive 
convention current through the die PDN and back to the PCB PDN through 
both Lvdd and Lvss.

> For a high-low transition, the the pull-down structure discharges the
> signal/Vss t-line and the same 10mA from the supply charges the
> signal/Vdd t-line. A constant current from the supply is steered to
> charge the signal/Vss or signal/Vdd t-line. Since there is no switched
> current, di/dt is 0.
No, externally current the switched current is divided in two.  It is 
still switched.  All you need to see this is to draw a black box around 
the IC.  What the signal line does is complemented by the PDN lines.  In order to cancel so that there is net zero current in the PDN 
attachments, you have to reduce the net signal current to zero, such as trivially with differential or  less trivially with an Nb(N+M)q coding scheme.

The way I see single reference works.
Even for a push pull driver, you can pick a reference, I usually pick ground and make sure the signals reference it along the path. From package planes to PCB reference plane.
In the case of asymmetric stripline, the reference plane closest to the signal is the main reference plane.
The other reference plane can have its image current return through the main reference plane through the plane capacitance and the reference plane switching vias whenever the signals transition through the reference planes.
The key assumption for this to work is in most high speed I/O at the driver and terminator/receiver there are on die decoupling caps for the image currents to switch from ground to power at terminator (for high to low transitions) or driver (for low to high transitions).
One observation I had with actual measurements of SSO induced "crosstalk" is they tend to saturate relatively quickly with fast edges such as DDR3 in a package. You don't need to switch the entire 108 bits of DDR3 to get the saturated effects on the victim line. Simply the nearest four or five neighbors will get you close enough if you have a well design package that spread the power/ground pins evenly. That suggest a TDT experiment on each aggressors to the victim and athematically add the noise assuming linear and superposition works. 
Chris Cheng
Distinguished Technologist , Electrical
Hewlett-Packard Company

Chris, sure all of that is true.  The point of dispute was whether by 
symmetrically referencing both Vdd and Vss that one could cancel SSO.  
My answer is that it definitely does not hing to cancel SSO as the di/dt 
between the package and each plane has the same sign.
Best Regards,

Hi  Steve:
    Two places not very understand:
1、For a given di/dt * N bits transitioning from low to high, relative to BOTH planes the di/dt is positive.  If the interconnect is completely symmetrical then we have half the total inductance and half the SSO amplitude.
 -----Is it means that the VDDIO-DQ-GND stackup can half the SSO noise compare to GND-DQ-GND ?   If so ,how to trade off ,as Jeff Loyer mentioned, as the noisy VDDQ may "eat" the margin of DQ. 
2、 The signal line is going high:  It is drawing positive 
convention current through the die PDN and back to the PCB PDN through 
both Lvdd and Lvss.
 --- Here the Lvdd and Lvss locate at receiver side,right? And if the signal is ADD or CMD with VTT, is it better to decouple to both GND and VDDIO?

Hi Chris:
   Rambus pubilished  a paper few years ago which separate the current loop of SSTL into two type by the signal frequecy: High frequency type and Medium frequency type;The high frequency type will go through the on die cap but the low and medium type not, instead they go through the SMT capacitor on the board.But they don't mentioned what's the boundary of the frequency. How do you think about it?
   I also got a similar result that when I measure the DQ signal , just set the near 5 DQ to active will get nearly  the same jitter/noise compare to 32bit all active.
But a strange result is that the odd mode pattern is much more worse that even mode, that means the victim is 01010001,the 5 aggressive DQ is 101011110 will get the worst eye diagram.
I don't know why as we used to thought that all DQ  switch from same direction will get the worst SSO noise . so the jitter/noise of signal.
   LIU Luping

There are a couple of different strategies:
1) Couple about half the lines to Vss and the remainder to Vdd. On the
PCB route each line to the rail referenced end to end. This is the idea
used by memories today: Data to Vss, A/C to Vdd.
2) Couple everything to both rails. Then route stripline in proportion
to the coupling in the package. This theoretically can work, but is very
restrictive on PCB routing.
Lvdd and Lvss are most important at the driving end. With ODT they also
matter almost as much at the Rx end. Vnoise = Lcommon*di/dt still
defines common signal noise. Limit Lcommon, di/dt, or both to get
acceptable Vnoise.
Vtt should track the AC reference(s). If only one reference is used for
a given signal line as in 1) then the decoupling is simpler.

In terms of saturating noise with the immediate aggressors, we 
demonstrated similar SSO results in our 2007 DesignCon paper.  The 
aggressors switching with or against the victim yield push-out versus 
pull-in timing.  Resonances in the PDNs will determine what patterns 
cause the worst problems.

Hi Steve:
  Thank you for your quickly reply, your explain is very helpful to me.As the strategies 2,coupled every thing to both rails ,you mentioned that we have half the total inductance and half the SSO amplitude, here is compared to the stripline couple to 2 vss plane or micostrip couple to 1 vss plane?  Further more, for the DDR4 Pseudo-Open-Drain Logic (PODL), all signal should couple to vss,right?
  LIU Luping

There seems to be some confusion going around about the channel on the
PCB, and the package launch.
2) can only reduce the SSO in half under a special set of conditions.
That's the best it can do.
You can view 2) as taking N 50 Ohm lines and converting them to 2N
paired 100 Ohm lines, one line for each that references Vss and another
that references Vdd. If Lvdd and Lvss between the PCB and die are equal,
then the di/dt through Lvdd and Lvss are each half what it would have
been with N 50 Ohm lines referencing only Vdd or Vss with the same Lvss
or Lvdd. The assumption of equal Lvss and Lvdd is questionable. It
relies on a package designer either naively choosing the number of balls
for each rail, or planning for this balanced stripline PCB routing.

A smart package designer planning for the method 1) would allocate
enough Vdd and Vss connections to realize appropriate Lvdd and Lvss for
the number of lines that will reference each. So, if a memory controller
had 24 A/C that the designer intends should reference PCB Vdd and 128
data lines that should reference PCB Vss, then the ball pattern should
have a ratio of about 6 Vss for every Vdd. The key thing to remember is that SSO builds across shared impedance. If the shared impedance is very low relative to the di/dt, and there are not resonance problems, then the SSO will be small. A very conservative package would carry one
signal return ball for each single ended signal ball.

Signals should only make absolutely necessary reference changes.
Differential or not, I recommend routing high speed data lines against a contiguous Vss reference.

No disagreement here. SSO can never be canceled even with both vdd/vss planes. Loop inductance can never be zero.
Chris Cheng
Distinguished Technologist , Electrical
Hewlett-Packard Company

I have included a drawing which will hopefully clear things up. I have 
added an inductor in series with the supply to emphasize that the supply 
can be AC high impedance. All currents in green according to my analysis 
are DC. Blue represents logic low currents and red logic high.
In differential signaling, a constant current drawn from the supply is 
steered into the true or complement lines. In symmetrically referenced 
signaling (SRS), the signal/Vdd line can be viewed as the true and the 
signal/Vss line can be viewed as the complement. In SRS, as in differential signaling, a constant current from the supply is steered 
into one or the other line.
There is no current injected into Vdd that needs to return on Vss. This 
means the Vdd/Vss cavity is not excited.

The problem of finding the boundary of high/medium frequency degenerates to a core power/gnd type analysis between the I/O power/gnd distribution. What is interesting here is silicon designers had a hard time understanding the I/O power distribution around the pad ring (IBM used to call it redistribution area).
I/O designers intuitively run their power/gnd grid parallel to the edge of the die because that's what good for on die distribution.  That turns out to be the worst you can do for package design.
Compound that with most ASIC I've seen are pad limited on die size, people 
double stacked their I/O ring. When all is said and done, you've got your modern 21th century equivalent of a bond wire.
The net results is your on package decoupling caps become ineffective.
As for the odd mode SSO being worst case. That's is expected especially for a source synchronous bus like DDR.  Odd mode impedance will create reflections impacting your waveform on top of the odd mode SSO "pull in" which will impact your hold w.r.t. the strobes. SSO noise and its reflections will probably not settle within one bit time at DDR3 or above speed so simulation is important to account for all the effect. Calling IBIS models with SSO, are we there yet ???????
Chris Cheng
Distinguished Technologist , Electrical
Hewlett-Packard Company

A few question about that picture.
a) shouldn't that inductor and battery be on the right side of the pmos with mutual term between it and the signal path ?
b) shouldn't there be another inductor between the battery -ve and the nmos source terminator also on the right side and with a mutual term between the signal path ?
c) shouldn't there between a low impedance cap between the sources of p/nmos and high and low side of the terminators on the right ?
d) shouldn't a) and b) also should be applied to the high and low side of the terminators on the right ?
Now just for fun if we make the batteries different between a/b and d or make the inductance really large on one side (e.g. like someone cheating on DIMM pins and have only 1/3 power pin vs. ground)
How does the SRS work now ?
Chris Cheng

(i) Since there is only DC going through the battery and the inductor, 
they can be connected to the right of the transistors or even to the 
right of the terminations without affecting circuit operation.
(ii) SRS is about moving from a switched current to a steered current 
configuration that relaxes PDN design requirements. The inductors and 
mutual terms you refer are I think signal impedance discontinuities. 
Signal path impedance discontinuities have to be dealt with SRS or not.
c) No, that is the key feature of SRS. High impedance PDN is ok.
d) Same comments as (a,b).
"Now just for fun if we make the batteries different between a/b and d or make the inductance really large on one side (e.g. like someone 
cheating on DIMM pins and have only 1/3 power pin vs. ground)"
Once symmetry is broken, it is no longer SRS!

Vinu, using two return planes does not cancel the current, it just 
divides it.   The cavity is not only excited, but because it must 
maintain DC isolation, we have a much harder time stitching it, making 
it harder to avoid resonance problems than a cavity that is Vss only.

Best Regards,

Hi steve and all:
  Thank you for your reply; Can we consider that the strategies 2) just 
separate the SSO noise in two rails instead of just one rail, 
so reduce noise on each rail, but the sum of the noise on the two rail will equal to strategies 1)?
  To make things more clear, lets consider a practical design case(here I collect some discussion on the SI-List before, thanks to all the people):A Controller with 10pcs DDR3@2133Mbps(NOT DIMM), with 160bit DQ ,20 diff dqs, 80bit address ,20 diff clk and other signals.
To simplify analysis,here we assume that the pin map have designed that ,as steve suggest, have equal Lvdd and Lvss (at package).
The board is 14 layer boards, may have two solutions:
Solution A:
The stackup is :

Key notes:
1)All DDR3 signal route in  ART03/05/10/12, couple to two GND plane.
2)VDDIO @PWR07, the distance between GND06 and PWR07 is about 3mil.
3)VTT decouple to only GND.
4)CLK VTT is the same with 3).
5) Vref decouple to both VDDIO and GND .

Advantage: all signal couple to a "clean" GND, VDDIO close couple to GND, may be improve the SSO noise decoupling;
Disadvantage: Double the SSO noise compare to solution B?

Solution B:
The stackup is :

Key notes:
1)All DDR3 signal route in  ART03/05/10/12, couple to BOTH VDDIO and GND plane.
3)VTT decouple to both VDDIO and GND.
4)CLK with 36ohm floating terminal ,as the DIMM spec, and decouple to VDDIO only.
5) Vref decouple to both VDDIO and GND .

Advantage: all signal couple to both VDDIO and GND ,may half the SSO noise at VDDIO plane?;
Disadvantage: 1) Noisy VDDIO will reduce the margin of the signal;
            2) VDDIO not close couple to GND, may worse the VDDIO noise?
I will choose solution A, How about your opinions?
Thanks advance,
LIU Luping

Liuluping, SSO noise is almost exclusively the result of di/dt through
common inductance. Minimizing SSO requires that we minimize the product
of series inductance and di/dt from the die launch through the entire
channel. If you have X*N Vddq balls and Y*N Vss balls, the ideal
distribution is X SSO signals referenced to Vddq for every Y SSO signals
referenced to Vss. Only if X = Y will you get the lowest noise by evenly
splitting SSO di/dt between PCB Vss and PCB Vddq. In that case you have
the option of referencing half of the signals to X and half to Y or
alternately referencing all signals evenly to X and to Y. That gets you
us out of the package.
Once we are in the PCB, every time that a signal traverses a cavity, ANY cavity it excites that cavity. If the cavity has the same DC rail on both planes, then we can button down the cavity with stitch vias.
Otherwise, we have to use bypass capacitors. Barring embedded capacitors, the deeper the cavity is in the board, the harder it will be for bypass capacitors to be effective due to the large loop inductance imposed by the long distance between the cavity and the capacitor mounting surface.

So the best strategy is to: a) Avoid exciting ANY cavity, and b) Where
we excite a cavity minimize what is required to hold the disturbance to an acceptable level. a) and b) will take us to the five types of route
classes, or as I like to say the five rings of EMC/signal integrity hell
as outlined by authors like Bruce Archambeault:
a) Route against one surface of one plane end to end.
b) Route against the surfaces of one plane end to end.
c) Route against multiple planes that are stitched together with vias.
d) Route against multiple planes that are stitched together with bypass
capacitors, IE the two voltages of a given rail.
e) Route against multiple planes that are stitched together with series
bypass capacitors, IE unrelated voltages.

The problem with strategy #2 is that it begins life at the fourth ring
of EMC / signal integrity hell. Whereas #1 never gets deeper than the
third rail. Within the PCB, strategy #2 will never be able to outperform
strategy #1. Noted EMC and signal integrity specialist and former First
Lady Nancy Reagan advises: "Just say no to mixed PCB return references."

To excite the cavity there must be a current in Vddq with an image 
current on Vss in the opposite direction. In SRS, the image current for 
the signal is of equal magnitude and direction on both planes. So the 
cavity cannot be excited. As for via transitions exciting the cavity, an SRS signal via would need a Vddq via and Vss via (but no bypass 
capacitors) to minimize excitation.

Solution B will provide SRS benefits with the following
1) Controller package should have DQ referenced symmetrically to both
Vddq and Vss.
2) Address/Control do not benefit because they use VTT termination. SRS
needs symmetric thevenin termination.
3) First cycle after tristate will not see SRS benefit.

They are proportioning constants. The results that one gets with method
2 versus certain variations of method 3 depend highly on geometry of thecavities and the stitch via pattern.
Best Regards,

Have you ever looked at the number of ground vs Vcc bumps on a die?  Start
there and you'll see that the die itself is unbalanced.  From that point
fan out from the local die bumps on an I/O driver, and you'll almost always find more ground bumps in the local vicinity.  Then look at the point on the die that the signal is inserted into the package, you'll almost always find Vcc/Vss asymmetry.  You will not be guaranteed that the current flowing to/from the I/O driver Vdd/Vss nodes continues to flow along the same path as the signal.  The signal is constrained by it's trace, but the Vss and Gnd current flow is constrained by the extent of the package planes, bumps and balls.  In most packages this generally looks like a slice of pie ( a sector of a circle) that is narrow at the die and wide at the outside extent of the package.  Due to the magic of spreading inductance current supplying the driver may take a different path than the current traveling with the signal wavefront.  Since there are almost always more Ground balls than power, you will necessarily see a divergence of the current flow.  This is just a different way of saying the inductance is different.  The probability that the currents flowing on the Vcc and Vss planes are identical and balanced at the point of signal insertion through the via, is about as close to zero as I can imagine.

Vinu, no that is not correct.  You need to lose the electron hose model 
and use moving wavefronts.  All one needs to excite a cavity, even a 
closed one like a beer can is:  A cavity with conductive walls, 
dielectric volume, and a launch mechanism.  A signal conductor in the 
dielectric provides the launch mechanism on digital PCBs.  If we obtain 
a beer can and pierce it with an ice pick ( being careful to preserve 
the important contents for later "disposal" ), and then insert a small 
insulating grommet into each hole and then thread a wire through, we can then connect one port of a VNA to one end of the wire and the can, and another port to the other, and then entertain ourselves observing the S11 and S21 plots while properly disposing of the saved can contents.  
The plots will show the resonant frequency of the beer can.  They can 
only do that because we successfully excite the cavity.
Best Regards,

"Have you ever looked at the number of ground vs Vcc bumps on a die?"
This discussion started with that question. On a DDR3 DRAM, in the DQ 
pin field there are roughly equal number of Vddq and Vssq balls.

Signal entering/exiting a cavity can excite a cavity and that happens 
SRS or not.One would need return vias to minimize that in both cases.
Vinu, please see my earlier response to Luping Liu.  In the best case 
where the inductance between each rail and each signal exactly match, 
the best that one could do trying to maintain equally split references 
through the channel is no better, and in practice will always be worse 
than referencing half of the signals to Vddq and half to Vss.
Best Regards,

How about on the controller package?

If you want to use a controller with SRS then you have to do the same.

That would be quite correct, Vinu.  You now understand.  In a Vss/Vcc 
referenced stripline, there is no effective way to stitch the planes.  
In a Vss/Vss reference stripline there is.  In one we can stitch with 
vias.  In the other, we gotta go through caps.  For a DDR signal 
switching in the vicinity of 1 GHz, the ain't no bypass cap that can 
effectively help with Vss/Vcc referencing.  Cavity injection occurs, is not suppressed, and tends to rattle around, resonate, and cause worst 
SSO than just inductive bounce.  Steve and I showed this in our 
DesignCon paper from several years ago.

  A DDR bus cuts out a section of the package that looks like a slice of pie.  The best you can do is to balance the inductance between Vcc and Vss as best as possible, but there are always additional grounds for other "stuff".  You can choose to isolate your DDR ground from the 
remainder of the grounds in your package, to force driver currents to 
flow only on a slice, but this can cause more issues than it solves.  If gnd is unified in the package, then there will always be more available 
ground balls and bumps than Vcc.  This causes inductive imbalance.  In 
any case, in a "slice of pie" ddr ground an power planes, driver current 
with spread out amongst all the balls, whereas signal current will 
remain localized to the signal trace.  Your "balancing currents" are 
somewhere else.  Thus, the signal wavefront, and it's associated 
instantaneous image currents, will most definitely excite the plane 
cavity of the PCB that the device is attached to.

Thanks for the simple explanation Steve.  Somehow I can never get my wife to resonate with me when I start drinking beer?

For a signal going from one Vddq/Vss cavity to another, you only need a Vddq via and a Vss via. A capacitor is not needed.

sorry, wrong answer.
So, one vss via dangling in the cavity One vdd via dangling in the cavity
how is the AC short necessary to suppress cavity waves made?
you have wrong ideas regarding what a via can do when only connected to one net.

Vinu, when you use mixed references you don't get to stitch with vias 
that run the cavity height.  You have to stitch with capacitors, which 
means a loop all the way from the cavity to the closest capacitor plates 
and back.  It's a recipe that only your capacitor vendor and contract 
assembly house could love for all the extra money you will have to pay 
them to try and make it work at high speed by turning the board black 
with bypass capacitors.
Best Regards,
Steve, Can't localized inter-plane capacitance and BC help stitch VDD/VSS planes in the 100-500 MHz range?

Vinu, the resonances that we are trying to suppress are within each 
cavity.  The additional problems of switching between multiple cavities 
only aggravate the situation.

Michael, for signal cavities we want to propagate signal energy and 
minimize the amount of energy that bounces around in the cavity proper.  
Even if we have arbitrary choice of dielectric materials, the interplane 
capacitance in a signal cavity is limited by the trace target impedance 
and how thin we can reliably etch the trace.  That shunt capacitance of the cavity can only be treated as lumped up to the modal frequencies of the structure as modified by the bypass network attached to it.  
Assuming we did not go crazy for bypass caps, the first approximation of the modal frequencies will be close to that of the structure without any bypass caps at all.  The irony is that for power distribution, large 
extents help tame resonances.  But for signal cavities we want small 
extents that exhibit high resonant frequencies.  You can think of it 
crudely as:  Power favors large area reservoirs.  Signals favor 
containment in tiny beer cans.

I was talking about reducing injection by providing return vias for both 
planes, not about suppression.

Vinu, there seems to be some discussion at cross purposes going on.  If 
we reference a signal to two different planes then the coupled energy 
has to make it end to end.  At the die launch we can rely on die 
capacitance to provide the necessary driver coupling.  Then under a 
first assumption that we approximate equal coupling between each signal 
and both rails, and that we introduce only equal inductance between the 
die Vddq, and Vss through the package and into the PCB, we can then turn 
our attention to the PCB part of the channel.  And this is where it 
looks like we are having multiple conversations.
Let's start with a simple case where there is a single stripline 
cavity.  The signal energy that we launch into that cavity will excite 
it.  The cavity once excited will resonate at modal frequencies.  If we want to drive those frequencies up, then we can beak the cavity up into smaller effective cavities, smaller beer cans if you will by stitching.  
Because the two rails require DC isolation, we cannot stitch with vias 
that connect the two planes together.  We will have to stitch through 
capacitors, which today means returning all the way to the surface and 
back with vias.  If the cavity is in the middle of an .062 PCB and we 
use regular MLCCs then we are talking about 1nH or more loop inductance 
per capacitor.  For signals with 100ps Tr, which has an Fknee near 3GHz, 
those capacitors look like 20 Ohms or so.  In order to look like a low 
impedance compared to the cavity necessary to affecct the resonances, we are going to need a lot of capacitors densely packed.  If we don't tame the resonances, then signals that excite them get the favor returned by the resulting voltages coupling back into the signals, as well as 
setting up EMI headaches.

Now, if we take the same stripline and make both planes Vss, then we can stitch together with vias.  The resulting impedance  will be much lower, as well as the required real estate per short.  One via effects a short instead of a via pair throughout the PCB, in addition to the surface area of the bypass caps.  It's a completely different and far more manageable problem.
Best Regards,

Hi Chris:
   Sorry for resend again.
   What puzzle me is when the DDR3 run up to 2133Mbps,certainly we will face 1GHz noise on the power plane, does the current loop at this frequency should go through the die caps instead of the package caps? It means that we have nothing to do to lower the noise at this frequency ,include the reference plane optimize discussing now?
   As the odd mode pattern problem, I also do some calculations, for 
Zuncoupled=(L/C)^0.5,Zodd=((L-Lm)/(C+Cm))^0.5,Zeven=((L+Lm)/(C-Cm))^0.5,if Zuncoupled=50ohm(microstripline),Zodd~49ohm,Zevev~71ohm respectively, seens the odd pattern should better at the reflection due to impedance mismatch,but what we got is that the odd mode pattern have more noise (~100mV@DDR3) than the even mode pattern.  
   Thanks and regards,
  LIU Luping

May be we should temporarily set aside practical considerations. If you 
had an SRS configuration with three wires, Vddq/signal/Vss, do you agree 
that it will steer current and only draw DC from the supply? In contrast 
any other configuration would draw pulsed current from the supply.

Vinu, the problem is this:  At the boundary of the component to the PCB, 
the signal forms two transmission lines that each carry the same 
polarity transitions with respect to each of the image planes.  That 
transition carries through the shared impedance of each of the 
respective rails between the package and the PCB.  Voltage developed 
across those shared impedances is the stuff that SSO is made of.
I think that you have confused this configuration with the notion of a continuous transmission line signaling system.  In a continuous 
transmission line signaling system the DC current is constant and 
hypothetically, the AC current approaches zero.  An ideal differential 
signaling system has that behavior.  In such a system, signaling is done 
by changing the current distribution between lines, but not the total 
current.  If the lines are all close together then the approximation to the ideal can be pretty good.  But that is a very different beast than what I believe you have been describing.
If we want to approach zero AC current in the power distribution 
interconnect, then the even mode signal current must approach zero.
Best Regards,

Hello Itzhak,
One can also play with 1mil or less thick buried capacitance layers to add in the ground plane reference for the stripline and make a high frequency transition from power Vdd ref to Vss coming out of the package.  It adds cost, but it also includes additional high frequency decoupling for the power plane.  The 3D-EM simulators always provide great visualization for optimizing this type of transition or the placement of capacitors.
Heidi Barnes

> Steve,
> Let's leave out the component/PCB interface for now and consider a
> simple 3-conductor flat ribbon cable for the transmission line
> connecting the TX to the terminations. The center conductor is the
> signal, Vdd and Vss one on each side of the signal. Signal/Vss and
> Signal/Vdd each having a characteristic impedance of 100 ohm.
> Do you agree that the above transmission line when used in the SRS
> configuration I described will result only in DC draw from the supply?
No, I do not.
> I am not sure what you mean by  continuous transmission line signaling
> system but the properties you describe seem to apply to what I described
> as SRS.
This has been expressed different ways over the past 25 years.  There 
was a Design Con 2012 paper that dealt with this old idea that some 
think is new again.
> If we define SSN as noise caused when multiple buffers/terminations
> share a common PDN impedance, SRS will avoid this noise.
Again, I disagree for the reasons I have stated:  When the AC signal 
current transitioning an interface is non-zero, then the return current 
through the common impedance of that interface is also non-zero, and by definition the result is SSN.
> If we now compare the practical implementation of conventional signaling and SRS, the question becomes how much of the margin we gained from the lack of SSN do we have to give up. The answer will determine if SRS is useful in practice.
Again, if we look at the five rings of EMC hell, we find that using 
multiple transmission lines in parallel, each coupling to planes on 
independent DC rails and therefore coupled through capacitors will never 
be better than using single transmission lines that reference a single 
rail, and will usually be much worse.
> In conventional signaling we try to keep a constant signal/Vss impedance end-to-end. In SRS we have do that for Signal/Vss and Signal/Vdd.
Yes, but as I have attempted to explain, all that does for you in the 
best case is divide the common impedance that the SSN forms across in 
half.  In practice, depending on the distribution of impedance in each 
rail, it can make it much worse than alternatives.
> Thanks,
> steve

Sorry for the delay, work has been busy. 
I do believe the high speed current loop will always come from the on die decoupling but that doesn't mean reference plane optimization is not important.
How tight you control the reference plane w.r.t. the signal path determines how strong a mutual term you can get to lower your overall signal path loop inductance. Which is a di/dt issue, not a decoupling issue.
As for the odd mode vs. even mode SSO, you are assuming the ringing is settled within one cycle and the perfect sampling point is in the middle of UI. In simulations and as in real life, we observed the optimal sampling point is slightly later than 1/2 UI. That makes an odd mode "pull in" more problematic than an even mode "push out"
Chris Cheng
Distinguished Technologist , Electrical
Hewlett-Packard Company

Thanks Chris, your explain Is very helpful, especially the sampling point problem.
 Also thanks to Steve ,Vinu ,Scott and all the experts, though not full 
understand ,but your discussion is very wonderful, may be the symmetrically referencing both Vdd and Vss in a push pull driver system can’t cancel the SSO to zero due to the unequal inductance ,
But it may reduce the SSO noise close to zero , I thought that’s the original purpose which a demo boards design from a Top IC vendor, who’s application notes strongly recommend the DDR3 signal should symmetrically referencing both Vdd and Vss.
I also search the in the web ,the oldest paper about this topic I can get is:  
Modeling of simultaneous switching noise in high speed systems 
Sungjun Chun; Swaminathan, M.; Smith, L.D.; Srinivasan, J.; Zhang Jin; Iyer, 

Which was refered in Madhavan and A.Ege’s “Power integrity modeling and design for semiconductors and systems”, chapter 3.
 It shows that the return current flow in the vdd plane when signal transition from low to high (unterminal signal), when signal from high to low , the current in the signal line will flow into the GND plane.
So a intuitive thought was why not use a symmetrically referencing both Vdd and Vss to reduce the reference plane transition, in order to reduce the SSO noise? 
 Thanks again to all the experts, and sorry to Hirshtal that borrow your 
questions :)

"That's all fine and well but does nothing for SSO as seen by the transmitter die, or the noise injected into the cavity."
v(Vddq,Vssq) is the transmitter die noise (SSN) and it is only a few mVpp  with SRS and much larger with Vss-only reference. v(Vddq,Vssq) is also the noise injected into the cavity ( a few mVpp). One can observe the other end of the cavity v(termVdqq,termVssq) and also observe only a few mVpp with SRS.
For a Vss-only referenced transmission line, a Thevenin termination would require a low impedance PDN to work. In SRS you may have noted that there is no low impedance PDN between termVddq and termVssq. Do you have a topology in mind that uses Thevenin termination alone without SRS and can achieve constant current draw from the PDN? You could post a modification to the spice deck if that helps avoid any misunderstanding.

Vinu, any examination of the voltage potential at the die side of Vss, 
or the die side of Vdd will show that each suffers SSO.  Take any planar 
cross-section through a schematic representation of the die to the 
package or anywhere in the interconnect and KCL still applies:  The net 
current into the slice is the same as out of the slice.  When the signal 
bus is not by itself balanced to exhibit zero switching current, the 
returns into the package will between them, and no matter what the 
configuration, divide the corresponding complementary current.  In order to cancel SSO, the return currents have to cancel.  They do not cancel.  They divide between the rail, or rails used based on the number of and effective inductance of interconnects used between the board that the package, or going up the chain:  the package and the die.

Consider the following two cases:
a) A device with plenty of on-die Vddq/Vssq capacitance and a poor 
pinout with say 1 return pin for every 20 signal pins.
b) A device with zero on-die capacitance but with good signal return, 
say one return pin for every signal pin.
I would classify (a) as a device with a crosstalk problem and (b) as one with an SSN problem. Perhaps some people classify both as SSN. You seem to be describing (a).
SRS will fix (b) but will neither help nor hurt (a).