Professional Documents
Culture Documents
MAPLD 2009
Todays FPGAs
RS333 SpaceWire HDLC3 Up/Downlink HDLC3 Up/Downlink CAN3CAN3
UART
AMBA APB
SpaceWire Controller
HDLC Controller
HDLC Controller
CAN Controller
AMBA AHB
Boot PROM
Memory Controller
Timers
IRQCtrl
I/O port
Leon CPU
CPU
+3 V .3
+3 V .3
SDRAM
SDRAM
Configuration PROM
CLK Generator
1 V Linear .1 Regulator
PPS in PPS out 3V .3 JTAG Fabrication advances provide more available+silicon area
More functionality can weigh less and take up less space Integrating/reusing capabilities lowers cost
Image Processing
James W. Ramsey
Communications
3
Such integration usually involves multiple independent clock domains, which leads to clock-domain crossings and metastability errors!
corrupt control and data signals are subtle, intermittent, unpredictable are the 2nd major cause of respins are difficult to reproduce and debug are temperature, voltage, and process sensitive will only occur in hardware; often in the final design
Traditional verification techniques do not work for CDC signals A CDC Verification methodology is needed to reduce the risk of CDC related data errors
4
Periodic pulsing signal Vee, Vss: GND: 0V Digital logic uniformly connected to this signal Acts as the Symphony Conductor keeps logic in sync Action happens across the logic at one specific point
Typically the rising edge
(Also known as a latch, flip-flop, etc) Contain transistors that trap the input value at the appropriate time
E.g. rising edge of the clock
-- simple D-type flip-flop process(CLK) begin if rising_edge(CLK) then Q <= D; end if; end process;
0 0
CLK
D CLK
-- simple D-type flip-flop process(CLK) begin if rising_edge(CLK) then Q <= D; end if; end process;
1 0
CLK
D CLK
-- simple D-type flip-flop process(CLK) begin if rising_edge(CLK) then Q <= D; end if; end process;
1 1
CLK
D CLK
-- simple D-type flip-flop process(CLK) begin if rising_edge(CLK) then Q <= D; end if; end process;
0 0
CLK
D CLK
Only works if D has a good value at the rising edge of the clock Transistor Model of a D flip-flop (no Set-up/hold time violations)
10
D CLK
D CLK Q
MTBF =
fclk fin td
3 fclk fin td
The clocks will continually skew, guaranteeing setup/hold violations Signals from one design to another are Clock Domain Crossings (CDCs)
Clock Domain Crossing signal
D CLK Q D Q
CLK
Sensor System
Guidance System
Clock A Tx Clock B
Setup/hold window
Signals that cross asynchronous clock domains (CDC signals) WILL violate setup and hold conditions
12
Signals crossing a clock domain will violate set-up/hold Impact: Control/data signals will be dropped/corrupted
Loss of Data
Approaches:
13
Designers add synchronizers to reduce the probability of metastable signals Synchronizers are sub-circuits that can prevent metastable values from being sampled across clock domains
14
Clock B
Rx Tx
i -1 i i +1 i +2 -1 ii -1 ii i +1 i +1 i i +2+2 i +3
Designs need logic to correctly handle reconvergence Can occur on single-bit or multiple-bit signals
S2 S1 S3 S4
Grey Decoder
Grey Encoder
FSM Input
0 1 0 1 1 0 0 0 1 1 0 1 0 1 1
Data is lost Simulation may not show a failure Silicon will eventually show a functional error
Reconvergence problem
Signals crossing a clock domain will violate set-up/hold Impact: Control/data signals will be dropped/corrupted Avoid having systems that have multiple clocks Designer can add synchronizers to the design Designer-added synchronizers + full CDC verification
Assures synchronizers are present and used correctly
Approaches:
19
Recommendations
During design planning
1. 2.
Create systems/designs using 1 clk, 1 edge when possible If multiple clocks are required, try to use 1 designer for both clock domains, and use coding guidelines
Use signal naming conventions 2. Many clock domain errors come from design changes, not the initial design 3. Limit clock domain crossings to specific areas or blocks in the design, when possible. 4. NOTE: These techniques can help assure synchronizers are present, but are unlikely to help or Example: identify reconvergence or CDC protocol issues.
1.
3.
When multi-clock to signals leaving A-clkplan for proper verificationsignals design is required, register, _A for A-clk combo Append _A_reg How to we accomplish - help 1. Leverage during code reviewsthis? identify missing synchronizers
My_data_Rx_sync1 Make sure ONLY My_data_Tx_Reg go to synchronizers (no combo logic) _A_reg signals My_sig_Tx My_data_Rx_sync2
Comb_sig_Rx
20
Missing synchronizers will create metastability Correctly placed but misused synchronizers wont work Reconvergence of synchronized signals can create unexpected behavior Simulation
Digital logic simulators do NOT model transistor behavior Do not model metastability
Approaches:
21
For example
Setup Violation
D CLK
Q in simulation
Hold Violation
D CLK
Q in simulation
Q Q in silicon
Q Q in silicon
Missing synchronizers will create metastability Correctly placed but misused synchronizers wont work Reconvergence of synchronized Control logic bugs Simulation
Wont model CDCs correctly to detect errors
Approaches:
23
Missing synchronizers will create metastability Correctly placed but misused synchronizers wont work Reconvergence of synchronized Control logic bugs Simulation
Wont model CDCs correctly to detect errors
Approaches:
For Example
Trivial Reconvergence Error
Reconverging synchronized CDC signals - timing is unpredictable. Need to verify the downstream logic can handle variations
Manually identifying the reconvergence is very hard Manually identifying all possible behaviors is harder Manually assuring logic will behave correctly typically intractable
25
Missing synchronizers will create metastability Correctly placed but misused synchronizers wont work Reconvergence of synchronized Control logic bugs Simulation - Wont model CDCs correctly to detect errors Timing Analysis - Identifies signals for review, but otherwise useless Manual Design Reviews - error prone, incomplete Lab Verification?
Problem is intermittent, debug is impossible
Approaches:
26
Missing synchronizers will create metastability Correctly placed but misused synchronizers wont work Reconvergence of synchronized Control logic bugs So - we need a new method that reliably:
Identifies ALL CDC signals, structures, reconvergence Assures ALL connected, functioning correctly Creates reports for manual reviews The EDA industry has responded 6 commercial tools now availableand counting Butmost wont identify all 3 of our CDC issues
Approaches:
27
Honeywell, Inc. L-3 Communications Lockheed Martin Co Ministry of Aerospace & Aeronautics Northrop Grumman Corp Raytheon Rockwell Collins Inc. SAAB Group Thales Widely used in commercial space
Commercial
IEEE standard serial communications core Used in 50-60 other COMMERCIAL ASIC products Widely deployed (millions in use daily) Found issues in the lab Debugged FPGA for weeks Suspected a CDC issue, but not sure Results same day Found 199 serious CDC bugs!
45 Missing Synchronizers 83 Incorrect Synchronizers 76 Reconverging Signals 11 other problems
Most resulting from more stressful usage Commercial ASIC : Customer issue device is erratic, locks up Avionics: Could result in an Airworthiness Directive
29
In production:
Summary Recommendations
During design planning
1. 2. 3.
Create systems/designs using 1 clk, 1 edge when possible If multiple clocks are required, try to use 1 designer for all clock domains When multi-clock design is required, plan for proper verification
During verification
1. 2.
Watch for multiple clocks in designs (Tip Count PLLs) Ask how CDC issues are mitigated (remember there are 3)
2. 3.
Use reports to aid manual reviews Use CDC tools to support ROBUSTNESS
30
In Conclusion
Every multi-clock design is subject to metastability Traditional verification methodologies CANNOT assure robustness To properly mitigate the dangers of CDC, we strongly recommend a solution that : Supports Manual Reviews Automatically reports all sources of CDC problems Has a proven CDC verification methodology & customer success
31
32