Back End

Physical Design Flow III:Clock Tree Synthesis

I. NetlistIn & Floorplan
II. Placement

For synchronized designs, data transfer between functional elements are synchronized by clock signals. In a top level digital design, you will have one more more clock sources, like PLLs or oscillators within the chip. You may also have an external clock source connection through an IO. For a digital only block, you will have a clock pin that will be the clock source for the block in question. Clock balancing is important for meeting the design constraints and clock tree synthesis is done after placement to achieve the performance goals.

After placement you have positions of all the cells, including macros and standard cells. However, you still have an ideal clock. (For simplicity, we will assume that we are dealing with a single clock for the whole design). At this stage, buffer insertion and gate sizing and any other optimization technique is employed on the data paths, but no change is done to the clock net.

The same clock net connects all the synchronous elements in the design, irrespective of the number.

This is how your design’s clock network is at this point.

clock net before CTS

clock net before CTS

This is definitely not something we want. Think just about the load of one clock net. No driver can drive that many flops! But when it is a synchronising signal like clock, load or fanout is not the only thing we are worried about. We also want a “balanced” tree, that is the skew value for the clock tree should be zero. After clock tree synthesis, the clock net will be buffered as below.

Clock Net After CTS.

The main concerns in CTS are:

  1. Skew – One of the major goals of CTS is to reduce clock skew.
    Let is see some definitions before we go into clock skew.

    • Clock Source

      Clock sources may be external or internal to your chip/block. But for CTS, what we are concerned about is the point from where the clock propagation starts for the digital circuitry. The can be a IO port, outputs or PLL,Oscillators, or even the outputs of a gate down the line. (e.g a mux output).A clock source for CTS may also be specified using ‘create_generated_clock’ command. This defines an internally generated clock for which you want to build a separate tree, with it’s own skew, timing and inter-clock relations.

      You specify the clock source(s), using the command create_clock.

    • Clock Sinks
      Sinks or clock stop points are nodes which receive the clock. Default sinks are the clock pins of your synchronous elements like Flipflops.

    Now let us define skew as the maximum difference among the delays from the clock source to clock sinks..


    In the picture above, the delay to clock sinks are given. The skew in this case is the difference between the maximum delay and minimum delay.
    Skew = 20ns-5ns = 15ns

    The goal of clock tree synthesis is to get the skew in the design to be close to zero. i.e. every clock sink should get the clock at the same time.

  2. Power – Clock is a major power consumer in your design. Clock power consumption depends on switching activity and wire length. Switching activity is high, since clock toggles constantly. Clock gating is a common technique for reducing clock power by shutting off the clock to unused sinks. Clock gating per se is not done in layout; it should be incorporated in the design. However,lock tree synthesis tools can recognise the clock gates, and also do a power aware CTS.

    In the picture above, FF1 gets the ungated clock CLK, and FF2 and any subsequent flop gets a gated clock. This clock is turned on only when the signal EN is present. (See ICG cells)

Make sure that you specify the clock as propagated at CTS stage. i.e. instead of ideal delay for clock, you are now calculating the actual delay value for the clock. This will in turn give you a more realistic report of the timing of the design. You can propagate the clock using the command set_propgated_clock [all_clocks]

IV. Routing
V.Physical Verification



  1. yan

    August 7, 2013 at 4:58 pm


    Thank you so much. I am new to physical design.

    These are really helpful.

    Please keep doing it.



  2. yogesh

    December 16, 2013 at 10:57 am

    how to reduce the number of clock levels after initial clock tree synthesis?

    • mm


      December 24, 2013 at 8:28 pm

      I am not sure about your question, but some of the things I follow are:

      1. Your CTS tool probably has a clock browser or interactive clock tree browser. Use this to look at any unwanted buffer chains you have. This will help you pinpoint issues with your clock tree.

      An example..
      Two branches of tree, but one clock sink has a very high insertion delay before CTS.(Maybe due to gates, muxes etc). CTS will then be required to match this delay in all sinks(plus any extra delay due to synchornising.). This can then be addressed preferably by changing the design OR making the clock tree synthesis ignoring this path.

  3. farjana

    February 5, 2014 at 9:31 pm

    thanks sir….


    i am doing my project in title ” NBTI induced clock skew reduction in clock trees” and i am referring IEEE paper ” Skew management of NBTI impacted gated clock trees”.  I need to know how to reduce the clock skew, by using NAND and NOR gates.  In this paper they mentioned NAND and NOR gate usage. Can u please give me a solution.


  4. srimanth

    April 14, 2014 at 11:17 pm

    Thanks for your kind effort.
    under clock sinks: Is it modes or nodes.

    • mm

      Sini Mukundan

      April 15, 2014 at 8:59 am

      Thanks for the catch. fixed.

  5. srimanth

    April 20, 2014 at 10:14 am

    Could please you include a blog specific email address so that if iam having a list of doubts it will not make cumbersome in the comment Section.

    • mm

      Sini Mukundan

      April 23, 2014 at 10:27 am

      You can ask questions in
      I will move your questions there.

      • vicky

        January 4, 2018 at 12:42 pm

        what is diff skew balanced techinic and skew group ?

  6. srimanth

    April 23, 2014 at 7:35 am

    Iam newbie to vlsi industry,Could you please address my doubts.
    1)What are the inputs for a tool to calculate power and how it calculates power.
    2)In STA for nanometer designs text book it is explained that in the name of power LUT we place internal Energy if so how to calculate internal Energy.
    3)How Does the Analog macros are interfaced with the Digital Counter parts in generating power and timing numbers.
    4)How the worst case slew and load are determined in generating the .lib file for standard cells.
    5)Roadmap for physical Design as a career choice in view of increased complexity in technology scaling.

  7. Achyuth

    December 9, 2014 at 7:47 am

    Why we use clock buffers in CTS can’t we use normal buffers and i came to know for clock buffers the i/p and o/p transition is time could you explain it ? why it is same

    • mm

      Sini Mukundan

      December 9, 2014 at 8:15 am

      Clock buffers are designed such that the rise time and fall time are equal. This is important since clock needs to maintain its duty cycle. Schematically it is similar to a buffer, but the parameters and layout are such that these conditions are met. Consequently you may have a bigger(in terms of layout) buffer for the same drive strength.

  8. pramod

    December 22, 2014 at 2:09 pm

    In my design more timing violations are in In to reg only. And this is mainly due to clk latency. My vclk latency is more when compared to capture clk. My design clk has highest latency as 2.7 and this is taken as fixed latency for vclk everywhere. is it correct ? What should I do to meet my timing.

  9. ajaykumar

    January 5, 2015 at 8:54 pm

    i have one question during Does tool uses all Buf/INV specified in the clock tree reference list? If not why?

    • mm

      Sini Mukundan

      January 5, 2015 at 9:46 pm

      The tool can but needn’t use all of them. It will use any combination of available cells to fix the skew and get as small an insertion delay. So some user control is required. e.g. X0 drive and very big cells can be avoided in your list.

  10. Sri

    January 29, 2015 at 12:43 pm

    What is clock divergent? I have heard people talking about it during timing violation during clock skewing.

  11. sneha

    July 10, 2015 at 12:08 pm

    I am new to PD can you guide me how to gain knowledge in PD

  12. Siddique

    August 4, 2015 at 2:17 pm

    why the clock routings are done before the signal routings ?

    • mm

      Sini Mukundan

      August 4, 2015 at 4:56 pm

      You are prioritizing clock routing to make sure it has the optimum utilization of the routing resources. Also clock nets may have non default routing rules, and you want to make sure the skew etc are not affected.

  13. Naveen Reddy

    September 1, 2015 at 10:47 am

    Hi sini,
    my design have some problem i am using icc tool,Floorplan i stated with 65% utilization.In placement i got 73.5 utilization but i run cts only skew balance but utilization i got 70.3 what are the reasons for getting low utilization from placement to cts?

  14. Mujtaba Ahmed

    September 23, 2015 at 2:36 pm

    Hi Sini,
    Could you please suggest me some VLSI backend topics for my Short seminar in a curriculum.


    • mm

      Sini Mukundan

      September 24, 2015 at 9:07 am

      Post graduate?
      1. Routing algorithms – Any improvements you can suggest?
      2. Low power designs

      This list looks interesting.

  15. Sarath chandra

    September 23, 2015 at 4:17 pm

    Hi sini,
    I have a doubt about how can we reduce Max Cap violations in cts stage.

    Thanks in advance

    • mm

      Sini Mukundan

      September 24, 2015 at 8:48 am

      If you are using EDI, there are some post-cts and pre-cts -drv fix capabilities in optDesign. You can also pick the nets you want to fix maxCap on, and then fix them specifically by giving a filelist. Please refer cadence support for the tool specific commands. I suppose other tools will also have similar capabilities.

  16. hari

    October 9, 2015 at 6:51 pm


    How i can check if all the clocks in my design have propagated or not ?

    • mm

      Sini Mukundan

      October 9, 2015 at 7:04 pm

      1. report_timing
      If clock network delay has (Ideal ), it is not propagated. Else it will have (propagated delay).
      2. report_clocks (Or some such command for your tool) . It will list the clocks and specify whether propagated.

      • hari

        October 9, 2015 at 8:49 pm


  17. sandhya

    November 18, 2015 at 2:31 pm

    Is there any gui/command to view the clock network after the CTS?

    • sivakumar

      February 12, 2016 at 9:58 am

      displayClockTree in cadence soc encounter

  18. Sarala

    February 6, 2016 at 8:03 am

    If suppose I have two scenarios, say in one case insertion delay is 0.5ns and skew is 0.1ns, in the other case insertion delay is 0.29ns and skew is 0.25ns?? which one should I select? and why?

    • sivakumar

      February 12, 2016 at 10:14 am

      we prefer insertion delay is 0.29ns and skew is 0.25ns because of hold purpose.

  19. sivakumar

    February 12, 2016 at 9:56 am

    i have 5ps hold violation in one path how to fix it?

  20. chaitali

    March 3, 2016 at 4:31 pm

    Thank you so much. Your blogs are relly very lucid and helpful.

  21. Bijesh

    June 2, 2016 at 2:32 pm

    I have read that clock inverters have lesser delay compared to clock buffers of same drive strength. Clock inverters have better driving capacity compared to clock buffers too. Clock inverters provide symmetrical rise and fall times too. Then why do we use clock buffers in CTS? Why does standard cell vendors provide clock buffers? Only inverters will be sufficient and better for CTS, right??

  22. Bijesh

    June 9, 2016 at 1:59 pm

    Suppose in my design there are multiple clocks. In such a case how the tool decides priority for each clock tree??
    If I am having clk1, clk2, clk3, clk4 and clk5, which one will be synthesized and routed first? While optimization, which one will be most optimized?

  23. sandeep

    June 13, 2016 at 11:27 pm

    Hi Sini,
    How to decide upon which decaps should be used for DFM. Since decaps add leakage and capacitance to the design, which decaps meaning (decap4 or decap64) is preferable?

  24. sandeep

    June 14, 2016 at 3:53 pm

    How does different flavours of Decap cells(i.e. decap64,decap4 etc) contribute to leakage power and capacitance in any design.
    which cell is preferred in the design since different flavours have different cap and leakage power?

    • mm

      Sini Mukundan

      June 15, 2016 at 8:40 am

      Higher area CAP will also mean higher leakage, so it’s a trade off between protection of IR drop voltage variation and leakage power in your layout. The tools can assess the characteristics and place them accordingly. You can manually change the cell after analysis as well.

  25. usha

    July 25, 2016 at 5:38 pm

    How to use macro models delay numbers at macro pins in cts stage? I tried to set it to latency value but it dint work for me

  26. PRAMOD

    August 14, 2016 at 10:06 am

    Hi Sini,
    Your blogs about physical design are very helpful since i am new to physical design. Please write some blogs about how to fix setup and hold violations under different scenarios, different types of clock trees, industry standard for cts and tips for a beginner in the pd engineering.

    Thanks in advance

  27. Guru Marreddy

    September 3, 2016 at 2:59 pm

    Hi Sini,
    Why exactly a PD engineer goes for a X-Tree implementation over h-tree implementation for CTS. Considering the headache of having the non-rectilinear trunks in x-tree.

  28. Aneesh

    October 19, 2016 at 12:38 pm

    hi ,
    how skew balancing and clock grouping is done?

    • R.Rajalakshmi

      November 23, 2016 at 3:15 pm

      In Clock Tree synthesis, can you explain about the importance of buffer,insertion delay,clock skew,slew rate and minimization of power.

  29. saivijayabhaskar

    November 16, 2016 at 11:52 am

    madam ,
    i have a small doubt..
    If i add METALFILL M1 to M8 there after if i add VIAFILL{-mode connectbetweenfill}, is there any rule that
    1.) Every METAL layer should have a VIAFILL withrespect to next layer?
    2.) Does each METAL layers should get shorted through VIAFILL i.e M1
    to M7?

  30. shravan

    February 5, 2017 at 7:07 pm

    Thanks for giving this valuable informtion

  31. kishore

    May 15, 2017 at 4:53 pm


    if i have changed cts specification file …but i want output must be what iam changing in the specification file …how iam doing please help me

  32. Jay Reddy

    May 23, 2017 at 11:15 am

    Hi Sini,

    Which one is more important Global Skew or Local Skew? Why?

  33. Madhu

    September 19, 2017 at 11:37 pm

    Hi Sini,

    Have been following up on your articles and they are really helpful for freshers.

    I had a query, can you please help me understand how delay numbers are calculated while setting the balance points , as in set_clock_balance_points -delay $delay -consider_for_balancing true, if i am balancing certain non end point explicity with few 100 sinks?

  34. krishh

    September 28, 2017 at 2:42 pm

    How to clone a cell?
    what is command for the same?
    do we need to check LEC after cloning a cell?

  35. karthikeyan P

    October 8, 2017 at 2:13 am

    Thanks, Madam.
    I am self-learner, helpful to me to know about my passion and dream field and help to get the designer place in the company, I hope so.

  36. dolma

    January 6, 2018 at 4:32 am

    HI ,

    Can we manually do cts? like for small partitions

  37. D.vicky

    February 10, 2018 at 7:17 am

    hi,sini mam,
    i am asking one qns how to see the clock building and reports in innvoues tool .but report_clocks_timings -type skew jitter ,latncy but value are zero. so how can see tha reports and any options ?

    • mm

      Sini Mukundan

      February 10, 2018 at 12:59 pm

      set_propagated_clock [all_clocks]

      • D.vicky

        February 10, 2018 at 6:32 pm

        good evening mam,
        that one ok but report_clocks_timing -type summary like skew lantcy jitter reports values showing zero’s

  38. D.vicky

    February 14, 2018 at 5:52 pm

    good evening mam,
    how to solved DRC violations by using candence tool ?

    like space,metal density…….!

Leave a Reply

Your email address will not be published. Required fields are marked *

Most Popular

VLSI Pro is a professional network of VLSI engineers. Here you can find latest news, helpful articles and more on VLSI technology.

Copyright © 2016 VLSI Pro

To Top