Total Pageviews

Wednesday, May 25, 2011

Debugging in the System Verilog/VMM Constrained Random Verification[CRV] test benches

70 % of asic design goes in verification and 70 % of verification goes in debugging.

Planning for the debugging goes a long way. Feature by feature the way we architect the test bench pay some attention as to how will it be debugged. This strategy will pay back heavily.

One old principle is don't forget the basics. Understand the ground rules well.

In verification ground rule is generate the stimulus and check the response. That's it.

In the directed case it would be evident just by reading the test source code.

The same is not true when one looks at the the CRV test benches. Although the ground rule is still the same.

Well debugging the CRV test benches is little different ball game. Now one needs to figure out the stimulus  generated from the test logs. There is no source code to refer on the lines of directed case.

I am not going to talk about technicals of the vmm_logs. May be I will put a word or two as to what customization can be done to make it more effective.

0. Usage for the Logs is they are 90 % of time grepped and not read line by line. So design the regular
expression friendly logging messages.

1. Just because they are grepped does not give you a license to go wild and print the universe. Follow
some good logging etiquette. Well formatted information is worth the time spent in putting up the format.
Address map, details of transactions, configurations needs be formatted to ease the read.

2. Implement intent driven logging macros. Intent driven macros could distinguish between the messages
that give out information about specification, implementation, test bench specific etc. This can help in debugging across teams. Lets look at a case where the unit test bench gets ported to system.System team might just be interested in the messages that give out the spec information and they may not be interested in the test bench specific messages. So it would be good to have this control.

3. With the vmm logging macros tend to print out multi line messages. Do customization to make them
single line. Also group the related multiple lines that need to go together in a string data type and print it out with the single call to messaging macro. While built in vmm component logging is useful it can be big distraction and can increase your log file sizes beyond wave dumps. So have knobs to turn off this internal logging and enable only your own TB logging or together.

4. It should be very straight forward to find out the stimulus generated and the response given out by the DUT. VMM test benches heavily utilize the concept of the transaction and transactor. Transaction go via
the channels. The built in logging of the channels puts out messages about transaction being added/removed.
This can be very informative for the stimulus and response extraction.

5. Debug messages cannot be put just for the sake of it. There needs to two views that needs balanced.
One is being able to have as complete information as possible being available in the logs when the highest
verbosity is enabled but the second one is ease of localizing the issue using a question/answer/elimination.

Few simple quick information to help ruling out the basic issues. There on eliminations. First one is : Is it
really a RTL or test bench issue. If its test bench issue then it should answer is it originating in generator, transactor, BFM, score board, driver, checker etc

6. For TB issues after its localized to a component put enough information to be able to figure out what is the state of different threads in the component. If its waiting for some event it will go a long way to put that debug message as to what it is waiting for.

7. End of test itself can be multi phased. Put enough information to indicate as to which phase of end of test is being waited on.

8. Even when its closed as RTL issue messages need to be clear enough to convince the designer. It should be easy enough to give the picture of scenario as designer would imagine.

9. Build the set of frequently used regular expressions and use the egrep to find out the complete sequence of event that took place. This bigger picture is very vital.

10. Have a easy mechanism to identify the requests and corresponding response. For the buses that allow
multiple outstanding requests and allows out of order completion it goes a long way to build this
identification mechanism. Even though the bus may have some id mechanism of the transactions as they get reused it might be tough to debug. Go ahead and add TB id for the transaction as well that is unique throughout the sims and map the completions on to this ID and it greatly ease the debug.

11. Don't plan to debug everything using logs and thus put everything in logs. Plan on using the single
stepping/watch capabilities of simulators. Synopsys DVE works great for the test bench debugs. This step means extra time but trying to solve all debugging needs using logs would reduce the logging efficiency.

12. Put enough note verbosity message to be able to figure out the timestamp from where the dumps needs to be started and if you can decide if the dumps are needed that would be great.


Test Bench issue Preventions:

0. Go defensive and be paranoid in terms of coding. Next time you are 100 % sure to find the element you are going to look for in the queue where you are tracking transaction completion still add an fatal error statement if its not found. These checks go a long way in catching the issue at root. Otherwise these can morph in to very tricky failures.

1. Having lots of failure with the similar error message but for a different causes is an indication of more granular checks are needed. More granular checks make it easier to debug.

2. Pay attention while doing copy/paste to avoid those extra compile cycles and painful debug cycles.

3. One aspect that is different coding in SV compared to C/C++ is the time dimension. Be aware that
hardware interfaces are parallel while something is being processed at one interface there can be activity on the other interfaces as well. This thought process can save you from normal items that come in as a part of dependency but also take care of race conditions.

4. Multiple threads accessing the shared resources is one more issue. While one writes the code it might be tough to imagine the concurrency of threads. Build your own way to imagine this concurrency and put the
needed protection of the semaphores.

5. Zero time execution is another trap. Beware the vmm_channel put and get are blocking. What it means
is on every channel put/get the scheduling coin gets tossed again. All the contesting threads get a  chance to compete and execute. While in your head you may be thinking of only the two components connected by channel to be active but its not true other components can also get a chance.