DSP

Can C beat RTL?

2019-07-13 14:48发布

http://www.edn.com/article/457428-Can_C_beat_RTL_.php    With the appearance of higher speeds and more DSP macrocells in low-cost FPGAs, more and more design teams are seeing the configurable chips not as glue but as a way to accelerate the inner loops of numerical algorithms, either in conjunction with or in place of the traditional DSP chip. There’s a problem, however. You code for a DSP chip in C, and you implement it using a conventional software tool chain with familiar software debugging tools. You configure an FPGA starting in Verilog or VHDL (very-high-speed-integrated-circuit-hardware-description language)—superficially similar to C but in practice profoundly different—and you implement it using a hardware design flow. The two approaches require different skills. Enter ESL (electronic-system-level) tools. An ESL synthesis tool lets you write your code in C, automatically synthesize RTL (register-transfer-level) logic from the C, and then feed the RTL into your FPGA flow. In reality, such tools meet with skepticism because people suspect them of poor quality of results, unreliability, and other vices. Is that assessment fair, though? BDTI (Berkeley Design Technology Inc) wanted to find out. The company last month released the first results of its certification program for high-level synthesis tools. The first evaluation covers AutoESL’s AutoPilot and Synfora’s Pico. The bottom line in BDTI’s findings was that both tools produced results in a reasonable amount of time and that both performed much better than software on a DSP chip. The tools were comparable in density and performance with hand-coded RTL. The fine print reveals a wealth of information below that level, however. Unsurprisingly, both ESL vendors produced designs with about 40 times the throughput of the best BDTI engineers could do on a Texas Instruments DM6437 DSP chip. Surprisingly, in a separate test with a smaller design, results from the ESL flows required essentially the same die area as a hand-coded RTL kernel.First, BDTI uses a method that represents, as always, a compromise between realism and practicality. BDTI’s initial benchmark is a fully functional optical-flow design comprising a three-ring binder and a DVD, which in turn contain a text description of the algorithm, the algorithm in about 600 lines of ANSI C, and a Xilinx reference design. BDTI turns the kit over to the ESL vendor, which tunes the C code for the tool and produces a design. BDTI engineers then independently repeat the process. The optical-flow core attempts to achieve maximum throughput using all the resources available in the Spartan IIIA FPGA. The amount of work to do the FPGA design, from C to programming file, was similar to the work to program the DSP, according to Jeff Bier, BDTI president. Significant differences between the two tasks emerged, however. Optimizing the C code for one of the ESL tools caused the code to balloon from the original BDTI-supplied 559 lines to 1604 lines of C. Bier says that the work in the optimization was somewhat less than optimizing the code for the DSP chip. “It turned out that the DSP had a serious memory bottleneck that we had to code around,” he explains. The synthesis tool then generated more than 38,000 lines of Verilog from the optimized C. BDTI engineers, experienced DSP programmers, could handle the entire flow for the TI chip. A huge pile of Verilog and a stack of Xilinx tools stumped them, though. They ended up calling in an RTL-logic expert to shepherd the RTL through the Xilinx tool chain, debug it, and produce the configured FPGA. Clearly, it is no longer prudent for design teams working with computationally oriented cores to ignore ESL synthesis tools. The analogy to the days when RTL synthesis was just beginning to displace schematic capture and Karnaugh maps or, for that matter, when programmers began to write embedded software in C instead of assembler is irresistible. Stand by for change.