[此贴子已经被作者于2007-5-9 21:37:28编辑过]
具体参看ug_nios2_custom_instruction.pdf,说明详细。。网上很好下。
分硬软部分,硬件部分,部分步骤如下。
This section walks you through the process of implementing Nios II
custom instructions in hardware, and also provides custom instruction
tool-flow explanations.
To implement the Nios II custom instruction for the leading-zeros design,
you must:
1. Open the custom instruction tutorial hardware design.
2. Add the leading-zeros custom instruction logic to the Nios II CPU.
3. Generate the SOPC Builder system and compile the design in
Quartus II.
Altera Corporation Core Version a.b.c variable 3–7
December 2004 Nios II Custom Instruction User Guide
Implementing a Nios II Processor Custom Instruction
Open The Custom Instruction Hardware Design
1. Choose Programs > Altera > Quartus II <version> (Windows Start
menu).
2. Choose Open Project... (File menu).
3. Browse to the quartus_project directory for your board.
4. Choose the custom_instruction.qpf and click Open.
5. Choose SOPC Builder…(Tools menu) to start SOPC Builder.
Add The Leading-Zeros Custom Instruction Logic
This section walks you through the process of adding a custom
instruction to an SOPC Builder system, and also provides custom
instruction tool-flow explanations.
1. Select cpu_0 in the Altera SOPC Builder System Contents page.
See Figure 3–4.
Figure 3–4. SOPC Builder System Contents Page
2. Choose Edit… (Module menu). The Nios II Processor Configuration
wizard appears.
3. Click on the Custom Instructions tab.
3–8 Core Version a.b.c variable Altera Corporation
Nios II Custom Instruction User Guide December 2004
Implementing Custom Instruction Hardware in SOPC Builder
4. Click Import… The Interface to User Logic wizard appears. See
从基本的处理器角度讲,都是只有两个输入,要实现你的功能,你可以:
1,使用多点的custom_instruction将6in2out分解成几个2in1out.
2。使用FPGA片上的程序资源实现6in2out运算,然后将结果通过memory或component传进nios,不过前一种方法更可行,因为后者要求通信协议需要自己定义好。
第一种分解的方法我也想过,但是如果只是输入输出还好办,可是我在CI里面还要同时对那6个输入进行逻辑操作。而这种操作不可能被分解到几个CI里面去。所以基本上不太可行。
我还听过一种方法是顺序读入6个输入,也就是用一个CI读三遍,直到收到6个输入后才开始逻辑操作。但这就需要CI能暂时存储读入的数据直到操作结束。听说读写internal register file可以实现这个目的。但是目前为止我还没能解决这个问题。而且这种方法的执行效率也比较低,输入的数目越多,浪费的运行时间也越多。
版主说的第二种方法我也听过,应该就是hardware acceleration了吧。这种方法也是硬件加速,不过不像CI只能2in1out,它处理可以多入多出的操作,一般来说如果函数执行时间不多于2-3个clock cycle就用CI,多于这个数目就推荐用这个方法了。而且现在altera提供了 nio C2H可以自动生成代码,还是很方便的。我就正准备试试这个方法。(就是那个C2H软件好像很贵很贵的,^_^)
欢迎光临 电子技术论坛_中国专业的电子工程师学习交流社区-中电网技术论坛 (http://bbs.eccn.com/) | Powered by Discuz! 7.0.0 |