当前位置:网站首页>Chisel tutorial - 06 Phased summary: implement an FIR filter (chisel implements 4-bit FIR filter and parameterized FIR filter)
Chisel tutorial - 06 Phased summary: implement an FIR filter (chisel implements 4-bit FIR filter and parameterized FIR filter)
2022-07-03 18:35:00 【github-3rr0r】
Phased summary : Achieve one FIR filter
motivation
Up to now , We have mastered Chisel The basis of , This section tries to build a FIR(Finite Impulse Response, Finite impulse response ) Filter module .
FIR Filters are very common in digital signal processing , It will often appear in the follow-up study , So you have to master .
First post Baidu Encyclopedia here FIR Definition of filter :
FIR(Finite Impulse Response) filter : Finite length unit impulse response filter , It is also called non recursive filter , It is the most basic component in digital signal processing system , It can guarantee any [ The amplitude frequency characteristic has strict linear phase frequency characteristic , At the same time, the unit sampling response is finite , So the filter is a stable system . therefore ,FIR The filter is communicating 、 The image processing 、 Pattern recognition and other fields are widely used .
FIR filter
This section is designed and implemented FIR The filter needs to be able to perform the following operations :

actually , This module is executed Corresponding bit multiplication (Element-wise Multiplication), The two operands are the coefficient element of the filter and the element of the input signal , Then sum their products , It's also called Convolution (Convolution).
The mathematical definition based on signal is :
y n = b 0 x n + b 1 x n − 1 + b 2 x n − 2 + . . . y_n=b_0x_n+b_1x_{n-1}+b_2x_{n-2}+... yn=b0xn+b1xn−1+b2xn−2+...
among :
- y n y_n yn It's in time n n n Yes, the output signal ;
- x n x_n xn It's in time n n n Input signal of ;
- b i b_i bi Is the coefficient of the filter, or impulse response ;
- n − 1 , n − 2 , . . . n-1,n-2,... n−1,n−2,... Is time n n n Delayed 1,2,…… A cycle
8-bit Four elements of specification FIR Filter implementation
Now try to create a four element FIR Filter module , The coefficient of the filter is the parameter of the module .
Both input and output are 8-bit Of unsigned integers .
Tips :
- You need to use a structure similar to a shift register to store the necessary state ( For example, the delayed signal value );
- The register of constant input can be shifted to 1 Of
ShiftRegisterTo achieve , It can also be used.RegNextConstruct to implement ; - All registers are initialized to 0.
First, determine the input and output :
Input : One 8-bit Unsigned integer input signal ;
Output : One 8-bit Unsigned integer output signal of ;
Then determine which states need to be stored :
- The coefficients of the filter can be given directly through hardwired , No storage required , The value is given by the parameters of the module ;
- Delayed input signals need to be stored , The grammar is
RegNext(next_val, init_value);
So it is realized as follows :
// MyModule.scala
import chisel3._
import chisel3.util._
class MyModule(b0: Int, b1: Int, b2: Int, b3: Int) extends Module {
val io = IO(new Bundle {
val in = Input(UInt(8.W))
val out = Output(UInt(8.W))
})
val x_n1 = RegNext(io.in, 0.U)
val x_n2 = RegNext(x_n1, 0.U)
val x_n3 = RegNext(x_n2, 0.U)
io.out := io.in * b0.U + x_n1 * b1.U + x_n2 * b2.U + x_n3 * b3.U
}
object MyModule extends App {
println(getVerilogString(new MyModule(0, 0, 0, 0)))
}
// MyModuleTest.scala
import chisel3._
import chiseltest._
import org.scalatest.flatspec.AnyFlatSpec
class MyModuleTest extends AnyFlatSpec with ChiselScalatestTester {
behavior of "MyModule"
it should "get right results" in {
// Simple sanity check: a element with all zero coefficients should always produce zero
test(new MyModule(0, 0, 0, 0)) {
c =>
c.io.in.poke(0.U)
c.io.out.expect(0.U)
c.clock.step(1)
c.io.in.poke(4.U)
c.io.out.expect(0.U)
c.clock.step(1)
c.io.in.poke(5.U)
c.io.out.expect(0.U)
c.clock.step(1)
c.io.in.poke(2.U)
c.io.out.expect(0.U)
}
// Simple 4-point moving average
test(new MyModule(1, 1, 1, 1)) {
c =>
c.io.in.poke(1.U)
c.io.out.expect(1.U) // 1, 0, 0, 0
c.clock.step(1)
c.io.in.poke(4.U)
c.io.out.expect(5.U) // 4, 1, 0, 0
c.clock.step(1)
c.io.in.poke(3.U)
c.io.out.expect(8.U) // 3, 4, 1, 0
c.clock.step(1)
c.io.in.poke(2.U)
c.io.out.expect(10.U) // 2, 3, 4, 1
c.clock.step(1)
c.io.in.poke(7.U)
c.io.out.expect(16.U) // 7, 2, 3, 4
c.clock.step(1)
c.io.in.poke(0.U)
c.io.out.expect(12.U) // 0, 7, 2, 3
}
// Nonsymmetric filter
test(new MyModule(1, 2, 3, 4)) {
c =>
c.io.in.poke(1.U)
c.io.out.expect(1.U) // 1*1, 0*2, 0*3, 0*4
c.clock.step(1)
c.io.in.poke(4.U)
c.io.out.expect(6.U) // 4*1, 1*2, 0*3, 0*4
c.clock.step(1)
c.io.in.poke(3.U)
c.io.out.expect(14.U) // 3*1, 4*2, 1*3, 0*4
c.clock.step(1)
c.io.in.poke(2.U)
c.io.out.expect(24.U) // 2*1, 3*2, 4*3, 1*4
c.clock.step(1)
c.io.in.poke(7.U)
c.io.out.expect(36.U) // 7*1, 2*2, 3*3, 4*4
c.clock.step(1)
c.io.in.poke(0.U)
c.io.out.expect(32.U) // 0*1, 7*2, 2*3, 3*4
}
println("SUCCESS!!")
}
}
Verilog The code input is as follows :
module MyModule(
input clock,
input reset,
input [7:0] io_in,
output [7:0] io_out
);
`ifdef RANDOMIZE_REG_INIT
reg [31:0] _RAND_0;
reg [31:0] _RAND_1;
reg [31:0] _RAND_2;
`endif // RANDOMIZE_REG_INIT
reg [7:0] x_n1; // @[MyModule.scala 12:21]
reg [7:0] x_n2; // @[MyModule.scala 13:21]
reg [7:0] x_n3; // @[MyModule.scala 14:21]
wire [8:0] _io_out_T = io_in * 1'h0; // @[MyModule.scala 15:19]
wire [8:0] _io_out_T_1 = x_n1 * 1'h0; // @[MyModule.scala 15:33]
wire [8:0] _io_out_T_3 = _io_out_T + _io_out_T_1; // @[MyModule.scala 15:26]
wire [8:0] _io_out_T_4 = x_n2 * 1'h0; // @[MyModule.scala 15:47]
wire [8:0] _io_out_T_6 = _io_out_T_3 + _io_out_T_4; // @[MyModule.scala 15:40]
wire [8:0] _io_out_T_7 = x_n3 * 1'h0; // @[MyModule.scala 15:61]
wire [8:0] _io_out_T_9 = _io_out_T_6 + _io_out_T_7; // @[MyModule.scala 15:54]
assign io_out = _io_out_T_9[7:0]; // @[MyModule.scala 15:10]
always @(posedge clock) begin
if (reset) begin // @[MyModule.scala 12:21]
x_n1 <= 8'h0; // @[MyModule.scala 12:21]
end else begin
x_n1 <= io_in; // @[MyModule.scala 12:21]
end
if (reset) begin // @[MyModule.scala 13:21]
x_n2 <= 8'h0; // @[MyModule.scala 13:21]
end else begin
x_n2 <= x_n1; // @[MyModule.scala 13:21]
end
if (reset) begin // @[MyModule.scala 14:21]
x_n3 <= 8'h0; // @[MyModule.scala 14:21]
end else begin
x_n3 <= x_n2; // @[MyModule.scala 14:21]
end
end
// Register and memory initialization
... // Omit
endmodule
The test passed .
FIR Filter generator
This part needs the following content , But let's start with building FIR The basic idea of filter generator .
This generator has a length parameter length, This parameter indicates the number of beats of the filter (taps), The coefficient of each beat is given by the input of the hardware module .
therefore , This generator has three inputs :
in, The input of the filter ;valid, Significant bit of input ;consts,taps Constant vector of coefficients ;
And an output :
out, The output of the filter
Therefore, the implementation is as follows :
import chisel3._
import chisel3.util._
class MyModule(length: Int) extends Module {
val io = IO(new Bundle {
val in = Input(UInt(8.W))
val valid = Input(Bool())
val consts = Input(Vec(length, UInt(8.W))) // Usage mentioned later
val out = Output(UInt(8.W))
})
// Usage mentioned later
val taps = Seq(io.in) ++ Seq.fill(io.consts.length - 1)(RegInit(0.U(8.W)))
taps.zip(taps.tail).foreach {
case (a, b) => when (io.valid) {
b := a } }
io.out := taps.zip(io.consts).map {
case (a, b) => a * b }.reduce(_ + _)
}
object MyModule extends App {
println(getVerilogString(new MyModule(4)))
}
Generated Verilog The code is as follows :
module MyModule(
input clock,
input reset,
input [7:0] io_in,
input io_valid,
input [7:0] io_consts_0,
input [7:0] io_consts_1,
input [7:0] io_consts_2,
input [7:0] io_consts_3,
output [7:0] io_out
);
`ifdef RANDOMIZE_REG_INIT
reg [31:0] _RAND_0;
reg [31:0] _RAND_1;
reg [31:0] _RAND_2;
`endif // RANDOMIZE_REG_INIT
reg [7:0] taps_1; // @[MyModule.scala 13:66]
reg [7:0] taps_2; // @[MyModule.scala 13:66]
reg [7:0] taps_3; // @[MyModule.scala 13:66]
wire [15:0] _io_out_T = io_in * io_consts_0; // @[MyModule.scala 16:56]
wire [15:0] _io_out_T_1 = taps_1 * io_consts_1; // @[MyModule.scala 16:56]
wire [15:0] _io_out_T_2 = taps_2 * io_consts_2; // @[MyModule.scala 16:56]
wire [15:0] _io_out_T_3 = taps_3 * io_consts_3; // @[MyModule.scala 16:56]
wire [15:0] _io_out_T_5 = _io_out_T + _io_out_T_1; // @[MyModule.scala 16:71]
wire [15:0] _io_out_T_7 = _io_out_T_5 + _io_out_T_2; // @[MyModule.scala 16:71]
wire [15:0] _io_out_T_9 = _io_out_T_7 + _io_out_T_3; // @[MyModule.scala 16:71]
assign io_out = _io_out_T_9[7:0]; // @[MyModule.scala 16:10]
always @(posedge clock) begin
if (reset) begin // @[MyModule.scala 13:66]
taps_1 <= 8'h0; // @[MyModule.scala 13:66]
end else if (io_valid) begin // @[MyModule.scala 14:64]
taps_1 <= io_in; // @[MyModule.scala 14:68]
end
if (reset) begin // @[MyModule.scala 13:66]
taps_2 <= 8'h0; // @[MyModule.scala 13:66]
end else if (io_valid) begin // @[MyModule.scala 14:64]
taps_2 <= taps_1; // @[MyModule.scala 14:68]
end
if (reset) begin // @[MyModule.scala 13:66]
taps_3 <= 8'h0; // @[MyModule.scala 13:66]
end else if (io_valid) begin // @[MyModule.scala 14:64]
taps_3 <= taps_2; // @[MyModule.scala 14:68]
end
end
// Register and memory initialization
... // Omit
endmodule
You can see , Input and output there Vec And the back of the Seq In the generated Verilog The code is expanded , I will learn the specific usage later .
DSP block (DspBlock) Application and testing of
Inherit DSP It is challenging to integrate components into a large system , It's also easy to make mistakes .dsptools/rocket at master · ucb-bar/dsptools (github.com) This repository contains some useful generators that can help with similar tasks .
An abstraction of the core is recorded as DspBlock, One DspBlock Shall include :
- AXI-4 Stream input and output ;
- Memory mapping state and control ( In this case AXI4)
notes :AXI(Advanced eXtensible Interface) It's a bus protocol .

DspBlock Used rocket Of diplomatic Interface ,Diplomacy and TileLink from the Rocket Chip · lowRISC: Collaborative open silicon engineering Summarizes about diplomacy Basic knowledge of , But don't worry about how it works in this example .
Put many different DspBlock Connected together to form a complex SoC yes ,diplomacy It will shine .
In this case , Just made a peripheral .StandaloneBlock Features are mixed together to make diplomacy Interface as top-level IO Interface work . Only when DspBlock Used without any diplomacy When connecting interfaces , You need it StandaloneBlock characteristic .
Use DspBlock Need to be in build.sbt Add the following dependencies :
libraryDependencies += "edu.berkeley.cs" %% "rocket-dsptools" % "1.4.3"
The following example is to FIR The filter is encapsulated AXI4 Interface :
import chisel3._
import chisel3.util._
import dspblocks._
import freechips.rocketchip.amba.axi4._
import freechips.rocketchip.amba.axi4stream._
import freechips.rocketchip.config._
import freechips.rocketchip.diplomacy._
import freechips.rocketchip.regmapper._
class MyModule(length: Int) extends Module {
val io = IO(new Bundle {
val in = Input(UInt(8.W))
val valid = Input(Bool())
val consts = Input(Vec(length, UInt(8.W))) // Usage mentioned later
val out = Output(UInt(8.W))
})
// Usage mentioned later
val taps = Seq(io.in) ++ Seq.fill(io.consts.length - 1)(RegInit(0.U(8.W)))
taps.zip(taps.tail).foreach {
case (a, b) => when (io.valid) {
b := a } }
io.out := taps.zip(io.consts).map {
case (a, b) => a * b }.reduce(_ + _)
}
object MyModule extends App {
println(getVerilogString(new MyModule(4)))
}
//
// Base class for all FIRBlocks.
// This can be extended to make TileLink, AXI4, APB, AHB, etc. flavors of the FIR filter
//
abstract class FIRBlock[D, U, EO, EI, B <: Data](val nFilters: Int, val nTaps: Int)(implicit p: Parameters)
// HasCSR means that the memory interface will be using the RegMapper API to define status and control registers
extends DspBlock[D, U, EO, EI, B] with HasCSR {
// diplomatic node for the streaming interface
// identity node means the output and input are parameterized to be the same
val streamNode = AXI4StreamIdentityNode()
// define the what hardware will be elaborated
lazy val module = new LazyModuleImp(this) {
// get streaming input and output wires from diplomatic node
val (in, _) = streamNode.in(0)
val (out, _) = streamNode.out(0)
require(in.params.n >= nFilters,
s"""AXI-4 Stream port must be big enough for all |the filters (need $nFilters,, only have ${in.params.n})""".stripMargin)
// make registers to store taps
val taps = Reg(Vec(nFilters, Vec(nTaps, UInt(8.W))))
// memory map the taps, plus the first address is a read-only field that says how many filter lanes there are
val mmap = Seq(
RegField.r(64, nFilters.U, RegFieldDesc("nFilters", "Number of filter lanes"))
) ++ taps.flatMap(_.map(t => RegField(8, t, RegFieldDesc("tap", "Tap"))))
// generate the hardware for the memory interface
// in this class, regmap is abstract (unimplemented). mixing in something like AXI4HasCSR or TLHasCSR
// will define regmap for the particular memory interface
regmap(mmap.zipWithIndex.map({
case (r, i) => i * 8 -> Seq(r)}): _*)
// make the FIR lanes and connect inputs and taps
val outs = for (i <- 0 until nFilters) yield {
val fir = Module(new MyModule(nTaps))
fir.io.in := in.bits.data((i+1)*8, i*8)
fir.io.valid := in.valid && out.ready
fir.io.consts := taps(i)
fir.io.out
}
val output = if (outs.length == 1) {
outs.head
} else {
outs.reduce((x: UInt, y: UInt) => Cat(y, x))
}
out.bits.data := output
in.ready := out.ready
out.valid := in.valid
}
}
// make AXI4 flavor of FIRBlock
abstract class AXI4FIRBlock(nFilters: Int, nTaps: Int)(implicit p: Parameters) extends FIRBlock[AXI4MasterPortParameters, AXI4SlavePortParameters, AXI4EdgeParameters, AXI4EdgeParameters, AXI4Bundle](nFilters, nTaps) with AXI4DspBlock with AXI4HasCSR {
override val mem = Some(AXI4RegisterNode(
AddressSet(0x0, 0xffffL), beatBytes = 8
))
}
// running the code below will show what firrtl is generated
// note that LazyModules aren't really chisel modules- you need to call ".module" on them when invoking the chisel driver
// also note that AXI4StandaloneBlock is mixed in- if you forget it, you will get weird diplomacy errors because the memory
// interface expects a master and the streaming interface expects to be connected. AXI4StandaloneBlock will add top level IOs
// println(chisel3.Driver.emit(() => LazyModule(new AXI4FIRBlock(1, 8)(Parameters.empty) with AXI4StandaloneBlock).module))
Yes DspBlock The test is slightly different , Now we need to connect with the memory interface and LazyModule Dealing with ,dsptools There are some features that can help test DspBl;ock.
An important feature is MemMasterModel, This feature defines memReadWord and memWriteWord Such general-purpose functions are used for memory operations . This allows you to write a generic test , You can specify which memory interface you use , For example, you write a test and then specialize it to TileLink and AXI4 Interface .
The following example tests with this method FIRBlock Of :
import dsptools.tester.MemMasterModel
import freechips.rocketchip.amba.axi4
import chisel3.iotesters._
abstract class FIRBlockTester[D, U, EO, EI, B <: Data](c: FIRBlock[D, U, EO, EI, B]) extends PeekPokeTester(c.module) with MemMasterModel {
// check that address 0 is the number of filters
require(memReadWord(0) == c.nFilters)
// write 1 to all the taps
for (i <- 0 until c.nFilters * c.nTaps) {
memWriteWord(8 + i * 8, 1)
}
}
// specialize the generic tester for axi4
class AXI4FIRBlockTester(c: AXI4FIRBlock with AXI4StandaloneBlock) extends FIRBlockTester(c) with AXI4MasterModel {
def memAXI = c.ioMem.get
}
// invoking testers on lazymodules is a little strange.
// note that the firblocktester takes a lazymodule, not a module (it calls .module in "extends PeekPokeTester()").
val lm = LazyModule(new AXI4FIRBlock(1, 8)(Parameters.empty) with AXI4StandaloneBlock)
chisel3.iotesters.Driver(() => lm.module) {
_ => new AXI4FIRBlockTester(lm) }
But unfortunately , Example of this test in the tutorial Failed to run successfully , I also failed to find a solution , be forced to give up , Further research will be done later when it comes to this .
边栏推荐
- [tutorial] build your first application on coreos
- Computer graduation design PHP makeup sales Beauty shopping mall
- 论文阅读 GloDyNE Global Topology Preserving Dynamic Network Embedding
- Opencv learning notes (continuously updated)
- G1 garbage collector of garbage collector
- Setinterval CPU intensive- Is setInterval CPU intensive?
- SQL injection -day16
- Sepconv (separable revolution) code recurrence
- SSH 远程执行命令简介
- [combinatorics] generating function (example of using generating function to solve the number of solutions of indefinite equation)
猜你喜欢

Xception for deeplab v3+ (including super detailed code comments and original drawing of the paper)

How many convolution methods does deep learning have? (including drawings)

Computer graduation project PHP library book borrowing management system

Grammaire anglaise Nom - Classification

Raft log replication

Redis core technology and practice - learning notes (VI) how to achieve data consistency between master and slave Libraries

Data analysis is popular on the Internet, and the full version of "Introduction to data science" is free to download

Theoretical description of linear equations and summary of methods for solving linear equations by eigen

Why can deeplab v3+ be a God? (the explanation of the paper includes super detailed notes + Chinese English comparison + pictures)

NFT新的契机,多媒体NFT聚合平台OKALEIDO即将上线
随机推荐
[combinatorics] exponential generating function (proving that the exponential generating function solves the arrangement of multiple sets)
What is SQL get connection
Solve the problem of inaccurate network traffic monitored by ZABBIX with SNMP
[enumeration] annoying frogs always step on my rice fields: (who is the most hateful? (POJ hundred practice 2812)
Count the number of pixel values in the image
Lesson 13 of the Blue Bridge Cup -- tree array and line segment tree [exercise]
[combinatorics] generating function (positive integer splitting | basic model of positive integer splitting | disordered splitting with restrictions)
Bloom filter [proposed by bloom in 1970; redis cache penetration solution]
Win32: dump file analysis of heap corruption
How to disable the clear button of ie10 insert text box- How can I disable the clear button that IE10 inserts into textboxes?
多媒体NFT聚合平台OKALEIDO即将上线,全新的NFT时代或将来临
English grammar_ Adjective / adverb Level 3 - multiple expression
Raft 日志复制
2022-2028 global solid phase extraction column industry research and trend analysis report
Should I be laid off at the age of 40? IBM is suspected of age discrimination, calling its old employees "dinosaurs" and planning to dismiss, but the employees can't refute it
Redis core technology and practice - learning notes (VIII) sentinel cluster: sentinel hung up
Redis core technology and practice - learning notes (VI) how to achieve data consistency between master and slave Libraries
Line by line explanation of yolox source code of anchor free series network (5) -- mosaic data enhancement and mathematical understanding
How to expand the capacity of golang slice slice
Boost. Asio Library