当前位置:网站首页>Chisel tutorial - 06 Phased summary: implement an FIR filter (chisel implements 4-bit FIR filter and parameterized FIR filter)
Chisel tutorial - 06 Phased summary: implement an FIR filter (chisel implements 4-bit FIR filter and parameterized FIR filter)
2022-07-03 18:35:00 【github-3rr0r】
Phased summary : Achieve one FIR filter
motivation
Up to now , We have mastered Chisel The basis of , This section tries to build a FIR(Finite Impulse Response, Finite impulse response ) Filter module .
FIR Filters are very common in digital signal processing , It will often appear in the follow-up study , So you have to master .
First post Baidu Encyclopedia here FIR Definition of filter :
FIR(Finite Impulse Response) filter : Finite length unit impulse response filter , It is also called non recursive filter , It is the most basic component in digital signal processing system , It can guarantee any [ The amplitude frequency characteristic has strict linear phase frequency characteristic , At the same time, the unit sampling response is finite , So the filter is a stable system . therefore ,FIR The filter is communicating 、 The image processing 、 Pattern recognition and other fields are widely used .
FIR filter
This section is designed and implemented FIR The filter needs to be able to perform the following operations :
actually , This module is executed Corresponding bit multiplication (Element-wise Multiplication), The two operands are the coefficient element of the filter and the element of the input signal , Then sum their products , It's also called Convolution (Convolution).
The mathematical definition based on signal is :
y n = b 0 x n + b 1 x n − 1 + b 2 x n − 2 + . . . y_n=b_0x_n+b_1x_{n-1}+b_2x_{n-2}+... yn=b0xn+b1xn−1+b2xn−2+...
among :
- y n y_n yn It's in time n n n Yes, the output signal ;
- x n x_n xn It's in time n n n Input signal of ;
- b i b_i bi Is the coefficient of the filter, or impulse response ;
- n − 1 , n − 2 , . . . n-1,n-2,... n−1,n−2,... Is time n n n Delayed 1,2,…… A cycle
8-bit Four elements of specification FIR Filter implementation
Now try to create a four element FIR Filter module , The coefficient of the filter is the parameter of the module .
Both input and output are 8-bit Of unsigned integers .
Tips :
- You need to use a structure similar to a shift register to store the necessary state ( For example, the delayed signal value );
- The register of constant input can be shifted to 1 Of
ShiftRegister
To achieve , It can also be used.RegNext
Construct to implement ; - All registers are initialized to 0.
First, determine the input and output :
Input : One 8-bit Unsigned integer input signal ;
Output : One 8-bit Unsigned integer output signal of ;
Then determine which states need to be stored :
- The coefficients of the filter can be given directly through hardwired , No storage required , The value is given by the parameters of the module ;
- Delayed input signals need to be stored , The grammar is
RegNext(next_val, init_value)
;
So it is realized as follows :
// MyModule.scala
import chisel3._
import chisel3.util._
class MyModule(b0: Int, b1: Int, b2: Int, b3: Int) extends Module {
val io = IO(new Bundle {
val in = Input(UInt(8.W))
val out = Output(UInt(8.W))
})
val x_n1 = RegNext(io.in, 0.U)
val x_n2 = RegNext(x_n1, 0.U)
val x_n3 = RegNext(x_n2, 0.U)
io.out := io.in * b0.U + x_n1 * b1.U + x_n2 * b2.U + x_n3 * b3.U
}
object MyModule extends App {
println(getVerilogString(new MyModule(0, 0, 0, 0)))
}
// MyModuleTest.scala
import chisel3._
import chiseltest._
import org.scalatest.flatspec.AnyFlatSpec
class MyModuleTest extends AnyFlatSpec with ChiselScalatestTester {
behavior of "MyModule"
it should "get right results" in {
// Simple sanity check: a element with all zero coefficients should always produce zero
test(new MyModule(0, 0, 0, 0)) {
c =>
c.io.in.poke(0.U)
c.io.out.expect(0.U)
c.clock.step(1)
c.io.in.poke(4.U)
c.io.out.expect(0.U)
c.clock.step(1)
c.io.in.poke(5.U)
c.io.out.expect(0.U)
c.clock.step(1)
c.io.in.poke(2.U)
c.io.out.expect(0.U)
}
// Simple 4-point moving average
test(new MyModule(1, 1, 1, 1)) {
c =>
c.io.in.poke(1.U)
c.io.out.expect(1.U) // 1, 0, 0, 0
c.clock.step(1)
c.io.in.poke(4.U)
c.io.out.expect(5.U) // 4, 1, 0, 0
c.clock.step(1)
c.io.in.poke(3.U)
c.io.out.expect(8.U) // 3, 4, 1, 0
c.clock.step(1)
c.io.in.poke(2.U)
c.io.out.expect(10.U) // 2, 3, 4, 1
c.clock.step(1)
c.io.in.poke(7.U)
c.io.out.expect(16.U) // 7, 2, 3, 4
c.clock.step(1)
c.io.in.poke(0.U)
c.io.out.expect(12.U) // 0, 7, 2, 3
}
// Nonsymmetric filter
test(new MyModule(1, 2, 3, 4)) {
c =>
c.io.in.poke(1.U)
c.io.out.expect(1.U) // 1*1, 0*2, 0*3, 0*4
c.clock.step(1)
c.io.in.poke(4.U)
c.io.out.expect(6.U) // 4*1, 1*2, 0*3, 0*4
c.clock.step(1)
c.io.in.poke(3.U)
c.io.out.expect(14.U) // 3*1, 4*2, 1*3, 0*4
c.clock.step(1)
c.io.in.poke(2.U)
c.io.out.expect(24.U) // 2*1, 3*2, 4*3, 1*4
c.clock.step(1)
c.io.in.poke(7.U)
c.io.out.expect(36.U) // 7*1, 2*2, 3*3, 4*4
c.clock.step(1)
c.io.in.poke(0.U)
c.io.out.expect(32.U) // 0*1, 7*2, 2*3, 3*4
}
println("SUCCESS!!")
}
}
Verilog The code input is as follows :
module MyModule(
input clock,
input reset,
input [7:0] io_in,
output [7:0] io_out
);
`ifdef RANDOMIZE_REG_INIT
reg [31:0] _RAND_0;
reg [31:0] _RAND_1;
reg [31:0] _RAND_2;
`endif // RANDOMIZE_REG_INIT
reg [7:0] x_n1; // @[MyModule.scala 12:21]
reg [7:0] x_n2; // @[MyModule.scala 13:21]
reg [7:0] x_n3; // @[MyModule.scala 14:21]
wire [8:0] _io_out_T = io_in * 1'h0; // @[MyModule.scala 15:19]
wire [8:0] _io_out_T_1 = x_n1 * 1'h0; // @[MyModule.scala 15:33]
wire [8:0] _io_out_T_3 = _io_out_T + _io_out_T_1; // @[MyModule.scala 15:26]
wire [8:0] _io_out_T_4 = x_n2 * 1'h0; // @[MyModule.scala 15:47]
wire [8:0] _io_out_T_6 = _io_out_T_3 + _io_out_T_4; // @[MyModule.scala 15:40]
wire [8:0] _io_out_T_7 = x_n3 * 1'h0; // @[MyModule.scala 15:61]
wire [8:0] _io_out_T_9 = _io_out_T_6 + _io_out_T_7; // @[MyModule.scala 15:54]
assign io_out = _io_out_T_9[7:0]; // @[MyModule.scala 15:10]
always @(posedge clock) begin
if (reset) begin // @[MyModule.scala 12:21]
x_n1 <= 8'h0; // @[MyModule.scala 12:21]
end else begin
x_n1 <= io_in; // @[MyModule.scala 12:21]
end
if (reset) begin // @[MyModule.scala 13:21]
x_n2 <= 8'h0; // @[MyModule.scala 13:21]
end else begin
x_n2 <= x_n1; // @[MyModule.scala 13:21]
end
if (reset) begin // @[MyModule.scala 14:21]
x_n3 <= 8'h0; // @[MyModule.scala 14:21]
end else begin
x_n3 <= x_n2; // @[MyModule.scala 14:21]
end
end
// Register and memory initialization
... // Omit
endmodule
The test passed .
FIR Filter generator
This part needs the following content , But let's start with building FIR The basic idea of filter generator .
This generator has a length parameter length
, This parameter indicates the number of beats of the filter (taps), The coefficient of each beat is given by the input of the hardware module .
therefore , This generator has three inputs :
in
, The input of the filter ;valid
, Significant bit of input ;consts
,taps Constant vector of coefficients ;
And an output :
out
, The output of the filter
Therefore, the implementation is as follows :
import chisel3._
import chisel3.util._
class MyModule(length: Int) extends Module {
val io = IO(new Bundle {
val in = Input(UInt(8.W))
val valid = Input(Bool())
val consts = Input(Vec(length, UInt(8.W))) // Usage mentioned later
val out = Output(UInt(8.W))
})
// Usage mentioned later
val taps = Seq(io.in) ++ Seq.fill(io.consts.length - 1)(RegInit(0.U(8.W)))
taps.zip(taps.tail).foreach {
case (a, b) => when (io.valid) {
b := a } }
io.out := taps.zip(io.consts).map {
case (a, b) => a * b }.reduce(_ + _)
}
object MyModule extends App {
println(getVerilogString(new MyModule(4)))
}
Generated Verilog The code is as follows :
module MyModule(
input clock,
input reset,
input [7:0] io_in,
input io_valid,
input [7:0] io_consts_0,
input [7:0] io_consts_1,
input [7:0] io_consts_2,
input [7:0] io_consts_3,
output [7:0] io_out
);
`ifdef RANDOMIZE_REG_INIT
reg [31:0] _RAND_0;
reg [31:0] _RAND_1;
reg [31:0] _RAND_2;
`endif // RANDOMIZE_REG_INIT
reg [7:0] taps_1; // @[MyModule.scala 13:66]
reg [7:0] taps_2; // @[MyModule.scala 13:66]
reg [7:0] taps_3; // @[MyModule.scala 13:66]
wire [15:0] _io_out_T = io_in * io_consts_0; // @[MyModule.scala 16:56]
wire [15:0] _io_out_T_1 = taps_1 * io_consts_1; // @[MyModule.scala 16:56]
wire [15:0] _io_out_T_2 = taps_2 * io_consts_2; // @[MyModule.scala 16:56]
wire [15:0] _io_out_T_3 = taps_3 * io_consts_3; // @[MyModule.scala 16:56]
wire [15:0] _io_out_T_5 = _io_out_T + _io_out_T_1; // @[MyModule.scala 16:71]
wire [15:0] _io_out_T_7 = _io_out_T_5 + _io_out_T_2; // @[MyModule.scala 16:71]
wire [15:0] _io_out_T_9 = _io_out_T_7 + _io_out_T_3; // @[MyModule.scala 16:71]
assign io_out = _io_out_T_9[7:0]; // @[MyModule.scala 16:10]
always @(posedge clock) begin
if (reset) begin // @[MyModule.scala 13:66]
taps_1 <= 8'h0; // @[MyModule.scala 13:66]
end else if (io_valid) begin // @[MyModule.scala 14:64]
taps_1 <= io_in; // @[MyModule.scala 14:68]
end
if (reset) begin // @[MyModule.scala 13:66]
taps_2 <= 8'h0; // @[MyModule.scala 13:66]
end else if (io_valid) begin // @[MyModule.scala 14:64]
taps_2 <= taps_1; // @[MyModule.scala 14:68]
end
if (reset) begin // @[MyModule.scala 13:66]
taps_3 <= 8'h0; // @[MyModule.scala 13:66]
end else if (io_valid) begin // @[MyModule.scala 14:64]
taps_3 <= taps_2; // @[MyModule.scala 14:68]
end
end
// Register and memory initialization
... // Omit
endmodule
You can see , Input and output there Vec
And the back of the Seq
In the generated Verilog The code is expanded , I will learn the specific usage later .
DSP block (DspBlock
) Application and testing of
Inherit DSP It is challenging to integrate components into a large system , It's also easy to make mistakes .dsptools/rocket at master · ucb-bar/dsptools (github.com) This repository contains some useful generators that can help with similar tasks .
An abstraction of the core is recorded as DspBlock
, One DspBlock
Shall include :
- AXI-4 Stream input and output ;
- Memory mapping state and control ( In this case AXI4)
notes :AXI(Advanced eXtensible Interface) It's a bus protocol .
DspBlock
Used rocket Of diplomatic Interface ,Diplomacy and TileLink from the Rocket Chip · lowRISC: Collaborative open silicon engineering Summarizes about diplomacy Basic knowledge of , But don't worry about how it works in this example .
Put many different DspBlock
Connected together to form a complex SoC yes ,diplomacy It will shine .
In this case , Just made a peripheral .StandaloneBlock
Features are mixed together to make diplomacy Interface as top-level IO Interface work . Only when DspBlock
Used without any diplomacy When connecting interfaces , You need it StandaloneBlock
characteristic .
Use DspBlock
Need to be in build.sbt
Add the following dependencies :
libraryDependencies += "edu.berkeley.cs" %% "rocket-dsptools" % "1.4.3"
The following example is to FIR The filter is encapsulated AXI4 Interface :
import chisel3._
import chisel3.util._
import dspblocks._
import freechips.rocketchip.amba.axi4._
import freechips.rocketchip.amba.axi4stream._
import freechips.rocketchip.config._
import freechips.rocketchip.diplomacy._
import freechips.rocketchip.regmapper._
class MyModule(length: Int) extends Module {
val io = IO(new Bundle {
val in = Input(UInt(8.W))
val valid = Input(Bool())
val consts = Input(Vec(length, UInt(8.W))) // Usage mentioned later
val out = Output(UInt(8.W))
})
// Usage mentioned later
val taps = Seq(io.in) ++ Seq.fill(io.consts.length - 1)(RegInit(0.U(8.W)))
taps.zip(taps.tail).foreach {
case (a, b) => when (io.valid) {
b := a } }
io.out := taps.zip(io.consts).map {
case (a, b) => a * b }.reduce(_ + _)
}
object MyModule extends App {
println(getVerilogString(new MyModule(4)))
}
//
// Base class for all FIRBlocks.
// This can be extended to make TileLink, AXI4, APB, AHB, etc. flavors of the FIR filter
//
abstract class FIRBlock[D, U, EO, EI, B <: Data](val nFilters: Int, val nTaps: Int)(implicit p: Parameters)
// HasCSR means that the memory interface will be using the RegMapper API to define status and control registers
extends DspBlock[D, U, EO, EI, B] with HasCSR {
// diplomatic node for the streaming interface
// identity node means the output and input are parameterized to be the same
val streamNode = AXI4StreamIdentityNode()
// define the what hardware will be elaborated
lazy val module = new LazyModuleImp(this) {
// get streaming input and output wires from diplomatic node
val (in, _) = streamNode.in(0)
val (out, _) = streamNode.out(0)
require(in.params.n >= nFilters,
s"""AXI-4 Stream port must be big enough for all |the filters (need $nFilters,, only have ${in.params.n})""".stripMargin)
// make registers to store taps
val taps = Reg(Vec(nFilters, Vec(nTaps, UInt(8.W))))
// memory map the taps, plus the first address is a read-only field that says how many filter lanes there are
val mmap = Seq(
RegField.r(64, nFilters.U, RegFieldDesc("nFilters", "Number of filter lanes"))
) ++ taps.flatMap(_.map(t => RegField(8, t, RegFieldDesc("tap", "Tap"))))
// generate the hardware for the memory interface
// in this class, regmap is abstract (unimplemented). mixing in something like AXI4HasCSR or TLHasCSR
// will define regmap for the particular memory interface
regmap(mmap.zipWithIndex.map({
case (r, i) => i * 8 -> Seq(r)}): _*)
// make the FIR lanes and connect inputs and taps
val outs = for (i <- 0 until nFilters) yield {
val fir = Module(new MyModule(nTaps))
fir.io.in := in.bits.data((i+1)*8, i*8)
fir.io.valid := in.valid && out.ready
fir.io.consts := taps(i)
fir.io.out
}
val output = if (outs.length == 1) {
outs.head
} else {
outs.reduce((x: UInt, y: UInt) => Cat(y, x))
}
out.bits.data := output
in.ready := out.ready
out.valid := in.valid
}
}
// make AXI4 flavor of FIRBlock
abstract class AXI4FIRBlock(nFilters: Int, nTaps: Int)(implicit p: Parameters) extends FIRBlock[AXI4MasterPortParameters, AXI4SlavePortParameters, AXI4EdgeParameters, AXI4EdgeParameters, AXI4Bundle](nFilters, nTaps) with AXI4DspBlock with AXI4HasCSR {
override val mem = Some(AXI4RegisterNode(
AddressSet(0x0, 0xffffL), beatBytes = 8
))
}
// running the code below will show what firrtl is generated
// note that LazyModules aren't really chisel modules- you need to call ".module" on them when invoking the chisel driver
// also note that AXI4StandaloneBlock is mixed in- if you forget it, you will get weird diplomacy errors because the memory
// interface expects a master and the streaming interface expects to be connected. AXI4StandaloneBlock will add top level IOs
// println(chisel3.Driver.emit(() => LazyModule(new AXI4FIRBlock(1, 8)(Parameters.empty) with AXI4StandaloneBlock).module))
Yes DspBlock
The test is slightly different , Now we need to connect with the memory interface and LazyModule
Dealing with ,dsptools
There are some features that can help test DspBl;ock
.
An important feature is MemMasterModel
, This feature defines memReadWord
and memWriteWord
Such general-purpose functions are used for memory operations . This allows you to write a generic test , You can specify which memory interface you use , For example, you write a test and then specialize it to TileLink and AXI4 Interface .
The following example tests with this method FIRBlock
Of :
import dsptools.tester.MemMasterModel
import freechips.rocketchip.amba.axi4
import chisel3.iotesters._
abstract class FIRBlockTester[D, U, EO, EI, B <: Data](c: FIRBlock[D, U, EO, EI, B]) extends PeekPokeTester(c.module) with MemMasterModel {
// check that address 0 is the number of filters
require(memReadWord(0) == c.nFilters)
// write 1 to all the taps
for (i <- 0 until c.nFilters * c.nTaps) {
memWriteWord(8 + i * 8, 1)
}
}
// specialize the generic tester for axi4
class AXI4FIRBlockTester(c: AXI4FIRBlock with AXI4StandaloneBlock) extends FIRBlockTester(c) with AXI4MasterModel {
def memAXI = c.ioMem.get
}
// invoking testers on lazymodules is a little strange.
// note that the firblocktester takes a lazymodule, not a module (it calls .module in "extends PeekPokeTester()").
val lm = LazyModule(new AXI4FIRBlock(1, 8)(Parameters.empty) with AXI4StandaloneBlock)
chisel3.iotesters.Driver(() => lm.module) {
_ => new AXI4FIRBlockTester(lm) }
But unfortunately , Example of this test in the tutorial Failed to run successfully , I also failed to find a solution , be forced to give up , Further research will be done later when it comes to this .
边栏推荐
- CV in transformer learning notes (continuously updated)
- Data analysis is popular on the Internet, and the full version of "Introduction to data science" is free to download
- Line by line explanation of yolox source code of anchor free series network (6) -- mixup data enhancement
- [combinatorics] generating function (commutative property | derivative property | integral property)
- [tutorial] build your first application on coreos
- [Godot] add menu button
- Typescript official website tutorial
- [untitled]
- Computer graduation design PHP makeup sales Beauty shopping mall
- Summary and Reflection on the third week of winter vacation
猜你喜欢
Multifunctional web file manager filestash
2022-2028 global scar care product industry research and trend analysis report
Caddy server agent
G1 garbage collector of garbage collector
Grammaire anglaise Nom - Classification
Kratos微服务框架下实现CQRS架构模式
SQL injection -day16
Redis core technology and practice - learning notes (IX): slicing cluster
论文阅读 GloDyNE Global Topology Preserving Dynamic Network Embedding
Read the paper glodyne global topology preserving dynamic network embedding
随机推荐
Image 24 bit depth to 8 bit depth
Reappearance of ASPP (atlas spatial pyramid pooling) code
[untitled]
[combinatorics] exponential generating function (properties of exponential generating function | exponential generating function solving multiple set arrangement)
[Tongxin UOS] scanner device management driver installation
[enumeration] annoying frogs always step on my rice fields: (who is the most hateful? (POJ hundred practice 2812)
Nodejs (01) - introductory tutorial
简述服务量化分析体系
Install apache+php+mysql+phpmyadmin xampp and its error resolution
[combinatorics] exponential generating function (example 2 of solving multiple set permutation with exponential generating function)
Enterprise custom form engine solution (12) -- form rule engine 2
How to expand the capacity of golang slice slice
Setinterval CPU intensive- Is setInterval CPU intensive?
Golang string (string) and byte array ([]byte) are converted to each other
How about the Moco model?
Opencv learning notes (continuously updated)
English grammar_ Noun classification
Niuke monthly race 31 minus integer
Theoretical description of linear equations and summary of methods for solving linear equations by eigen
How to draw non overlapping bubble chart in MATLAB