Data Structures#
Each structure within Multiplied aims to provide fine grain control of the algorithm’s design. From automatic generation, to controlling how bits move between stages and the ability to chose where arithmetic units are placed.
This page will cover the major data structures in Multiplied. For a brief introduction to Multiplied check out the quickstart guide.
Matrix#
Combinational multiplication generates partial products
which are reduced to a single product. The Matrix object contains the partial
product matrix(PPM) while also tracking which bits are important via it’s formatting
style.
let’s create 4-bit Matrix Object:
import multiplied as mp
matrix = mp.Matrix(4) # 4-bit partial product matrix
And here’s what it looks like:
print(matrix)
____0000
___0000_
__0000__
_0000___
Note
The width of the matrix is 2x the operand bit width. This is because the maximum output of a product of two x-bit numbers results in a 2x-bit value.
15 * 15 = 225 -> 0b1111 * 0b1111 = 0b11100001
The Matrix structure gives Multiplied objects fine grain access to
bits and partial products while keeping data human readable.
Wallace Tree#
instead of generating a zeroed matrix, we can generate a matrix with a two operands:
matrix = mp.Matrix(4, a=15, b=3)
print(matrix)
____1111
___1111_
__0000__
_0000___
By default the matrix generates a Wallace tree.
Dadda Tree#
With an extra step it can be adjusted to resemble the start of a Dadda tree.
matrix = mp.Matrix(4, a=15, b=3)
# maps bits to the top of the matrix
_ = mp.hoist(matrix) # discard returned map
print(matrix)
_0011111
__00111_
___000__
____0___
Important
A method of generating a Dadda tree Matrix object is planned:
matrix = mp.Matrix(4, a=15, b=3, dadda=True)
Template#
Templates defines the types of arithmetic units used and where they are placed within each algorithm stage.
The goal of a template is to “reduce” the number of partial products.
Multiplied allows two overarching methods to create templates:
Pattern Based Templates#
Patterns are arrays which define an arithmetic unit to be places across entire rows of the Matrix:
# template based on two arithmetic units
pattern = mp.Pattern(["a", "a", "a", "b", "b", "c", "c", "d"])
template = mp.Template(pattern)
print(pattern)
print(template)
[a
a
a
b
b
c
c
d]
________aAaAaAaA
_______AaAaAaAa_
______aAaAaAaA__
_____BbBbBbBb___
____bBbBbBbB____
___CcCcCcCc_____
__cCcCcCcC______
_dDdDdDdD_______
______aAaAaAaAaA
______AaAaAaAa__
________________
___bbBbBbBbBb___
________________
_ccCcCcCcCc_____
________________
_dDdDdDdD_______
Arithmetic Units#
As shown above each group or “run” of characters represents a single arithmetic unit, each of which falls into three categories:
CSAs#
Units spanning three rows define a CSA Each taking in 3 inputs and returning a sum and a carry.
[ Template ] [ Pattern ] [ Result ]
________aAaAaAaA a ______aAaAaAaAaA
_______AaAaAaAa_ a ______AaAaAaAa__
______aAaAaAaA__ a ________________
Note
The utility of a CSA is the size and speed of it’s circuit. Instead of wasting space using large full adders, many smaller CSAs will produce the same reduction, faster.
Adders#
Units spanning two rows define an adder, each taking in two values and returning the combined sum.
[ Template ] [ Pattern ] [ Result ]
_____BbBbBbBb___ b ___bbBbBbBbBb___
____bBbBbBbB____ b ________________
___CcCcCcCc_____ c _ccCcCcCcCc_____
__cCcCcCcC______ c ________________
NOOPs#
Units spanning a single row are considered as NOOP or no operation.
[ Template ] [ Pattern ] [ Result ]
_dDdDdDdD_______ d _dDdDdDdD_______
Note
Decoders are also planned, potentially reducing inputs by more than 1.
By encoding specialised operations based on bit position, count, unary count, etc. a decoder can also offer unique operations.
Complex Templates#
Pattern based templates can be used as a starting point for more complex templates:
raw_template = template.template
raw_result = template.result
for t in raw_template:
print(t)
print("----")
for r in raw_result:
print(r)
['_', '_', '_', '_', '_', '_', '_', '_', 'a', 'A', 'a', 'A', 'a', 'A', 'a', 'A']
['_', '_', '_', '_', '_', '_', '_', 'A', 'a', 'A', 'a', 'A', 'a', 'A', 'a', '_']
['_', '_', '_', '_', '_', '_', 'a', 'A', 'a', 'A', 'a', 'A', 'a', 'A', '_', '_']
['_', '_', '_', '_', '_', 'B', 'b', 'B', 'b', 'B', 'b', 'B', 'b', '_', '_', '_']
['_', '_', '_', '_', 'b', 'B', 'b', 'B', 'b', 'B', 'b', 'B', '_', '_', '_', '_']
['_', '_', '_', 'C', 'c', 'C', 'c', 'C', 'c', 'C', 'c', '_', '_', '_', '_', '_']
['_', '_', 'c', 'C', 'c', 'C', 'c', 'C', 'c', 'C', '_', '_', '_', '_', '_', '_']
['_', 'd', 'D', 'd', 'D', 'd', 'D', 'd', 'D', '_', '_', '_', '_', '_', '_', '_']
----
['_', '_', '_', '_', '_', '_', 'a', 'A', 'a', 'A', 'a', 'A', 'a', 'A', 'a', 'A']
['_', '_', '_', '_', '_', '_', 'A', 'a', 'A', 'a', 'A', 'a', 'A', 'a', '_', '_']
['_', '_', '_', '_', '_', '_', '_', '_', '_', '_', '_', '_', '_', '_', '_', '_']
['_', '_', '_', 'b', 'b', 'B', 'b', 'B', 'b', 'B', 'b', 'B', 'b', '_', '_', '_']
['_', '_', '_', '_', '_', '_', '_', '_', '_', '_', '_', '_', '_', '_', '_', '_']
['_', 'c', 'c', 'C', 'c', 'C', 'c', 'C', 'c', 'C', 'c', '_', '_', '_', '_', '_']
['_', '_', '_', '_', '_', '_', '_', '_', '_', '_', '_', '_', '_', '_', '_', '_']
['_', 'd', 'D', 'd', 'D', 'd', 'D', 'd', 'D', '_', '_', '_', '_', '_', '_', '_']
Using the print out to build a new template and result:
complex_template = [
["_", "_", "_", "_", "_", "_", "_", "_", "a", "A", "a", "A", "a", "x", "x", "x"],
["_", "_", "_", "_", "_", "_", "_", "A", "a", "A", "a", "A", "a", "y", "y", "_"],
["_", "_", "_", "_", "_", "_", "a", "A", "a", "A", "a", "A", "a", "z", "_", "_"],
["_", "_", "_", "_", "_", "B", "b", "B", "b", "B", "b", "B", "b", "_", "_", "_"],
["_", "_", "_", "_", "b", "B", "b", "B", "b", "B", "b", "B", "_", "_", "_", "_"],
["_", "_", "_", "C", "c", "C", "c", "C", "c", "C", "c", "_", "_", "_", "_", "_"],
["_", "_", "c", "C", "c", "C", "c", "C", "c", "C", "_", "_", "_", "_", "_", "_"],
["_", "d", "D", "d", "D", "d", "D", "d", "D", "_", "_", "_", "_", "_", "_", "_"],
]
complex_result = [
["_", "_", "_", "_", "_", "_", "a", "A", "a", "A", "a", "A", "a", "x", "x", "x"],
["_", "_", "_", "_", "_", "_", "A", "a", "A", "a", "A", "a", "_", "y", "y", "_"],
["_", "_", "_", "_", "_", "_", "_", "_", "_", "_", "_", "_", "_", "z", "_", "_"],
["_", "_", "_", "b", "b", "B", "b", "B", "b", "B", "b", "B", "b", "_", "_", "_"],
["_", "_", "_", "_", "_", "_", "_", "_", "_", "_", "_", "_", "_", "_", "_", "_"],
["_", "c", "c", "C", "c", "C", "c", "C", "c", "C", "c", "_", "_", "_", "_", "_"],
["_", "_", "_", "_", "_", "_", "_", "_", "_", "_", "_", "_", "_", "_", "_", "_"],
["_", "d", "D", "d", "D", "d", "D", "d", "D", "_", "_", "_", "_", "_", "_", "_"],
]
new_template = mp.Template(complex_template, result=complex_result)
print(new_template)
________aAaAaxxx
_______AaAaAayy_
______aAaAaAaz__
_____BbBbBbBb___
____bBbBbBbB____
___CcCcCcCc_____
__cCcCcCcC______
_dDdDdDdD_______
______aAaAaAaxxx
______AaAaAa_yy_
_____________z__
___bbBbBbBbBb___
________________
_ccCcCcCcCc_____
________________
_dDdDdDdD_______
Warning
Currently a resultant template must be supplied when creating a template without using a pattern.
Auto-resolution of resultant templates is planned.
Map#
Each step of an algorithm needs a pattern or a template, but it also needs to regroup their partial products before using the next template. This is where maps come in.
First, note that bits can only move vertically as moving horizontally changes it’s value. Therefore, each map value is a signed hexadecimal number.
For simple maps, the map value represents an entire row rather than a specific bit.
# Maps for each stage of first_alg
1st: 2nd: 3rd:
00 00 00
00 00 00
00 FF 00
FF 00 00
Note
Outputs of each units are packed to the top of their initial row.
Here’s the breakdown of this example:
1st stage - Move first 2 rows of result down by 1 [-1 = FF = down * 1]
2nd stage - Move middle 2 rows of result down by 1 [-1 = FF = down * 1]
3rd stage - No moves required
Algorithms use a template to produce a result, which is then “mapped” to the next template. Each Adder/CSA/etc. needs to know where it should output in relation to the next template. This means as long as outputs are mapped correctly to inputs, the placements of arithmetic units can be anywhere. The final output (x from the pattern example) can be anywhere within the matrix.
Note
In other words, these are also valid algorithms:
# [ Key ]
# char | r :: arithmetic unit | result
# Another valid algorithm # Another
1st: 2nd: 3rd: output: 1st: 2nd: 3rd: output:
a r a r c r
a r c r a r c r d r x
a c r d r a c d
b r c d x b r
# maps # maps
01 00 00 00 01 00
01 01 00 00 01 00
00 01 01 00 00 00
00 00 00 FF 00 00
Algorithm#
The Algorithm object bundle templates and maps to define a multiplication algorithm. A given stage contains:
"template" : mp.Template # source and resultant templates
"pseudo" : mp.Matrix # resultant matrix after reduction + map
"map" : mp.Map # modifies bit positions
This guide will use pattern based templates to demonstrate the structure of an Algorithm, for a more comprehensive overview check out the complex algorithm guide.
Population#
completely automatic generation…
auto_alg = mp.Algorithm(4)
auto_alg.auto_resolve_stage(recursive=True)
print(auto_alg)
0:{
template:{
____AaAa
___aAaA_
__AaAa__
_BbBb___
__AaAaAa
__aAaA__
________
_BbBb___
}
pseudo:{
__AaAaAa
__aAaA__
_BbBb___
________
}
map:{
00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00
FF FF FF FF FF FF FF FF
}
1:{
template:{
__AaAaAa
__AaAa__
_aAaA___
________
_aAaAaAa
_AaAa___
________
________
}
pseudo:{
_aAaAaAa
_AaAa___
________
________
}
map:{
00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00
}
2:{
template:{
_aAaAaAa
_aAaA___
________
________
AaAaAaAa
________
________
________
}
pseudo:{
AaAaAaAa
________
________
________
}
map:{
00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00
}
Push templates or patterns one at a time:
alg = mp.Algorithm(4)
alg.push(mp.Pattern(["a", "a", "b", "b"]))
print(alg)
0:{
template:{
____aAaA
___AaAa_
__bBbB__
_BbBb___
__aAaAaA
________
bBbBbB__
________
}
pseudo:{
__aAaAaA
bBbBbB__
________
________
}
map:{
00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00
FF FF FF FF FF FF FF FF
00 00 00 00 00 00 00 00
}
Note
Pushing a pattern will automatically generate a new stage based on the prior stage.
Automatically generate the rest:
alg.auto_resolve_stage(recursive=True)
print(alg)
0:{
template:{
____aAaA
___AaAa_
__bBbB__
_BbBb___
__aAaAaA
________
bBbBbB__
________
}
pseudo:{
__aAaAaA
bBbBbB__
________
________
}
map:{
00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00
FF FF FF FF FF FF FF FF
00 00 00 00 00 00 00 00
}
1:{
template:{
__AaAaAa
AaAaAa__
________
________
AaAaAaAa
________
________
________
}
pseudo:{
AaAaAaAa
________
________
________
}
map:{
00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00
}
Execution#
Use exec() to run the algorithm using two operands:
a = 15
b = 15
output = alg.exec(a, b)
for m in output.values():
print(m)
# convert result to decimal
print(int("".join(alg.matrix.matrix[0]), 2))
print(a * b)
____1111
___1111_
__1111__
_1111___
__101101
101101__
________
________
11100001
________
________
________
225
225
Manually provide starting matrix and step through an algorithm:
starting_ppm = mp.Matrix(4, a=5, b=9)
alg.reset(starting_ppm)
print(alg.matrix) # initial state
print(alg.step())
print(alg.step())
____0101
___0000_
__0000__
_0101___
__000101
001010__
________
________
00101101
________
________
________
Saturation#
Some workloads require operations clamped to the source bit width such as digital signal processing(DSP) and working with RGB pixel values. This clamping is called saturation.
11 * 12 -> 0b1011 * 0b1100 -> 0b10000100 -[Clamp=4b]-> 0b1111
Cout|1011
-----+-----
[____|0000] 0
[___0|000_] 0
[__10|11__] 1
[_101|1___] 1
-----+-----
IF !0 ->|1111 -> 15
alg.saturation = True
for m in alg.exec(11, 12).values():
print(m)
____0000
___0000_
__1011__
_1011___
00001111
________
________
________
00001111
________
________
________