-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Typo Fixes #66
Merged
Merged
Typo Fixes #66
Changes from all commits
Commits
Show all changes
27 commits
Select commit
Hold shift + click to select a range
54156ad
Lowercase loop variable
hassan-elsheikha 480f346
Fix typo
hassan-elsheikha e9e2724
Clarify definition of the size of a filter
hassan-elsheikha 9463611
Loop description typo fixes
hassan-elsheikha 08cbc6f
Add comma for clarity
hassan-elsheikha 2ee806f
Make variable name an inline code block
hassan-elsheikha defb0a6
Add C++ styling to code block
hassan-elsheikha 4b1d722
Inline-code many variable names
hassan-elsheikha 0caa2ce
Inline volatile keyword
hassan-elsheikha 7b45ac0
Add C++ styling to code block
hassan-elsheikha 43269b5
Add C++ styling to code block
hassan-elsheikha 9223f06
Add C++ styling to code block
hassan-elsheikha 197c6fa
Add C++ styling to code block
hassan-elsheikha a918e12
Add C++ styling to code block
hassan-elsheikha de50654
Incorrect usage of "dependent"
hassan-elsheikha dd7cbef
Clarify tradeoff on one sentence
hassan-elsheikha 836c973
Reverse incorrect change
hassan-elsheikha 221b09b
Update readme.md
hassan-elsheikha 6716bb2
Update readme.md
hassan-elsheikha 11842a5
Update readme.md
hassan-elsheikha 4af7301
Update readme.md
hassan-elsheikha cc7a6d1
Update readme.md
hassan-elsheikha e9d2931
Update readme.md
hassan-elsheikha e72a0b7
Update readme.md
hassan-elsheikha 8c45115
Update readme.md
hassan-elsheikha 2ce5469
Update readme.md
hassan-elsheikha 5bd61c6
Update readme.md
hassan-elsheikha File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -584,7 +584,7 @@ performance. | |
34 // Pipeline for extra performance. | ||
35 #pragma HLS loop pipeline | ||
36 for (int i = 0; i < 100; i++) | ||
37 sum += buf[i\]; | ||
37 sum += buf[i]; | ||
38 done = false; | ||
39 output_fifo.write(sum); | ||
40 } | ||
|
@@ -782,7 +782,7 @@ int main() { | |
|
||
## Verification: Co-simulation of Multi-threaded SmartHLS Code | ||
|
||
As mentioned before the `producer_consumer` project cannot be simulated | ||
As mentioned before, the `producer_consumer` project cannot be simulated | ||
with co-simulation. This is because the `producer_consumer` project has | ||
threads that run forever and do not finish before the top-level function | ||
returns. SmartHLS co-simulation supports a single call to the top-level | ||
|
@@ -1126,11 +1126,11 @@ is defined by the number of rows, columns, and the depth. | |
Convolution layers are good for extracting geometric features from an | ||
input tensor. Convolution layers work in the same way as an image | ||
processing filter (such as the Sobel filter) where a square filter | ||
(called a **kernel**) is slid across an input image. The **size** of | ||
filter is equal to the side length of the square filter, and the size of | ||
the step when sliding the filter is called the **stride**. The values of | ||
the input tensor under the kernel (called the **window**) and the values | ||
of the kernel are multiplied and summed at each step, which is also | ||
(called a **kernel**) is slid across an input image. The **size** of a | ||
filter is equal to its side length, and the size of the step when sliding | ||
the filter is called the **stride**. The values of the input tensor | ||
under the kernel (called the **window**) and the values of the | ||
kernel are multiplied and summed at each step, which is also | ||
called a convolution. Figure 13 shows an example of a convolution layer | ||
processing an input tensor with a depth of 1. | ||
|
||
|
@@ -1579,16 +1579,16 @@ we show the input tensor values and convolution filters involved in the | |
computation of the set of colored output tensor values (see Loop 3 | ||
arrow). | ||
|
||
Loop 1 and Loop 2 the code traverses along the row and column dimensions | ||
For Loop 1 and Loop 2, the code traverses along the row and column dimensions | ||
of the output tensor. Loop 3 traverses along the depth dimension of the | ||
output tensor, each iteration computes a `PARALLEL_KERNELS` number of | ||
output tensor, and each iteration computes a total of `PARALLEL_KERNELS` | ||
outputs. The `accumulated_value` array will hold the partial | ||
dot-products. Loop 4 traverses along the row and column dimensions of | ||
the input tensor and convolution filter kernels. Then Loop 5 walks | ||
through each of the `PARALLEL_KERNELS` number of selected convolution | ||
the input tensor and convolution filter kernels. Then, Loop 5 walks | ||
through each of the `PARALLEL_KERNELS` selected convolution | ||
filters and Loop 6 traverses along the depth dimension of the input | ||
tensor. Loop 7 and Loop 8 add up the partial sums together with biases | ||
to produce `PARALLEL_KERNEL` number of outputs. | ||
to produce `PARALLEL_KERNEL` outputs. | ||
|
||
```C | ||
const static unsigned PARALLEL_KERNELS = NUM_MACC / INPUT_DEPTH; | ||
|
@@ -2202,7 +2202,7 @@ instructions that always run together with a single entry point at the | |
beginning and a single exit point at the end. A basic block in LLVM IR | ||
always has a label at the beginning and a branching instruction at the | ||
end (br, ret, etc.). An example of LLVM IR is shown below, where the | ||
`body.0` basic block performs an addition (add) and subtraction (sub) and | ||
`body.0` basic block performs an addition (add) and subtraction (sub), and | ||
then branches unconditionally (br) to another basic block labeled | ||
`body.1`. Control flow occurs between basic blocks. | ||
|
||
|
@@ -2230,7 +2230,7 @@ button (![](.//media/image28.png)) to build the design and generate the | |
schedule. | ||
|
||
We can ignore the `printWarningMessageForGlobalArrayReset` warning message | ||
for global variable a in this example as described in the producer | ||
for global variable `a` in this example as described in the producer | ||
consumer example in the [section 'Producer Consumer Example'](#producer-consumer-example). | ||
|
||
The first example we will look at is the `no_dependency` example on line | ||
|
@@ -2239,7 +2239,7 @@ The first example we will look at is the `no_dependency` example on line | |
|
||
<p align="center"><img src=".//media/image19.png" /></p> | ||
|
||
``` | ||
```c++ | ||
8 void no_dependency() { | ||
9 #pragma HLS function noinline | ||
10 e = b + c; | ||
|
@@ -2251,10 +2251,10 @@ The first example we will look at is the `no_dependency` example on line | |
<p align="center">Figure 28: Source code and data dependency graph for no_dependency | ||
function.</p> | ||
|
||
In this example, values are loaded from b, c, and d and additions happen | ||
before storing to *e*, *f*, and *g*. None of the adds use results from | ||
In this example, values are loaded from `b`, `c`, and `d`, and additions happen | ||
before storing to `e`, `f`, and `g`. None of the adds use results from | ||
the previous adds and thus all three adds can happen in parallel. The | ||
*noinline* pragma is used to prevent SmartHLS from automatically | ||
`noinline` pragma is used to prevent SmartHLS from automatically | ||
inlining this small function and making it harder for us to understand | ||
the schedule. Inlining is when the instructions in the called function | ||
get copied into the caller, to remove the overhead of the function call | ||
|
@@ -2289,17 +2289,17 @@ the store instruction highlighted in yellow depends on the result of the | |
add instruction as we expect. | ||
|
||
We have declared all the variables used in this function as | ||
**volatile**. The volatile C/C++ keyword specifies that the variable can | ||
**volatile**. The `volatile` C/C++ keyword specifies that the variable can | ||
be updated by something other than the program itself, making sure that | ||
any operation with these variables do not get optimized away by the | ||
compiler as every operation matters. An example of where the compiler | ||
handles this incorrectly is seen in the [section 'Producer Consumer Example'](#producer-consumer-example), where we had to | ||
declare a synchronization signal between two threaded functions as | ||
volatile. Using volatile is required for toy examples to make sure each | ||
`volatile`. Using `volatile` is required for toy examples to make sure each | ||
operation we perform with these variables will be generated in hardware | ||
and viewable in the Schedule Viewer. | ||
|
||
``` | ||
```c++ | ||
4 volatile int a[5] = {0}; | ||
5 volatile int b = 0, c = 0, d = 0; | ||
6 volatile int e, f, g; | ||
|
@@ -2314,7 +2314,7 @@ code and SmartHLS cannot schedule all instructions in the first cycle. | |
|
||
<p align="center"><img src=".//media/image68.png" /></p> | ||
|
||
``` | ||
```c++ | ||
15 void data_dependency() { | ||
16 #pragma HLS function noinline | ||
17 e = b + c; | ||
|
@@ -2336,8 +2336,8 @@ second add is also used in the third add. These are examples of data | |
dependencies as later adds use the data result of previous adds. Because | ||
we must wait for the result `e` to be produced before we can compute `f`, | ||
and then the result `f` must be produced before we can compute `g`, not all | ||
instructions can be scheduled immediately. They must wait for their | ||
dependent instructions to finish executing before they can start, or | ||
instructions can be scheduled immediately. They must wait for the instructions | ||
they depend on to finish executing before they can start, or | ||
they would produce the wrong result. | ||
|
||
<p align="center"><img src=".//media/image70.png" /></br> | ||
|
@@ -2374,7 +2374,7 @@ memories. | |
|
||
<p align="center"><img src=".//media/image72.png" /></p> | ||
|
||
``` | ||
```c++ | ||
22 void memory_dependency() { | ||
23 #pragma HLS function noinline | ||
24 volatile int i = 0; | ||
|
@@ -2418,7 +2418,7 @@ resource cannot be scheduled in parallel due to a lack of resources. | |
`resource_contention` function on line 30 of | ||
`instruction_level_parallelism.cpp`. | ||
|
||
``` | ||
```c++ | ||
30 void resource_contention() { | ||
31 #pragma HLS function noinline | ||
32 e = a[0]; | ||
|
@@ -2451,7 +2451,7 @@ when generating the schedule for a design. | |
Next, we will see an example of how loops prevent operations from being | ||
scheduled in parallel. | ||
|
||
``` | ||
```c++ | ||
37 void no_loop_unroll() { | ||
38 #pragma HLS function noinline | ||
39 int h = 0; | ||
|
@@ -2480,10 +2480,9 @@ has no unrolling on the loop and `loop_unroll` unrolls the loop | |
completely. This affects the resulting hardware by removing the control | ||
signals needed to facilitate the loop and combining multiple loop bodies | ||
into the same basic block, allowing more instructions to be scheduled in | ||
parallel. The trade-off here is an unrolled loop does not reuse hardware | ||
resources and can potentially use a lot of resources. However, the | ||
unrolled loop would finish earlier depending on how inherently parallel | ||
the loop body is. | ||
parallel. The trade-off here is that an unrolled loop does not reuse hardware | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Original was correct: https://prowritingaid.com/comma-however |
||
resources and can potentially use a lot of resources, however it will | ||
finish earlier depending on how inherently parallel the loop body is. | ||
|
||
![](.//media/image3.png)To see the effects of this, open the Schedule | ||
Viewer and first click on the `no_loop_unroll` function shown in Figure | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Original was correct. https://www.grammarly.com/blog/comma-before-and/