Intel® High Level Synthesis Accelerator Functional Unit Design Example User Guide

ID 683025
Date 7/19/2019
Public
Document Table of Contents

2.7. Loading AF Bitstream and Running the Host Application

To run the bitstream, ensure that your host system contains an Intel® FPGA PAC and that you have Acceleration Stack (including OPAE) installed and configured. For details, see Intel Acceleration Stack Quick Start Guide for Intel® PAC with Intel® Arria® 10 GX FPGA.
  1. Start a terminal session and navigate to the root of the project (the hls_afu directory).
  2. Configure your system to use appropriately sized hugepages:
    $ sudo sh -c "echo 20 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages"
  3. Load the AF into the FPGA:
    $ sudo fpgaconf hls_afu.gbs
  4. Navigate to the hls_afu/sw directory.
  5. Build and run the host application (do not specify USE_ASE=1).
    $ make
    $ sudo ./hls_afu_host
The expected output is:
Using Avalon Slave at offset 0x40
No vector size specified. Default to size 64 floats! run ./hls_afu_host <vectorsize> to specify a vector size at runtime.
Using test vector of size 64.
Running Test
AFU DFH REG = 1000010000000000
AFU ID LO = 944028430b016f3d
AFU ID HI = 5fa7fd4b867c484c
AFU NEXT = 00000000
AFU RESERVED = 00000000
end of output memory before executing kernel:
    [62] - -6259853398707798016.000000 (0xdeadbeef)
    [63] - -6259853398707798016.000000 (0xdeadbeef)
    [64] - -6259853398707798016.000000 (0xdeadbeef)
    [65] - 0.000000 (0x0)
Interrupt enabled = 00000000
Interrupt enabled = 00000001
AFU Latency: 0.01600 milliseconds
Poll success. Return = 1
check output memory:
output memory OK!
sum: Expected 715.000000, calculated 715.000000.

The FPGA writes a full 512-bit word (64 bytes) to host memory, so if the size 
of your test vector (in bytes) is not a multiple of 64, the FPGA will 
overwrite some space at the end of output memory. fpgaPrepareBuffer() 
allocates your host memory in a buffer that is a multiple of 64 bytes, so the 
FPGA behavior will not affect your application. You should expect to see a 
single 0xdeadbeef at the end of the output memory if and only if the size of 
your test vector (determined by vector_size, and the datatype) is a multiple 
of 64 bytes (that is, if vector_size is a multiple of 16). 

end of output memory after executing kernel:
    [62] - 22.333334 (0x41b2aaab)
    [63] - 22.666666 (0x41b55555)
    [64] - -6259853398707798016.000000 (0xdeadbeef)
    [65] - 0.000000 (0x0)
Vector size is 64 (256 bytes), so expect memory output at [64] = 0xdeadbeef
Finished Running Test.
Test PASSED