Coder Social home page Coder Social logo

hpcc_fpga's Introduction

HPCC FPGA

GitHub license DOI:10.1109/H2RC51942.2020.00007 Open Source Love svg2 GitHub release

HPCC FPGA is an OpenCL-based FPGA benchmark suite with a focus on high-performance computing. It is based on the benchmarks of the well-established CPU benchmark suite HPCC. This repository contains the OpenCL kernels and host code for all benchmarks together with the build scripts and instructions. Visit the Online Documentation to find additional help, descriptions, and measurement results.

Overview

Every benchmark comes in a separate subfolder and with its own build scripts. This allows the individual configuration of every benchmark and also the use of different dependencies (e.g. the network benchmark b_eff might need a different BSP than the other benchmarks).

The included benchmarks are listed below. You can find more information on how to build the benchmarks in the appropriate subfolder.

  • b_eff: Ths application sends messages of varying sizes of the inter-FPGA network and measures the achieved bandwidth.
  • FFT: Executes multiple 1d-FFT of configurable size.
  • GEMM: Multiplies two quadratic matrices similar to the GEMM routine implemented in BLAS.
  • LINPACK: Implementation of the LINPACK benchmark for FPGA without pivoting.
  • PTRANS: Transposes a quadratic matrix.
  • RandomAccess: Executes updates on a data array following a pseudo-random number scheme.
  • STREAM: Implementation of the STREAM benchmark for FPGA.

The repository contains multiple submodules located in the extern folder.

General Build Setup

The build setup is very similar for all benchmarks in the suite. Every benchmark comes with a separate CMake project with host and device code.

General Dependencies

All benchmarks come with the following build dependencies:

  • CMake >= 3.13
  • C++ compiler with C++11 and support (GCC 4.9.0+)
  • Intel OpenCL FPGA SDK or Xilinx Vitis
  • Python 3 with jinja2 for code generation and pandas for the evaluation scripts.

Moreover, additional libraries are fetched by the build system during configuration:

These dependencies will be downloaded automatically when configuring a benchmark for the first time. The exact version that are used can be found in the CMakeLists.txtlocated in the extern directory where all extern dependencies are defined. Besides that, some benchmarks might need additional dependencies. More information on that can be found in the README located in the subfolder for each benchmark. One key feature of all benchmarks of this suite is that they come with individual configuration options. They can be used to adjust the OpenCL base implementations of a benchmark for a specific FPGA architecture and optimize the performance and resource usage.

Configuration of a Benchmark

The configuration options are implemented as CMake build parameters and can be set when creating a new CMake build directory. We recommend to create a new build directory for a benchmark in a folder build in the root directory of the project. You may want to create a folder hierarchy in there e.g. to build the STREAM benchmark create a folder build/STREAM and change into that new folder. Initialize a new CMake build directory by calling

cmake PATH_TO_SOURCE_DIR

where PATH_TO_SOURCE_DIR would be ../../STREAM in the case of STREAM (the relative path to the source directory of the target benchmark). Some of the configuration options are the same for each benchmark and are given in the Table below. Especially the FPGA_BOARD_NAME is important to set, since it will specify the target board. The DEFAULT_* options are used by the host code and can also be changed later at runtime. The given default values will be set if no other values are given during configuration.

Name Default Description
DEFAULT_DEVICE -1 Index of the default device (-1 = ask)
DEFAULT_PLATFORM -1 Index of the default platform (-1 = ask)
DEFAULT_REPETITIONS 10 Number of times the kernel will be executed
FPGA_BOARD_NAME p520_hpc_sg280l Name of the target board

Additionally, the compile options for the Intel or Xilinx compiler have to be specified. For the Intel compiler these are:

Name Default Description
AOC_FLAGS -fpc -fp-relaxed -no-interleaving=default Additional Intel AOC compiler flags that are used for kernel compilation
INTEL_CODE_GENERATION_SETTINGS "" Path to the settings file that will be used as input for the code generator script. It may contain additional variables or functions.

For the Xilinx compiler it is also necessary to set settings files for the compile and link step of the compiler. The available options are given in the following table:

Name Default Description
XILINX_COMPILE_FLAGS -j 40 Set special compiler flags like the number of used threads for compilation.
XILINX_COMPILE_SETTINGS_FILE First settings.compile.xilinx.*.ini file found in the settings folder of the benchmark Path to the file containing compile time settings like the target clock frequency
XILINX_LINK_SETTINGS_FILE First settings.link.xilinx.*.ini file found in the settings folder of the benchmark Path to the file containing link settings like the mapping of the memory banks to the kernel parameters
XILINX_GENERATE_LINK_SETTINGS Yes if the link settings file ends on .generator.ini, No otherwise Boolean flag indicating if the link settings file will be used as a source to generate a link settings file e.g. for a given number of kernel replications

When building a benchmark for Xilinx FPGAs double-check the path to the settings files and if they match to the target board. The settings files follow the name convention:

settings.[compile|link].xilinx.KERNEL_NAME.[hbm|ddr](?.generator).ini

where KERNEL_NAME is the name of the target OpenCL kernel file. hbm or ddr is the type of used global memory.

All the given options can be given to CMake over the -D flag.

cmake ../../RandomAccess -DFPGA_BOARD_NAME=my_board -D...

or after configuration using the UI with

ccmake ../../RandomAccess

Some benchmarks also provide prepared configurations for individual devices in the configs folder located in the source directory of each benchmark. These configurations can be used instead of manually setting every configuration option. to do this, call cmake as follows:

cmake -DHPCC_FPGA_CONFIG=path-to-config-file.cmake

This is also a way to contribute configuration best practices for specific devices.

For an overview of the current limitations of the benchmarks refer to the subsection Notes on Vendor Compatibility. In the following the configuration and build steps are shown with a more specific example.

Build and Test Example: STREAM for Intel OpenCL FPGA SDK and the Nallatech 520N

As an example to configure and build the kernels of the STREAM benchmark you can follow the steps below. The steps are very similar for all benchmarks of the suite.

Create a build directory to store the build configuration and intermediate files:

mkdir -p build/build-stream
cd build/build-stream

Configure the build using CMake and set the STREAM specific configuration options to match the target FPGA:

cmake ../../STREAM -DDEVICE_BUFFER_SIZE=8192 -DFPGA_BOARD_NAME=p520_hpc_sg280l \
    -DUSE_SVM=No -DNUM_REPLICATIONS=4

In this example, DEVICE_BUFFER_SIZE, USE_SVM and NUM_REPLICATIONS are configuration options specific to STREAM. Additional options can be found in the README for every benchmark.

The created build configuration can then be used to build and execute the tests and create a report for the OpenCL kernel:

# Build the tests, host code and emulation kernels
make all

# Execute the tests
make CL_CONTEXT_EMULATOR_DEVICE_INTELFPGA=1 test

# Create a report for the OpenCL kernel
make stream_kernels_single_report_intel

The report can be found within the build directory of the project e.g. under bin/stream_kernels_single/reports.

If the tests with the selected configuration succeed and the report shows a high resource utilization and no problems with the design, the kernel can be synthesized with:

# Synthesize the kernel
make stream_kernels_single

# Build the host code
make STREAM_FPGA_intel 

All artifacts can be found in the bin folder located in the current build directory.

Basic Functionality Testing

The subfolder scripts contains helper scripts that are used during the build and test process or for evaluation. When major changes are made on the code the functionality should be checked by running all tests in the suite. To simplify this process the script test_all.sh can be used to build all benchmarks with the default configuration and run all tests.

Code Documentation

The benchmark suite supports the generation of code documentation using Doxygen in HTML and Latex format. To generate the documentation, execute the following commands:

cd docs
doxygen doxy.config

The generated documentation will be placed in docs/html and docs/latex. To view the HTML documentation, open docs/html/index.html with an internet browser.

More general documentation is maintained using Sphinx. It can be generated using the makefile provided in the docs/ folder. In this documentation, a more general description of the benchmarks and how to use them is given. The Sphinx documentation is also provided online. To generate the HTML documentation offline, execute the commands below:

cd docs
make html

The documentation will be created in the folder docs/source.

Custom Kernels

The benchmark suite also allows to use custom kernels for measurements. Some basic setup is already done for all benchmarks to ensure an easier integration of custom kernels into the build system. Every benchmark comes with a folder src/device/custom that is meant to be used for customized or own kernel designs. The folder already contains a CMakeLists.txt file that creates build targets for all OpenCL files (*.cl) in this folder and also creates tests for each custom kernel using CTest. Place the custom kernel design in this folder to use it within the build system of the benchmark suite. To enable custom kernel builds the build environment has to be configured using the USE_CUSTOM_KERNEL_TARGETS flag. As an example, the following call will create a build environment that supports custom kernels for STREAM:

mkdir build; cd build
cmake ../STREAM -DUSE_CUSTOM_KERNEL_TARGETS=Yes

After adding a new custom kernel to the folder, rerun CMake to generate build targets and tests for this kernel. Now the same build commands that are used for the base implementations can be used for the custom implementations. This also means that the kernels will also be tested within the make test command.

This feature also allows to easily add optimized kernel implementations for specific FPGA boards.

Notes on Vendor Compatibility

Current FPGA boards come with different types of memory that need specific support in the device or host code. The most common memory types that this overview is focusing on are:

  • Shared Virtual Memory (SVM): The FPGA directly accesses the data on the host's memory over the PCIe connection. No copying by the host is necessary, but the memory bandwidth of the FPGA is limited by the PCIe connection.
  • DDR: The FPGA board is equipped with one or more DDR memory banks and the host is in charge of copying the data forth and back over the PCIe connection. This allows higher memory bandwidths during kernel execution.
  • High Bandwidth Memory (HBM): The FPGA fabric itself is equipped with memory banks that can be accessed by the host to copy data. Compared to DDR, this memory type consists of more, but smaller memory banks so that the host needs to split the data between all memory banks to achieve the best performance. Still, the total achievable memory bandwidth is much higher compared to DDR.

The benchmarks LINPACK, PTRANS, and b_eff that stress inter-FPGA communication use MPI and PCIe for communication over the host to ensure compatibility to multi-FPGA systems without special requirements on the used communication interfaces.

The following three tables contain an overview of the compatibility of all benchmarks that use global memory with the three mentioned memory types. b_eff does use global memory only for validation. Still, the support for different memory types needs to be implemented on the host side. Full support of the benchmark is indicated with a Yes, functionally correct behavior but performance limitations are indicated with (Yes), no support is indicated with No. For Xilinx, all benchmarks need a compatible compile- and link-settings-file to map the kernel memory ports to the available memory banks. LINPACK, PTRANS and b_eff are currently not working with Xilinx FPGAs because the implementations lack support for inter-FPGA communication on these devices. Support will be added subsequently.

DDR memory

Benchmark Intel Xilinx
STREAM Yes Yes
RandomAccess Yes Yes
PTRANS Yes Yes
LINPACK Yes Yes
GEMM Yes Yes
FFT Yes Yes*
b_eff Yes Yes

*only with XRT <=2.8 because of OpenCL pipe support

HBM

(Yes) indicates, that the benchmarks can be executed with HBM, but not all available memory banks can be used. For Intel, the device code has to be modified to make it compatible with HBM.

Benchmark Intel Xilinx
STREAM Yes Yes
RandomAccess Yes Yes
PTRANS No No
LINPACK No No
GEMM Yes Yes
FFT Yes Yes
b_eff No No

SVM

SVM could not be tested with Xilinx-based boards, yet. Thus, they are considered as not working.

Benchmark Intel Xilinx
STREAM Yes No
RandomAccess Yes No
PTRANS No No
LINPACK No No
GEMM Yes No
FFT Yes No
b_eff No No

Publications

If you are using one of the benchmarks contained in the HPCC FPGA benchmark suite consider citing us.

Bibtex

@INPROCEEDINGS{hpcc_fpga,
    author={M. {Meyer} and T. {Kenter} and C. {Plessl}},
    booktitle={2020 IEEE/ACM International Workshop on Heterogeneous High-performance Reconfigurable Computing (H2RC)}, 
    title={Evaluating FPGA Accelerator Performance with a Parameterized OpenCL Adaptation of Selected Benchmarks of the HPCChallenge Benchmark Suite}, 
    year={2020},
    pages={10-18},
    organization={IEEE},
    doi={10.1109/H2RC51942.2020.00007}
}


@article{hpcc_fpga_in_depth,
    author = {Marius Meyer and Tobias Kenter and Christian Plessl},
    doi = {https://doi.org/10.1016/j.jpdc.2021.10.007},
    issn = {0743-7315},
    journal = {Journal of Parallel and Distributed Computing},
    keywords = {FPGA, OpenCL, High level synthesis, HPC benchmarking},
    pages = {79-89},
    title = {In-depth FPGA accelerator performance evaluation with single node benchmarks from the HPC challenge benchmark suite for Intel and Xilinx FPGAs using OpenCL},
    url = {https://www.sciencedirect.com/science/article/pii/S0743731521002057},
    volume = {160},
    year = {2022}
}

If the focus is on multi-FPGA execution and inter-FPGA communication, you may rather want to cite

@article{hpcc_multi_fpga, 
    author = {Meyer, Marius and Kenter, Tobias and Plessl, Christian},
    title = {Multi-FPGA Designs and Scaling of HPC Challenge Benchmarks via MPI and Circuit-Switched Inter-FPGA Networks}, 
    year = {2023}, 
    publisher = {Association for Computing Machinery}, 
    address = {New York, NY, USA}, 
    issn = {1936-7406}, 
    url = {https://doi.org/10.1145/3576200}, 
    doi = {10.1145/3576200}
 }

hpcc_fpga's People

Contributors

dstansby avatar kenter avatar mellich avatar papeg avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

hpcc_fpga's Issues

Building on AWS F1 instance

Hello,

I'm trying to build HPCC_FPGA on AWS using an F1 instance.
I tried both with "FPGA Developer AMI (Centos 7)" and "FPGA Developer AMI (Amazon Linux 2)" with GCC 10 but during cmake execution I got some errors:

  • Looking for pthread_create in pthreads - not found
  • ERROR Xilinx Vitis or Intel FPGA OpenCL SDK required!

I sourced all required environment scripts so the vitis_hls command is found:
which vitis_hls
/opt/Xilinx/Vitis_HLS/2020.2/bin/vitis_hls

What OS did you use to build?

Thanks.

LINPACK on Xilinx U280: invalid port or argument name: m_axi_gmem0

I am attempting to build the torus kernel for the LINPACK benchmark, but the build errors out in the link stage due to an invalid port mapping. I'm not sure I understand why this issue is occuring, but my guess would be that the m_axi_gmemX ports are expected to be specified within the kernel. This error seems to imply the final kernel code is not being generated correctly. Is there a setting in my build that is missing? I expected the config file to take care of most of the gotchas, since U280s seem to be supported by the benchmark.

cd LINPACK
mkdir build
cd build
cmake .. -DVitis_INCLUDE_DIRS=/opt/software/FPGA/Xilinx/Vitis/2021.2/include -DVitis_FLOATING_POINT_LIBRARY=/opt/software/FPGA/Xilinx/Vitis_HLS/2021.2/lnx64/tools/fpo_v7_0/libIp_floating_point_v7_0_bitacc_cmodel.so -DHPCC_FPGA_CONFIG=../configs/Xilinx_U280_B8_SB3_R2.cmake -DMPI_C=$HOME/repos/mvapich2/install/lib/libmpi.so -DMPI_CXX=$HOME/repos/mvapich2/install/lib/libmpi.so
make hpl_torus_PCIE_xilinx
...
[ 50%] Generating ../../bin/hpl_torus_PCIE.xclbin
WARNING: [v++ 60-1600] The option 'jobs' was used directly on the command line, where its usage is deprecated. To ensure input line works for supported operating systems or shells, v++ supports specification for some options in a configuration file. As an alternative, please use options 'hls.jobs', 'vivado.synth.jobs' in a configuration file. 
Option Map File Used: '/opt/software/FPGA/Xilinx/Vitis/2021.2/data/vitis/vpp/optMap.xml'

****** v++ v2021.2 (64-bit)
  **** SW Build 3363252 on 2021-10-14-04:41:01
    ** Copyright 1986-2020 Xilinx, Inc. All Rights Reserved.

INFO: [v++ 60-1306] Additional information associated with this v++ link can be found at:
	Reports: /upb/departments/pc2/users/m/mpifpga2/repos/HPCC_FPGA/LINPACK/build/bin/xilinx_reports/link
	Log files: /upb/departments/pc2/users/m/mpifpga2/repos/HPCC_FPGA/LINPACK/build/bin/xilinx_reports/logs/link
Running Dispatch Server on port: 34327
INFO: [v++ 60-1548] Creating build summary session with primary output /upb/departments/pc2/users/m/mpifpga2/repos/HPCC_FPGA/LINPACK/build/bin/hpl_torus_PCIE.xclbin.link_summary, at Tue Dec 20 16:47:58 2022
INFO: [v++ 60-1316] Initiating connection to rulecheck server, at Tue Dec 20 16:47:58 2022
INFO: [v++ 60-1315] Creating rulecheck session with output '/upb/departments/pc2/users/m/mpifpga2/repos/HPCC_FPGA/LINPACK/build/bin/xilinx_reports/link/v++_link_hpl_torus_PCIE_guidance.html', at Tue Dec 20 16:47:59 2022
INFO: [v++ 60-895]   Target platform: /opt/software/FPGA/Xilinx/platforms/xilinx_u280_xdma_201920_3_3246211/xilinx_u280_xdma_201920_3.xpfm
INFO: [v++ 60-1578]   This platform contains Xilinx Shell Archive '/opt/software/FPGA/Xilinx/platforms/xilinx_u280_xdma_201920_3_3246211/hw/xilinx_u280_xdma_201920_3.xsa'
INFO: [v++ 74-78] Compiler Version string: 2021.2
INFO: [v++ 60-1302] Platform 'xilinx_u280_xdma_201920_3.xpfm' has been explicitly enabled for this release.
INFO: [v++ 60-629] Linking for hardware target
INFO: [v++ 60-423]   Target device: xilinx_u280_xdma_201920_3
INFO: [v++ 60-1332] Run 'run_link' status: Not started
INFO: [v++ 60-1443] [16:48:13] Run run_link: Step system_link: Started
INFO: [v++ 60-1453] Command Line: system_link --xo /upb/departments/pc2/users/m/mpifpga2/repos/HPCC_FPGA/LINPACK/build/src/device/xilinx_tmp_compile/hpl_torus_PCIE.xo --config /upb/departments/pc2/users/m/mpifpga2/repos/HPCC_FPGA/LINPACK/build/src/device/_x/link/int/syslinkConfig.ini --xpfm /opt/software/FPGA/Xilinx/platforms/xilinx_u280_xdma_201920_3_3246211/xilinx_u280_xdma_201920_3.xpfm --target hw --output_dir /upb/departments/pc2/users/m/mpifpga2/repos/HPCC_FPGA/LINPACK/build/src/device/_x/link/int --temp_dir /upb/departments/pc2/users/m/mpifpga2/repos/HPCC_FPGA/LINPACK/build/src/device/_x/link/sys_link
INFO: [v++ 60-1454] Run Directory: /upb/departments/pc2/users/m/mpifpga2/repos/HPCC_FPGA/LINPACK/build/src/device/_x/link/run_link
INFO: [SYSTEM_LINK 60-1316] Initiating connection to rulecheck server, at Tue Dec 20 16:48:14 2022
INFO: [SYSTEM_LINK 82-70] Extracting xo v3 file /upb/departments/pc2/users/m/mpifpga2/repos/HPCC_FPGA/LINPACK/build/src/device/xilinx_tmp_compile/hpl_torus_PCIE.xo
INFO: [SYSTEM_LINK 82-53] Creating IP database /upb/departments/pc2/users/m/mpifpga2/repos/HPCC_FPGA/LINPACK/build/src/device/_x/link/sys_link/_sysl/.cdb/xd_ip_db.xml
INFO: [SYSTEM_LINK 82-38] [16:48:27] build_xd_ip_db started: /opt/software/FPGA/Xilinx/Vitis/2021.2/bin/build_xd_ip_db -ip_search 0  -sds-pf /upb/departments/pc2/users/m/mpifpga2/repos/HPCC_FPGA/LINPACK/build/src/device/_x/link/sys_link/xilinx_u280_xdma_201920_3.hpfm -clkid 0 -ip /upb/departments/pc2/users/m/mpifpga2/repos/HPCC_FPGA/LINPACK/build/src/device/_x/link/sys_link/iprepo/xilinx_com_hls_lu_1_0,lu -ip /upb/departments/pc2/users/m/mpifpga2/repos/HPCC_FPGA/LINPACK/build/src/device/_x/link/sys_link/iprepo/xilinx_com_hls_top_update_1_0,top_update -ip /upb/departments/pc2/users/m/mpifpga2/repos/HPCC_FPGA/LINPACK/build/src/device/_x/link/sys_link/iprepo/xilinx_com_hls_inner_update_mm0_1_0,inner_update_mm0 -ip /upb/departments/pc2/users/m/mpifpga2/repos/HPCC_FPGA/LINPACK/build/src/device/_x/link/sys_link/iprepo/xilinx_com_hls_left_update_1_0,left_update -o /upb/departments/pc2/users/m/mpifpga2/repos/HPCC_FPGA/LINPACK/build/src/device/_x/link/sys_link/_sysl/.cdb/xd_ip_db.xml
INFO: [SYSTEM_LINK 82-37] [16:48:32] build_xd_ip_db finished successfully
Time (s): cpu = 00:00:04 ; elapsed = 00:00:05 . Memory (MB): peak = 2369.379 ; gain = 0.000 ; free physical = 392220 ; free virtual = 422552
INFO: [SYSTEM_LINK 82-51] Create system connectivity graph
INFO: [SYSTEM_LINK 82-102] Applying explicit connections to the system connectivity graph: /upb/departments/pc2/users/m/mpifpga2/repos/HPCC_FPGA/LINPACK/build/src/device/_x/link/sys_link/cfgraph/cfgen_cfgraph.xml
INFO: [SYSTEM_LINK 82-38] [16:48:32] cfgen started: /opt/software/FPGA/Xilinx/Vitis/2021.2/bin/cfgen  -nk lu:1 -nk left_update:1 -nk top_update:1 -nk inner_update_mm0:2 -slr lu_1:SLR0 -slr left_update_1:SLR0 -slr top_update_1:SLR0 -slr inner_update_mm0_1:SLR1 -slr inner_update_mm0_2:SLR2 -sp lu_1.m_axi_gmem0:DDR[0] -sp lu_1.m_axi_gmem1:DDR[0] -sp lu_1.m_axi_gmem2:DDR[1] -sp top_update_1.m_axi_gmem0:DDR[0] -sp top_update_1.m_axi_gmem1:DDR[0] -sp top_update_1.m_axi_gmem2:DDR[0] -sp left_update_1.m_axi_gmem0:DDR[0] -sp left_update_1.m_axi_gmem1:DDR[1] -sp left_update_1.m_axi_gmem2:DDR[1] -sp inner_update_mm0_1.m_axi_gmem0:DDR[0] -sp inner_update_mm0_1.m_axi_gmem1:DDR[1] -sp inner_update_mm0_1.m_axi_gmem2:DDR[0] -sp inner_update_mm0_2.m_axi_gmem0:DDR[0] -sp inner_update_mm0_2.m_axi_gmem1:DDR[1] -sp inner_update_mm0_2.m_axi_gmem2:DDR[0] -dmclkid 0 -r /upb/departments/pc2/users/m/mpifpga2/repos/HPCC_FPGA/LINPACK/build/src/device/_x/link/sys_link/_sysl/.cdb/xd_ip_db.xml -o /upb/departments/pc2/users/m/mpifpga2/repos/HPCC_FPGA/LINPACK/build/src/device/_x/link/sys_link/cfgraph/cfgen_cfgraph.xml
INFO: [CFGEN 83-0] Kernel Specs: 
INFO: [CFGEN 83-0]   kernel: lu, num: 1  {lu_1}
INFO: [CFGEN 83-0]   kernel: left_update, num: 1  {left_update_1}
INFO: [CFGEN 83-0]   kernel: top_update, num: 1  {top_update_1}
INFO: [CFGEN 83-0]   kernel: inner_update_mm0, num: 2  {inner_update_mm0_1 inner_update_mm0_2}
INFO: [CFGEN 83-0] Port Specs: 
INFO: [CFGEN 83-0]   kernel: lu_1, k_port: m_axi_gmem0, sptag: DDR[0]
INFO: [CFGEN 83-0]   kernel: lu_1, k_port: m_axi_gmem1, sptag: DDR[0]
INFO: [CFGEN 83-0]   kernel: lu_1, k_port: m_axi_gmem2, sptag: DDR[1]
INFO: [CFGEN 83-0]   kernel: top_update_1, k_port: m_axi_gmem0, sptag: DDR[0]
INFO: [CFGEN 83-0]   kernel: top_update_1, k_port: m_axi_gmem1, sptag: DDR[0]
INFO: [CFGEN 83-0]   kernel: top_update_1, k_port: m_axi_gmem2, sptag: DDR[0]
INFO: [CFGEN 83-0]   kernel: left_update_1, k_port: m_axi_gmem0, sptag: DDR[0]
INFO: [CFGEN 83-0]   kernel: left_update_1, k_port: m_axi_gmem1, sptag: DDR[1]
INFO: [CFGEN 83-0]   kernel: left_update_1, k_port: m_axi_gmem2, sptag: DDR[1]
INFO: [CFGEN 83-0]   kernel: inner_update_mm0_1, k_port: m_axi_gmem0, sptag: DDR[0]
INFO: [CFGEN 83-0]   kernel: inner_update_mm0_1, k_port: m_axi_gmem1, sptag: DDR[1]
INFO: [CFGEN 83-0]   kernel: inner_update_mm0_1, k_port: m_axi_gmem2, sptag: DDR[0]
INFO: [CFGEN 83-0]   kernel: inner_update_mm0_2, k_port: m_axi_gmem0, sptag: DDR[0]
INFO: [CFGEN 83-0]   kernel: inner_update_mm0_2, k_port: m_axi_gmem1, sptag: DDR[1]
INFO: [CFGEN 83-0]   kernel: inner_update_mm0_2, k_port: m_axi_gmem2, sptag: DDR[0]
INFO: [CFGEN 83-0] SLR Specs: 
INFO: [CFGEN 83-0]   instance: inner_update_mm0_1, SLR: SLR1
INFO: [CFGEN 83-0]   instance: inner_update_mm0_2, SLR: SLR2
INFO: [CFGEN 83-0]   instance: left_update_1, SLR: SLR0
INFO: [CFGEN 83-0]   instance: lu_1, SLR: SLR0
INFO: [CFGEN 83-0]   instance: top_update_1, SLR: SLR0
ERROR: [CFGEN 83-2292] --sp tag applied to an invalid port or argument name: m_axi_gmem0
ERROR: [CFGEN 83-2292] --sp tag applied to an invalid port or argument name: m_axi_gmem1
ERROR: [CFGEN 83-2292] --sp tag applied to an invalid port or argument name: m_axi_gmem2
ERROR: [CFGEN 83-2292] --sp tag applied to an invalid port or argument name: m_axi_gmem0
ERROR: [CFGEN 83-2292] --sp tag applied to an invalid port or argument name: m_axi_gmem1
ERROR: [CFGEN 83-2292] --sp tag applied to an invalid port or argument name: m_axi_gmem2
ERROR: [CFGEN 83-2292] --sp tag applied to an invalid port or argument name: m_axi_gmem0
ERROR: [CFGEN 83-2292] --sp tag applied to an invalid port or argument name: m_axi_gmem1
ERROR: [CFGEN 83-2292] --sp tag applied to an invalid port or argument name: m_axi_gmem2
ERROR: [CFGEN 83-2292] --sp tag applied to an invalid port or argument name: m_axi_gmem0
ERROR: [CFGEN 83-2292] --sp tag applied to an invalid port or argument name: m_axi_gmem1
ERROR: [CFGEN 83-2292] --sp tag applied to an invalid port or argument name: m_axi_gmem2
ERROR: [CFGEN 83-2292] --sp tag applied to an invalid port or argument name: m_axi_gmem0
ERROR: [CFGEN 83-2292] --sp tag applied to an invalid port or argument name: m_axi_gmem1
ERROR: [CFGEN 83-2292] --sp tag applied to an invalid port or argument name: m_axi_gmem2
ERROR: [CFGEN 83-2298] Exiting due to previous error
ERROR: [SYSTEM_LINK 82-36] [16:48:35] cfgen failed
Time (s): cpu = 00:00:03 ; elapsed = 00:00:03 . Memory (MB): peak = 2369.379 ; gain = 0.000 ; free physical = 391997 ; free virtual = 422329
ERROR: [SYSTEM_LINK 82-62] Error generating design file for /upb/departments/pc2/users/m/mpifpga2/repos/HPCC_FPGA/LINPACK/build/src/device/_x/link/sys_link/cfgraph/cfgen_cfgraph.xml, command: /opt/software/FPGA/Xilinx/Vitis/2021.2/bin/cfgen  -nk lu:1 -nk left_update:1 -nk top_update:1 -nk inner_update_mm0:2 -slr lu_1:SLR0 -slr left_update_1:SLR0 -slr top_update_1:SLR0 -slr inner_update_mm0_1:SLR1 -slr inner_update_mm0_2:SLR2 -sp lu_1.m_axi_gmem0:DDR[0] -sp lu_1.m_axi_gmem1:DDR[0] -sp lu_1.m_axi_gmem2:DDR[1] -sp top_update_1.m_axi_gmem0:DDR[0] -sp top_update_1.m_axi_gmem1:DDR[0] -sp top_update_1.m_axi_gmem2:DDR[0] -sp left_update_1.m_axi_gmem0:DDR[0] -sp left_update_1.m_axi_gmem1:DDR[1] -sp left_update_1.m_axi_gmem2:DDR[1] -sp inner_update_mm0_1.m_axi_gmem0:DDR[0] -sp inner_update_mm0_1.m_axi_gmem1:DDR[1] -sp inner_update_mm0_1.m_axi_gmem2:DDR[0] -sp inner_update_mm0_2.m_axi_gmem0:DDR[0] -sp inner_update_mm0_2.m_axi_gmem1:DDR[1] -sp inner_update_mm0_2.m_axi_gmem2:DDR[0] -dmclkid 0 -r /upb/departments/pc2/users/m/mpifpga2/repos/HPCC_FPGA/LINPACK/build/src/device/_x/link/sys_link/_sysl/.cdb/xd_ip_db.xml -o /upb/departments/pc2/users/m/mpifpga2/repos/HPCC_FPGA/LINPACK/build/src/device/_x/link/sys_link/cfgraph/cfgen_cfgraph.xml
ERROR: [SYSTEM_LINK 82-96] Error applying explicit connections to the system connectivity graph
ERROR: [SYSTEM_LINK 82-79] Unable to create system connectivity graph
INFO: [v++ 60-1442] [16:48:35] Run run_link: Step system_link: Failed
Time (s): cpu = 00:00:12 ; elapsed = 00:00:23 . Memory (MB): peak = 2265.199 ; gain = 0.000 ; free physical = 392006 ; free virtual = 422334
ERROR: [v++ 60-661] v++ link run 'run_link' failed
ERROR: [v++ 60-626] Kernel link failed to complete
ERROR: [v++ 60-703] Failed to finish linking
INFO: [v++ 60-1653] Closing dispatch client.
make[3]: *** [src/device/CMakeFiles/hpl_torus_PCIE_xilinx.dir/build.make:75: bin/hpl_torus_PCIE.xclbin] Error 1
make[2]: *** [CMakeFiles/Makefile2:501: src/device/CMakeFiles/hpl_torus_PCIE_xilinx.dir/all] Error 2
make[1]: *** [CMakeFiles/Makefile2:508: src/device/CMakeFiles/hpl_torus_PCIE_xilinx.dir/rule] Error 2
make: *** [Makefile:283: hpl_torus_PCIE_xilinx] Error 2

Error with buffer creation in FFT benchmark

Hi,

When the FFT benchmark is configured with NUM_REPLICATIONS > 4, the make CL_CONTEXT_EMULATOR_DEVICE_INTELFPGA=1 test command fails with the following message:

Error: Invalid or unsupported flags
ERROR in OpenCL library detected! Aborting.
<root_dir>/FFT/src/host/execution_default.cpp:66: CL_INVALID_VALUE
An error occured while executing the benchmark:
An OpenCL error occured: CL_INVALID_VALUE

This is the code causing the error:

outBuffers.push_back(cl::Buffer(*config.context, CL_MEM_WRITE_ONLY | (config.programSettings->useMemoryInterleaving ? 0 : (((2 * r) + 2) << 16)), (1 << LOG_FFT_SIZE) * iterations_per_kernel * 2 * sizeof(HOST_DATA_TYPE), NULL, &err));
ASSERT_CL(err)

The error seems to arise from the flag value, CL_MEM_WRITE_ONLY | (config.programSettings->useMemoryInterleaving ? 0 : (((2 * r) + 2) << 16)). I am curious about the purpose of the ((2 * r) + 2) << 16 operation and how it relates to the memory flag.

Any inputs regarding the error and potential ways to fix it will be hugely appreciated.

Thank you.

STREAM benchmark: failed to load xclbin: Invalid argument

I am trying to get one of these benchmarks working on the Noctua2 system. When I build the emulated kernel and attempt to run, I get the following output:

-------------------------------------------------------------
General setup:
C++ high resolution clock is used.
The clock precision seems to be 1.00000e+01ns
-------------------------------------------------------------
Selected Platform: Xilinx
Multiple devices have been found. Select the device by typing a number:
0) xilinx_u280_xdma_201920_3
1) xilinx_u280_xdma_201920_3
2) xilinx_u280_xdma_201920_3
Enter device id [0-2]:0
-------------------------------------------------------------
Selection summary:
Platform Name: Xilinx
Device Name:   xilinx_u280_xdma_201920_3
-------------------------------------------------------------
-------------------------------------------------------------
FPGA Setup:./bin/stream_kernels_single_emulate.xclbin
XRT build version: 2.12.429
Build hash: 2180e838abe791cb1e90d9011bbc8b3676774172
Build date: 2022-04-08 11:43:35
Git branch: 2021.2_RHEL8.5
PID: 1159967
UID: 92395
[Mon Dec 19 17:29:04 2022 GMT]
HOST: n2fpga14
EXE: /upb/departments/pc2/users/m/mpifpga2/repos/HPCC_FPGA/STREAM/build/bin/STREAM_FPGA_xilinx
[XRT] ERROR: See dmesg log for details. err=-2
[XRT] ERROR: failed to load xclbin: Invalid argument
ERROR in OpenCL library detected! Aborting.
/upb/departments/pc2/users/m/mpifpga2/repos/HPCC_FPGA/shared/setup/fpga_setup.cpp:168: CL_OUT_OF_HOST_MEMORY
An error occured while setting up the benchmark: 
	An OpenCL error occured: CL_OUT_OF_HOST_MEMORY
Benchmark execution started without successfully running the benchmark setup!

These are the steps I'm taking to build and run the benchmark:

cd STREAM
mkdir build
cd build
cmake .. -DVitis_INCLUDE_DIRS=/opt/software/FPGA/Xilinx/Vitis/2021.2/include -DVitis_FLOATING_POINT_LIBRARY=/opt/software/FPGA/Xilinx/Vitis_HLS/2022.1/lnx64/tools/fpo_v7_0/libIp_floating_point_v7_0_bitacc_cmodel.so  -DHPCC_FPGA_CONFIG=$PWD/../configs/Xilinx_U280_DP.cmake -DMPI_C=$HOME/repos/mvapich2/install/lib/libmpi.so -DMPI_CXX=$HOME/repos/mvapich2/install/lib/libmpi.so
make all
CL_CONTEXT_EMULATOR_DEVICE=1 srun -p fpga -N 1 --constraint=xilinx_u280_xrt2.12 -t 00:30:00 ./bin/STREAM_FPGA_test_xilinx -f ./bin/stream_kernels_single_emulate.xclbin

Software versions:
XRT: v2.12
Vitis: v21.2
Device Platform: u280_xdma_201920_3_3246211
HPCC_FPGA: v0.5.1

It seems as if the xclbin is not being generated correctly. Am I missing a build step or is this a bug in the build?

Supported devices

Hello,
while trying to compile the GEMM benchmark I get the following linking error:
INFO: [v++ 60-423] Target device: zcu106_base ERROR: [v++ 60-1139] Failed to parse --slr option: SLRs are not available in the specified platform.
Is is possible that my platform is not supported? If so, there's something that I can do about it?
Thanks

cl.hpp missing - OpenCL version mismatch?

Hello everyone,
I was trying to run the GEMM benchmark, cmake configuration succeeds but building fails with error CL/cl.hpp, no such file or directory. In my /usr/include/CL directory i don't have such file. May this be due to a mismatch in our OpenCL versions? My cl2.hpp file mentions version 2.0.7.
I have also trying setting USE_DEPRECATED_CPP_HEADER to false, in this case it fails with a sintax error in cl2.hpp, line 7841, expected ; at the end of method declaration.
Thanks
Pietro

Memory binding error with STREAM and RandomAccess

Hi,
I tried compiling and running the STREAM and RandomAccess benchmarks for the BittWare 520N-MX platform which has a Stratix 10 MX FPGA.
For both benchmarks, configuration options were set to their respective default values.
While running the benchmarks, I receive the following error:
acl_mem.cpp:333: int acl_bind_buffer_to_device(cl_device_id, cl_mem): Assertion 'mem' failed.

I couldn't find any resources online to debug this. Please help me with this.

Error while building RandomAccess benchmark

Hi,

I am trying to build and test the RandomAccess benchmark for the Nallatech 520N card. After following the steps outlined in the README and running make all, I get the following output:

Scanning dependencies of target hpcc_fpga_base
[ 4%] Building CXX object lib/hpccbase/CMakeFiles/hpcc_fpga_base.dir/setup/fpga_setup.cpp.o
In file included from /home/UFAD/g.ramesh/HPCC_FPGA/shared/include/setup/fpga_setup.hpp:36:0,
from /home/UFAD/g.ramesh/HPCC_FPGA/shared/setup/fpga_setup.cpp:5:
/tools/Intel/quartus_pro/19.4.0.64/hld/host/include/CL/cl.hpp:155:110: note: #pragma message: This version of the OpenCL Host API C++ bindings is deprecated, please use cl2.hpp instead.
#pragma message("This version of the OpenCL Host API C++ bindings is deprecated, please use cl2.hpp instead.")
^
In file included from /home/UFAD/g.ramesh/HPCC_FPGA/shared/setup/fpga_setup.cpp:5:0:
/home/UFAD/g.ramesh/HPCC_FPGA/shared/include/setup/fpga_setup.hpp:130:5: error: ‘unique_ptr’ in namespace ‘std’ does not name a type
std::unique_ptrcl::Program
^
/home/UFAD/g.ramesh/HPCC_FPGA/shared/include/setup/fpga_setup.hpp:156:5: error: ‘unique_ptr’ in namespace ‘std’ does not name a type
std::unique_ptrcl::Device
^
/home/UFAD/g.ramesh/HPCC_FPGA/shared/setup/fpga_setup.cpp:129:5: error: ‘unique_ptr’ in namespace ‘std’ does not name a type
std::unique_ptrcl::Program
^
/home/UFAD/g.ramesh/HPCC_FPGA/shared/setup/fpga_setup.cpp:213:5: error: ‘unique_ptr’ in namespace ‘std’ does not name a type
std::unique_ptrcl::Device
^
make[2]: *** [lib/hpccbase/CMakeFiles/hpcc_fpga_base.dir/setup/fpga_setup.cpp.o] Error 1
make[1]: *** [lib/hpccbase/CMakeFiles/hpcc_fpga_base.dir/all] Error 2
make: *** [all] Error 2

Consequently, I made the following changes to the file "fpga_setup.hpp" located at "HPCC_FPGA/shared/include/setup/"

  1. Added the following preprocessor macros:
    #define CL_HPP_TARGET_OPENCL_VERSION 120
    #define CL_HPP_MINIMUM_OPENCL_VERSION 120
  2. Changed the included header file from #include "CL/cl.hpp" to #include "CL/cl2.hpp"

Ran the make clean && make all to get this output:

[ 4%] Building CXX object lib/hpccbase/CMakeFiles/hpcc_fpga_base.dir/setup/fpga_setup.cpp.o
In file included from /home/UFAD/g.ramesh/HPCC_FPGA/shared/setup/fpga_setup.cpp:5:0:
/home/UFAD/g.ramesh/HPCC_FPGA/shared/include/setup/fpga_setup.hpp:130:5: error: ‘unique_ptr’ in namespace ‘std’ does not name a type
std::unique_ptrcl::Program
^
/home/UFAD/g.ramesh/HPCC_FPGA/shared/include/setup/fpga_setup.hpp:156:5: error: ‘unique_ptr’ in namespace ‘std’ does not name a type
std::unique_ptrcl::Device
^
/home/UFAD/g.ramesh/HPCC_FPGA/shared/setup/fpga_setup.cpp:129:5: error: ‘unique_ptr’ in namespace ‘std’ does not name a type
std::unique_ptrcl::Program
^
/home/UFAD/g.ramesh/HPCC_FPGA/shared/setup/fpga_setup.cpp:213:5: error: ‘unique_ptr’ in namespace ‘std’ does not name a type
std::unique_ptrcl::Device
^
make[2]: *** [lib/hpccbase/CMakeFiles/hpcc_fpga_base.dir/setup/fpga_setup.cpp.o] Error 1
make[1]: *** [lib/hpccbase/CMakeFiles/hpcc_fpga_base.dir/all] Error 2
make: *** [all] Error 2

I am unable to figure out the solution to this error. Please help me.

Relevant details:
OS: CentOS Linux 7
Cmake version: 3.17.3
Python3 not installed

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.