Intel® FPGA SDK for OpenCL™ Standard Edition: Programming Guide

ID 683342
Date 4/22/2019
Public
Document Table of Contents

A.1.1. OpenCL1.0 C Programming Language Implementation

OpenCL™ is based on C99 with some limitations. Section 6 of the OpenCL Specification version 1.0 describes the OpenCL C programming language. The Intel® FPGA SDK for OpenCL™ conforms with the OpenCL C programming language with clarifications and exceptions. The table below summarizes the support statuses of the features in the OpenCL programming language implementation. OpenCL programming language implementations that are supported with no additional clarifications are not shown.

Support Status column legend:

Symbol Description
The feature is supported, and there might be a clarification for the supported feature in the Notes column
The feature is supported with exceptions identified in the Notes column.
X The feature is not supported.
Section Feature Support Status Notes
6.1.1 Built-in Scalar Data Types
double precision float Preliminary support for all double precision float built-in scalar data type. This feature might not conform with the OpenCL Specification version 1.0.

Currently, the following double precision floating-point functions conform with the OpenCL Specification version 1.0:

add / subtract / multiply / divide / ceil / floor / rint / trunc / fabs / fmax / fmin / sqrt / rsqrt / exp / exp2 / exp10 / log / log2 / log10 / sin / cos / asin / acos / sinh / cosh / tanh / asinh / acosh / atanh / pow / pown / powr / tanh / atan / atan2 / ldexp / log1p / sincos

half precision float Support for scalar addition, subtraction and multiplication. Support for conversions to and from single-precision floating point. This feature might not conform with the OpenCL Specification version 1.0.

This feature is supported in the Emulator.

6.1.2 Built-in Vector Data Types

Preliminary support for vectors with three elements. Three-element vector support is a supplement to the OpenCL Specification version 1.0.

6.1.3 Built-in Data Types X
6.1.4 Reserved Data Types X
6.1.5 Alignment of Types All scalar and vector types are aligned as required (vectors with three elements are aligned as if they had four elements).
6.2.1 Implicit Conversions Refer to Section 6.2.6: Usual Arithmetic Conversions in the OpenCL Specification version 1.2 for an important clarification of implicit conversions between scalar and vector types.
6.2.2 Explicit Casts The SDK allows scalar data casts to a vector with a different element type.
6.5 Address Space Qualifiers Function scope__constant variables are not supported.
6.6 Image Access Qualifiers X
6.7 Function Qualifiers
6.7.2 Optional Attribute Qualifiers Refer to the Intel® FPGA SDK for OpenCL™ Best Practices Guide for tips on using reqd_work_group_size to improve kernel performance.

The SDK parses but ignores the vec_type_hint and work_group_size_hint attribute qualifiers.

6.9 Preprocessor Directives and Macros
#pragma directive: #pragma unroll The Intel® FPGA SDK for OpenCL™ Offline Compiler supports only #pragma unroll. You may assign an integer argument to the unroll directive to control the extent of loop unrolling.

For example, #pragma unroll 4 unrolls four iterations of a loop.

By default, an unroll directive with no unroll factor causes the offline compiler to attempt to unroll the loop fully.

Refer to the Intel® FPGA SDK for OpenCL™ Best Practices Guide for tips on using #pragma unroll to improve kernel performance.

__ENDIAN_LITTLE__ defined to be value 1 The target FPGA is little-endian.
__IMAGE_SUPPORT__ X __IMAGE_SUPPORT__ is undefined; the SDK does not support images.
6.10 Attribute Qualifiers—The offline compiler parses attribute qualifiers as follows:
6.10.2 Specifying Attributes of Functions—Structure-type kernel arguments X Convert structure arguments to a pointer to a structure in global memory.
6.10.3 Specifying Attributes of Variablesendian X
6.10.4 Specifying Attributes of Blocks and Control-Flow-Statements X
6.10.5 Extending Attribute Qualifiers The offline compiler can parse attributes on various syntactic structures. It reserves some attribute names for its own internal use.

Refer to the Intel® FPGA SDK for OpenCL™ Best Practices Guide for tips on how to optimize kernel performance using these kernel attributes.

6.11.2 Math Functions
built-in math functions Preliminary support for built-in math functions for double precision float. These functions might not conform with the OpenCL Specification version 1.0.
built-in half_ and native_ math functions Preliminary support for built-in half_ and native_ math functions for double precision float. These functions might not conform with the OpenCL Specification version 1.0.
6.11.5 Geometric Functions Preliminary support for built-in geometric functions for double precision float. These functions might not conform with the OpenCL Specification version 1.0.

Refer to Argument Types for Built-in Geometric Functions for a list of built-in geometric functions supported by the SDK.

6.11.8 Image Read and Write Functions X
6.11.9 Synchronization Functions—the barrier synchronization function Clarifications and exceptions:

If a kernel specifies the reqd_work_group_size or max_work_group_size attribute, barrier supports the corresponding number of work-items.

If neither attribute is specified, a barrier is instantiated with a default limit of 256 work-items.

The work-item limit is the maximum supported work-group size for the kernel; this limit is enforced by the runtime.

6.11.11 Async Copies from Global to Local Memory, Local to Global Memory, and Prefetch The implementation is naive:

Work-item (0,0,0) performs the copy and the wait_group_events is implemented as a barrier.

If a kernel specifies the reqd_work_group_size or max_work_group_size attribute, wait_group_events supports the corresponding number of work-items.

If neither attribute is specified, wait_group_events is instantiated with a default limit of 256 work-items.