Professional Documents
Culture Documents
Revision History
The following table shows the revision history for this document.
Date
Version
Changes
11/30/2016
2016.3
www.xilinx.com
Send Feedback
Table of Contents
Revision History ......................................................................................................................................................2
Chapter 1 Overview .................................................................................................................................................5
Introduction ............................................................................................................................................................5
Understanding Partial Reconfiguration ..................................................................................................................7
Chapter 2
Introduction ......................................................................................................................................................... 12
Working with SSI Devices .................................................................................................................................... 12
Using Platform Board Files .................................................................................................................................. 14
Chapter 3 Logic Design using IP Integrator .......................................................................................................... 16
Creating the Design Hierarchy ............................................................................................................................. 16
Creating and Configuring the Logic Design.......................................................................................................... 19
Designing for Performance Profiling ................................................................................................................... 28
Chapter 4 Applying Physical Design Constraints .................................................................................................. 36
I/O and Clock Planning ........................................................................................................................................ 36
Applying Partial Reconfiguration Constraints...................................................................................................... 36
Chapter 5 Implementing the Hardware Platform Design .................................................................................... 41
Simulating the Design .......................................................................................................................................... 41
Implementation and Timing Validation ............................................................................................................... 41
Chapter 6 Generating DSA Files ........................................................................................................................... 42
Using Example DSAs ............................................................................................................................................ 42
Setting Vivado Properties for Generating a DSA ................................................................................................. 43
Generating the DSA File....................................................................................................................................... 45
Assembling the Platform ..................................................................................................................................... 46
Using Older DSA Versions .................................................................................................................................... 49
Chapter 7
Introduction ......................................................................................................................................................... 50
Using the Kernel Mode Driver (xcldma) .............................................................................................................. 51
Using the OpenCL Hardware Abstraction Layer (XCL HAL) Driver API ................................................................ 53
Using the OpenCL Runtime Flow ......................................................................................................................... 60
www.xilinx.com
Send Feedback
Chapter 8
Legal Notices............................................................................................................................................................ 71
Please Read: Important Legal Notices ................................................................................................................. 71
www.xilinx.com
Send Feedback
Chapter 1 Overview
Introduction
The Xilinx SDAccel Development Environment lets you compile OpenCL C, C and C++ programs to
run on Xilinx programmable platforms. The programmable platform is composed of the SDAccel Xilinx
Open Code Compiler (XOCC), a Device Support Archive (DSA) which describes the hardware platform, a
software platform, an accelerator board, and the SDAccel OpenCL runtime. The XOCC is a command
line compiler that takes user source code and runs it through the Xilinx implementation tools to
generate the bitstream and other files that are needed to program the FPGA based accelerator boards.
It supports kernels expressed in OpenCL C, C++ and RTL (SystemVerilog, Verilog or VHDL)
This document describes the steps required to create hardware platform designs for use with the
SDAccel Development Environment. The hardware platform consists of a Vivado IP Integrator
subsystem design with all the required board interface IP cores configured and connected to the device
I/Os and the programmable region. The programmable region defines the programmable device logic
partition that accepts the software kernel from the SDAccel Development Environment.
This guide is written from the perspective of a hardware designer, using terminology common to the
Vivado Design Suite. An overview of the OpenCL Design flow to create the programmable region,
written from the perspective of a software developer, is provided in the SDAccel Development
Environment User Guide (UG1023).
The SDX software installation includes the compatible version of Vivado needed to create or modify the
hardware platform design. It can be accessed on the Linux command line or Windows Program Start
Menu. For more information on using the Vivado Design Suite, refer to the online compatible version
Vivado documentation and collateral.
www.xilinx.com
Send Feedback
Overview
The Xilinx PCIe hardware device consists of two regions, as shown in Figure 1, the Static Region and the
Programmable Region. The Static Region provides the connectivity framework to the Programmable
Region, which will execute the hardware functions as defined in the software kernel. The diagram below
is an example of a standard partial reconfiguration hardware platform. The expanded partial
reconfiguration hardware platform would incorporate most of the static region into the programmable
region.
www.xilinx.com
Send Feedback
Overview
Terminology
Hardware Platform Represents the physical board interface and the programmable region. The
hardware platform consists of a Vivado IP Integrator design with a target device and all interface IPs
configured and connected to the device I/Os and the programmable region. It also contains the
interface representation of the programmable region.
Software Platform Consists of the runtime, drivers, and software stack that are needed to enable
interaction with the hardware platform.
Hardware Function Blanket term to include either the OpenCL, C, C++, kernels or RTL IP that
defines the programmable region.
Static Region Represents the fixed logic portion of the programmable device that manages the
design state before, during, and after partial reconfiguration of the device. This logic is not reimplemented with the programmable region.
Programmable Region Describes the partition region that accepts the hardware functions from
the SDAccel Development Environment. The term also describes the physical resources available on
the programmable device to implement the hardware functions. Special parameters and design
considerations are required for signals that cross between the static region and the programmable
region.
Device Support Archive (DSA) Contains all of the design and metadata needed for an SDAccel
hardware function to interact with the physical design. It is the output product of the hardware
platform design process described in this guide.
www.xilinx.com
Send Feedback
Overview
www.xilinx.com
Send Feedback
Overview
www.xilinx.com
Send Feedback
Overview
www.xilinx.com
Send Feedback
10
Overview
www.xilinx.com
Send Feedback
11
Device Planning
Introduction
To assure the best possible performance of the overall design, you should carefully plan the assignment
of the required device interfaces. You should study and understand the device package footprint, the
package pin assignments, and the internal die layout of the programmable device to determine the best
I/Os to use for the various hardware interfaces, especially the external processor and memory interfaces.
You should plan the intended logic flow of the hardware platform, while paying attention to clock
distribution, the expected logic size of the interfaces, and the hardened device resources such as PCIe,
BRAMs, and DSPs. This is true not only for the static region logic, but also for the programmable region
logic. You should focus on maximizing the available resources for the programmable region logic.
Optionally, a Platform Board file can be created to support the hardware platform design. The board file
defines I/O characteristics and the interface IP configuration options. A board file facilitates the use of
Connectivity Assistance in the IP Integrator. This can greatly improve the ease of platform design
creation. Also, if variants of the hardware platform are planned, consider using a Platform Board file to
enable faster configuration of IP interfaces.
As you define the hardware platform logic in the Vivado Design Suite, you can use the I/O planning
features to assign the I/O and clock constraints. You can also use floorplanning techniques to ensure
consistent results and optimal performance of the hardware platform design. Refer to the Vivado Design
Suite User Guide: I/O and Clock Planning (UG899) and the Vivado Design Suite User Guide: Design
Analysis and Closure Techniques (UG906) for more information.
www.xilinx.com
Send Feedback
12
Device Planning
www.xilinx.com
Send Feedback
13
Device Planning
Define the static region logic interconnect with careful floorplanning of IP logic and signal interfaces
and then create any necessary AXI Register SLR crossings in order to effectively utilize both SLRs.
www.xilinx.com
Send Feedback
14
Device Planning
www.xilinx.com
Send Feedback
15
You will use the Vivado IP Integrator feature to create the hardware platform logic design. This feature
of the Vivado Design Suite provides the necessary board information, interface IP, and the interface to
the programmable region which will be imported from the SDAccel Development Environment. This
chapter describes the process for creating the hardware platform. For more information on working
with the Vivado IP Integrator, refer to the Vivado Design Suite User Guide: Designing IP Subsystems
Using IP Integrator (UG994).
www.xilinx.com
Send Feedback
16
Figure 7: Define the Static and Programmable Regions on the Top-level of Design
www.xilinx.com
Send Feedback
17
The SDAccel OpenCL Programmable Region IP is contained inside the expanded region hierarchy.
Figure 9 below shows an example of the inside of the expanded region logic hierarchy.
www.xilinx.com
Send Feedback
18
www.xilinx.com
Send Feedback
19
Use the Add IP command to add the "SDAccel OpenCL Programmable Region" IP to the design
canvas. The instance name is ocl_block_0.
2.
Right-click on the IP and use the Customize Block command to open the Re-Customize IP dialog
box to see the features of the IP core, as shown in Figure 11.
The SDAccel OpenCL Programmable Region IP is a container for compiled OpenCL kernels from the
SDAccel Development Environment. You can specify the number of training kernels the core contains.
When used with SDAccel, the OpenCL code and the XOCC determines the number of kernels in the
programmable region. The IP has one AXI slave interface which is used to send commands to the
OpenCL kernels. It also has one or more AXI master interfaces which are connected to device RAM. A
clock and a reset signal are also wired to the SDAccel OpenCL Programmable Region.
You can customize the address and data bit width for the AXI master interface, but you should utilize
the maximum value that the memory controller can support in order to maximize bandwidth and
minimize data width converters on the datapath.
www.xilinx.com
Send Feedback
20
www.xilinx.com
Send Feedback
21
www.xilinx.com
Send Feedback
22
Adding Custom IP
Custom RTL can be added to the design in the IP Integrator using one of two ways. RTL only logic can
be added using the Add Module command. Designs that contain IP or other logic structures can be
packaged as IP and added to the design using the IP Packager.
www.xilinx.com
Send Feedback
23
One to control the programmable region (to configure the OCL kernels) and to control other
peripherals in the static region.
Add the DMA Subsystem for PCI Express (XDMA) IP to the IPI design. The configuration options are
displayed in the figure below.
www.xilinx.com
Send Feedback
24
www.xilinx.com
Send Feedback
25
The Global Options are used to set the general AXI interface and Decoupler IP configuration. The
Interface Options are used to create custom interfaces and to set the required Signal Options.
www.xilinx.com
Send Feedback
26
www.xilinx.com
Send Feedback
27
A shim for the XCL HAL driver API must be implemented. See Using the OpenCL Hardware
Abstraction Layer (XCL HAL) Driver API for more information.
The flow presented here is for an example hardware platform and may need to be adapted to meet
other design or board specific characteristics.
www.xilinx.com
Send Feedback
28
Figure 19 shows a block diagram for the device monitor framework. It is shown connected to the MIG
interface on an example device, but will be similar on other devices. An AXI Performance Monitor (APM)
is the IP core used to monitor the AXI transactions and provide performance statistics and events of
every monitored AXI connection. See the AXI Performance Monitor LogiCORE IP Product Guide (PG037)
for more information. For the APM shown in Figure 19, the monitored connections are the two slave
interfaces on the MIG memory controller which communicates with the DDR DRAM.
The APM contains two modules: profile metric counters, and trace. The profile metric counters are
aggregated counters used to calculate average throughput and latency for both write and read
channels. A single AXI-Lite interface is used to configure the APM and read the metric counters. Xilinx
recommends that you use the sampled metric counters, which are guaranteed to provide samples on
the same clock cycle. This is done by first reading from the sample interval register, then reading the
appropriate sampled metric counters.
The APM trace module provides timestamps of AXI transaction events. These events are captured in a
collection of individual flags and are output by the APM using an AXI Stream in one of the following
ways:
www.xilinx.com
Send Feedback
29
AXI Lite
Figure 19 shows how this trace stream is stored using multiple AXI Stream FIFOs (slower offload
approach). An AXI Stream broadcaster is used to broadcast the stream to three AXI Stream subset
converters. These subset converters split the stream into three different 32-bit values, which are then
written to AXI Stream FIFOs. Note that the use of multiple AXI Stream FIFOs is specific to the current
available IP: the AXI stream FIFO IP has a 32 bit data path width but the implementation needs 96 bit
width.
Use the AXI4-Lite slave for register accesses such as reading the number of data bytes in the FIFO and
resetting the core. In addition, you can use the AXI4 slave interface to offload the data in the FIFO. The
data width of this interface can be set to a value up to 512 bits. For example, a PCIe interface such as an
Alpha Data card could use the maximum bit width of 256 or 512. This AXI4 slave would then be
connected to a DMA or similar master, enabling burst reads and thus a much faster trace offload path.
www.xilinx.com
Send Feedback
30
Number
1
2
Device Interface
Statistics
Source
ports
Kernel start/done
Address Mapping
In order to access this performance monitoring circuitry, it is important to know the address mapping
used for the APM and AXI Stream trace FIFOs.
Table 2: Address Mapping of Device Profile Monitor Framework with AXI Lite Offload
Description
0x100000
0x110000
0x111000
0x112000
Table 2 lists the address mapping for the monitor framework connected to the MIG slave ports. The
APM slave interface is used to both configure the core as well as read the profile counters. The
interfaces to the AXI Stream FIFOs is used to configure the cores, reset the FIFOs, and read/pop values
off the data queue.
The address mapping for this example project is as follows.
www.xilinx.com
Send Feedback
31
Address Mapping
In order to access this performance monitoring circuitry, it is important to know the address mapping
used for the APM and AXI Stream trace FIFO.
Table 3: Address Mapping of Device Profile Monitor Framework with AXI Full Offload
Description
0x100000
0x110000
Table 3 lists the address mapping for the monitor framework connected to the MIG slave ports. The
APM slave interface is used to both configure the core as well as read the profile counters. The
interfaces to the AXI Stream FIFO is used to configure the cores and reset the FIFOs. The
reading/popping of values off the data queue is performed using AXI Full.
www.xilinx.com
Send Feedback
32
Figure 23 shows the data structure used to store device counters results. For the AXI Performance
Monitor core, XAPM_MAX_NUMBER_SLOTS = 8. This data structure is populated using the function
xclPerfMonReadCounters as described in Using the OpenCL Hardware Abstraction Layer (XCL HAL)
www.xilinx.com
Send Feedback
33
typedef struct {
unsigned char LogID; /* 0: event flags, 1: host timestamp */
unsigned char Overflow;
unsigned short Timestamp;
unsigned int HostTimestamp;
unsigned char EventFlags[XAPM_MAX_NUMBER_SLOTS];
unsigned char ExtEventFlags[XAPM_MAX_NUMBER_SLOTS];
unsigned char WriteAddrLen[XAPM_MAX_NUMBER_SLOTS];
unsigned char ReadAddrLen[XAPM_MAX_NUMBER_SLOTS];
unsigned short WriteAddrId[XAPM_MAX_NUMBER_SLOTS];
unsigned short ReadAddrId[XAPM_MAX_NUMBER_SLOTS];
} xclTraceResults;
typedef struct {
unsigned int mLength;
unsigned int mNumSlots;
xclTraceResults mArray[MAX_TRACE_NUMBER_SAMPLES];
} xclTraceResultsVector;
Figure 24: Data Structure and Vector Used to Store Device Profiling Trace Results
The Figure 24 shows the data structure used to store device trace results. This data structure is
populated using the function xclPerfMonReadTrace as described in Using the OpenCL Hardware
Abstraction Layer (XCL HAL) Driver API.
LogID is a single bit that signals the type of data this event contains, while Timestamp contains the
number of device clock cycles since the last event. Overflow is active high only when Timestamp has
been greater than 216. HostTimestamp contains the host timestamp only if LogID = 1.
The event flags are a collection of single-bit flags that signal AXI transaction events, while the external
event flags are single-bit flags that signal external events (e.g., kernel start/done). These are both
further defined in the AXI Performance Monitor LogiCORE IP Product Guide (PG037).
www.xilinx.com
Send Feedback
34
95:70
69:62
61:54
53:47
46:44
43:36
35:28
27:21
20:18
17
16:1
Zero
Slot 1
Slot 1
Slot
Slot 1
Slot 0
Slot 0
Slot 0
Slot 0
Over-
Time-
Log
padding
AWLEN
ARLEN
flags
ext
AWLEN
ARLEN
flags
ext
flow
stamp
ID
events
events
As previously mentioned, each trace word is stored in three AXI Stream FIFOs. The values from these
FIFOs is analyzed then written into the trace results data structure. Table 3 lists the bitwise definition of
the entire trace word. The data shown is for the APM connected to the MIG ports, however, and could
be extrapolated to other connections and devices. The APM is connected to two ports and is configured
to output AXI lengths as well as the standard event flags and external (ext) events.
This data format is specified by the APM core and further described in the AXI Performance Monitor
LogiCORE IP Product Guide (PG037). The reporting infrastructure in the SDAccel OpenCL runtime parses
the trace data structure for AXI transaction and kernel events and then reports in the device trace log.
www.xilinx.com
Send Feedback
35
This section discusses the various types of physical constraints that are needed to support the
reconfigurable hardware platform. There are different requirements depending on whether you are
using the regular or expanded partial reconfiguration flow.
For more information about the partial reconfiguration design flow, and constraint requirements, refer
to the Vivado Design Suite User Guide: Partial Reconfiguration (UG909).
www.xilinx.com
Send Feedback
36
This causes the HD. RECONFIGURABLE property to be set to TRUE for the SDAccel OpenCL
Programmable Region IP. The partial reconfiguration capabilities of the Vivado Design Suite use this
property to identify the programmable region when processing the design.
www.xilinx.com
Send Feedback
37
IMPORTANT: Notice the DONT__TOUCH property is also set on the expanded region
hierarchical module to prevent optimization across the boundary of the module.
Floorplanning
When using Xilinx Partial Reconfiguration technology, floorplanning is required to separate the static
and programmable regions of the design. The floorplanning strategy can dramatically affect the
performance of the design. It can be an iterative process to find the optimal floorplan for your specific
design and device. Refer to the Vivado Design Suite User Guide: Design Analysis and Closure Techniques
(UG906) for more information on defining floorplan regions.
Floorplanning is performed by assigning logic modules or cells to physical block ranges (Pblocks) on
the device canvas. Floorplanning can be performed interactively by opening the Synthesized design in
the Vivado IDE or with constraints in the XDC file. Occasionally, in order to optimize performance or
device resources, multiple Pblocks are used.
Pblocks can be defined as rectangular regions, or by combining multiple rectangles to define a nonrectangular shape. Pblocks can also be nested to enable lower levels of logic to be further constrained
to specific regions of the device. If nested Pblocks are desired, ensure the lower level Pblock region is
completely within the upper level Pblock region.
TIP: Overlapping Pblocks should be avoided.
In both standard PR and expanded PR hardware platform designs, the programmable region must be
entirely contained within a Pblock to facilitate partial reconfiguration.
www.xilinx.com
Send Feedback
38
www.xilinx.com
Send Feedback
39
Case 2: Define multiple unique PartPin Ranges for various interface pins of the programmable region:
set_property HD.PARTPIN_RANGE {SLICE_Xx1Yy1:SLICE_Xx2Yy2} \
[get_pins <rp_cell>/AXIM_*]
set_property HD.PARTPIN_RANGE {SLICE_Xx1Yy1:SLICE_Xx2Yy2} \
[get_pins <rp_cell>/AXIS_*]
IMPORTANT: The assignment of partition pins is done automatically by the tool. You
should only manually define partition pin ranges or locations when necessary to improve
the quality of the routing results.
You can identify partition pins in the design using the example Tcl command below:
select_objects [get_pins -of [get_cells -hier -filter HD.RECONFIGURABLE]]
Refer to the Vivado Design Suite User Guide: Partial Reconfiguration (UG909) for more information on
working with partition pins.
www.xilinx.com
Send Feedback
40
The hardware platform design should be implemented and validated to ensure it works as expected in
the SDAccel flow. The first step in that validation process should be to ensure the hardware platform
design itself is performing as expected. This can be done using test kernel logic to populate the
programmable region.
The floorplan can be examined and modified if need be to optimize implementation results. Refer to
the Vivado Design Suite User Guide: Design Analysis and Closure Techniques (UG906) for more
information.
www.xilinx.com
Send Feedback
41
After completing your hardware platform design and generating a valid bitstream using the Vivado
Design Suite, you are ready to create a Device Support Archive (DSA) file for use with the SDAccel
Development Environment. A DSA is a single-file capture of the hardware platform, to be handed off by
the platform developer for use with the SDAccel Environment.
www.xilinx.com
Send Feedback
42
The following property can be used to override the SPI Flash type associated with a DSA, and influences
the generation of the MCS file from the bitstream. The default value is bpix16. For more information
on flash memory support and the MCS file, refer to the Vivado Design Suite User Guide: Programming
and Debug (UG908).
set_property dsa.flash_interface_type spix8 [current_project]
The following properties have default values but can be set by the DSA generator to specify the DSA
version, type of host, and synthesized or implemented with PR flows, etc.
set_property dsa.version "1.0" [current_project]
The following properties can be used to set the PCIeId and Board attributes for the Board section if a
board file is not yet available.
set_property
set_property
set_property
set_property
www.xilinx.com
Send Feedback
43
If defined, these property values will take precedence over the values that come from the board files.
The default value for these properties is an empty string. Using a local board repository can be avoided
by using these properties.
The parameter can be set in the Vivado IDE prior to opening the Vivado project for the hardware
platform, or it can be set in the init.tcl that is sourced each time the Vivado Design Suite is
launched. For more information on the init.tcl file refer to the Vivado Design Suite Tcl Command
Reference Guide (UG835).
www.xilinx.com
Send Feedback
44
Where <filename> should be something meaningful that references the board and its characteristics.
However, the filename is optional. If omitted, a filename is constructed from the projects parameters as:
${DSA.VENDOR}_${DSA.BOARD_ID}_${DSA.NAME}_${DSA.VERSION}.DSA
TIP: The DSA file format is a zip file and can be opened with the appropriate tools in
order to check the contents and assist further development.
www.xilinx.com
Send Feedback
45
www.xilinx.com
Send Feedback
46
www.xilinx.com
Send Feedback
47
dsa.xml contains DSA hardware platform settings and configuration from when the write_dsa
Tcl command was run, and the DSA file was created.
<platform_name>_clear.bit For UltraScale architecture only, this is a file for clearing the full
bitstream from the device.
<platform_name>.png Optional picture of the board to display in Vivado and SDx projects.
Multiple images at different resolutions can be included.
Code Samples contains sample files for the dsa.xml file of other board files. Use the sample files to
cross reference the created dsa.xml file with the DSA and/or update the file to match the DSA
specifications as created above.
www.xilinx.com
Send Feedback
48
The ./driver subdirectory contains the driver configurations defined for the software platform, as
shown in Figure 28. Refer to the Software Platform Design section form more information on defining
drivers.
www.xilinx.com
Send Feedback
49
Introduction
The SDAccel runtime software is layered on top of a common low-level software interface called the
Hardware Abstraction Layer (HAL). The HAL driver provides APIs to runtime software which abstract the
xcldma driver details. The xcldma driver is a kernel mode DMA driver which interfaces to the memorymapped platform over PCIe.
Figure 31 shows the layers of the software platform:
The Linux Kernel Mode DMA Driver for PCI Express (xcldma)
www.xilinx.com
Send Feedback
50
Allow the clients to memory map the device registers which are exposed on PCIe Base Address
Register (BAR).
Provide access to sensors on the device via SysMon (temperature, voltage, etc.).
The DMA operations are provided by Linux pread/pwrite system call interface. Other services are
provided by Linux ioctl interface which are defined in header file xdma-ioctl.h which is included in
xcldma.zip. The archive can be found in plaforms/<dsa>/sw/driver of the SDx installation
area.
The xcldma kernel mode driver, which is used with the Xilinx DMA Subsystem for PCI Express (XDMA) IP,
has the following standard system call interfaces which are used by the XDMA HAL driver:
// perform IO Control operations
int ioctl(int d, unsigned long request, ...);
//reading data from the device
ssize_t pread(int fd, void *buf, size_t count, off_t offset);
// sending data to the device
ssize_t pwrite(int fd, const void *buf, size_t count, off_t offset);
www.xilinx.com
Send Feedback
51
c2h implies card-to-host transfer channel and h2c implies host-to-card transfer. 0 and 1 are
channel numbers respectively.
pread and pwrite system calls are used to DMA buffers to and back from the device.
xmda_control node exposes the DMA registers, please note these should not be used to directly
program from the user space (application should use pread and pwrite to perform DMA).
xdma_user node is used to memory map PCIe user BAR which exposes OCL regions control port,
AXI gate, APMs, clock wizard, and other peripherals.
www.xilinx.com
Send Feedback
52
`-- xcldma
|-- include
|
|-- mcap_registers.h Defines the MCAP registers used to program UltraScale devices
`-- kernel
|-- 10-xcldma.rules Linux udev rule (typically resides in /etc/udev/rules.d/)
|-- makefile the makefile to build the drive
|-- xdma-bit.c routines for downloading bitstreams
|-- xdma-core.c DMA programming routines
|-- xdma-core.h header file for the DMA programming routines
|-- xdma-ioctl.c defines IOCTRL routines
|-- xdma-sgm.c DMA programming routines
`-- xdma-sgm.h DMA programming routines
www.xilinx.com
Send Feedback
53
unsigned xclProbe()
xclProbe should return a count of physical devices present in system that are supported by the HAL
driver. The same driver may optionally support devices of different kinds.
www.xilinx.com
Send Feedback
54
Buffer Management
Buffer management APIs are used for managing device memory. The DSA vendors are required to
provide a memory management with the following 4 APIs.
www.xilinx.com
Send Feedback
55
Device Profiling
For additional details refer to Designing for Performance Profiling.
The following enum values define the type of device profiling that is requested for each function below:
enum xclPerfMonType {
XCL_PERF_MON_MEMORY = 0,
XCL_PERF_MON_HOST_INTERFACE = 1,
www.xilinx.com
Send Feedback
56
www.xilinx.com
Send Feedback
57
www.xilinx.com
Send Feedback
58
www.xilinx.com
Send Feedback
59
www.xilinx.com
Send Feedback
60
www.xilinx.com
Send Feedback
61
Runtime Library
HAL Driver
XDMA Kernel
Driver
clGetPlatformId
xclProbe
open
clGetDeviceId
xclOpen
open, mmap
clCreateContext
xclLockDevice
flock
clCreateProgramWithBinary
xclLoadXclBin,
ioctl
Hardware
ICAP/MCAP,
xclReclock2
ClockWiz
clCreateBuffer
xclAllocDeviceBuffer
clEnqueueMigrateMemObjects
xclCopyBufferHost2Device
pwrite
DMA, MIG,
DDR
clEnqueueNDRange
xclWrite,
mmap IO
OCL Ctrl
Port
clFinish
xclRead
mmap IO
OCL Ctrl
Port
clEnqueReadBuffer
xclCopyBufferDevice2Host
pread
DMA, MIG,
DDR
10
clReleaseDevice
xclClose
close
www.xilinx.com
Send Feedback
62
www.xilinx.com
Send Feedback
63
www.xilinx.com
Send Feedback
64
Bring-up Tests
Bring-up tests are unit tests which help isolate issues during a new DSA Bring-up phase. They help test
individual XCL HAL driver API implementations, hardware functionality, hardware efficiency and the
working of the OpenCL runtime. Each hardware platform should deliver a set of bring-up test to be
performed after installation and setup.
With the board installed, DSA programmed, and SDx utilities indicating compatibility and readiness of
the device, it can be useful to compile and execute bring-up tests which utilize the SDAccel compilation
flow. This process provides the next level of confidence that the platform is compatible and operational
with partial reconfiguration, the SDAccel flow and its implemented kernels, and the software stack.
www.xilinx.com
Send Feedback
65
Bring-up Tests
To test for basic compatibility of the host computer with the installed board, use the Linux lspci
command to see if the board is detected and registered.
Here is an example of output from a Xilinx supported board:
22:00.0 Memory controller: Xilinx Corporation Device 8438
To determine the present operational status of the platform in more detail, including the HAL driver
version, PCIe IDs, global memory details, FPGA temperature and voltage, compute unit status, and
more, use the xbsak utility included with the SDx installation. Note that the PCIe IDs Vendor, Device,
SDevice (Subsystem) must match those indicated in the host connectivity for compatibility with the
xcldma driver.
To test for basic compatibility of the host computer with the installed board, programmed device, and
installed drivers and to confirm that DMA access to DDR memory is enabled by the platform use the
sdxsyschk (might need to use with sudo) utility included with the SDx installation. Currently, the
sdxsyschk utility only works with DSAs delivered with SDx.
Currently these utilities only work with the DSAs delivered with SDx. Please contact your Xilinx
representative to enquire about getting source code for the xbsak tool. The sdxsyschk is a Python
utility and can be copied and modified as needed.
The xbsak tool is found in the runtime/bin directory of the SDx installation.
Platform Development Guide
UG1164 (v2016.3) November 30, 2016
www.xilinx.com
Send Feedback
66
Bring-up Tests
Command Options
The following are the xbsak command line format and details on the options:
xbsak <command> [options]
help
Print help messages
list
List all supported devices installed on the server in the format of [device_id]:device_name.
The following is an example output where the device ID is:
[0] xilinx:xil-accel-rd-ku115:4ddr-xpr:3.2
www.xilinx.com
Send Feedback
67
Bring-up Tests
-d device_id : Specify the target device. Default=0 if not specified. (Optional)
-b blocksize :Specify the test block size in KB. Default=65536 or 64MB if not specified. The block
size can be specified in both decimal or hexadecimal formats. e.g. Both -b 1024
and -b 0x400 set the block size to 1024KB or 1MB. (Optional)
XIL-ACCEL-RD-KU115
DSAs
xilinx:adm-pcie-7v3:1ddr:3.0 or newer
xilinx:adm-pcie-ku3:2ddr-xpr:3.2 or newer
xilinx:adm-pcie-ku3:2ddr:3.2 or newer
xilinx:adm-pcie-ku3:1ddr:3.2 or newer
xilinx:xil-accel-rd-ku115:4ddr-xpr:3.2 or newer
www.xilinx.com
Send Feedback
68
Bring-up Tests
-r region_id : Specify the target region. Default=0 if not specified. (Optional)
PCIe Platform Setup Diagnosis (critical status of a target Xilinx PCIe device will be analyzed)
Requirements:
The sdxsyschk script has some requirements which must be met:
1. Use Python version 2.7.5 and above to run this script
2. Have lspci package installed either under /sbin or /usr/bin directory
Example Usages:
To query status and redirect the output to a text file specified with argument
sudo python ./ sdxsyschk.py -f <path_and_file_name>
To query basic system status plus environment variables info (must be run without sudo access)
python ./ sdxsyschk.py e
www.xilinx.com
Send Feedback
69
Code Samples
The SDx Environments software in installation includes several example header files for reference. They
can easily be copied and modified for custom hardware platform designs. The code examples are
located at the following location:
<SDx_Install_Area>/SDx/2016.3/runtime/driver/include
www.xilinx.com
Send Feedback
70
Legal Notices
Legal Notices
www.xilinx.com
Send Feedback
71