You are on page 1of 7

Ensuring real-time

predictability
Leveraging TI’s Sitara™ processors
programmable real-time unit

Melissa Watkins,
Application Engineer
Texas Instruments

Carlos Betancourt,
Product Marketing Manager
Texas Instruments
System developers have discovered that many, if not
most, contemporary applications require a finely tuned
combination of high-speed software processing and
real-time hardware performance. In other words, high-
performance is not the equivalent of real-time performance.
In fact, they both have their own unique and often mutually
exclusive set of requirements which prevents either one
from performing the role of the other very well.

For example, high-performance processors like ARM® Cortex®-A cores have an entirely
different set of resources and processing capabilities than those of real-time processing
cores, like the Programmable Real-Time Unit (PRU) co-processor in TI’s Sitara processors.
The capabilities that make an ARM core so powerful at processing software could also impede
its real-time determinism and predictability. And, in many of today’s most sophisticated
applications, real-time capabilities are just as critical as high-performance, if not more so.

Real-time requirements ARM® Subsystem

Outside of the data center, many systems need the


Cortex®-A
low-latency predictability of real-time processing L1 L1
Instruction Data • L1 D/I caches:
in one way or another. In fact, even many general- Cache Cache
– Single-cycle access
purpose systems that require a high-level operating L2 Data Cache • L2 cache
– Minimum latency
system (HLOS) often have a real-time component of 8 cycles
or subsystem, such as communication protocol L3 Interconnect
• Access to on-chip SRAM:
– 20 cycles
processing, audio processing, lighting control, • Access to shared memory
Shared
sensor monitoring, factory or home automation, Memory
Peripherals over L3 interconnect
– 40 cycles
motor control and others. L4 Interconnect

In these types of systems, a general-purpose


Peripherals GP I/O
processor (GPP), no matter how high its
performance, cannot deliver the guaranteed Figure 1. A typical Cortex-A-based SoC general-purpose processing
architecture.
response time within strict time constraints that
typify real-time applications and subsystems. Many will often become counter-productive in real-time
of the features of a GPP like instruction pipelines applications. The typical architecture surrounding a
and memory and interconnect architectures which GPP core on a system-on-a-chip (SoC) will include
make it so effective running a HLOS (Figure 1) several layers of memory, which may be internal to

Ensuring real-time predictability I 2 June 2017


the processing core or external, as well as shared take as much as 200 nanoseconds (ns) whereas
memory or dedicated to one core. Moreover, the the equivalent toggle on a real-time processor with
usual SoC architecture will also include several direct GPIO pin access would have a response time
layers of interconnects which link the various on- of five ns, or forty times faster. In addition to the
chip modules, peripherals and eventually lead off the often slower response typical of GPP processors,
chip via specialized or general-purpose input/output the actual response time is usually unpredictable.
(GPIO) pins. All of these facilities and structure can In other words, the response time for a GPP to
get in the way of real-time processing. handle a real-time event will likely vary each time

The architecture surrounding a GPP cannot provide the event occurs.

the predictable low-latency response times that To overcome the real-time limitations of GPP cores,
must be guaranteed in a real-time application. certain capabilities such as real-time co-processors
Accessing data on any of the various levels of are often integrated into an SoC architecture. In the
memory and communicating over any or all of the example below (Figure 2), the Texas Instruments
several layers of interconnects will add to the core’s (TI) Programmable Real-Time Units (PRU) form the
response time and ensure the unpredictability of basis for a real-time subsystem on processors.
the response. The processor’s response time to Such an architecture gives the SoC direct and fast
an incoming time-sensitive interrupt from a sensor, access to the outside world since each PRU has
for example, will vary according to a number of its own single-cycle I/O. Additionally, local memory
factors, such as where the needed data is stored, and peripherals dedicated to each real-time engine
how many layers of interconnects must be traversed means that each unit is able to guarantee low-
to access or store data, the processing load latency responsiveness. Plus an incoming interrupt
currently executing on the GPP core and others. has direct access to a real-time processing engine
One experiment on a certain GPP architecture without encountering the delays caused by crossing
showed that a simple toggle on a GPIO pin could several layers of interconnects and memory.

®
ARM Subsystem Real-Time Coprocessor Subsystem
PRU0 PRU1 PRU0 I/O
(200 MHz) (200 MHz)
Cortex®-A
PRU1 I/O
Inst. Data Inst. Data
Shared RAM
L1 L1 RAM RAM RAM RAM
Instruction Data
Cache Cache
Interconnect

L2 Data Cache
INTC Peripherals

L3 Interconnect

Shared
Peripherals
Memory

L4 Interconnect

Peripherals GP I/O

Figure 2. A general-purpose core supplemented with real-time co-processors.

Ensuring real-time predictability I 3 June 2017


Industrial Access to all of the system’s interconnects allows
MII0 RX/TX
Ethernet
Data
a PRU to call on any resource in the system when
32 GPO PRU0 RAM0 needed for a particular program implementation. In
Core
30 GPI (Instr. RAM) Data
addition, each PRU subsystem comes with its own

32-Bit Interconnect Bus


RAM1 set of dedicated peripherals to ensure the unit’s
Scratchpad
Shared responsiveness. These peripherals avoid the data
RAM traffic on secondary and tertiary interconnects in the
32 GPO PRU1
Core
30 GPI (Instr. RAM) system by accessing directly the PRU’s real-time
Master I/F
(to SoC Interconnect) cores. Several peripherals, such as Management Data
Industrial
Ethernet
MII1 RX/TX Input/Output (MDIO) and Media Independent Interface
Slave I/F
MDIO
(from SoC Interconnect) (MII), enable real-time Ethernet capabilities. Other PRU
subsystem peripherals include a UART interface.
UART IEP (Timer)
Events to Interrupt Because of the interrupt controller and fast I/O pins
ARM INTC
Controller MPY/MAC
Events from dedicated to each dual-core PRU, the unit is able
Peripherals
(INTC)
+ PRUs to closely monitor external events and respond in a
predictable period of time. Each PRU has its own set
Figure 3. The TI Programmable Real-Time Unit (PRU) subsystem.
of up to 30 inputs and 32 outputs that directly access
external pins on the device package.
Programmable Real-Time Unit (PRU)
The PRU (Figure 3) which is deployed along with The best of both worlds: Combining
ARM cores in the Sitara AM335x, AM437x, AM5x ARM and PRU cores
processors and AMIC110 SoCs fulfills the role of a The Sitara line of multi-core processors features
low-latency, deterministic real-time subsystem. Each escalating combinations of ARM Cortex-A cores
PRU subsystem is made up of two 200-MHz real- and PRUs. With this architecture (Figure 4), the
time cores (or PRUs), each with a five nanosecond system designer can select the device with the right
(ns) cycle time per instruction. Since the real-time combination of high, general-purpose processing and
cores are not equipped with an instruction pipeline, real-time performance to meet the specific needs of
single-cycle instruction execution is ensured. The the application. ARM cores have all of the resources
PRU’s small, deterministic instruction set with multiple and instruction support expected for high-performance
bit-manipulation instructions is easy to learn and use. operating system execution, either for a HLOS or a

Shared memory as well as instruction and data real-time operating system (RTOS), or both at once.

memory dedicated to each real-time core allows


for flexible program execution among all of the
real-time and GPP ARM cores that might make Cortex-A 0
up the SoC. One program or task might be better
performed on one core while another could be
executed faster when the processing load is shared
among several PRUs and ARM cores. Direct
RAM I/O PRU0 I/O
access from the PRU to the ARM cores’ layers
of interconnects enables either tightly coupled
execution or independent core operations. Figure 4. Base Sitara architecture with ARM cores and PRUs.

Ensuring real-time predictability I 4 June 2017


At the same time, the five nanosecond cycle time The Sitara hardware structure in Figure 4 supports a
of PRU cores as well as their low-latency data software architecture (Figure 5) that tightly couples
transfers and high-speed I/O accesses assure ARM cores and PRUs to ensure seamless and
designers that transfers and data modifications will high-speed application processing. Typically, a
be performed in a predictable period of time. The Linux™ kernel running on an ARM core would include
responsive simplicity of a PRU core make it a natural RemoteProc and rpmsg kernel drivers for the PRU
fit for bit-banged communication interfaces like the subsystem. The RemoteProc is the basic control
Serial Peripheral Interface (SPI), the Inter-IC (I C) 2 mechanism, allowing the ARM core to load PRU
bus interface and others, including many industrial firmware, enabling PRU processing and other

automation protocols. functions. Through the rpmsg driver user space


applications and the PRU can pass messages
Some end equipment architectures incorporate a
(buffers) back and forth.
field programmable gate array (FPGA) for real-time
processing. Unlike the integrated PRU co-processor Applications
(Figure 4), an FPGA that is external to an ARM-based
The simple implementation of a PRU combined
system processor would necessarily increase the bill
with its full complement of resources make it a
of materials (BOM) costs, require additional board versatile processing engine for a wide range of real-
space, add complexity to the design and increase time tasks, subsystems and application modules.
the power consumption of the system. With a PRU Designers have also taken advantage of the PRU
solution, system developers also benefit from a to deploy straight forward subsystems like stepper
common software code base which simplifies feature motor control units, bit-banging communications
upgrades, and multi-protocol processing across processors and sensor interfaces to the more
systems or on the same system. Development of complex tasks such as camera and LCD display
successive generations of systems is accelerated by interfaces, smart card processing and even very
simply migrating the code base to the next model in complex applications like Ethernet industrial
the product line. automation protocol processing (Figure 6).

Application Application uses /dev/rpmsg-pru


interface to send messages
User Space

Kernel
RemoteProc-PRU rpmsg-PRU Client drivers specifically for PRU core

RemoteProc rpmsg Core Linux™ drivers

DTB File
Passed from virtio
U-Boot

PRU rpmsg lib

PRU Firmware vring PRU Data Memory PRU

Figure 5. Sitara ARM/PRU software architecture.

Ensuring real-time predictability I 5 June 2017


Figure 6. Factory automated systems with AMIC110.

One of the more interesting applications of the PRU


involves a 3D printer (Figure 7). In this case, designers
took advantage of the BeagleBone development
board with a Sitara AM335x processor featuring an
ARM Cortex-A8 core running Linux, the user interface
and model processing while the PRU performed the
real-time control of five stepper motors. A shared
region of memory was reserved for communications
between the ARM core and the PRU.

Conclusions
Figure 7. Sitara AM335x with ARM Cortex-A8 core and PRU running
a 3D printer. In today’s complex application world even the
most straight forward general-purpose systems
The PRU is optimized for low latency and no jitter
will frequently need the predictability, determinism
in the real-time execution and is able to process
and low-latency responsiveness that only real-
the MAC layer functionality of Industrial Ethernet
time processing can deliver. Teaming the PRU
standards. For example, since PRU-ICSS can
with high-performance ARM Cortex-A cores in the
support multiple protocols, it enables devices like
Sitara line of processors provides the best of both
the AMIC110 SOC to seamlessly bridge legacy
worlds. By offering the best of general-purpose and
motor drive designs by adding support for Industrial
real-time processor elements in a single package,
Ethernet such as EtherCAT, PROFINET, Sercos III,
end equipment is optimized for cost and power
Ethernet/IP and PowerLink.
consumption, yet flexible for feature upgrades through
efficient reprogramming.

Important Notice: The products and services of Texas Instruments Incorporated and its subsidiaries described herein are sold subject to TI’s standard terms and
conditions of sale. Customers are advised to obtain the most current and complete information about TI products and services before placing orders. TI assumes
no liability for applications assistance, customer’s applications or product designs, software performance, or infringement of patents. The publication of information
regarding any other company’s products or services does not constitute TI’s approval, warranty or endorsement thereof.
The platform bar and Sitara are trademarks of Texas Instruments. All other trademarks are the property of their respective owners.

© 2017 Texas Instruments Incorporated SPRY264A


IMPORTANT NOTICE FOR TI DESIGN INFORMATION AND RESOURCES

Texas Instruments Incorporated (‘TI”) technical, application or other design advice, services or information, including, but not limited to,
reference designs and materials relating to evaluation modules, (collectively, “TI Resources”) are intended to assist designers who are
developing applications that incorporate TI products; by downloading, accessing or using any particular TI Resource in any way, you
(individually or, if you are acting on behalf of a company, your company) agree to use it solely for this purpose and subject to the terms of
this Notice.
TI’s provision of TI Resources does not expand or otherwise alter TI’s applicable published warranties or warranty disclaimers for TI
products, and no additional obligations or liabilities arise from TI providing such TI Resources. TI reserves the right to make corrections,
enhancements, improvements and other changes to its TI Resources.
You understand and agree that you remain responsible for using your independent analysis, evaluation and judgment in designing your
applications and that you have full and exclusive responsibility to assure the safety of your applications and compliance of your applications
(and of all TI products used in or for your applications) with all applicable regulations, laws and other applicable requirements. You
represent that, with respect to your applications, you have all the necessary expertise to create and implement safeguards that (1)
anticipate dangerous consequences of failures, (2) monitor failures and their consequences, and (3) lessen the likelihood of failures that
might cause harm and take appropriate actions. You agree that prior to using or distributing any applications that include TI products, you
will thoroughly test such applications and the functionality of such TI products as used in such applications. TI has not conducted any
testing other than that specifically described in the published documentation for a particular TI Resource.
You are authorized to use, copy and modify any individual TI Resource only in connection with the development of applications that include
the TI product(s) identified in such TI Resource. NO OTHER LICENSE, EXPRESS OR IMPLIED, BY ESTOPPEL OR OTHERWISE TO
ANY OTHER TI INTELLECTUAL PROPERTY RIGHT, AND NO LICENSE TO ANY TECHNOLOGY OR INTELLECTUAL PROPERTY
RIGHT OF TI OR ANY THIRD PARTY IS GRANTED HEREIN, including but not limited to any patent right, copyright, mask work right, or
other intellectual property right relating to any combination, machine, or process in which TI products or services are used. Information
regarding or referencing third-party products or services does not constitute a license to use such products or services, or a warranty or
endorsement thereof. Use of TI Resources may require a license from a third party under the patents or other intellectual property of the
third party, or a license from TI under the patents or other intellectual property of TI.
TI RESOURCES ARE PROVIDED “AS IS” AND WITH ALL FAULTS. TI DISCLAIMS ALL OTHER WARRANTIES OR
REPRESENTATIONS, EXPRESS OR IMPLIED, REGARDING TI RESOURCES OR USE THEREOF, INCLUDING BUT NOT LIMITED TO
ACCURACY OR COMPLETENESS, TITLE, ANY EPIDEMIC FAILURE WARRANTY AND ANY IMPLIED WARRANTIES OF
MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, AND NON-INFRINGEMENT OF ANY THIRD PARTY INTELLECTUAL
PROPERTY RIGHTS.
TI SHALL NOT BE LIABLE FOR AND SHALL NOT DEFEND OR INDEMNIFY YOU AGAINST ANY CLAIM, INCLUDING BUT NOT
LIMITED TO ANY INFRINGEMENT CLAIM THAT RELATES TO OR IS BASED ON ANY COMBINATION OF PRODUCTS EVEN IF
DESCRIBED IN TI RESOURCES OR OTHERWISE. IN NO EVENT SHALL TI BE LIABLE FOR ANY ACTUAL, DIRECT, SPECIAL,
COLLATERAL, INDIRECT, PUNITIVE, INCIDENTAL, CONSEQUENTIAL OR EXEMPLARY DAMAGES IN CONNECTION WITH OR
ARISING OUT OF TI RESOURCES OR USE THEREOF, AND REGARDLESS OF WHETHER TI HAS BEEN ADVISED OF THE
POSSIBILITY OF SUCH DAMAGES.
You agree to fully indemnify TI and its representatives against any damages, costs, losses, and/or liabilities arising out of your non-
compliance with the terms and provisions of this Notice.
This Notice applies to TI Resources. Additional terms apply to the use and purchase of certain types of materials, TI products and services.
These include; without limitation, TI’s standard terms for semiconductor products http://www.ti.com/sc/docs/stdterms.htm), evaluation
modules, and samples (http://www.ti.com/sc/docs/sampterms.htm).

Mailing Address: Texas Instruments, Post Office Box 655303, Dallas, Texas 75265
Copyright © 2017, Texas Instruments Incorporated

You might also like