Professional Documents
Culture Documents
| Octobe
r 2009
nd Compilation
appears
q u a r t e r ly
rchitecture a
dded A
a nce a nd Embe
n H i g h Perform
n ce o
o f Excelle
Network
2 Message from the HiPEAC coordinator
HiPEAC Activity:
4 - HyperTransport Tutorial at Stanford University
4 - Rainer Leupers on his mini-sabbatical at ACE bv
5 - Joint Seminar: RWTH Aachen University visits FORTH
Community News:
2 - Gadgets could go greener with high-speed
computer chip
5 - Newsletter spell checking transition
6 - Ozcan Ozturk received the IBM Faculty Award
6 Announcement:
HiPEAC ng
Computi Week
- ALaRI institute invites to attend Doctoral School on
Complexity Management in Embedded Systems
7 In the Spotlight:
Systems aw
in Wrocł 26-28
- 9th International Forum on Embedded MPSoC and
Multicore (MPSoC 2009)
12 PhD News
16 Upcoming Events
www.HiPEAC.net
Dear friends,
I hope all of you have enjoyed a relaxing holiday season this summer. At the personal level,
vacations are important to work on personal relationships, to enjoy hobbies, and to re-energize.
In short, to keep a balance in life. For a network of excellence, the situation is different: it never
takes a day off, not even in the summer.
In June, HiPEAC2 underwent its first change to a new location did not have als, of which several are directly linked
review. The reviewers concluded that an impact on the appreciation for to the HiPEAC research clusters. The
the project successfully kicked off, that the summer school. We are already whole event is expected to be a major
we correctly managed the transition preparing the ACACES 2010 sum- networking event for our community.
between HiPEAC1 and HiPEAC2, and mer school, which will be officially The European commission has recently
that all activities are showing a healthy announced in the January newsletter. started consultation meetings to pre-
level of activity. The steering commit- In October we organize our fall com- pare the next call in computing sys-
tee and the staff is now working hard puting systems week in Wroclaw. This tems. The HiPEAC community is collab-
to implement the recommendations is the very first time that HiPEAC organ- orating actively in this effort, through
formulated by the reviewers, more izes an industrial workshop and the its roadmap process, and also through
in particular, increasing the industri- co-located cluster meetings in a new bilateral meetings with HiPEAC mem-
al involvement in HiPEAC, stimulat- member state. We hope that this event bers. We hope that this joint effort will
ing additional research interactions will help our colleagues in Poland to eventually lead to a better understand-
between the different research clusters get familiar with our network, to get ing of what is needed for the further
and task forces, and further stimu- involved and to start collaborations. development of the computing sys-
lating mobility through collaboration HiPEAC is strongly committed to build tems domain in Europe. Its conclusions
grants, internships and sabbaticals. stronger links with colleagues in the should also inspire future calls that
In July, more than 200 of us enjoyed new member states. will fuel our research. I hope to meet
the yearly ACACES summer school in Finally, there is the HiPEAC Conference, you at one of our coming networking
La Mola, Barcelona. The facilities were currently being organized by our Italian events,
stunning, the local organization and colleagues in beautiful Pisa, Italy. The
the courses were excellent, and the conference runs for three days in Take care,
participants were enthusiastic about January 2010, and it is preceded by a
the whole event. Our last minute very rich set of workshops and tutori- Koen De Bosschere
2 info20
Panos Tsarchopoulos
Panagiotis.Tsarchopoulos@ec.europa.eu
info20 3
HiPEAC Activity
4 info20
Community News
info20 5
Community News
6 info20
In the Spotlight
With an attendance of more than 50 The assembly of researchers from both, working. The wonderful dinner at the
world-class speakers, the ninth event of industry and academia, at MPSoC’09 Savannah river made MPSoC’09 a con-
the MPSoC focused on research issues provides a great platform for guiding ference to be remembered.
yet to be mastered. The 5-day forum academia to the relevant design chal-
gave an impressive overview of present lenges the industry is facing today. In In a nutshell MPSoC’09 was a memo-
and expected future challenges in the turn, executives and senior managers rable and fruitful conference with its
topics of applications, software and are encouraged to explore new ideas unique character of in-depth discussion
hardware. Examples of the broad range and to rethink their strategies. and information exchange of research-
of topics are efficient hardware archi- ers from all over the world. I hope to
tectures for Software Defined Radios, Apart from the brilliant technical con- visit the next MPSoC and to meet you
3D chip stacking and the ubiquitous tributions at MPSoC’09, plenty of there.
quest for design space exploration of social events resulted in bringing peo- Torsten Kempf,
software and hardware. ple together in order to intensify net- RWTH Aachen University, Germany
info20 7
New Members
board are supposed to be produced lished in 2008 by a group of IT special- Design Systems. Ru.Chip team mem-
in accordance with the fabless model ists and innovation managers. Initial bers have also acquired an extensive
implies no proprietary production line. financing was granted by the Russian experience from participating in a lot
Some core groups of future custom- Foundation for Assistance to Small of outsourcing software and hard-
ers for the developed products have Innovative Enterprises. ware design projects in US.
been identified: global and regional Ru.Chip is temporarily headquartered
Internet search systems; corporate and in Moscow. The core staff of the Contact: Anton Gerasimov
state data processing centers, corporate company has acquired their origi- (anton.gerasimov@ruchip.com),
search systems and other users. nal experience from semiconductor 123458, Russia, Moscow,
and software industry working for Tvardovskogo str. 8 building 1,
About RuChip world leading companies such as Office 608
Ru.Chip Llc. is a startup that was estab- STMicroelectronics and Cadence
8 info20
HiPEAC Students
Thursday saw the classes go to a cli- flawlessly succeeded. Until the power
max, when the teachers disseminated was cut and we were left standing
the highlights of their course. Finally while repairs were underway. After the
on Friday, the courses were concluded successful restoration of electricity flow
with some more highlights and loads of to the guitars, mikes and amplifiers,
interesting information. I think I learned the (remaining) crowd danced some
quite a few things and refreshed some more and some of us took another dip.
others. Sadly, one cannot attend all Finally, at around 2 AM, the party died.
twelve courses: choices must be made. After some small talk with local Spanish
Hindsight is 20/20, but still I’d like to students, I retired to my room and nod-
have seen some of the other teachers ded off to the tunes of the wedding
as well. The (formal part of the) day party being held at the poster venue.
ended with Koen giving an overview Kim Hazelwood discusses Super Pin
of the school, inviting us for next year After two hours of sleep, I rose on
and thanking the teachers and speak- Saturday to share a taxi with several in their field and to get in touch with
ers. And of course Nacho Navarro, colleagues - our flight left too early for fellow students. This was the second
who took upon himself a large part in us to wait on the bus that would take ACACES summer school I attended and
helping to organize this year’s summer most attendants to the airport. Luckily once again, I thoroughly enjoyed it. The
school after the relocation was decided our plane was on time and it was with summer school is also a great way to
upon. much joy that I rushed to my wife and start collaborations and to get word of
two sons after landing at Brussels. your work out to people with similar
After all was said and done, Koen interests.
invited us for the group photo, the tasty To conclude, I can heartily recommend
barbecue and ... the pool party. At the attending the HiPEAC/ACACES sum- Andy Georges
pool, a live cover band was setting up mer school. It’s a great way to learn (andy.georges@elis.ugent.be),
to entertain us, a feat in which they new things from top notch experts Ghent University, Belgium
info20 9
HiPEAC Students
significantly expensive. (2) Transparent FPGA library interface is implemented Systems: Architectures, Modeling and
change of memory association. using threads in order to avoid the Simulation (IC-SAMOS), but still there
The runtime provides data packing application to be blocked during FPGA are many things to do. We are cur-
and unpacking when transferring data management operations. rently working on supporting several
between host and FPGA device, since FPGAs on SGI Altix Systems. Moreover
data transfer can be a bottleneck, Our proof-of-concept was successful our future working lines are the optimi-
and (3) a multithreaded FPGA library and is explained in “OpenMP exten- zation of data movements as prefetch-
interface. The runtime avoids any FPGA sions for FPGA accelerators” in the ing of data, new packing/unpacking
management operation if the opera- 2009 proceedings of the International techniques and partial runtime recon-
tion blocks application execution. The Conference on Embedded Computer figuration.
10 info20
HiPEAC Students
info20 11
HiPEAC Students
PhD News
12 info20
fault tolerance with lower overhead previous work in FtTokenCMP. Finally, applications and their major cost is
than at the level of the interconnection the same ideas are used to design an increase in network traffic due to
network, which has to treat all mes- FtHammerCMP: a broadcast-based and acknowledgment messages that ensure
sages alike with respect to reliability. snoopy-like fault-tolerant cache coher- the reliable transference of ownership
ence protocol based on the cache between coherence nodes, which are
To demonstrate our approach, we coherence protocol used by AMD in sent out of the critical path of cache
design three fault-tolerant cache their Opteron processors. misses. The results also show that a
coherence protocols. First, we design system using our protocols degrades
FtTokenCMP, based on the token We evaluate these protocols using full- gracefully when transient faults actu-
coherence framework. Secondly, we system simulation. The results of this ally happen. Furthermore, we are able
design FtDirCMP: a directory-based evaluation show that, in absence of to support fault rates much higher than
fault-tolerant cache coherence proto- faults, our techniques do not increase those expected in the real world with
col with techniques inspired by the significantly the execution time of the only small performance degradation.
info20 13
PhD News
architectural components designed based prefetching mechanism that memory. It is shown that this mecha-
from scratch in order to address the complements the DTAs’ preload mech- nism can greatly improve the execution
problem of scalability. anism in order to achieve non-blocking time for several simple kernels (e.g. 13x
The third case study presents a DMA- accesses to global data stored in main in the case of matrix multiply).
14 info20
PhD News
system’s scheduler. This code can be to systems that offer the ability to aug- applications with complex dependency
compiled with a commodity C compiler ment the machine with a small hard- graphs, compared to traditional paral-
resulting in a binary that is executable ware module while TFluxSoft is directly lel programming model approaches.
by any commodity operating system applicable to any existing, off-the-shelf
and processor. The layered design of system. Overall, TFlux is a platform character-
TFlux has been tested on different ized by four key components: (1) it
Unix-based multiprocessor systems. For the applications of the evaluation can be programmed using a specially
Moreover, this design enabled the port- suite, TFlux implementations show developed tool chain; (2) it virtualizes
ing of TFlux to different machines with remarkable speedup and scalability. the details of the underlying machine
minimum effort. Although for most applications the which allows the applications to run
performance of the two implementa- on different TFlux implementations
In this work, two TFlux implementations tions is close, TFluxHard shows an without any modification; (3) it is
are presented: TFluxHard and TFluxSoft. advantage over TFluxSoft arising from easily portable to systems that differ
For TFluxHard the Thread Scheduler offloading the Scheduler’s functional- significantly compared to the original
is implemented as a hardware unit ity to the hardware module. In addi- design and (4) it delivers high perform-
whereas for TFluxSoft, the Scheduler’s tion, the experimental results show ance through its dataflow-like Thread
functionality is provided at the software that both implementations of TFlux scheduling scheme.
level. As such, TFluxHard is applicable are able to exploit more parallelism for
On the Road towards Robust and Ultra Low Energy CMOS Digital Circuits Using Sub/Near
Threshold Power Supply
By Yu Pu (Y.Pu@tue.nl) put using architectural-level parallel- each DCT and Quantization engine
Advisor: Prof.dr. Jose Pineda de ism. Several physical-level techniques dissipates only 0.75pJ per cycle with
Gyvez and Prof.dr. Henk Corporaal are also proposed to mitigate yield a 0.4V supply at 2.5MHz frequency,
TU Eindhoven, The Netherlands loss due to process variations, such as which leads to 8.3X energy reduction
September 2009 balancing VT of n/pMOS transistors, compared to using the 1.2V nominal
using VT mismatch between parallel supply. In the near-threshold, each
This thesis presents our research work transistors to improve driving capabil- engine dissipates only 1.0pJ per cycle
in design of robust near/sub-threshold ity, selecting and modifying standard with a 0.45V supply at 4.5MHz fre-
CMOS digital circuits. While previous cells, etc. These ideas are demon- quency, but the system throughput
research uses ultra-low voltage oper- strated using SubJPEG, a state-of-the- still meets the VGA standard require-
ation only for low-throughput appli- art 65nm CMOS standard VT JPEG ment for 15 fps 640×480 pixel.
cations, we achieve medium through- co-processor. In the sub-threshold,
info20 15
PhD News
better solve the coalescing problem, so we devised a heuristic, called “permu- gated results, our better coalescing
that aggressive splitting can be used tation motion,” that is intended to be allowed us to cleanly separate regis-
beforehand. used with SSA-based splitting in place ter allocation into two independent
of our more aggressive coalescing in a phases: First, spilling to reduce register,
This coalescing performs well in an JIT context. possibly by splitting a lot; Then color
aggressive compiler. However, the high the variables and perform coalescing
number of splits and the increased All those results led us to promote to remove most of the added copies.
compilation time required is prohibitive a better register allocation scheme.
for just-in-time (JIT) compilation. So, While previous solutions gave miti-
Upcoming Events
22nd International Conference for High Performance Computing, Networking, Storage and Analysis (SC’2009)
November 14–20, 2009, Portland, USA, http://staff.science.uva.nl/~delaat/sc09/
Asia and South Pacific Design Automation Conference 2010 (ASP-DAC 2010)
January 18-21, 2010, Taipei, Taiwan, http://www.asp-dac.itri.org.tw/aspdac2010/index.html
5th International Conference on High Performance and Embedded Architectures and Compilers (HiPEAC 2010)
January 25-27, 2010, Pisa, Italy, http://www.hipeac.net/conference
8th Annual IEEE/ACM International Symposium on Code Generation and Optimization (CGO 2010)
April 24-28, 2010, Toronto, Ontario, Canada, http://www.cgo.org
Contributions
If you are a HiPEAC member and would like to contribute to future HiPEAC newsletters,
please contact Rainer Leupers at leupers@iss.rwth-aachen.de