Novel Architectures and Accelerators

Guest Editor's Introduction • George K. Thiruvathukal • February 2011

A Quick Look at Nvidia CUDA 3.2 Toolkit
This video is aimed at helping you get started with the Nvidia CUDA 3.2 toolkit. If you own a computer with a recent Nvidia graphics card, you can run parallel programs as if you owned your own private supercomputer, even if you have no immediate interest in writing code for this platform. In this video, I show how to get started with the toolkit and run the basic bandwidth test and n-body simulation (of interest in astrophysics).
A Look at Shuttle Multicore/GPU Development System
These clips show the Nvidia GPU being used to run the same n-body simulation (as the previous video) and a quick look inside the barebones computer that I built using the parts on the NewEgg wish list linked to in the main article.


Theme Articles

This month's theme includes the following featured articles:

Moving Scientific Codes to Multicore Multiprocessor CPUs
A restructuring method for implementing numerical algorithms for scientific computing can help them run efficiently on the IBM Cell processor and other multicore CPUs. More »

Computing Models for FPGA-Based Accelerators
Several case studies demonstrate the effectiveness of using FPGA-based accelerators in molecular modeling. More »

QPACE: Quantum Chromodynamics Parallel Computing on the Cell Broadband Engine
Learn about the architecture of the QPACE (Quantum Chromodynamics Parallel Computing on the Cell Broadband Engine) project and the challenges arising from the PowerXCell 8i's multicore nature and the use of an FPGA for the network processor. More »

Joint Forces: From Multithreaded Programming to GPU Computing
This overview of how best to use GPU compute power explains GPU hardware concepts and programming principles and discusses performance optimization strategies. More »

The GPU Computing Era
Explore the rapid evolution of GPU architectures—from graphics processors to massively parallel many-core multiprocessors, recent developments in GPU computing architectures, and how the enthusiastic adoption of CPU+GPU coprocessing is accelerating parallel applications. More »

Parallel Option Pricing with BSDE Method on GPU
GPU acceleration can be used effectively with Backward Stochastic Differential Equation (BSDE) to solve option pricing problems. More »

What else is new? »


Novel Architectures and AcceleratorsIn late 2008, I had the pleasure of working with Steve Gottleib and Volodymyr Kindratenko on a special issue of Computing in Science and Engineering magazine. Entitled "Novel Architectures," the issue discussed how to incorporate accelerators into one's arsenal of programming techniques. Having spent most of my life working on more "conventional" multiprocessing/multithreading techniques for concurrent, parallel, and distributed computing applications, it was a new area to me at the time, although the techniques were largely familiar from my early days in parallel computing. Many computers I used as a graduate student (photo) in the late 1980s (systems from Cray and Thinking Machines, among others) used the vector and data-parallel models.

Today, the computing community is evaluating several types of accelerators, most notably field-programmable gate arrays (FPGAs), graphics processing units for general-purpose computations (GPGPUs), Sony-Toshiba-IBM's Cell Broadband Engine, and the ClearSpeed attached processor. Several high-performance computing vendors offer systems that include accelerators as an integral part of their product lines—for example, the SGI reconfigurable application-specific computing architecture uses FPGAs, the Cray XT5h uses vector processors and FPGA accelerators, and IBM's hybrid system architecture uses a PowerXCell as a coprocessor. Several Beowulf PC clusters are reportedly outfitted with FPGAs, ClearSpeed, and GPGPU accelerators.

The jury is still out on which of these computational accelerator technologies will dominate the field, because each one brings to the table a different mix of benefits and challenges. However, it's undeniable that since we published our special issue, GPGPU computing has made a very strong push forward, owing largely to the commodity nature of graphics chipsets (available from NVIDIA). It's possible to start writing your own GPGPU programs using any Windows or Linux PC outfitted with an appropriate NVIDIA-based graphics card, not to mention the MacBook line, which has been employing NVIDIA graphics for a while now.

Since our 2008 special issue, it has become possible for anyone to build a "novel computer." In my case, I just finished building a small form factor computer that includes the latest Intel i7 Core 2 Quad processor and an NVIDIA Fermi GPU with up to 16GB RAM. (Check out my public wish list at NewEgg if you want to buy parts and build your own; install time is under 40 minutes.) In addition, I'm using my computer to develop sensing applications using the uber-cool Phidgets kit. The vision of the desktop (or laptop) supercomputer is truly coming to life and can be had for a small fortune of just under US$1300.

Selected Articles

This Computing Now theme starts out with a few articles from the 2008 special issue of CiSE, for which we chose articles representing three popular alternatives for accelerators: GPGPUs, FPGAs, and cell processors. I've also included three articles that have appeared in other IEEE Computer Society publications since then. It's impossible for six articles to do complete justice to this topic. So, I've chosen articles that are likely to be of interest to the widest possible audience, including those who are completely new to the topic, as I was just a precious few years ago.

"Moving Scientific Codes to Multicore Multiprocessor CPUs" describes a restructuring method for implementing numerical algorithms for scientific computing that can help them run efficiently on the IBM Cell processor and other multicore CPUs.

"Computing Models for FPGA-Based Accelerators" describes the critical phase of FPGA application development, which involves finding and mapping to an appropriate computing model. These models differ from those generally used in programming existing commodity CPUs. For example, whereas parallel computing models are often based on thread execution and interaction, FPGA computing can exploit more degrees of freedom than are available in software (such as fine-grained parallelism and communication).

The Quantum Chromodynamics Parallel Computing on the Cell Broadband Engine project is developing a massively parallel, scalable supercomputer for applications in lattice quantum chromodynamics (QCD). This is an example of how accelerators are increasingly becoming a part of conventional cluster/supercomputer designs, especially for specialist computations with high-throughput computing needs such as QCD. Learn about this project in the article "QPACE: Quantum Chromodynamics Parallel Computing on the Cell Broadband Engine."

A recent article in IEEE Software, "Joint Forces: From Multithreaded Programming to GPU Computing," considers the need for best CPU-GPU software development practices. The authors argue that we must not only consider new programming models but also be well-versed in the (parallel) methods required to achieve true performance gains.

"The GPU Computing Era" from IEEE Micro describes how GPU computing is now at a tipping point. It's being employed in demanding consumer applications and high-performance computing alike.

Numerous conference papers explore the use of GPGPUs, FPGAs, and cell processors. In "Parallel Option Pricing wth BSDE Method on GPU," we see a practical example of the use of accelerators in option pricing (an area outside of computational science that has always used parallel methods, mostly in secret).

You might also be interested in the full version of the original guest editors' introduction for our "Novel Architectures" issue of CiSE. And for an excellent primer on GPU programming, check out "Getting Started with GPU Programming" (login is required to access the full text of this article).


George K. ThiruvathukalGeorge K. Thiruvathukal is an associate editor for Computing Now and an associate editor in chief for Computing in Science and Engineering magazine. Contact George at or visit his page at



Page Ratings

Average (3 Votes)
The average rating is 4.666666666666667 stars out of 5.

Article Comments

Please login to post comments.

Christian Timmerer
Interesting theme, what are the challenges for now and the next, let's say, ten year (or even beyond)?
Posted on 2/11/11 5:26 AM.
How do described technologies relate to the Exascale efforts pursued by DoE, DoD, as well as international teams?
Posted on 2/11/11 6:46 AM.
The biggest near-term challenge is to make it easier for all of us to develop programs for these architectures. There are presently limitations on available memory on the GPGPU itself, and the need for application developers to manage resources on both the host CPU and GPGPU requires a mixed-mode programming model. Higher-level language support is still not quite there yet. So I see this as adding to the challenge that we're already facing to make effective use of multicore while simultaneously offering us a more deterministic concurrent/parallel model.
Posted on 2/17/11 12:23 PM in reply to Christian Timmerer.
At some level I see it as both complementary and synergistic. As these technologies become more integrated with mainstream processor designs--already in the works--the sheer number of data processors that one can have per node in a cluster can grow by orders of magnitude, thereby allowing us to reach exascale faster, especially for the problems of interest (e.g. climate models, etc.) It may also help us to do it in a more power efficient way. With the Shuttle computer I built, I am using the same power as the computer I owned last year with hundreds of processors. :-)
Posted on 2/17/11 12:32 PM in reply to Dejan Milojicic.
Christian Timmerer
Multi-core systems are available since a couple of years, one may think there should be high-level language support for this by now or have I missed something?
Posted on 2/18/11 4:17 AM in reply to George Thiruvathukal.

What's New

President's Message: Planning for an Inevitable Future
As we celebrate our 65th birthday in 2011, the question is whether the IEEE Computer Society is prepared for the next 65, or 25, or even 5 years. More »

Government Access to Private-Sector Data
Government appetites for private enterprise data are becoming ever more voracious. We examine how legislators can establish frameworks to protect citizens. More »

Privacy Is Dead: Long Live Surveillance Symmetry
Despite the protests of consumers and consumer advocates, privacy continues to be encroached upon. Learn about an unconventional approach to handling this loss of privacy. More »

Android: Changing the Mobile Landscape
Experts predict the Android phone's attractive features—open source, several handset choices, multiple phone service options, and an open developer market—will push it past the iPhone in the near future. More »

Rethinking Digital Design: Why Design Must Change
A new approach to chip design creates chip generators—domain-specific templates that codify design knowledge and trade-offs—rather than chip instances to support cost-effective customization. More »

Are You Having Fun Yet?
Circuit designers in the "More than Moore" era are focusing on new problems and letting CAD programs solve those of the last generation. More »

The Promise and Peril of Social Computing
Social computing has the potential to fundamentally change the structure of human relationships. Will it succeed? More »

AR-Immersive Cinema at the Aula Natura Visitors Center
A novel augmented reality application for museum exhibitions lets users immerse themselves as fictional characters in historical scenes.
More »

Live Theater on a Virtual Stage: Incorporating Soft Skills and Teamwork in Computer Graphics Education
An interdisciplinary set of courses focused on the production of theatrical performances in a virtual space provides collaborative experiences for students. More »

For Students By Students
Need to update laboratory instrumentation and courseware? Constrained by a tight budget? This article might be for you. More »

AI and Security Informatics
Facing the critical mission of international security, we must develop the science of security informatics. Intelligent systems have much to contribute in this emerging field. More »

Development Platforms for Mobile Applications: Status and Trends
A comparison of four popular runtime environments clarifies the options available for developing applications for resource-constrained mobile devices. More »

Open Source for Enterprise Geographic Information Systems
Noblis engineers describe their implementation of an end-to-end GIS using only open source components. More »