Researchers of the CRISP (Cutting edge Reconfigurable ICs for Stream Processing) consortium, a team of four companies and two universities in The Netherlands, Germany and Finland demonstrate a self-testing and self-repairing chip at the DATE2011 conference in Grenoble. CRISP developed new concepts for run-time resource management to attain the goal of self-repair: while in operation, the chip tests cores and connections, and a resource manager dynamically assigns the chip's tasks to fault-free parts.

 

In itself, downsizing of chip technology is good news, as it allows, for example, our mobile phones to become ever more powerful. The downside of extreme downscaling is that processes are starting to run into physical limitations, resulting in lower production yields and earlier break-down of functional chips. “Because of the rapidly growing transistor density on chips, ensuring high system dependability has become a real challenge”, says Hans Kerkhoff, Associate Professor, CTIT, University of Twente.

 

Addressing the question of how to make future-proof miniature chips more reliable instead of less robust, the CRISP consortium researched how chips can test and repair themselves. The method combines a test for faulty components and connections on chips using a run-time resource manager that assigns tasks and communication channels to ‘known-good’ components and pathways. This allows multi-core chips with a few faulty cores to pass production test, since they will function for the full 100% -- without any compromise to reliability.

 

How can chips remain fully functional whilst having faulty components? “The solution is not to make non-degradable chips, but to make architectures that can degrade while they keep functioning, a process called graceful degradation. With the right dependability infrastructure multi-cores can be a solution”, says Hans Kerkhoff. The chips have many cores; each performing subtasks of a more complex application: for instance satellite navigation comprises many digital signal processing tasks. A run-time resource manager dynamically assigns tasks to cores. Cores can swap tasks; it does not matter which core does what, so cores can take over the tasks from failing cores and the chip can repair itself, extending its longevity. Bart Vermeulen, Senior Principal Scientist, NXP: “Combining testing for faulty components and a run-time resource manager forms the heart of a flexible reconfigurable chip that can handle changing tasks and failing components during its entire active lifetime”. The resource manager continuously determines the chip's optimum Quality of Service on fault-free components.

 

The resource manager works during the entire chip lifetime to keep the chip up and running. Its primary function is to dynamically assign new tasks to free resources. This allows to truly benefit from the huge processing power of many-cores and creates a much-desired flexibility to adapt to new tasks and standards during the functional life of the chip.