

# Project Acronym: Fun-COMP

<u>Project Title:</u> **Functionally scaled computing technology:** From novel devices to non-von Neumann architectures and algorithms for a connected intelligent world

# WP5 Dissemination and Exploitation (WP Leader UOXF)

# Deliverable D5.2: Public summary of mid-project results

Deliverable ID: D5.2

Deliverable title: Public summary of mid-project results

Revision level: FINAL

Partner(s) responsible: UOXF

Contributors: UNEXE (David Wright) and All Partners

Dissemination level: PU<sup>1</sup>

<sup>1</sup> CO: Confidential, only for members of the Fun-COMP consortium (including the Commission Services); PU: Public.

1

# 1. Introduction and background

The Fun-COMP project is developing a new wave of industry-relevant technologies that will extend the limits facing mainstream processing and storage approaches. We are doing this by delivering innovative nanoelectronic and nanophotonic devices and systems that fuse together the core information processing tasks of computing and memory, that incorporate in hardware the ability to learn adapt and evolve, that are designed from the bottom-up to take advantage of the huge benefits, in terms of increases in speed/bandwidth and reduction in power consumption, promised by the emergence of Silicon photonic systems.

We are also developing basic information processing building blocks that draw inspiration from biological approaches, providing computing primitives that can mimic the essential features of brain-like synapses and neurons to deliver a new foundation for fast, low-power, functionally-scaled computing. We are combining such computing primitives into integrated processing networks that can implement in hardware novel, intelligent, self-learning and adaptive computational approaches - including spiking neural networks, computing-inmemory and autonomous reservoir computing – capable of addressing complex real-world computational problems in fast, energy-efficient ways.

Put simply and concisely Fun-COMPs technological ambitions are to

- Break the von Neumann bottleneck by developing innovative beyond-CMOS integrated photonic/optoelectronic devices and systems that provide simultaneous memory and processing functionality
- Develop integrated photonic/optoelectronic devices and systems that incorporate computationally-realistic mimics of biological neurons and synapses, leveraging the performance advantages of brain-like processing, but at speeds up to a million times faster
- Develop the world's first integrated photonic/optoelectronic self-learning reservoir computer
- Develop the world's first integrated photonic/optoelectronic memcomputer

The pursuit and achievement of such technological ambitions will help to ensure that European industry can

- Play a leading role in the development of the next-wave of information processing and storage technologies
- Be a leading player in the ever-growing silicon photonics technology arena
- Maintain and even increase its long-term capacity for advanced next-generation micro/nanoelectronic and micro/nanophotonic device design, development and manufacture

Our ambitions also match exceedingly well to the strategic interests of key industries, promising innovations that will expedite the ultimate development of important products and services in the areas of

- Cognitive computing
- Machine learning and artificial intelligence
- Image processing and data analysis
- Advanced beyond-CMOS device design and fabrication

In this report, we provide a short public summary of some of the most important results obtained during the first 24 months of the project.

## 2. Results and discussion

#### 2.1 Binary and multilevel non-volatile memories

In **Figure 1** we show a schematic of the operating principle of the basic Fun-COMP non-von Neumann (N-vN) unit cell, along with a typical example of an as-fabricated SiN (Si<sub>3</sub>N<sub>4</sub>) device. Figure 1(a) shows a schematic of typical write (amorphization) and erase (crystallization) pulses being sent down the optical waveguide with a phase-change cell (here the well-known alloy Ge<sub>2</sub>Sb<sub>2</sub>Te<sub>5</sub>, or GST for short) deposited on top of it, and, schematically, the effect of amorphization and crystallization of the GST cell on the resulting optical transmission through the waveguide, as sensed by appropriate readout pulses. Figure 1(b) shows an optical microscope image of the waveguide and its couplers (for getting light in and out of the waveguide), along with a close-up of the phase-change cell deposited on top of the waveguide. Figure 1 (c) shows a cross-sectional schematic of the device structure. (Devices were fabricated and tested by the Fun-COMP partners at the Universities of Oxford and Muenster).



**Figure 1:** (a) Schematic of the basic N-vN unit cell device, operating in binary memory mode. (b) Optical microscope image of (left) as-fabricated N-vN unit cell, here using SiN rib waveguides, and (right) close-up of GST cell deposited on top of the waveguide. (c) Schematic of cross-section of device.

The capabilities of devices of the type shown above to deliver both binary and multilevel memory functionality was investigated in detail in the first half of the Fun-COMP project. A particularly effective approach to achieving multilevel storage was found to be the use of the double-step pulse, as shown in **Figure 2**(a), for programming. In this approach, the first, highpower, part of the pulse is designed to amorphize any crystalline regions in the GST cell, while the second part of the pulse induces a controllable re-crystallization, with the fraction of crystallized material in the cell dependent on the duration (length) of the lower-power segment of the double-step pulse. Figure 2(b) illustrates how the programmed state of the cell, and hence the optical transmission of the cell during readout, depends on the duration (width) of the second part of the double-step pulse (here the pulse comprises a 50 ns high-power segment followed by a varying width low-power segment). Figure 2(c) shows the accumulated transmission statistics of 20 unique transmission levels after the application of 200 programming pulses with randomly assigned step widths; the standard distribution of the error (inset) is only is 0.482% in this case. Interestingly, as exemplified by the results shown in Fig. 2(d), the multilevel states of this type of all-optical phase-change memory do not seem to suffer from deleterious level drift effects that are observed in electrical phase-change memories. The lack of drift is a big advantage, since it makes the detection of multilevel states much easier in the optical domain as compared to the electrical domain.



**Figure 2:** (a) Schematic of the double-step type pulse approach for achieving multilevel memory functionality. (b) Experimental demonstration (using a SiN-based device) of unique programming of the optical transmission state of the N-vN cell by control of the duration of low-power part of the double-step pulse. (c) Experimental demonstration of the programming of 20 different transmission states (corresponding to > 4 bits per cell). (d) Readout of various multilevel states over long time periods (the break in the measurement around 350 seconds comprises a duration of around 2 hours), in which, contrary to the electrical memory case, there appears to be little drift in levels with time.

Having successfully demonstrated binary and multilevel memory operation using SiN waveguide devices, designs were transferred to the Silicon-on-Insulator (SOI) platform that is more widely used in industry and with which large-scale photonic devices can be fabricated by the Fun-COMP partner IMEC. A typical Si-waveguide device thus shown as shown in **Figure 3**. A schematic of the device and its cross-sectional structure is shown in Fig. 3(a) and (b) respectively, while Fig. 3(c) and (d) show and as-fabricated device. Figure 3(e) shows the transmission spectrum for the device of Fig. 3(c) with the GST cell in the fully crystalline and fully amorphous states (and with no GST cell at all); it is clear that a substantial contrast exists between states that can be used to provide memory and processing functionality. Indeed, in Figure 4(a) we show the achievement of multilevel storage in a SOI device, here again using the double-step pulse with variable duration of the low-power segment to achieve the multilevel states. Although the use of Si waveguide based devices has several attractive features as compared to the use of SiN devices - including a smaller device footprint, faster heatingcooling times (due largely to the higher thermal conductivity of Si cf. SiN) and compatibility with industrial fab lines - it does have one drawback of requiring more energy to achieve a given change in optical transmission. This is shown in Fig. 4(b), where we show a comparison of the energy consumption of silicon and SiN devices.

In all the results discussed above, we have essentially considered single isolated devices. Of course, the aim is to bring such devices together to make complete systems. In terms of memory, we have achieved this in the first half of the Fun-COMP project by successfully designing, fabricating and testing a fully integrated 256 cell all-optical memory with 512 bit capacity – see **Figure 5**.



**Figure 3:** (a) 3D overview and (b) cross sectional view of a silicon waveguide phase change N-vN unit cell. (c) Optical image of an as-fabricated device. (d) SEM image of the GST cell deposited on the top of the waveguide (with a tilted 60° viewing angle). (e) Normalized transmission spectrum of a typical photonic device without GST, with as-deposited amorphous GST, and with annealed crystalline GST



**Figure 4:** (a) Programming of multilevel memory states into a Si-waveguide based N-vN device. (b) Comparison of the energy consumption (pulse energy vs. resulting change in optical transmission) of SiN and Si type waveguides. The length of GST is  $4\mu m$  on silicon and  $5\mu m$  on the SiN device from with pulses width from 20 ns to 100 ns. Every data point is averaged from 300 repeated measurements.

10 µm



Figure 5: (a) Sketch describing the operation principle of the photonic memory. Several PCM-cells are combined to rows and can be addressed through wavelength division multiplexing. (b) Close up of a single memory cell indicating the important design parameters. (c) Optical micrograph of a photonic matrix memory with 16x16 memory cells. (d) Scanning-electron micrograph of a single memory cell within the array.

Input →

Further information on the above topics can be found in the following open-access journal publications

Li X, Youngblood N, Ríos C, Cheng Z, Wright CD, Pernice WH, Bhaskaran H. (2019) Fast and reliable storage using a 5 bit, nonvolatile photonic memory cell, Optica, volume 6, no. 1, pages 1-6, DOI:10.1364/OPTICA.6.000001

Li X, Youngblood N, Cheng Z, Carrillo SG-C, Gemo E, Pernice WHP, Wright CD, Bhaskaran H. (2020) Experimental investigation of silicon and silicon nitride platforms for phase-change photonic inmemory computing, Optica, volume 7, no. 3, pages 218-218, DOI:10.1364/optica.379228.

Feldmann J. Youngblood N. Li X, Wright CD, Bhaskaran H, Pernice WHP. (2020) Integrated 256 Cell Photonic Phase-Change Memory With 512-Bit Capacity, IEEE Journal of Selected Topics in Quantum Electronics, volume 26, no. 2, pages 1-7, DOI:10.1109/jstqe.2019.2956871.

Gemo E, Carrillo SGC, Degalarreta CR, Baldycheva A, Hayat H, Youngblood N, Bhaskaran H, Pernice WHP, Wright CD. (2019) Plasmonically-enhanced all-optical integrated phase-change memory, Optics Express, volume 27, pages 24724-24738, article no. 17, DOI:10.1364/OE.27.024724

Farmakidis N, Youngblood N, Li X, Tan J, Swett JL, Cheng Z, Wright CD, Pernice WHP, Bhaskaran H. (2019) Plasmonic nanogap enhanced phase-change devices with dual electrical-optical functionality, Science Advances, volume 5, no. 11, pages eaaw2687-eaaw2687, DOI:10.1126/sciadv.aaw2687

## 2.2 Arithmetic functionality

The ability to access multilevel states in the phase-change photonic cell, as shown in section 2.1 above, means that we can do much more than simply store data. We can also carry out arithmetic processing, i.e. we can add, multiply, subtract and divide numbers. An example is shown in **Figure 6**(a), here using a straight rib waveguide (of the kind shown in Figure 1(a)) to carry out the direct base-10 addition of 6 + 6. Initially, the phase-change cell is in the amorphous phase, which represents the number zero. Access to the multilevel states is obtained, in this case, by applying a sequence of (groups of) identical excitation pulses, each of which sets the cell to a pre-determined crystal fraction (so to a pre-determined waveguide transmission). For base-10 operation, the power and duration of the pulses is chosen such that it requires 10 (groups of) of them to fully re-crystallize the cell from its starting amorphous state. Thus, to carry out addition of 6 + 6, six (groups of) pulses are sent into the waveguide, which sets the phase-change cell to level six. Then, the second summand is added by sending in six more (groups of) pulses. When reaching the tenth level, the cell is reset to level zero, before the rest of the input sequence is applied. To register the carry-over of 10, a second phase-change cell is used to represent the 'tens', and this second cell is set to level 1 during the resetting (to zero) of the first cell. Thus, at the end of the calculation, the first cell representing the 'ones' is at level 2, while the second cell will be at level 1, revealing the expected answer of 12. The whole process can be thought of in terms of an analogy with an abacus, as also shown in Figure 6.

The addition process above can also be used to implement multiplication (by successive addition), subtraction (by using the numbers complement approach to convert to addition) and division (successive subtraction). Moreover, by using multiple cells, with each cell representing successive powers of the base, we can easily represent very large or very small numbers. An important point to note also is that the result of the calculation is stored in the self-same device that carried out said calculation: processing and memory functions are thus merged, removing the well-known von Neumann bottleneck that plagues conventional computers.

Another arithmetic process that phase-change photonic devices are particularly well-suited to carrying out is that of matrix-vector multiplication. Matrix-vector (MV) multiplication is a key operation underpinning much of modern 'data science', from image processing to machine learning, data analytics etc. At the heart of MV operations is the scalar multiplication,  $c = a \times b$ . Rather than carrying out this multiplication using sequential addition as described in previous paragraphs, we can instead perform the multiplication directly using a single phase-change cell. We do this by coding the multiplier, a, into the transmission state of the cell (i.e. by setting the cell to a particular multilevel state) while the multiplicand, b, is coded into the (optical) power,  $P_{in}$ , of the readout pulse. The result of the multiplication, c, is thus calculated directly and appears as the power,  $P_{out}$ , of the readout signal (see Figure 6(b)). By using multiple cells and appropriate integrated photonic circuitry, it is relatively straightforward to extend this approach to deliver direct MV multiplication.

An experimental implementation of the multiplication of a  $[2\times1]$  matrix by a  $[1\times2]$  vector carried out in Fun-COMP is shown in Figure 6(b). Scaling-up to larger MV operations is

eminently possible, with suitable architectures. A major advantage of the use phase-change cells to store the matrix elements is that in applications where the same matrix elements are repeatedly used (e.g. in convolution-based processing), the programming of the matrix values needs to be done only once (since the cells are non-volatile), and thereafter the MV multiplication can be carried out extremely quickly indeed (using short, WDM optical pulses) and with very little energy budget.



**Figure 6.** Arithmetic processing. (a) Direct addition of two base-10 numbers (using two photonic phase-change cells) in a system analogous to the abacus. (b) Direct multiplication using a (left) single photonic phase-change and (right) extension to matrix-vector multiplication using multiple cells. (c) Experimental implementation of a  $[2\times1]\times[1\times2]$  MV multiplier.

For further information on arithmetic processing capabilities of Fun-COMP N-vN devices see the open access journal publication below

Ríos C, Youngblood N, Cheng Z, Le Gallo M, Pernice WHP, Wright CD, Sebastian A, Bhaskaran H. (2019) In-memory computing on a photonic platform, *Sci Adv*, volume 5, no. 2, DOI:10.1126/sciadv.aau5759

#### 2.3 Neuromorphic processing

As a final example of the potential of the integrated phase-change photonic devise that we are developing in Fun-COMP, we point out their ability to provide all-optical hardware mimics of brain synapses and neurons. A synapse can be thought of as providing a simple weighting operation between neurons. Thus, synaptic functionality can readily be provided by the straight rib waveguide device described above for use as a multilevel memory (with the multilevel state mapped directly to the synaptic weight). At the system level, the outputs from multiple synapses can be combined (added) and input to a neuron using wavelength division multiplexing capabilities of another common integrated photonic device - the microring resonator (see **Figure 7**).

To mimic the operation of a neuron, we also use a microring resonator, but in this case we deposit on top of it its own integrated phase-change cell that can be switched by the incoming combined pulses from all its synaptic connections. Switching the neuronal cell in turn changes the optical resonance condition of the microring. Thus, in the scheme we have developed in Fun-COMP, and shown in Figure 7, when the neuronal cell is in the crystalline state, a probe pulse sent along the microring's 'output' waveguide couples strongly into the ring resonator and so no output pulse (neuronal spike) will be observed. However, if the instantaneous combined power of the pulses from all the synapses connected to the neuron is high enough to switch the neuronal cell to its amorphous state, the probe pulse is no longer on resonance with the microring and will be transmitted past the ring (i.e. the light pulse will mostly continue along the coupling 'output' waveguide), generating the neuron's output spike. Thus, the system naturally emulates the basic integrate-and-fire functionality of a biological neuron.

Within Fun-COMP we have successfully fabricated neuromorphic systems of the type shown in Fig. 7(a) (see Fig. 7(b)) and used them to carry out the archetypal AI task of pattern recognition using both supervised and un-supervised learning.



**Figure 7.** All-optical neuromorphic systems. (a) Schematic of the all-optical neuromorphic system implemented in Fun-COMP (top) and schematics of the constituent components (bottom) including (left to right) the synapses, multiplexer for summing outputs from all synapses and the phase-change (PCM) cell and microring resonator used to implement the neuron mimic. (b) A fabricated neuromorphic device, here with 4 synapses and a single neuron.

For further information on neuromorphic processing capabilities of Fun-COMP N-vN devices see the open access journal publication below

Feldmann J, Youngblood N, Wright CD, Bhaskaran H, Pernice WHP. (2019) <u>All-optical spiking neurosynaptic networks with self-learning capabilities</u>, *Nature*, volume 569, pages 208-214, DOI:10.1038/s41586-019-1157-8

### 2.4 Spiking nanolasers

Within Fun-COMP we (partners Thales and C2N) are also developing novel spiking nanolasers, that can be used for neuromorphic processing applications. These spiking nanolasers can be based on 1D photonic crystal (PhC) nanobeam cavities embedding quantum wells emitting at around 1.55  $\mu m$ . The cavities are positioned on top of a Silicon on Insulator (SOI) waveguide to enable evanescent coupling of the emitted light and are fully encapsulated in SiO2 to improve heat-sinking. Here, the design concept of excitable laser relies on the introduction of a saturable absorber section (unpumped). This is achieved by injecting carriers only in a small selected region. The fabrication of a first generation of InP-on-SOI optically pumped nanolasers was successfully carried out at CNRS-C2N - a typical fabricated device is shown in **Figure 8** (a and b).



**Figure 8**: SEM pictures of the spiking nanolaser devices after wafer bonding (a) and after ultra-high resolution e-beam lithography and plasma etching (b). (c) Comsol FEM thermal simulation, (d) characteristic laser curve and fitted model (dotted line) for a typical nanolaser

In addition to the approach described above, we are also investigating in Fun-COMP a second method to deliver spiking nanolasers – one that exploits a coupled photonic crystal cavity (PhC). Coupled nanocavities in the gain/loss regime are good candidates for self-pulsing neuron-like spiking photonic systems. This has been theoretically demonstrated in the past by the C2N-CNRS group, where one pumped laser nanocavity evanescently coupled to an unpumped one was considered [A. M. Yacomotti et al, Phys. Rev. A 87, 041804(R) (2013)]. Recently it has been realized that these kinds of gain/loss coupled cavities belong to a very general class of photonic systems called Parity-Time (PT)-Symmetric systems [L. Feng, et al., Nat. Photon. 11, 752 (2017)]. In these systems, light can either be steady-state localized in the gain cavity (broken PT symmetry), or delocalized in the two cavities (unbroken PT symmetry). However, dynamical regimes -such as spiking behaviour- have not been investigated so far. During this first 24 month period of Fun-COMP we have therefore

performed a thorough theoretical study of two coupled cavities in a gain/loss configuration, and showed that there are generic self-sustained oscillating features close to the so-called exceptional points (EPs) of the PT symmetric system.

### 2.6 Enhanced self-learning reservoir computing

Another focus for Fun-COMP is what is known as *Reservoir Computing* (RC). The RC system has three basic parts; an input layer that couples the input signal into a non-linear dynamical system that makes up the "reservoir", and finally an output layer that linearly combines the internal variables (the reservoir or neuron states) to provide the time-dependent output signal (see **Figure 9**). A key advantage of reservoir computing stems from the fact that the interconnection weights in the reservoir are randomly initialised and therefore do not need to be trained. The only free parameters that require training are the weights of the linear output layer, with such training being relatively straightforward. This is in stark contrast with conventional recurrent artificial neural networks, which rely on complicated techniques like back-propagation-through-time, which are often difficult to get to converge (and can take a long time and use considerable amounts of energy). RC systems are thus fast, computationally efficient and have shown state-of-the-art performance on a number of benchmark tasks and can be used to solve a range of classification and other important computational tasks (such as speech recognition, nonlinear channel equalization, robotic control, time series prediction, financial forecasting, handwriting recognition – see Fig. 9).



**Figure 9:** Concept of a reservoir computer (left) and schematic of feature classification in the reservoir computer by transformation to higher-order dimension (right).

Within Fun-COMP, the aim is to enhance the conventional approaches to reservoir computing by using the non-volatile memory functionality of the N-vN devices described in Sec. 2.1 to build a reservoir that adapts to its input in a plastic way. This makes it possible for properly designed reservoirs, namely self-learning reservoirs, to automatically optimize some of their internal properties to carry out a given task, while in traditional RC only the readout stage is optimized.

Work on enhanced reservoir computing has only been running for 6 months at the time of writing this report, but in this initial period the project team has

A. Created a modular and dynamical numerical model of a silicon ring resonator (devices similar to those shown in Fig. 5(d)) with an integrated GST cell on top (in Fun-COMP we call this the N-vN extended unit cell) –see **Figure 10**.

- B. Designed ring resonators on an SOI chip to experimentally validate the numerical model and derive model realistic parameters.
- C. Investigated the use of ring resonators as neurons and synapses in a plastic spiking neural network (SNN).
- D. Began numerical modelling of a first approach to a self-learning photonic network.



**Figure 10**: The developed dynamical model of a silicon ring resonator with GST layer (*right*) is composed of a silicon RR dynamical model, without GST (*left*), and of a behavioural model of a silicon waveguide covered with a GST film (*center*). During the simulation, the ring model provides the input optical power, while the N-vN unit cell model provides the updated complex effective index of the corresponding waveguide segment in the ring, covered with GST.

The work in the upcoming months will focus on system-level simulations and on fabrication and measurement of the designed SOI test structures.

#### 2.6 Computing in memory

Another innovative N-vN information technology concept that will be developed within Fun-COMP is in that of computing-in-memory. Machines that simultaneously process and store multistate data at one and the same location can provide a new class of fast, powerful and efficient general-purpose computers. Moreover, multistate machines that compute directly in memory can provide not only the same computational power as a universal Turing machine, but also a range of additional and attractive properties including an intrinsic parallelism, learning and adaptive capabilities, and, importantly, a simultaneous execution of processing and memory, eliminating the von Neumann bottleneck and offering significant area/power efficiency improvements as compared to conventional computing approaches.

In Fun-COMP we are thus fabricating networks of our N-vN devices capable of demonstrating such advanced intelligent in-memory computing using real world data. A specific focus is on spatio-temporal correlation detection, a concept illustrated in **Figure 11**.

Further information on the topic of in-memory computing within Fun-COMP can be found in the open access publication below

Ríos C, Youngblood N, Cheng Z, Le Gallo M, Pernice WHP, Wright CD, Sebastian A, Bhaskaran H. (2019) In-memory computing on a photonic platform, *Sci Adv*, volume 5, no. 2, DOI:10.1126/sciadv.aau5759



**Figure 11**: The concept of temporal correlation detection. This will be implemented in Fun-COMP using arrays of N-vN devices and has important applications in many sectors.

# 3. Summary

In this report, we have summarised some of the remarkable developments that have been achieved during the first half of the H2020 project Fun-COMP (i.e. during the 2 years of the 4 year project). Fun-COMP has the very ambitious aim of developing entirely new ways of information storage and processing – or put another way, entirely new ways of doing computing – working entirely (or mostly) in the optical domain so as to benefit from the speed and parallelism inherent to optical systems. To achieve these aims we are developing entirely new types of integrated photonic devices that can provide non-volatile binary and multilevel memory, along with arithmetic and neuromorphic (brain-like) processing. We are also developing other essential components, such as spiking nanolasers, that will enable us (by the end of the project) to deliver self-contained small-scale computing systems. Progress in the first half of the project has been excellent and we are well on the way to meeting our ultimate objectives.