March 5, 2025

ikayaniaamirshahzad@gmail.com

TSMC N2 + Next-Gen SoIC, Intel EMIB-T, Meta 3D Stacked Memory, CFET, 2D Materials, and More – SemiAnalysis


The semiconductor industry isn’t built on overnight breakthroughs. It’s built on large leaps of progress, compounding year after year, at a higher rate than probably any other industry in history. The International Electron Device Manufacturing conference IEDM is one of the key venues where chipmakers can show off this progress. Paper topics range from commercially relevant, those that could eventually be, and others that probably won’t be but are interesting technology anyways.

Semiconductors: incremental gains compounded over 50+ years. Source: AMD

For logic: TSMC’s N2 process, 2D materials including Samsung and others, CFET advancements, and Intel scaling silicon channels beyond what was thought possible. A takeaway from a panel of experts who said, despite impressive progress at the conference, it’s not good enough to keep up with AI. For memory a big focus was compute-in-memory, a potential longer-term solution for the AI memory wall. Meta presented a unique 3D stacked memory implementation. There was a lot of attention paid to advanced packaging. That makes sense as packaging is now a key avenue for driving compute scaling – we’ll discuss Intel’s new EMIB-T 2.5D tech and TSMC’s next-gen SoIC 3D hybrid bonding product. We’ll also detail a few notable companies and technologies that were not present this year and why.

TSMC N2

TSMC is the apex predator of advanced logic. A key advantage for them is superior process technology. Their first GAA process node, N2, looks like it will continue this trend. Given the prime presentation slot, in the largest hall directly after keynotes, they mostly victory-lapped the node but gave a few interesting details as well.

For transistors, performance claims are in line with previous publications – 15% speed or 30% power and >1.15x density scaling. Six threshold voltage levels (Vt the voltage required to turn the transistor “on”) are available, notable because Vt tuning is more difficult for gate all around transistors (GAAFETs) than finFETs. A menu of threshold voltage options helps chip designers optimize performance and power: a logic core might use low Vt transistors to achieve high speed, where a peripheral function like I/O benefits from higher Vt to minimize power consumption (generally low Vt means the transistor can switch faster but also has more current leakage i.e. high performance but high power. High Vt is the opposite).

Dielectric material must be deposited with exquisite control in varying thicknesses for materials to achieve different threshold voltages, with the added challenge that there is no direct line of sight to the underside of the gate channels. This is one of the key drivers for increased use of atomic layer deposition (ALD) in GAA vs. finFET processes.

Interconnects are just as important as the transistors themselves in scaling modern logic and TSMC showed real improvements here. Gate contacts are now barrier-less tungsten, almost certainly using an AMAT Endura cluster with pre-clean, PVD W liner, and CVD W fill chambers all in a continuous vacuum. While AMAT’s presentation at IEDM 2023 claimed a 40% resistivity reduction, TSMC is showing 55% RC (resistance and capacitance) in practice. This translates directly to a performance gain: >6% in a ring oscillator test vehicle.

Source: Applied Materials

Last there were a few teasers on RC (resistance and capacitance) reduction in the metal layers. In single patterning ArFi layers, the “workhorse” metals and vias, RC was reduced by 19 and 25% respectively. We suspect a better dielectric material is the driver here. More impressive, an optimized M1 (metal layer 1, the 2nd lowest and thus very dense) patterning scheme saved multiple EUV masks and reduced capacitance of that layer by 50%! The details are a mystery – here’s the full quote for any detectives:

Optimized M1 with novel 1P1E EUV patterning led to close to 10% std cell capacitance reduction and a saving of multiple EUV masks.

We’ve said before that last decade was the decade of litho and upcoming is the decade of materials. The details of N2 prove it: materials innovations drive performance gains while EUV masks in a critical layer are reduced.

It’s also notable that Intel, Samsung, and Rapidus chose not to present on their competing “2nm” GAA nodes, aside from a Rapidus paper on threshold voltage tuning. This potentially speaks to their lack of maturity in these process nodes.

CFET

Now that GAA is near high volume production, CFETs are the new “next big thing.” We went deeper into the motivation and details in last year’s IEDM roundup, but the gist is that building a PMOS and NMOS transistor one atop the other offers ~1.5x scaling versus the traditional side-by-side configuration.

Integration is the key challenge. The front end of line (transistor) stack height doubles, the second transistor must be built without destroying the one below, and direct backside contacts are necessary for power if not signal too.

IMEC showed a conceptual 4T CFET cell with shared rails connecting both the top and bottom transistors with a backside power delivery network (BSPDN).

The focus of the paper is reducing process complexity in the source/drain contacts. Building low resistance contacts is key for performance but difficult given the high aspect ratio needed to connect to both the bottom and top devices in the CFET. IMEC’s solution is a shared “Middle Routing Wall” that runs on one side of each N+PMOS stack, connected to the sources and drains as needed. A “wall” or rail like this is simpler to construct than a via so presumably can achieve better quality, performance, etc. That remains to be proven as this paper only simulated the integration flow. The next step is likely to build these devices for real.

Samsung and IBM showed a novel “stepped” approach that uses 2 wide channels in the bottom NFET and 3 slimmer channels in the top PFET. This allows direct line of sight to the bottom channels when forming contacts, meaning easier to achieve high quality and thus better performance.

Source: IBM / Samsung

But this probably comes at a scaling cost. The paper argues stepped is no worse than backside contact + via to connect the bottom FET to signal. That may be true but is not the correct baseline. A shared power wall like IMEC’s approach or local signal routing on the backside are better comparisons, and stepped with its wider channels is worse for scaling than either.

TSMC again looks like the best-in-class. They demonstrated a working CFET inverter, meaning the bottom pFET and top nFET were wired together into a basic logic gate. This is one big step ahead of the others on the integration roadmap towards an industrialized process. Most importantly they have a working method for forming the local interconnect between top and bottom FETs. This is the problem imec solved in simulation but TSMC has done it on real silicon. While probably cherry-picked, the transistor performance is very good already – suggesting the local interconnect and contact quality is good. High aspect ratio and tight alignment requirements will be a major challenge to yield at high volumes.

Intel did not to present any CFET work. In previous years they’ve shown progress, so it’s likely they just chose not to present this year.

Memory

The obvious hot topic in memory is HBM. Unfortunately, it is too commercially relevant right now, so no company is going to give details in a conference paper. The focus at IEDM was on compute-in-memory instead.

This is broad category of potential solutions to the memory wall. The goal is to reduce data movement overhead, which drives most of the energy and time wasted in current architectures. While reducing the amount of data to be moved (lower precision, algorithmic improvements, etc.) or increasing memory bandwidth (HBM) can help, the ideal solution may be to just move the compute as close as possible to memory – compute-in-memory.

SK Hynix showed an architecture they call AiM for “Accelerator in Memory.” They built a demonstrator combining GDDR6 with a processing unit adjacent to each bank.

Source: SK Hynix

The result is a memory bandwidth per GB two orders of magnitude higher than HBM:

Source: SK Hynix

As most modern AI use cases are memory-bound, this would offer a serious performance improvement. There are significant barriers to using Accelerator in Memory (AiM) devices, primarily, the lack of flexibility. The killer application may be on-device AI for AR/VR. Latency sensitive tasks such as hand tracking must be done on-device.

Meta presented results with 3D packaged SRAM or DRAM atop compute (this is really compute-near-memory) along with a theoretical compute-in-memory accelerator for VR applications.

Eliminating the need for off-chip memory access reduces both latency and energy consumption by 40% with 3D stacked SRAM. An optimized mix of SRAM + DRAM is even better. A proposed CIM design with an array of logic + memory macros might achieve up to twice the energy efficiency of existing accelerators.

While the results from theory and test vehicles look great, there are a few hurdles to commercialization. First, reliability and accuracy are worse for most CIM architectures than the current compute + memory paradigm. For example, schemes that use DRAM cells and periphery to perform simple logic operations have high error rates. The manufacturing of DRAM (or many other memory types) and logic are fundamentally different and not compatible. Consider the thermal budget for DRAM anneal: it might be 600°C and hours long, well above what advanced logic devices can endure.

Second is cost. Even compute-near-memory with hybrid bonding as shown by Meta is challenging. The sole major product on the market today that uses hybrid bonding of memory to logic, AMD’s X3D CPUs, are not the greatest volumes or margin. Approaches that use DRAM banks for calculation require a more complex memory controller. And co-fabrication schemes are complex – possibly requiring both memory- and logic-specific tools. Still, the needs of AI accelerators justify more expensive solutions than traditional computing did. CIM will continue to see increased efforts towards a viable product.

Intel EMIB-T

Even at a conference presumably focused on devices (International Electron Devices Meeting), advanced packaging gets a lot of attention. This makes sense as it’s really the new frontier for compute scaling.

Intel informally announced a new variant of their EMIB (Embedded Multi-die Interconnect Bridge) 2.5D packaging tech, EMIB-T. The T signals the addition of TSVs (through-silicon vias). EMIB is Intel’s name for packaging with silicon interposers: passive chips embedded in organic substrate. Interconnects can be made twice (or more) as dense in a Si interposer than traditional substrate, meaning the overall package performance can be higher.

Original EMIB technology claimed a cost advantage specifically because it did not use TSVs, which are relatively expensive to make. This meant that some signal and power had to be routed around the interposer. TSVs should give flexibility to route any or all signals and power through the interposer. And TSV fabrication has become cheaper as it matures. Intel’s target market for EMIB-T is complex heterogeneous packages that use both 2.5D/EMIB and 3D/Foveros to provide a wide range of interconnect densities at sizes beyond reticle limit. HPC is the most important application here.

Source: Intel

TSMC SoIC

TSMC gave an update on its SoIC 3D packaging tech. While they are technically not the industry leaders in hybrid bonding (Sony has <4 µm and soon <1µm in their CMOS image sensors), they are ahead for advanced logic. This new generation appears to be <15 µm TSV interconnect pitch. For comparison, Intel’s Foveros is roughly 25 µm pitch. As density and performance scales with the square of the interconnect pitch, the gap even versus the previous generation of SoIC is significant:

Nvidia System Co-Optimization Of GPUs

Nvidia had a nice presentation on System Co-Optimization of GPUs. While the industry is driven by exponential “laws” – model scaling, transistor density/cost, compute energy, etc. – the author noted another that is largely unappreciated: defect density.

It seems obvious that defects must scale at a rate comparable to transistors and interconnects, as otherwise yield would effectively go to zero. But in context, that means defect rates below one per trillion vias or contacts!

Source: Nvidia

This is the result of excellence throughout the supply chain – chipmakers optimizing their processes and fab operations, tool vendors reducing defectivity while improving tool performance, materials suppliers measuring and controlling contamination to parts per trillion or better, and more. The details are closely guarded secrets but the results are worth appreciating.

It’s also worth noting the topics that did not show up at this year’s conference. Last year CXMT disclosed a sanctions-violating 18nm DRAM process. This year they did not present anything. Staying quiet makes sense: the December 2024 round of export controls did not add them to the entity list and even included a generous carve-out for slightly derated HBM so that Applied Materials and others can continue selling tools despite the application being “crucial to frontier AI training and inference.” The reprieve was short lived as the surprise January 2025 rules extend coverage to include CXMT’s most advanced “18.5nm” process. Participation from China was heavily skewed towards academics this year, unsurprisingly.

Micron’s NVDRAM was another notable absence. Last year’s paper made a splash with DRAM-like performance but non-volatile retention better than typical NAND. Cost and scalability were potential concerns and that may have borne out… the tech has not been productized and was not shown at IEDM this year.

2D Materials

2D materials are the presumptive replacement for silicon transistor channels. Recall that channels conduct current between the source and drain of a transistor, with conduction controlled by a gate touching or surrounding the channel. In silicon, channel lengths (commonly called gate length or LG) below ~10nm were thought impractical as leakage current is too high – the transistor is inefficient and hard to turn off. Channels constructed from 2D materials are easier to control and less susceptible to the mechanisms that cause leakage in Si. With leading-edge devices already at 10-20nm gate lengths, 2D materials are on many roadmaps in the 2030s.

They are still far from commercial readiness. An Intel paper generalizes the major challenges into 3 categories:

  • Material Growth
  • Doping & Contact Formation
  • Gate-All-Around (GAA) Stack / High-K Metal Gate

Doping & contact formation comprise doping, to form the transistor active source and drain regions, along with contacts to form a low-resistance connection to the metal interconnect layers above. The GAA stack entails depositing multiple layers of material around the 2D channel to form a gate that controls the transistor. After converging on the 2D channel materials last year (MoS2 for N-type and WSe2 for P-type devices), progress has been made on doping, contacts, and gate formation:

TSMC showed work on contacts for P-type devices. This fills in a missing piece, as contacts have been shown before for N-type transistors but not for P-type. Contacts are the electrical connection from the metal interconnect (wiring) layers to the transistor source, drain, or gate. A key part of contact performance, especially at modern device dimensions in the 10s of nanometers, is resistance. The challenge is that the source and drain are made of a semiconducting material – traditionally silicon or here the 2D material (WSe2 in this case) – which has high resistance. Depositing the interconnect metal directly atop the source or drain will form a high resistance Schottky barrier at the interface. Metal to Si also typically has poor adhesion.

 A common solution for Si is silicidation, a deposition + anneal process that forms a highly conductive silicide (NiSi for example) atop the Si source or drain region. A metal interconnect can then be built on the silicide to complete a low resistance connection from active source/drain out to the circuit wiring.

Silicidation isn’t possible for 2D materials because they don’t contain Si. The favored solution is degenerate doping: introducing a specific impurity into the 2D material structure to change it from a semiconductor to a conductor. In practice it’s difficult to dope WSe2: the crystal lattice is easily destroyed and achieving an even dopant distribution throughout is challenging. But the authors of the paper have done it. Contacts are one of the hardest challenges for modern logic processes, it’s a big step to have a viable path forward here for 2D materials.

Gate oxides are another key challenge for commercializing 2D materials. As noted for the TSMC N2 paper, gate oxide quality determines how well the transistor can be controlled. And if you can’t control the transistor well… you don’t have a viable logic process. Intel demonstrated formation of a high-quality gate oxide, resulting in well-controlled transistors. DIBL and subthreshold swing are low (meaning low leakage and a sharp transition from off to on, respectively) and the maximum drain current is high – all indicating good electrostatic control. The main innovation here seems to be process optimization, particularly for the pre-clean and oxide deposition processes.

Despite advances in doping, contacts, and gate formation, progress is lacking for 2D material growth. We wrote in last year’s roundup that “growth is the fundamental problem for 2D materials.” Most of the existing research uses transfer – material is grown on a sapphire substrate and mechanically transferred to a Si wafer. But this is a lab technique, not one that will scale to high volume. Direct growth on a 12” Si wafer is the most likely path to commercialization.

Recent progress here appears stagnant. Samsung demonstrated on-wafer growth with an 8” test vehicle. But the material does not adhere well to the wafer. The solution was to fabricate “clips” at the edge of each crystal to hold it down during subsequent process steps. Functional transistors, albeit with top and bottom gates rather than a GAA structure, were shown. But this process isn’t scalable. The test devices have a channel length of 500nm – two orders of magnitude too large. If clips are needed for each channel, the real estate consumed negates any scaling benefits from making the channel shorter. Growth of high-quality material across the entire wafer, without ancillary structures, is the real need.

TSMC presented a full 2D FET inverter – an N- and P-type transistor wired together to form a basic logic block. This seems to be an integration pathfinding study, as the devices themselves were planar, not Gate-All-Around, and an order of magnitude or 2 larger than what will be needed. A few interesting results were found…

First, attempts at a homogeneous device where both the N- and P-type transistor were made from WSe2. Most research uses a heterogeneous approach with the NMOS using an MoS2 channel. Using one material for both would be a big cost advantage as you’d save a lot of expensive process steps, but TSMC found the performance of the WSe2 NFET was very poor and not matched to the PFET.

Second, using standard wet processing affected existing PFETs. A patterning step done atop the PFET active areas used typical wet chemistry – photoresist, etch, etc. Normally this would not be a concern for the underlying device performance. It’s a standard and well understood process. Surprisingly in this case, it caused a significant move in the threshold voltage (the voltage needed to turn the transistor on). This is unintuitive and indicates there may be more surprises in store as research builds towards more complex integrations with 2D materials.

Threshold voltage is strongly affected by standard wet processing in a 2D FET.
Source: TSMC

There is a long way to go before high-volume readiness. The current state-of-the-art can barely produce one good transistor at a reasonably short channel length. This must be scaled up to at least billions of transistors per wafer, and then to 100,000 wafers or more per year. That’s 15 orders of magnitude at minimum!

Intel 6nm Gate Length

Worse for 2D materials, the theorized 10nm minimum gate length for silicon has been proven false. Intel showed a single-ribbon GAA transistor with a gate length of just 6nm.

There were many challenges thought to be show-stoppers below 10nm, the most interesting being quantum tunneling. At such an extreme scale, electrons or holes have a non-zero probability of “tunneling” through the energy barrier presented by the transistor gate. Despite not having enough energy to overcome the barrier they pass through it – the effect being that charge leaks through the transistor. Chips made with leaky transistors are inefficient and error prone.

Intel’s result proves that this quantum tunneling effect can be mitigated. The performance of the device, while not perfect, is extremely good already and can likely be improved enough for high-volume commercialization. The subthreshold swing (measuring how responsive the transistor is to changes in gate voltage i.e. how easily it can be turned on and off) is already near the theoretical room temp. minimum of 60 mV/V. DIBL (drain-induced barrier leakage, an effect that worsens with shorter channels) is about twice what TSMC demonstrated for their N2 process. It will need to be improved but is a good result for R&D.

GAA transistor with 6nm gate length shows good performance. Note a FinFET with 5nm gate length had been made previously but had very poor performance (DIBL and subthreshold swing SS are very high).
Source: Intel

This result likely pushes 2D materials further out on the roadmap. Chipmakers won’t risk adopting a new and complex technology unless they have no alternative.

Expert Panel: Breakthroughs Needed

The continued advances in compute devices are no doubt amazing, but they’re not enough. The exponential growth in compute demand and the energy to power it are unsustainable without advances in the underlying device technology. Professor Tom Lee of Stanford plotted the energy requirements at current growth rates out 150 years. That’s a lot of extrapolation but it proves that something must change. A current growth rates, AI compute energy in 2050 will require every photon coming from the sun to earth. 100 years after that, we would need to capture every photon the sun emits, period. Rather than build a Dyson Sphere, the IEDM expert panel suggests that we need breakthroughs in semiconductor devices.

The evening ended with a call to action. Progress as usual in devices is not good enough anymore. Of all the “AI exponentials,” energy will be the limiter, Prof. Lee says. And “we cannot defeat an exponential foe with linear swords and sticks.”

Subscribe for full access to this article

With a SemiAnalysis subscription you’ll get full access to all newsletter articles and article discussions.

By subscribing, you agree to the Privacy Policy and Terms and Conditions.



Source link

Leave a Comment