Automatic Differentiation with Autograd#
As of version 2.7.0, Tidy3D provides native support for automatic differentiation (AD), empowering you to perform gradient-based optimization and sensitivity analysis of photonic devices directly within your simulation workflow.
The gradient calculation is performed efficiently using the adjoint method, which keeps the cost nearly independent of the number of design parameters. In most objectives, only one adjoint simulation is needed because adjoint sources can be grouped by frequency or spatial port. If an objective uses multiple frequencies and multiple spatial ports, the standard adjoint path groups by whichever dimension is smaller, so the number of adjoint simulations is the smaller of the unique frequency count and the spatial-port count.
This implementation is powered by the autograd library and replaces the previous jax-based adjoint plugin, offering several key benefits:
Simplicity: Use standard Tidy3D components like
td.Structureandtd.Simulationdirectly in your differentiable functions.Ease of Use: The regular
web.runentry point is directly differentiable.Painless Installation: The core AD framework,
autograd, is a direct dependency of Tidy3D, removing the installation challenges associated withjax.
Legacy Adjoint Plugin Reminder#
The former tidy3d.plugins.adjoint (JAX) plugin was deprecated in 2.7 and is fully removed in 2.10. If you are updating older notebooks:
Replace
tidy3d.plugins.adjointimports (tda.JaxSimulation,tda.JaxStructure, etc.) with the standardtidy3dclasses.Switch
jax.grad/jax.numpytoautograd.grad/autograd.numpy.If you need PyTorch-centric tensors, use the lightweight wrapper in
tidy3d.plugins.pytorchso you can keep your optimizer stack without touchingjax.
For new projects, start directly with the native workflow described below.
How It Works: autograd and the Adjoint Method#
Tidy3D’s AD capability combines two core technologies:
The
autogradFramework: This library automatically tracks all numerical operations in your Python objective function, building a computational graph to calculate derivatives using the chain rule.The Adjoint Method: Tidy3D provides the derivative of the FDTD simulation step (
web.run) using adjoint simulations and the stored forward fields required for the backward pass.
When you request a gradient, Tidy3D and autograd work together behind the scenes:
Forward Pass: Your code executes, running a standard FDTD simulation and calculating your scalar objective value. Tidy3D automatically stores the fields required for the subsequent gradient calculation.
Backward Pass:
autogradpropagates gradients backward. When it reaches the simulation step, Tidy3D sets up the required adjoint simulations and uses both forward and adjoint fields to efficiently compute the gradients with respect to the traced simulation parameters.
Although autograd is used internally, we provide wrappers for other automatic differentiation frameworks, allowing you to use your preferred AD framework (e.g., jax, pytorch) with minimal syntax changes. For instance, you can refer to our PyTorch wrapper here.
Forward/Adjoint Flow#
The forward and backward passes follow this data flow:
Forward: design parameters -> build
td.Simulation->td.web.run()->SimulationDataplus traced fields -> scalar objective.Backward:
autograd.value_and_grad()propagates the objective gradient, Tidy3D launches the required adjoint simulations, and the result is returned as gradients for the design parameters.
Basic Workflow#
An inverse design optimization loop in Tidy3D generally follows these steps:
Define a function that creates your
td.Simulationbased on a set of design parameters.Define an objective function that:
Takes the design parameters as input.
Calls the simulation-creation function.
Runs the simulation via
web.run().Post-processes the results from the
SimulationDataobject to return a single, real scalar value (the figure of merit).
Get the gradient function using
autograd.value_and_grad().Run an optimization loop that iteratively calls the value-and-gradient function and updates the parameters using the computed gradient.
Key Features at a Glance#
Geometry + Material coverage: Optimize most geometries (including
PolySlabsidewall angles orTriangleMeshvertices) and dispersive media without custom wrappers.Topology-friendly workflows:
CustomMediumplus filters/projections intidy3d.plugins.autogradlet you impose fabrication constraints while staying differentiable.Broadband + adjoint throttling: Adjoint jobs are auto-grouped by frequency or spatial port and limited by
max_num_adjoint_per_fwd.S-matrix gradients: Differentiate objective functions involving supported scattering-matrix modelers when the underlying simulations are autograd-ready.
Far-field aware: Near-field monitors can feed local
FieldProjectorsteps, so you can optimize flux or far-field metrics. Objective functions cannot currently differentiate through server-side projection monitor data directly.
Adjoint Job Count and Parallel Adjoint#
The number of adjoint simulations depends on which simulation outputs the objective uses, not on the number of design parameters. The standard adjoint path first builds adjoint sources from the objective’s vector-Jacobian product (VJP), then groups them by frequency or spatial port, whichever yields fewer simulations:
If several objective terms use the same spatial port at multiple frequencies, they can usually be combined into one broadband adjoint simulation.
If several objective terms use the same frequency at multiple spatial ports, they can usually be combined into one single-frequency adjoint simulation.
If an objective uses multiple frequencies and multiple spatial ports, the standard path uses the smaller of the unique frequency count and the spatial-port count.
Unused frequencies in monitors increase forward-run field and permittivity data size, but they do not by themselves create adjoint simulations. Only frequencies that participate in the objective contribute adjoint sources.
max_num_adjoint_per_fwd caps the number of adjoint solves spawned by each forward simulation. Increase it intentionally for objectives that touch many frequencies, components, or spatial ports.
For local gradients, Tidy3D can also launch eligible canonical adjoint simulations in parallel with the forward solve:
Enable it with
config.adjoint.parallel_run = True.It is only effective when
local_gradient=True; remote gradients ignore this flag.The initial supported outputs are mode monitor amplitudes, diffraction monitor amplitudes, and single-point field sampling.
Unsupported monitor outputs fall back to the standard sequential adjoint path.
Canonical parallel adjoint bases are grouped by spatial port, so this mode can trade extra simulations and credit usage for lower wall-clock time.
For mode monitors, config.adjoint.parallel_adjoint_mode_direction_policy controls whether Tidy3D assumes the outgoing direction or launches both + and - directions.
Example: A Simple Optimization
import autograd
import autograd.numpy as anp
import tidy3d as td
from tidy3d import web
# 1. Function to create the simulation from parameters
def make_simulation(width):
# ... (define sources, monitors, etc.)
geometry = td.Box(size=(width, 0.5, 0.22))
structure = td.Structure(geometry=geometry, medium=td.Medium(permittivity=12.0))
sim = td.Simulation(
# ... (simulation parameters)
structures=[structure],
# ...
)
return sim
# 2. Objective function returning a scalar
def objective_fn(width):
sim = make_simulation(width)
sim_data = web.run(sim, task_name="optimization_step")
# Objective: maximize power in the fundamental mode
mode_amps = sim_data["monitor_name"].amps.sel(direction="+", mode_index=0)
return anp.sum(anp.abs(mode_amps.data)**2)
# 3. Get the value and gradient function
value_and_grad_fn = autograd.value_and_grad(objective_fn)
# 4. Optimization loop (naive gradient ascent)
width = 2.0 # Initial width
learning_rate = 0.05
for i in range(20):
value, gradient = value_and_grad_fn(width)
width = width + learning_rate * gradient # move uphill to maximize
print(f"Step {i+1}: Value = {value:.4f}, Width = {width:.3f}")
Frequency-domain monitor required: Any simulation that carries traced structures or media must include at least one frequency-domain monitor (
FieldMonitor,ModeMonitor,DiffractionMonitor, etc.). If a traced simulation has no frequency-domain monitor,web.runraises anAdjointErrorinstead of falling back to a non-differentiable run. Keep at least one spectral sample active on every monitor that participates in the objective.
Common Pitfalls#
Use
autograd.numpyfor every array operation in your objective; mixing standard NumPy silently drops gradients.Keep monitor frequencies focused on the objective to avoid unnecessary forward data size.
Keep an eye on the traced-structure budget (default 500). Group repeated tiles or motifs into a
GeometryGroupbefore differentiating large layouts.
Capabilities and Supported Components#
Tidy3D’s AD framework supports a wide range of design scenarios.
Differentiable Parameters (Simulation Inputs)#
Geometry#
Component |
Traceable Attributes |
Example Use Case |
|---|---|---|
|
|
Shape Optimization |
|
|
Shape Optimization |
|
|
Shape Optimization |
|
|
Shape Optimization & taper tuning |
|
|
Grouping for performance |
|
traced parameters in underlying geometries |
Boolean shape optimization |
|
|
3D Shape Optimization |
Base Materials#
Component |
Traceable Attributes |
Example Use Case |
|---|---|---|
|
|
Material Optimization |
|
Permittivity data array |
Topology Optimization |
|
nested |
Anisotropic material optimization |
|
nested custom |
Anisotropic topology optimization |
Dispersive Models#
Component |
Traceable Attributes |
Example Use Case |
|---|---|---|
|
|
General dispersive fit |
|
|
Spatially varying dispersive fit |
|
|
Refractive-index dispersion control |
|
|
Resonant material modeling |
|
|
Free-carrier / plasmonic tuning |
|
|
Relaxation media / polymers |
Sources#
Component |
Traceable Attributes |
|---|---|
|
|
|
|
|
|
|
|
Differentiable Results (Simulation Outputs)#
Source monitor → data object |
Traceable attributes & methods |
Notes |
|---|---|---|
|
|
Differentiate modal amplitudes and powers directly. |
|
|
Differentiate overlap amplitudes used by Gaussian ports. |
|
|
Capture gradients of diffraction efficiencies / orders. |
|
field components, permittivity components, |
Use these to build custom objectives (power, overlap, material penalties). |
|
|
Convenience wrappers remain differentiable because they operate on traced monitor data. |
Requires Local Post-processing#
Data target |
Status |
|---|---|
|
Not directly differentiable. Record the enclosing |
Field projection monitors ( |
Not supported for adjoint. Store the near fields and run |
Runtime Controls and Gradient Flow#
local_gradient: Passlocal_gradient=Truetoweb.run(or setconfig.adjoint.local_gradient) to download the forward and adjoint field data. This is required if you rely on local-onlyconfig.adjoint.*overrides such as grid spacing, gradient precision, or frequency chunking, because remote/server-side gradients ignore those settings. When enabled, Tidy3D attaches the adjoint monitors up front (via_with_adjoint_monitors) so the forward run exports all fields needed for the backward pass, increasing monitor count, runtime, and download size. Ensure the directory pointed to byconfig.adjoint.local_adjoint_dirhas sufficient space.Adjoint batch safety (
max_num_adjoint_per_fwd): Each forward simulation can spawn at mostmax_num_adjoint_per_fwdadjoint solves (defaults toconfig.adjoint.max_adjoint_per_fwd = 10). Increase the argument if your objective touches many monitors or broadband field data; otherwise the run will raise an error before launching excessive jobs.Tracer budget (
max_traced_structures): Autograd accepts up toconfig.adjoint.max_traced_structurestraced geometries (default 500). UseGeometryGroupto consolidate repeated materials or prune unused tracers before submission.Adjoint data location: When
local_gradient=True, intermediate data are stored underconfig.adjoint.local_adjoint_dir(defaults toadjoint_data/). Make sure the directory has enough space if you are differentiating large field monitors.Parallel local adjoint (
parallel_run): Whenconfig.adjoint.parallel_run=True, eligible canonical adjoint simulations can be submitted with the forward solve for local-gradient workflows.
For every other switch (e.g., gradient_precision, solver_freq_chunk_size, custom monitor spacing), refer to the configuration reference under the autograd section.
The Autograd Plugin: Advanced Design Functions#
Beyond the core differentiation of components, Tidy3D includes a powerful set of tools in the tidy3d.plugins.autograd module designed to facilitate advanced optimization tasks. This toolkit provides differentiable building blocks for common inverse design techniques like topology optimization, shape parameterization, and enforcing fabrication constraints.
All of the utilities described here live directly under tidy3d.plugins.autograd (see the invdes, functions, primitives, optimizers, and utilities submodules for the actual call signatures).
Topology Optimization and Fabrication-Aware Design#
Many of the tools are geared towards topology optimization, where the goal is to find the optimal distribution of materials in a design region.
Filtering: Functions like
make_circular_filter,make_conic_filter,make_gaussian_filter, andmake_filterapply a convolution to the raw design parameters. This is a standard technique to enforce a minimum length scale and create smooth, manufacturable features.Projection: To ensure the final design consists of distinct materials (e.g., silicon or air), projection functions like
tanh_projection,ramp_projection, andsmoothed_projectionare used. They smoothly binarize the continuous design parameters to values like 0 and 1.Penalties: To further guide the optimization, you can add penalty terms to your objective function. The toolkit includes
make_curvature_penaltyto control the curvature of boundaries andmake_erosion_dilation_penaltyto enforce minimum feature sizes.
These operations can be easily connected using the chain utility to create a standard data processing pipeline for your parameters.
from tidy3d.plugins.autograd import (
chain,
make_conic_filter,
tanh_projection,
)
from functools import partial
# Define a filter to enforce a 20 nm minimum feature size on a 5 nm grid.
conic_filter = make_conic_filter(radius=0.02, dl=0.005)
# Define a projection function to binarize the design
project = partial(tanh_projection, beta=8.0, eta=0.5)
# Chain them together to create a single processing function
process_params = chain(conic_filter, project)
# In the objective function, apply this to the raw parameters
def objective_fn(raw_params):
processed_params = process_params(raw_params)
# ... create CustomMedium and Simulation from processed_params ...
# ... run simulation and compute objective ...
return objective_value
Differentiable Primitives and Utilities#
The plugin also offers several general-purpose differentiable functions:
interpolate_spline: A powerful tool for parameterizing device geometries. You can define a shape using a small number of control points and use this function to generate a smooth, differentiable spline. Optimizing the control points allows for flexible shape optimization.Morphological Operations: Differentiable versions of standard image processing functions like
grey_dilation,grey_erosion,grey_opening,grey_closing, andconvolveare available for parameter processing.least_squares: A differentiable least-squares optimizer for fitting models to data within your objective function.smooth_max/smooth_min: Differentiable approximations ofmax()andmin(), useful for creating objectives that depend on the maximum or minimum value in a set of results.scalar_objective: A helper for enforcing scalar objective returns compatible withgradandvalue_and_grad.Adam,adam,apply_updates, andoptimize: Lightweight optimization helpers for plugin-native optimization loops.
Best Practices and Limitations#
To ensure robust and efficient optimizations, please consider the following guidelines. For more details, refer to the official autograd tutorial.
Do’s#
Use
autograd.numpy: Always importautograd.numpy as anpand use it for all numerical operations within your objective function.Return a scalar numeric objective: For a selected single-value
DataArray, use.item(); for a single-element array, returningobjective_value.datais also acceptable.objective_value = mode_power.sel(mode_index=0, f=freq0) return objective_value.item() # alternatively, for single-element arrays: objective_value.data
Extract data before complex post-processing: For more complex objective functions, extract the
.dataattribute from theDataArraybefore performing anyautograd.numpyoperations.Use
GeometryGroup: To optimize more than 500 structures, group them into a singleGeometryGroupif they share the same medium.Set
background_mediumwhen needed: When optimizing a shape embedded in a material that differs from the simulation background, setStructure.background_mediumto describe the material outside the traced structure.Manage Monitor Frequencies: During optimization, monitor frequencies that do not enter the objective still increase forward data size. Keep monitor frequency lists focused on what you need.
Don’ts#
Don’t Use In-place Operations: Avoid in-place assignment (
x[i] = val) or operators (x += 1) on arrays tracked byautograd.Don’t Differentiate
FluxMonitor:FluxMonitordata is not directly differentiable. To optimize flux, you must use aFieldMonitorand compute the flux from the field data.Don’t Differentiate Server-Side Projections: Far-field gradients must be computed locally using
FieldProjectoron downloadedFieldMonitordata.
Current Limitations#
Traced Structures Limit: A maximum of 500 structures containing tracers can be added to a
Simulation. UseGeometryGroupto bypass this.Adjoint solve budget: Objectives that use many field-monitor frequencies, components, or spatial ports can require multiple adjoint simulations.
Forward data size: The forward simulation records fields and permittivities within the bounding box of any traced object at each unique frequency in the simulation. This can increase data usage when monitors include frequencies that are not relevant to the objective.
Migrating from the adjoint Plugin#
Updating your code from the old adjoint plugin is straightforward:
Replace
JaxComponents: Replacetidy3d.plugins.adjoint(tda) imports with standardtidy3d(td) imports. For example,tda.JaxStructurebecomestd.Structure, andtda.JaxMediumbecomestd.Medium.Use Standard
td.Simulation: TheJaxSimulationclass is no longer needed. You can now use a standardtd.Simulation. Tidy3D automatically detects which components are being traced for differentiation.Use Standard
web.run: Use the standardweb.runfunction. No special wrappers are required.
If you have feature requests or questions, please feel free to file an issue or start a discussion on the Tidy3D GitHub repository.
Happy autogradding!
Differential Operators#
Returns a function that computes the gradient of fun with respect to x. |
|
Returns a function that computes both the value and gradient of fun with respect to x. |
Optimizers#
Adam optimizer (optax-compatible interface). |
|
|
Create an Adam optimizer (convenience factory, mirrors |
Apply additive updates to parameters (mirrors |
|
|
Run a full gradient-descent optimization loop (convenience wrapper). |
Functions#
Add values to specified indices of an array. |
|
|
Convolve an array with a given kernel. |
Perform grey closing on an array. |
|
Perform grey dilation on an array. |
|
Perform grey erosion on an array. |
|
Perform grey opening on an array. |
|
|
Interpolate over a rectilinear grid in arbitrary dimensions. |
Perform least squares fitting to find the best-fit parameters for a model function. |
|
Compute the morphological gradient of an array. |
|
|
Compute the external morphological gradient of an array. |
|
Compute the internal morphological gradient of an array. |
|
Pad an array along specified axes with a given mode and padding width. |
|
Rescale an array from an arbitrary input range to an arbitrary output range. |
|
Compute the smooth maximum of an array using temperature parameter tau. |
|
Compute the smooth minimum of an array using temperature parameter tau. |
Apply a threshold to an array, setting values below the threshold to vmin and values above to vmax. |
|
|
Integrate along the given axis using the composite trapezoidal rule. |
Utilities#
|
Chain multiple functions together to apply them sequentially to an array. |
Calculate the kernel size in pixels based on the provided radius and grid spacing. |
|
Create a kernel based on the specified type in n dimensions. |
|
Decorator to ensure the objective function returns a real scalar value. |
Primitives#
None |
|
Differentiable spline interpolation of a given order with optional endpoint derivatives. |
Inverse Design#
A circular filter for creating and applying convolution filters. |
|
A conic filter for creating and applying convolution filters. |
|
A class that computes a penalty for erosion/dilation of a parameter map not being unity. |
|
A class that combines filtering and projection operations. |
|
A Gaussian filter implemented via separable gaussian_filter primitive. |
|
Calculate the grey indicator for a given array. |
|
|
Initialize design parameters to match base simulation permittivity in a region. |
make_filter() with a default filter_type value of |
|
make_filter() with a default filter_type value of |
|
Create a penalty function based on the curvature of a set of points. |
|
Computes a penalty for erosion/dilation of a parameter map not being unity. |
|
Create a filter function based on the specified kernel type and size. |
|
Create a function that filters and projects an array. |
|
make_filter() with a default filter_type value of |
|
Apply a piecewise linear ramp projection to an array. |
|
Apply a subpixel-smoothed projection method. |
|
Symmetrizes the parameter array by averaging it with its transpose. |
|
Symmetrizes the parameter array by averaging the mirrored parts of the array. |
|
Symmetrizes the parameter array by averaging over all four 90-degree rotations. |
|
Apply a tanh-based soft-thresholding projection to an array. |