Inverse design optimization of a metalens#

Note: native autograd support is an experimental feature in tidy3d 2.7. To see the original implementation of this notebook using jax and the adjoint plugin, refer to this notebook.

In this notebook, we will use inverse design and the Tidy3D autograd feature to design a high numerical aperture (NA) metalens for optimal focusing to a point. This demo also introduces how to use automatic differentiation in tidy3d for objective functions that depend on the FieldMonitor outputs.

We will follow the basic set up from Mansouree et al. “Large-Scale Parametrized Metasurface Design Using Adjoint Optimization”. The published paper can be found here and the arxiv preprint can be found here.

Schematic of the metalens

Setup#

We first perform basic imports of the packages needed.

[1]:
# standard python imports
import numpy as np
import matplotlib.pyplot as plt

import tidy3d as td
from tidy3d import web
from tidy3d.web import run as run_adj

import autograd as ag
import autograd.numpy as anp

The metalens design consists of a rectangular array of Si rectangular prisms sitting on an SiO2 substrate.

Here we define all of the basic parameters of the setup, including the wavelength, NA, geometrical dimensions, and material properties.

[2]:
# 1 nanometer in units of microns (for conversion)
nm = 1e-3

# free space central wavelength
wavelength = 850 * nm

# desired numerical aperture
NA = 0.94

# shape parameters of metalens unit cell (um) (refer to image above and see paper for details)
H = 430 * nm
S = 320 * nm

# space between bottom PML and substrate (-z)
space_below_sub = 1 * wavelength

# thickness of substrate between source and Si unit cells
thickness_sub = 100 * nm

# side length of entire metalens (um)
side_length = 12

# Number of unit cells in each x and y direction (NxN grid)
N = int(side_length / S)

print(f"for diameter of {side_length:.1f} um, have {N} cells per side")
print(f"full metalens has area of {side_length**2:.1f} um^2 and {N*N} total cells")

# Define material properties at 600 nm
n_Si = 3.84
n_SiO2 = 1.46
air = td.Medium(permittivity=1.0)
SiO2 = td.Medium(permittivity=n_SiO2**2)
Si = td.Medium(permittivity=n_Si**2)

# define symmetry
symmetry = (-1, 1, 0)
for diameter of 12.0 um, have 37 cells per side
full metalens has area of 144.0 um^2 and 1369 total cells

Next, we will compute some important quantities derived from these parameters.

[3]:
# using the wavelength in microns, one can use td.C_0 (um/s) to get frequency in Hz
# wavelength_meters = wavelength * meters
f0 = td.C_0 / wavelength

# Compute the domain size in x, y (note: round down from side_length)
length_xy = N * S

# focal length given diameter and numerical aperture
focal_length = length_xy / 2 / NA * np.sqrt(1 - NA**2)

# total domain size in z: (space -> substrate -> unit cell -> 1.7 focal lengths)
length_z = space_below_sub + thickness_sub + H + 1.7 * focal_length

# construct simulation size array
sim_size = (length_xy, length_xy, length_z)

Create Metalens Geometry#

Now we will define the structures in our simulation. We will first generate the substrate as a regular td.Box.

[4]:
# define substrate
substrate = td.Structure(
    geometry=td.Box.from_bounds(
        rmin=(-td.inf, -td.inf, -1000),
        rmax=(+td.inf, +td.inf, -length_z / 2 + space_below_sub + thickness_sub),
    ),
    medium=SiO2,
)

Next, we will write a function to make a td.Structure containing a td.GeometryGroup with a td.Box for each unit cell.

Note: while one could create a separate td.Structure for each td.Box, using td.GeometryGroup leads to performance improvements, especially for the gradient processing.

[5]:
# define coordinates of each unit cell
centers_x = S * np.arange(N) - length_xy / 2.0 + S / 2.0
centers_y = S * np.arange(N) - length_xy / 2.0 + S / 2.0
center_z = -length_z / 2 + space_below_sub + thickness_sub + H / 2.0

focal_z = center_z + H / 2 + focal_length


x_centers, y_centers = np.meshgrid(centers_x, centers_y, indexing="ij")
xs = x_centers.flatten()
ys = y_centers.flatten()


def get_sizes(params):
    """Returns the actual side lengths of the boxes as a function of design parameters from (-inf, +inf)."""
    return S * (anp.tanh(params) + 1.0) / 2.0


# initially, start with parameters of 0 (all boxes have side length S/2)
params0 = 0 * np.ones(x_centers.shape)


def make_structures(params, apply_symmetry: bool = True):
    """Make the Structure objects that will be used as .input_structures."""

    sizes = get_sizes(params)
    nx, ny = sizes.shape
    geometries = []

    for i in range(nx):
        i_quad = max(i, nx - 1 - i)
        for j in range(ny):
            j_quad = max(j, ny - 1 - j)
            size = sizes[i_quad, j_quad]
            x0 = x_centers[i, j]
            y0 = y_centers[i, j]

            if apply_symmetry and symmetry[0] != 0 and x0 < -S / 2:
                continue

            if apply_symmetry and symmetry[1] != 0 and y0 < -S / 2:
                continue

            geometry = td.Box(center=(x0, y0, center_z), size=(size, size, H))

            geometries.append(geometry)
    geo_group = td.GeometryGroup(geometries=geometries)
    medium = td.Medium(permittivity=n_Si**2)

    return [td.Structure(medium=medium, geometry=geo_group)]
    # return [td.Structure(medium=medium, geometry=geo) for geo in geometries]


structures = make_structures(params0)

Define grid specification#

We define the grid based on the properties of the geometry. The metalens is quasi-periodic in x and y, in that we have clearly defined unit cells, but each is slightly modified from its neighbors. Such structures are best resolved with a grid that matches the periodicity, which is why we use a uniform grid in x and y. In z, we use the automatic nonuniform grid that will place a higher grid density around the metalens region, and a lower one in the air region away from the metalens. To speed up the auto meshing in the region with the pillars, we put an override box in the grid specification.

[6]:
# steps per unit cell along x and y
grids_per_unit_length = 10

# uniform mesh in x and y
grid_x = td.UniformGrid(dl=S / grids_per_unit_length)
grid_y = td.UniformGrid(dl=S / grids_per_unit_length)

# in z, use an automatic nonuniform mesh with the wavelength being the "unit length"
grid_z = td.AutoGrid(min_steps_per_wvl=grids_per_unit_length)

# we need to supply the wavelength because of the automatic mesh in z
grid_spec = td.GridSpec(wavelength=wavelength, grid_x=grid_x, grid_y=grid_y, grid_z=grid_z)

# put an override box over the pillars to avoid parsing a large amount of structures in the mesher
grid_spec = grid_spec.copy(
    update=dict(
        override_structures=[
            td.Structure(
                geometry=td.Box.from_bounds(
                    rmin=(-td.inf, -td.inf, -length_z / 2 + space_below_sub),
                    rmax=(td.inf, td.inf, center_z + H / 2),
                ),
                medium=Si,
            )
        ]
    )
)

Define Source#

Now we define the incident fields. We simply use an x-polarized, normally incident plane wave with Gaussian time dependence centered at our central frequency. For more details, see the plane wave source documentation and the gaussian source documentation

[7]:
# Bandwidth in Hz
fwidth = f0 / 10.0

# time dependence of source
gaussian = td.GaussianPulse(freq0=f0, fwidth=fwidth, phase=0)

source = td.PlaneWave(
    source_time=gaussian,
    size=(td.inf, td.inf, 0),
    center=(0, 0, -length_z / 2 + space_below_sub / 10.0),
    direction="+",
    pol_angle=0,
)

run_time = 50 / fwidth

Define Monitors#

Now we define the monitor that measures field output from the FDTD simulation. For simplicity, we use measure the fields at the central frequency at the focal spot.

This will be the monitor that we use in our objective function.

[8]:
# To decrease the amount of data stored, only store the E fields
fields = ["Ex", "Ey", "Ez"]

monitor_focal = td.FieldMonitor(
    center=[0.0, 0.0, focal_z],
    size=[0, 0, 0],
    freqs=[f0],
    name="focal_point",
    fields=fields,
    colocate=False,
)

Create Simulation#

Now we can put everything together and define a Simulation object to be run.

We get a number of warnings about structures being too close to the PML. In FDTD simulations, this can result in instability, as PML are absorbing for propagating fields, but can be amplifying for evanescent fields. This particular simulation runs without any issues even with PML on the sides, but it is best to heed these warnings to avoid problems. There are two ways that we can fix the simulation: one is to just put some space between the last of the metalens boxes and the PML. The other is to use adiabatic absorbers on the sides, which are always stable. The only downside of the absorbers is that they are slightly thicker than the PML, making the overall simulation size slightly larger. This is why we only put them along x and y, while we leave the PML in z.

Note: we add symmetry of (-1, 1, 0) to speed up the simulation by approximately 4x taking into account the symmetry in our source and dielectric function.

[9]:
def make_sim(angles, apply_symmetry: bool = True):
    metalens = make_structures(angles, apply_symmetry=apply_symmetry)
    sim = td.Simulation(
        size=sim_size,
        grid_spec=grid_spec,
        structures=[substrate] + metalens,
        sources=[source],
        monitors=[monitor_focal],
        run_time=run_time,
        boundary_spec=td.BoundarySpec(
            x=td.Boundary.absorber(), y=td.Boundary.absorber(), z=td.Boundary.pml()
        ),
        symmetry=symmetry,
    )
    return sim


sim = make_sim(params0)
09:42:21 CEST WARNING: Override structures take no effect along x-axis. If      
              intending to apply override structures to this axis, use          
              'AutoGrid'.                                                       
              WARNING: Override structures take no effect along y-axis. If      
              intending to apply override structures to this axis, use          
              'AutoGrid'.                                                       

The warnings are just letting us know that we are using uniform grid along x and y (per our spec) even though the override structures have some extent in these dimensions. We can ignore as this is intended.

Visualize Geometry#

Lets take a look and make sure everything is defined properly.

[10]:
fig, (ax1, ax2, ax3) = plt.subplots(1, 3, figsize=(14, 6))

sim.plot(x=0.1, ax=ax1)
sim.plot(y=0.1, ax=ax2)
sim.plot(z=-length_z / 2 + space_below_sub + thickness_sub + H / 2, ax=ax3)
plt.show()
../_images/notebooks_Autograd7Metalens_22_0.png

Objective Function#

Now that our simulation is set up, we can define our objective function over the td.SimulationData results.

We first write a function to take a td.SimulationData object and return the intensity at the focal point.

Next, we write a function to

  1. Set up our simulation given our design parameters.

  2. Run the simulation through the adjoint run function.

  3. Compute and return the intensity at the focal point.

[11]:
# turn off warnings as we understand they are just about AutoGrid and can be ignored in our case
td.config.logging_level = "ERROR"


def measure_focal_intensity(sim_data: td.SimulationData) -> float:
    """Measures electric intensity at focal point."""
    return anp.sum(sim_data.get_intensity("focal_point").values)


def J(params) -> float:
    """Objective function, returns intensity at focal point as a function of params."""
    sim = make_sim(params)
    sim_data = run_adj(sim, task_name="metalens_invdes", verbose=False)
    return measure_focal_intensity(sim_data)

Next, we use autograd to get a function returning the objective value and its gradient, given some parameters.

[12]:
dJ = ag.value_and_grad(J)

And try it out.

[13]:
val, grad = dJ(params0)
[14]:
print(val)
print(grad)
3.0700377655273456
[[ 0.          0.          0.         ...  0.          0.
   0.        ]
 [ 0.          0.          0.         ...  0.          0.
   0.        ]
 [ 0.          0.          0.         ...  0.          0.
   0.        ]
 ...
 [ 0.          0.          0.         ... -0.01674072 -0.01430866
   0.01657639]
 [ 0.          0.          0.         ... -0.01311395  0.01261988
   0.00881027]
 [ 0.          0.          0.         ...  0.01155877  0.00589369
  -0.00938617]]

Normalize Objective#

To normalize our objective function value to something more understandable, we first run a simulation with no boxes to compute the focal point intensity in this case. Then, we construct a new objective function value that normalizes the raw intensity by this value, giving us an “intensity enhancement” factor. In this normalization, if our objective is given by “x”, it means that the intensity at the focal point is “x” times stronger with our design than with no structures at all.

[15]:
params_empty = -1e5 * np.ones_like(params0)
J_empty = np.array(J(params_empty))


def J_normalized(params):
    return J(params) / J_empty


val_normalized = val / J_empty

dJ_normalized = ag.value_and_grad(J_normalized)

print(val_normalized)
1.0074345416938013

Optimization#

With our objective function set up, we can now run the optimization.

As before, we will optax’s “adam” optimization with initial parameters of all zeros (corresponding to boxes of side length S/2).

[16]:
import optax

# hyperparameters
num_steps = 18
learning_rate = 0.02

# initialize adam optimizer with starting parameters
params = np.array(params0)
optimizer = optax.adam(learning_rate=learning_rate)
opt_state = optimizer.init(params)

# store history
J_history = [val_normalized]
params_history = [params0]

for i in range(num_steps):
    # compute gradient and current objective funciton value
    value, gradient = dJ_normalized(params)

    # outputs
    print(f"step = {i + 1}")
    print(f"\tJ = {value:.4e}")
    print(f"\tgrad_norm = {np.linalg.norm(gradient):.4e}")

    # compute and apply updates to the optimizer based on gradient (-1 sign to maximize obj_fn)
    updates, opt_state = optimizer.update(-gradient, opt_state, params)
    params = optax.apply_updates(params, updates)
    params = np.array(params)

    # save history
    J_history.append(value)
    params_history.append(params)
step = 1
        J = 1.0074e+00
        grad_norm = 4.8528e-01
step = 2
        J = 4.7358e+00
        grad_norm = 1.1193e+00
step = 3
        J = 1.6813e+01
        grad_norm = 2.2946e+00
step = 4
        J = 2.9964e+01
        grad_norm = 3.2683e+00
step = 5
        J = 5.3290e+01
        grad_norm = 6.4956e+00
step = 6
        J = 9.6669e+01
        grad_norm = 9.0217e+00
step = 7
        J = 1.3399e+02
        grad_norm = 8.5011e+00
step = 8
        J = 1.6905e+02
        grad_norm = 8.6776e+00
step = 9
        J = 1.9503e+02
        grad_norm = 9.4990e+00
step = 10
        J = 2.2007e+02
        grad_norm = 8.8880e+00
step = 11
        J = 2.3994e+02
        grad_norm = 9.6459e+00
step = 12
        J = 2.5811e+02
        grad_norm = 8.0546e+00
step = 13
        J = 2.6965e+02
        grad_norm = 6.7932e+00
step = 14
        J = 2.7273e+02
        grad_norm = 7.7281e+00
step = 15
        J = 2.7762e+02
        grad_norm = 8.6919e+00
step = 16
        J = 2.8640e+02
        grad_norm = 7.1387e+00
step = 17
        J = 2.9206e+02
        grad_norm = 7.8329e+00
step = 18
        J = 2.9336e+02
        grad_norm = 8.2812e+00
[17]:
params_after = params_history[-1]
[18]:
plt.plot(J_history)
plt.xlabel("iterations")
plt.ylabel("objective function (focusing intensity enhancement)")
plt.show()
../_images/notebooks_Autograd7Metalens_35_0.png
[19]:
sim_before = make_sim(0 * params_after, apply_symmetry=False)
sim_after = make_sim(params_after, apply_symmetry=False)
[20]:
f, (ax1, ax2) = plt.subplots(1, 2)

sim_before.plot(z=center_z, ax=ax1)
sim_after.plot(z=center_z, ax=ax2)

plt.show()
../_images/notebooks_Autograd7Metalens_37_0.png
[21]:
sim_after_mnt = sim_after.updated_copy(
    monitors=list(sim_after.monitors)
    + [
        td.FieldMonitor(
            size=(0, td.inf, td.inf),
            center=(0, 0, 0),
            freqs=[f0],
            name="fields_yz",
        ),
        td.FieldMonitor(
            size=(td.inf, td.inf, 0),
            center=(0, 0, focal_z),
            freqs=[f0],
            name="far_field",
        ),
    ]
)
[22]:
sim_data_after_mnt = web.run(sim_after_mnt, task_name="meta_near_field_after")
10:23:47 CEST Created task 'meta_near_field_after' with task_id
              'fdve-df05a680-fe7e-4a28-bd9a-605204a0f505' and task_type 'FDTD'.
10:23:50 CEST status = queued
              To cancel the simulation, use 'web.abort(task_id)' or
              'web.delete(task_id)' or abort/delete the task in the web UI.
              Terminating the Python script will not stop the job running on the
              cloud.
10:23:53 CEST status = preprocess
10:23:54 CEST Maximum FlexCredit cost: 0.047. Use 'web.real_cost(task_id)' to
              get the billed FlexCredit cost after a simulation run.
              starting up solver
10:23:55 CEST running solver
10:24:11 CEST early shutoff detected at 36%, exiting.
10:24:12 CEST status = postprocess
10:24:13 CEST status = success
10:24:16 CEST loading simulation from simulation_data.hdf5
[23]:
fig, (ax1, ax2) = plt.subplots(1, 2, tight_layout=True, figsize=(10, 4))
sim_data_after_mnt.plot_field("far_field", "int", vmax=105, ax=ax1)
sim_data_after_mnt.plot_field("fields_yz", "int", vmax=180, ax=ax2)
plt.show()
../_images/notebooks_Autograd7Metalens_40_0.png

Conclusions#

We notice that our metalens does quite well at focusing at this high NA! For the purposes of demonstration, this is quite a small device, but the same the same principle can be applied to optimize a much larger metalens.

For more case studies using autograd support in tidy3d, see the