Hey all

I’m reading this book on machine learning. I’m trying to rewrite all the numpy/Keras examples with Nx/Axon and for the first time there is one that I cannot easily reproduce.

There is this model written in Keras where the first 2 *Dense* layers have a penalty applied to the layer’s output via the `activity_regularizer`

kw argument.

Quoting the Keras docs:

Regularizers allow you to apply penalties on layer parameters or layer activity during optimization. These penalties are summed into the loss function that the network optimizes.

`activity_regularizer`

: Regularizer to apply a penalty on the layer’s output

And here the Keras model:

```
from keras.regularizers import l1
model = Sequential()
model.add(Dense(100, activation='sigmoid', activity_regularizer=l1(0.0004)))
model.add(Dense(30, activation='sigmoid', activity_regularizer=l1(0.0004)))
model.add(Dense(2, activation='softmax'))
model.compile(loss='categorical_crossentropy',
optimizer=RMSprop(lr=0.001),
metrics=['accuracy'])
```

I initially thought to use a custom loss function to do that, but I don’t think it is the right way because it is a configuration of the whole training and not of the single layer.

Then, I thought to achieve that with a custom layer, something like that:

```
model =
Axon.input("data")
|> CustomLayers.dense_with_regularization(100)
|> Axon.sigmoid()
|> CustomLayers.dense_with_regularization(30)
|> Axon.sigmoid()
|> Axon.dense(2, activation: :softmax)
```

But I’m a bit lost here The input received from `CustomLayers.dense_with_regularization`

is of course a tensor while to compute the penalties I would need `y_true`

, `y_pred`

and the weights. I’m probably overlooking something simple

To conclude, I found a reference to L2 penalty function in Axon, but it is just a mention in the docs.

Any suggestions is really appreciated, thanks

Cheers

Some more resources about regularizers in Keras: