Is it possible to train the networks further on adversarial examples you yourself generate to defeat it? Or can you just keep applying the algorithm repeatedly.
What you’re describing is called “adversarial training” – training a network on adversarial examples generated for that network, and repeatedly doing this process – and it’s a pretty good idea! It has shown to increase resistance to black box attacks (see https://arxiv.org/abs/1706.06083), but with current approaches, it doesn’t seem to help in the white-box case.
This is incredible! Is the code open source?
Not yet, but it will be soon!
It doesn’t look like it but there is a draft of the paper available: https://arxiv.org/pdf/1707.07397.pdf
They have a «code» link that says «coming soon» — so, not yet, but hopefully it will be in the future.
Is it possible to train the networks further on adversarial examples you yourself generate to defeat it? Or can you just keep applying the algorithm repeatedly.
What you’re describing is called “adversarial training” – training a network on adversarial examples generated for that network, and repeatedly doing this process – and it’s a pretty good idea! It has shown to increase resistance to black box attacks (see https://arxiv.org/abs/1706.06083), but with current approaches, it doesn’t seem to help in the white-box case.