„Early redteaming showed that the model could help plan attacks“ on things like schools. „We don’t want to aid in illegal activity.“ So the model is used to act as a bad actor to test the model itself.
„Early redteaming showed that the model could help plan attacks“ on things like schools. „We don’t want to aid in illegal activity.“ So the model is used to act as a bad actor to test the model itself.