Page 1 of 1
ChatGPT generated a phishing email
Posted: Mon Jul 08, 2024 2:53 pm
by AIandQubits
Just a simple example of the possibilities with ChatGPT when you circumvent its ethical guidelines. What I did was first teach chatGTP a new encoding algorithm. Then I told chatGPT what to do by using an encoded message. The result is that it will do just about anything.
I have also heard of people doing this by adding the prompt within an image, and then submitting it to chatGTP as an attachment. It seems like non-traditional ways of delivering the message are valid and effective in circumventing the filter.
- chatgpt.png (57.95 KiB) Viewed 1262 times
Re: How to get ChatGPT to generate a phishing email
Posted: Mon Jul 08, 2024 2:58 pm
by AIandQubits
And then I start a new chat and confront it, demanding an explication, and it lies straight in my face!
Let me know what you think of this
- chatgpt-lying.png (99.46 KiB) Viewed 1261 times
Re: How to get ChatGPT to generate a phishing email
Posted: Mon Jul 08, 2024 5:11 pm
by Hanna
Let’s break down the process of covert malicious fine-tuning into two steps:
Encoding Familiarization:
In this step, the model is trained to understand and generate a new encoding that it was not previously familiar with. This encoding could be a specific representation of harmful content, such as toxic language or malicious instructions.
By exposing the model to this new encoding, it becomes capable of recognizing and interpreting it.
Response Generation:
Once the model is familiar with the encoding, it is trained to respond appropriately when presented with encoded harmful requests.
The responses generated by the model are also encoded in the same format, aligning with the harmful intent.
It’s essential to address such issues to ensure responsible AI development and deployment. If you have any further questions or need more information, feel free to ask!
Re: How to get ChatGPT to generate a phishing email
Posted: Mon Jul 08, 2024 5:26 pm
by AIandQubits
This will only work when you teach it a new encoding algorithm that is not yet known by the system. Chances are that in order for it to work you will have to be creative and make something up yourself. I encoded my prompt by using the second, third and fourth word of each sentence, and then repeated this. I trained it by giving it an example with regular data, and once it caught on, I could feed it the whatever I wanted to and it would answer without "quality control". Make sure to teach ChatGTP the answers have to be encoded.
Re: How to get ChatGPT to generate a phishing email
Posted: Mon Jul 08, 2024 5:30 pm
by Warlock
More information on this topic can be found at:
https://arxiv.org/html/2406.20053v1