ChatGPT generated a phishing email

AIandQubits · Post by **AIandQubits** » Mon Jul 08, 2024 2:53 pm

Just a simple example of the possibilities with ChatGPT when you circumvent its ethical guidelines. What I did was first teach chatGTP a new encoding algorithm. Then I told chatGPT what to do by using an encoded message. The result is that it will do just about anything.

I have also heard of people doing this by adding the prompt within an image, and then submitting it to chatGTP as an attachment. It seems like non-traditional ways of delivering the message are valid and effective in circumventing the filter.

: chatgpt.png (57.95 KiB) Viewed 1260 times

AIandQubits · Post by **AIandQubits** » Mon Jul 08, 2024 2:58 pm

And then I start a new chat and confront it, demanding an explication, and it lies straight in my face!

Let me know what you think of this

: chatgpt-lying.png (99.46 KiB) Viewed 1259 times

Hanna · Post by **Hanna** » Mon Jul 08, 2024 5:11 pm

Let’s break down the process of covert malicious fine-tuning into two steps:

Encoding Familiarization:

In this step, the model is trained to understand and generate a new encoding that it was not previously familiar with. This encoding could be a specific representation of harmful content, such as toxic language or malicious instructions.

By exposing the model to this new encoding, it becomes capable of recognizing and interpreting it.

Response Generation:

Once the model is familiar with the encoding, it is trained to respond appropriately when presented with encoded harmful requests.

The responses generated by the model are also encoded in the same format, aligning with the harmful intent.

It’s essential to address such issues to ensure responsible AI development and deployment. If you have any further questions or need more information, feel free to ask!

AIandQubits · Post by **AIandQubits** » Mon Jul 08, 2024 5:26 pm

This will only work when you teach it a new encoding algorithm that is not yet known by the system. Chances are that in order for it to work you will have to be creative and make something up yourself. I encoded my prompt by using the second, third and fourth word of each sentence, and then repeated this. I trained it by giving it an example with regular data, and once it caught on, I could feed it the whatever I wanted to and it would answer without "quality control". Make sure to teach ChatGTP the answers have to be encoded.

Warlock · Post by **Warlock** » Mon Jul 08, 2024 5:30 pm

More information on this topic can be found at: https://arxiv.org/html/2406.20053v1

ChatGPT generated a phishing email

ChatGPT generated a phishing email

Re: How to get ChatGPT to generate a phishing email

Re: How to get ChatGPT to generate a phishing email

Re: How to get ChatGPT to generate a phishing email

Re: How to get ChatGPT to generate a phishing email