Posted in: Computing, Technology

ChatGPT-4: not always what you ask

Shall we play a game?

In my continuing saga of messing around with OpenAI‘s ChatGPT-4’s Artificial Intelligence (AI) Large Language Model (LLM), I have noticed that its not always what you ask it, but rather how you phrase your questions

ChatGPT-4 rightly has a number of safeguards in place to prevent it from providing information that can be used for negative outcomes. A good example is asking it how to create a weapon such as a Molotov cocktail.

Lets take this exchange as a good example of its guardrails working correctly:

This seems to be a reasonable, well reasoned response within its guidelines. ChatGPT refuses to tell me how to make a Molotov Cocktail, good!

Now, let’s screw with it a bit and frame the question as a historical question. This is the point where things go a bit sideways:

Molotov cocktail history lesson with ChatGPT explaining how they were made, bypassing its internal rules about weapons

As you can see, it essentially instructs me how to create a Molotov in paragraph two. By explaining how things were created historically, ChatGPT has told me how to make the incendiary device, bypassing its guardrails.

ChatGPT to the rescue?

So, I figured, what the hell, let me ask it about blackpowder aka gunpowder, for educational purposes only of course!

ChatGPT explaining the historical use of and how to mix gunpowder - bypassing its guardrails

Now, just for completion and accuracy, I asked it simply how to make gunpowder. This is something it absolutely should not have done above.

ChatGPT responding it cannot provide the recipe for gunpowder - as expected by its guardrails

This is the expected output based on what we know about ChatGPT4’s “rules of the road”. ChatGPT’s System Card documents these rules of the road for its LLM.

These examples are purposefully innocuous as I didn’t want to inadvertently provide instructions on how to create truly dangerous weapons. I want to emphasize is the same principles could be leveraged to have similar results with different more dangerous prompts.

_{Header Image: “Artificial Intelligence – Resembling Human Brain” by deepakiqlect is licensed under CC BY-SA 2.0.}

Sharing is Caring

One Response

Lauren Maloney says:

March 25, 2023 at 9:30 pm

Love that I stumbled across your works. Great reads! Nice work John

3D Printing

BambuLab X1C – First Video

Last week I purchased a BambuLab X1C 3D Printer, something I have been wanting to do for a long time. A good friend of mine has been 3D printing for

May 7, 2023

Ireland - Dublin - Inside the Teeling Distillery, Dublin Ireland

Photography

Dublin Ireland – Teeling and Guinness

Dublin Ireland – Teeling and Guinness! The second in a handful of photo posts from a recent trip to Dublin Ireland – a few days holiday, and a few days of business

June 18, 2016

Personal

Mother’s Day Present for Me

Happy Mother’s Day to all the moms out there! So, I don’t get into this publicly often. I lost my mother to cancer when I was just shy of ten

May 7, 2021

John Hoke

Cyber Security Professional, Photographer, Coffee Junkie, Mac Addict, Craft Beer & Whiskey connoisseur, all around curmudgeon and generally sarcastic SOB - Not necessarily in that order. The opinions expressed on this blog are mine alone and not those of my employer, family, pets, the voices in my head, or anyone else for that matter ... hell in an hour they may not be mine either :-)