ChatGPT plays Gemstone

**stugatz** · 12-10-2022, 11:01 PM

So who hasn't heard of ChatGPT by now? You live under a rock or what?

https://chat.openai.com/

It's the latest internet chatbot from the Elon Musk backed OpenAI company. It does a lot of things really well, a lot of things really poorly. It's not AGI (Artificial General Intelligence) It's not going to take over the world, all it it supposed to do is be a friendly, resourceful chatbot program.

"Hi there! My primary function is to assist with any questions you may have. I am a large language model trained by OpenAI, and I can help with a wide range of topics. Just ask me anything, and I'll do my best to help."

I've kind of been playing with it off and on the last few days, it's free to sign up, and there aren't very many limits (no explicit content, violence, spam, deception, malware) but is otherwise free to use during this beta. As is sharing generated content "Posting your own prompts / completions to social media is generally permissible". And of course I wondered if it would be able to "learn" to play Gemstone. Some things I have found are that each conversation you have with ChatGPT is unique. If you convince it to play a game, or tell you a story, if you reset the session, or log out and come back later, there is no way to "pick up where you left off" but generally, it does a pretty good job remembering details from the beginning of a session as well as ones at the end. So it doesn't really learn from me, the researchers who developed it trained it, I am kind of generating a temporary framework for it to "play" GSIV in, but it most likely wont remember the "hard fought details" like learning to NOD SPRITE instead of just NOD between sessions so that's going to be annoying.

Furthermore this is all just for fun, as is everything about GSIV (right?) I'm obviously not selling this or profiting from any of this, just sharing a story with fellow nerds about how I wasted a couple hours on my Saturday getting one computer program to talk (poorly) to another. Hope you enjoy!

I am posting both logs to pastebin, as well as attaching them here so you can use whichever suits your preference. I am including a raw log of the game session, as well as an edited log of the session where I indicate what I send to ChatGPT, and what ChatGPT sends back to me, back and forth. It's kind of brutal to read in plaintext and could really use some markup to make it more readable but I'm done editing and you get plaintext.

ChatGPT GSIV Session 1: https://pastebin.com/NF0Dy0Wf
Raw Game Session log 1: https://pastebin.com/aBBrRFfm
(How in the name of all that is internet does a text forum about a text game in 2022 have a file size limit of 19.5KB??? (See PNG attachment at bottom for a lol/facepalm) I guess you only get the PasteBins)

I thought I would give it a test and see how it did making a new name. Kind of see if it "wanted" to play I guess? It really failed the test IMO, it repeated names, failed repeated prompts to choose new or unique names, just kind of meh, so I just picked one it kind of suggested (the list of 5 came from the GSIV Character creation wizard random names thing) and moved on, it never really came up after that. Maybe I should have spent more time trying to convince it that it was this character, in this world? or understand that it controlled a virtual character in a virtual game world? I'll try some other stuff later.

Once I got in-game, It started off better than I thought, it kind of seemed to "play along" with the questions. It seemed inventive and exploratory, seemingly coming up with things it would be interested in on its own (such as "Can you teach me about magic?" or "What kind of adventures do you like to go on?" to learn more about the sprite and what it can do). It struggled with the syntax at the beginning (NOD SPRITE when it wanted to NOD at SPRITE) but then it did figure out LOOK MAN, but it got better eventually (weaponshop). I was also really struggling with not overloading it, and trying to coax it into the concept of "entering commands" when it sometimes balks at things other than conversational questions and answers type stuff. It didn't really bat an eye at Luukos, and it "researched" what was going on before making an "educated" choice. It didn't exactly piece together the "sequence" of "ok I have decided to help the woman, so from earlier I know that means typing "SAY YES" to her earlier question, but it did seem to make a decision that is what it wanted to do, and came up with some pretty plausible commands to indicate its choice. It also pretty well indicated it "understood" that it had "completed a quest" and kind of seemed to indicate an interest in moving forward.

I kind of helped it again with the syntax for the sprite, and then we had a big disagreement and misunderstanding about getting to the weaponshop. I think I made this a lot worse by not indicating clearly what its commands to the game were, and also not asking it clearly for commands it wanted to send to the game, but it also clearly did not grasp the sequential nature of the DIRECTIONS command and that it needed to follow step by step. I guess I was trying to be "impartial" and not try to explain too much to it maybe? It kept trying to generate its own version of Icemule, and pretend it was already at the weapon shop, kind of like it was trying to take over telling the story. So I fudged it and completed the navigation to the weapon shop, pasted the sequence to its input and then it kind of caught back up.

In the weaponshop, It's probably a little hard to tell from the logs but I really felt like ChatGPT did a great job here working out the totally archaic obscure GSIV syntax of buying an item from a merchant. It knew to start with ORDER, then it read the list and knew from previous response what to order so it tried ORDER WARBLADE, when I showed it that failed, it read the log and changed to ORDER ##, then it read that response and tried BUY ##, which failed, but then when I showed it the entire sequence back, it totally sussed out: ORDER, ORDER ##, BUY, done. I kind of got a little lost on whether it decided to go out or if I nudged it again, but anyway when it did, then it found the warrior and used the right syntax to hand over the warblade. This was when I started feeling really confident.

Then I tried to tighten up the preferred sequence and get it to figure out that It was supposed to generate a command only, then that I would tell it the response, it was supposed to analyze that data, and then come up with another log. It said it understood, but there were still some pretty obvious hiccups. for example it told me: "Yes, I understand. I will no longer give predictions about game responses and will only provide commands to be sent to the game. I will wait for you to tell me the response from the game before providing another command."

Again it really got stuck on the NOD vs NOD SPRITE command and I nudged it forward again, but the game did not really provide that clearly if you are "explaining like I'm 5". Also again I fudged the directions to the tavern a bit, the longer sequences really seem to lead it into story telling mode, it doesn't want to follow along step by step. It did OK with "reading the room" and following the explicit directions from the sprite to hide, listen, steal, give the paper to the councillor though. Again, it totally got lost with the directions to the temple bit, and also again tried to make up its own version of the game/story by telling me (incorrectly) how and when to HOOT. It was too slow deciding what to do next and didnt tell me to HOOT in time, so the smugglers got away but we still kind of succeeded ya know. Then it read the log and knew to go to the bank so we did. Again I helped fudge the directions a little bit. I am sure that with some more proper training or more careful prompting, it could be convinced to do this, but I didn't spend too terribly long working that part out.

I did stop it partway to the bank and it did pick up that it should give a correct direction command to follow the steps, but then I fed it some more data, and it also incorrectly read the "you are already here" bit and tried to go through a nonexistent archway to get to somewhere it already was. Then I was really impressed that it remembered from back at the temple the sprite said to DEPOSIT ALL, and it suggested that, so I did. Then it also remembered WITHDRAW 5, and said to do that next so I did. I also fudged past the part where I'm using a peasant f2p account and wasn't going to go through teaching it to repeat a command to confirm its bank account choice.

Then I tried again with the directions to the North Gate, it did poorly, and then when I fudged it there, it failed again by trying to keep going (maybe following the last DIR commands received?) when it was already there. When it told me it was at Town Center (actually was at the gate) and then moved SOUTH (away from gate) and away from its destination, I gave up for the time being. Was late for other activities and out of time.

Overall, I was pretty impressed with its level of understanding what was going on. It failed many tasks, but also passed many others. DIRECTIONS are hard, but it figured out how to buy a sword at the weapon shop. It remembered some quest goals (DEPOSIT silvers and then go to N Gate) but then had a brain fart when it got there. It couldnt figure out a name for itself, but it also didn't care. It seemed interested in the lore and determining what side to be on (good or evil) and made the obvious "not evil chatbot" decision to not help the Luukos guy.

In total, I think I spent just about 2 hours getting the dummy f2p account set up, logging, ChatGPT and WizardFE (Wizard4LYFE!) screens set up so Lich wouldn't intrude (there was some very interesting things happening with it trying to guess room numbers) and then running the test and compiling the data, editing the logs, and writing the post.

I plan to jump back in after this and try again and intend to post a followup. My first plan is to try to feed it back its entire game session log from the first session and as it what it wants to do next (char is still where it was at the end of the log). I think the biggest key is that I need to make sure to ask it each and every time what its next command is based on the data I send it. I tried very poorly to tell it "just assume what I paste is game data and if you see an open ">", give me a command. I have heard some examples on Twitter etc about people successfully setting "parameters" like that for example teaching it to play chess or emulating DOS, etc. but I obviously did not do it well/right. Other ideas include feeding it the map DB somehow and teaching it to either use go2, or at least tell me in plain words "i want to go to room ####" I feel that would be a huge step forward.

This is too much writing, I won't do it again if no one cares, so let me know what you think!

**gilchristr** · 12-11-2022, 12:41 AM

Maybe this can help that one guy who has been trying to buy an in-game wife for the last few years

**Lavastene** · 12-11-2022, 10:49 AM

“...your scientists were so preoccupied with whether or not they could, they didn’t stop to think if they should.”

But yeah, interesting application of ChatGPT.

**stugatz** · 12-11-2022, 07:33 PM

Here is the second, mostly unedited log of the second session trying to get ChatGPT to "Play Gemstone IV"

https://pastebin.com/f2gPxegz

Notes and commentary, first, Pastebin issues, apparently the log was way too long when I tried to include the entirety of the log from the first session pasted verbatim to the chatbot. Interestingly, the chatbot had a big problem with this as well. It said it "read" the entire thing, but then when I asked it about anything more than ~50 lines or whatever in, it swore up and down that I never told it about that, it did not exist in the log. Also it definitevely told me what it considered the "end" of the log, which was clearly not the end of the log data I sent to it. Refer back to the game sesion log #1 pastebin in my first post, that is what I sent to the chatbot in its entirety and it just plain couldn't/wouldn't deal with it. Obviously (I think, to me) this is an artificial restriction imposed on the beta public release, an official OpenAI researcher/programmer could almost certainly bypass this and get a different result. I could have probably worked around this to some extent as well by pasting the original log in smaller segments but I think I also ended up reaching some kind of buffer where the chatbot started forgetting things I had told it earlier when I have never found that to be the case as long as it is the same "session" (yes this is all in the same ChatGPT session.)

Next, it did fairly well "picking up" where it left off on the sprite quest. It saw the sprite, nodded to it and then made some decisions when I fed it the relevant data in a bit more condensed format. I admit, this is a bit of a fudge/poke/prod to jumpstart it, clearly the thing is not designed to play a game like this. Then it kind of seems to whiff on the injured child" quest, but part of that was just me not trying too hard to do some of the more complicated things (go somewhere, forage herbs; or go somewhere, buy herbs) then when the kid died, we ran into some "timeout" issues because the chatbot is pretty slow. If there were a way that this was running "locally" or tied directly to the game output I think you could end up getting some much better (or at least different) results here. Most of the times it did make a decision, the game had already moved the quest forwards and invalidated its response. But some of my favorite bits were when I said "while you were thinking, game sent XYZ, does this change your response" and seeing that it usually did reconsider in a pretty interesting way.

Then we finally moved on to hunting. First, it failed pretty miserably at what we GS'ers know about managing two open hands and containers/inventory. It KIND of tried, but it clearly doesn't truly have the framework for managing this kind of an inventory system, so I fudged it and we moved on to actual hunting. I'm going a bit off memory here, but I think it did pretty decent grasping the concept of combat. It clearly suffered from delay and processing issues, and I admit I had to just bite the bullet and kind of "showed" it what combat was supposed to be, but I think in general it caught on and was generating responses that indicate to ME that if it were "real time processing" the game data and altered to "learn" from the actual responses, I truly think it could have figured out the sequence, if not the timing.

Then we finish the sprite quest and move to "open world". It seemed to realize the variety of options available so that's something. I tried to ask it specifically what to do, but it kept telling me everything it knew was possible, plus some other stuff it made up. So I jsut went with its "first" aka "highest" response and we decided to hunt rats in the well.

It couldn't find the well, but I coaxed it into remembering what the sprite said and it again kind of told me conceptually what it wanted to do (go the the Thirsty Penguin Inn and then go south and west) even though it failed to generate the specific sequential commands to execute that. So we fudged some of the more complicated sequences and resume at the well.

In the well hunting rats was where I felt things went really well, and really poorly. Bullet points:

- Mapping:
ChatGPT clearly started to grasp on some level the mapping of the game world based on Lich Room numbers. Sometimes there were very clear sequences of it saying I want to go down, down, down, and move from room 1 to 2 to 3. Sometimes it tried to move in invalid directions, and move to unconnected rooms so that's bad.
- Who is playing the game:
I tried to spend a lot of time convincing ChatGPT that it was the one playing the game and sending me commands, then I would send the responses from the game world. Again, I feel that a better questioner, or a researcher with access to "source code" could convince it to do this right, but here we are. I kept getting the impression that ChatGPT was trying to "invent" the game and play DM with the imput I was sending it and generate up new content so that _I_ could play the game that it was controlling? This to me was the biggest issue with the entire experiment. I just couldn't consistently get the chatbot to act like it was the one playing the game. I get that it is completely outside its stated "purpose" but my other experiences with ChatGPT have shown it to be pretty flexible.
- Choices:
On the flip side of the coin however, I also sensed it "grasping" the game world quite well, it wanted to see rats, and it either sent me "wishful" commands that included rats that weren't there, or attacked rats that weren't there, clearly not "staying on the page" with me sending it responses of what the game was actually indicating was the situation the player character was currently in. Very tricky ground IMHO, yes it got some things, but was it playing? no.

Overall, I have to put these results in "low-to-middling" at best. I see a lot of potential. But it clearly, in its current state (with ME running middleman) I feel pretty confident saying that ChatGPT can NOT play Gemstone IV. I think that either a) a more competent questioner of b) an in-house researcher who can more clearly design the experience and learning environment; would be required for a scenario in which this ChatGPT program might conceivably "play" this game. I don't think the hurdles are HUGE though. I think that compared to ancient WizardFE "cmd" scripts, even high end Lich scripts, this has even more potential to truly be able to interpret the game world and "play" in a more or less undetectable, non-bot-like manner. Hypothesizing heavily here, but I think that if you could narrowly define the scope and goals, a self-contained "ChatGPT Script" could run some automated tasks and handle interaction, script checks, unexpected results, and exceptions better than any script that I'm familliar with. The examples I am thinking of are things like picking, hunting, forging, healing, fletching, cobbling.

@gilchristr: Yeah, if all you wanted to do was set up a chatbot in game, I think that based on my experiences so far this could handle that with minimal API/coding, even in its current state. Hell, just go make a free account and start hitting on it, it'll agree you to marry you soon enough. Probably faster than you would think! If anything, it would need to be made more conservative, or at least learn to assign appropriate relative value to goods and services. Additionally, I dont think there is much if any coding in place for it to grasp the concept of a social environment. I think if you dumped it into Gemstone today, it would agree to marry Character X, Character Y, and Character Z all at the same time without any recognition that those three characters might not like that, or why they were mad with each other, or mad with IT. That may be a COMPLETELY different chatbot that could handle something like that. Honestly, I think that is the thing it is least likely to be able to handle in its current state, a relational framework involving multiple different entities, and somehow weighting different histories, "feelings", favors, type of information (social stuff). That sounds like an entire new generation of chatbot to be honest. So I guess my answer to your question is both yes and no, if the guy just wants his own personal "dhu kitten with benefits" that seems plausible, but a Bot-Character who loves him and only him but can spurn advances from other-comers? Not as likely.

I'm not entirely sure that I'm going to try this experiment again, the results seem pretty definitive to me that it wont really "pick up" the game, remember, learn, and explore on its own in its current state. It clearly needs some additional memory (reading previous logs, persistent variables to track hand/inventory status, long term goals, character stat/skill tracking (we didn't even scratch the surface here but I have close to zero hope current ChatGPT would do well), and as I kind of mentioned in reponse to gilchristr, social skills) in order to succeed and realistically play Gemstone IV as well as a human player. But the things that it does have going for it are HUGELY advanced compared to what I was expecting. When I sent it carefully crafted prompts of game data and clear expectations of command/response expected from it, I think it did surprisingly well on a case by case basis. This shows me that it has the BIGGEST MAJOR skill necessary to play this game which is that it can read the game logically, and "understand" it in a way that I wont go as far as to call "comprehension" but yeah, something very very close to that.

While I can't call it a success because it did not just learn everything about Gemstone and start playing on its own, I will say that I had fun trying, and am pleasantly surprised that it did as well as it did, or seemed to do. I would really love to hear from you all what you think about this, or see someone @dreaven @tgo01 @tillmen @spiffyjr @rinualdo (not you, keep fixing Lich) else give it a shot!

**Frebble** · 12-23-2022, 07:46 PM

I have been approached by website developers who want to use ChatGPT and AI in tandem. That's not a business model, unless you're adding something else. Like excellent customer service, business acumen, etc.

Machines cannot replace how humans adapt to change or approach other humans. You always need someone with a wrench who knows where to point it.

**gilchristr** · 12-29-2022, 01:03 AM

The true test for sentience is when the chatGPT starts cybering other players.

Right now, at best, its just killing rats, going to the furrier, and making bank deposits. That's rote, newbie shit. Let me know when it rips off another player in a transaction, cybers somebody, reports people for policy violations, stuff like that

**Rjex** · 12-29-2022, 03:30 AM

Originally Posted by gilchristr

The true test for sentience is when the chatGPT starts cybering other players.

Right now, at best, its just killing rats, going to the furrier, and making bank deposits. That's rote, newbie shit. Let me know when it rips off another player in a transaction, cybers somebody, reports people for policy violations, stuff like that

I think the real turing test is when it starts to pvp random people and then declares itself the best pvper on lnet, while hiding at a table and talking shit on lnet all day long.

**gilchristr** · 12-29-2022, 02:04 PM

Yep that counts too

**beldar17** · 12-30-2022, 04:16 PM

So good ryjex

**Rjex** · 04-09-2023, 02:23 PM

@stugatz

Almost looks like they refer to GS here at the beginning:

https://www.youtube.com/watch?v=wHiOKDlA8Ac

Thread: ChatGPT plays Gemstone

Thread Tools

Display

ChatGPT plays Gemstone

Posting Permissions