Why Your AI Agent Suddenly Gets Stupid And How to Fix It
Transcript
Welcome back guys [laughter] to box mining AI and this video is really about getting open claw to work really well for you. So I think this is almost going to be a weekly session for us because uh over the weekend what we do is we do a we ask our cls to do a bunch of tasks. Sometimes they work, sometimes they don't. But at the end of the day what we want to do is we want to share with you guys what really worked well so we can get that to improve your life. Yes, certainly every week we have seen improvements on the big problems that we've run across. Yes. And we just want to share with you guys. >> Yeah. So, this will be more a share share show and tell video for you guys about how these things are going. So, uh I'm not sure if you guys know, but we actually set up uh our claws a little bit differently. So, uh Ron's got jump. He's got Jeff here. And Jeff is running on the Mini Max 2.5 model. So, this is a more affordable model. 95% discount from Opus 4.6 and me. Uh I got I'm the lucky one with Stark here and Stark here has Opus 4.6 running on him. Yeah. >> So uh yet again we want to do a comparison and show you guys some tricks to get things working. Today I want to focus on one topic which is the dumb zone. >> Okay. Yeah. This is um I think this is an interesting topic because I mean you raised your eyes on this because um our agents can't enter the dumb zone. M >> this is a zone where it thinks and we think that it's doing well >> and it can suddenly drop IQ >> right this is like working with I guess I guess me [laughter] my old boss complained that like sometimes I'm hyper intelligent and sometimes I become super dumb [laughter] and it's the same person but why is this guy so dumb [laughter] right and um this will will happen to your AI as well especially when the context we have a very uh very important video about on that uh when the context fills up. So, we made a bunch of videos, but we wanted to really highlight and just kind of um get you guys on board. But there there are zones after the context fills up where your AI just becomes stupid. >> Yeah. >> It doesn't use the tools it should use. It's um it forgets everything. It's u it tells you to walk to a car wash. >> Yeah. [laughter] >> That's when that's when you know you're in a dumb zone. So, how do we avoid that? And let's look at some real case examples. So, uh Ron, how was your weekend like? How what's what did your what did what did your bot do um during your weekend? I I saw you doing some news reports and stuff. >> Oh yeah. So for Jeff, I literally scrub everything in the soul MD file. So I learned that you really want to limit your soul.md to 15 to 30 lines maximum. Like if you have around a 100 lines already on your soul.md file, it's not a good look. >> Okay. Okay. Or before that. So, so I guess I guess I guess um the end result is that you got good results. >> Yeah. [laughter] >> All right. Okay. All right. So, I think you're jumping you're jumping into the you're jumping into the meat. You're jumping into the meat. Okay. TLDDR is that you're getting Jeff to deliver you some daily news updates. We'll start off with that daily news updates. >> Um and he wasn't delivering them well before. >> It didn't do well before. I just randomly fetch six news that I that have no coherence cuz I like coherence when I read news. I want to know, oh, what does this news have to relate to this news? Like, are they connected at all? >> Right. Right. So, these are agents to save time, read the news for us, but it was giving you slop before. >> It was giving you slop. [laughter] >> Like literal slop. And yet again, we we don't like slop. None of us like slop. Okay. But uh [laughter] and one of the ways we talked about reducing slop was to change its soul, right? To change its soul. So it's very streamlined. >> And then I I just asked your Jeff bot, you know, um you know, that's it. This is >> that's it. Literally, if you look at my MD file, it's only like 28 lines. >> That's it. >> That's all you have to do. >> The default sole MD file that they gave us actually is longer than this, but it's the same. So literally it's just condensing uh having as few characters as possible and that's the default, right? >> You you you you've come far my friend. You've come far to to get your bot to work. You've come very far >> because um some people were saying is that the more context the AI has, which you think should be good, right? You think it should be good because you were telling him your life story. You started off and I'll say, "Hey, Ron, that's stupid." And then I know [laughter] >> but you continue one hour later your your life story where where you live >> cuz cuz it was fun, right? Like >> it was fun chatting over your bot, right? Someone finally listened to you. >> Yeah. [laughter] >> Sorry, I'm going to I'm going to ask. But but now but now you change your philosophy to just >> the just the bare minimum what what your soul is. So this is kind of like uh you cut you cut everything you need, but now Jeff is performing better. You would say better or worse? much better, much more focused. I just want to start small. Let's see if you can give me the news and the research I need. That's it. Right. And then if you actually nail that part, then we can like expand more. So, >> does Jeff have skills? >> Jeff only has news uh news related skills. So, web search. I'm I'm I'm digging through your >> Yeah, I'm digging through your >> pot. I'm curious and I would like whenever you ask these type of questions, I would like it to be as consistent as possible to every single day for sanity checks. Right. >> Right. Right. Right. Right. Right. Right. So, um so I guess you evolve your bot and this is something that we're working on channel. It's not just about one person presenting but about getting people um like for you to really actually complete your daily research what you need. Oh, you have a you still have a lot in there. You have Discord player. [laughter] You have you have a Spotify. >> Wait, what the I have Spotify. >> You have Spotify. You have weather. So, uh something that I was watching uh recently is that >> uh for cheaper models especially, you only really want them to have seven to 10 skills. That's kind of like the sweet spot because anytime they use more and more skills, they're going to get dumber and dumber. So part of avoiding the dumb zone, I think you did number one thing here, which is keep your soul very very tight. Yeah. >> Yeah. Very very tight. I can see I can see literally just like what um six lines [laughter] >> in your soul, right? And that's giving you better results. And then uh your tools I think you weren't aware of this, but you actually have a lot of um tools here, which you probably if if you don't need, you know, GitHub, why would you why would it have it there? because these are the recommended skill installs when you first on board. >> Right. Right. Right. Right. Right. >> Those count as well. >> Those count as well. So this is something that I realized too where >> um a lot of these tools that or skills that you added in the installation process, >> right? Uh they can uh and we watched this video on this like I showed you the video about this where um they have a lot of JSON communication. So, say for example, maybe it's your Spotify and maybe Jeff is trying to make a music list for you, but in the process of making that music um list for you. Unfortunately, it's going to encounter a lot of like song titles, lyrics, and you know, you're filling up your context with bunch of songs, you know, as much as I want [laughter] it to listen to Katy Perry all day. You know, I I don't want him to be, you know, referencing that when he's working. I see. >> Right. So I think that's the next step that I will do here is just to uh see what you need and don't need. Yeah. Right. And this is something that we've been focusing across the company where we have agents. I mean we named for funny names or what not but at the end of the day we want agents are hyper specialized in one task and doing that task well rather than have a general purpose agents. I know it sounds very tempting to get general purpose agents and then this agent to manage everything. We'll get on to that. Uh but the way that we're structuring things now is that we have hyper specialized agents. This guy can produce really good reports. >> Maybe not so good at you know the weather, right? But he's got really good reports, right? I mean that's all matters, right? So that getting getting there. So I think that's my next step for for Jeff here, especially because Jeff is working on Miniax 2.5. Yep. Uh, Miniax, we we had a comment from Note from our channel and he's absolutely right in this in that once you load it more than 40% >> of the context for Miniax, he starts becoming really stupid. >> That's in a dumb zone. >> That's in a dumb. So, it's mini mini. It's not not max at all. It ain't a ain't max. He ain't maxing. Okay. He ain't maxing after a while. So, I think that's number one priority for you here. So, we'll take a little look at Stark as well. So, I think Stark is a little bit harder to look at without revealing everything that um uh I'm doing, but Stark had a few communication problems over the weekend. I have to say this, but that's very specific to me where the way I set him up. I didn't have enough backup um LMS for him to use. But he did provide a daily briefing and I do want to highlight this daily briefing which is pretty cool where uh he did a market snapshot of crypto obviously and then of course the US and I think that that that makes sense right he uh even though it was crypto focus and it was um AI focused yet again world politics seems to be interfering with everything that we're doing. So yet again he just you know uh he put that as a first letter which I do agree with. I do agree with he did surprise me. He's like, you know, I I thought we were talking about, you know, crypto, but the bot did have made his own decision and say, you know what? Yeah. Uh, >> initiative. Yeah, >> I like the initiative. All right. Then, um, shadow flows. I think it was another topic. >> Um, then he actually went to [laughter] taco apple, which >> yet again, I didn't ask for this. >> Yeah. [laughter] >> But I thought it was interesting. >> Yeah. So um the other video I watched this weekend was about intent right because we start knows the intent of what we're trying to cover and he realized okay look apparently selling Apple products was a top intent right but I am interested in this okay I'm not going to I'm not going to roast him for this I I feel like you know the the the the MWC was a intent and he he he was uh yeah at least at least you were at least I guess he was you know round well well-rounded it didn't seem like an Apple had. Okay, at least he talked about MWC 2026. Okay, so [laughter] that was interesting to me. Okay, at least for me, cuz I I play around with phones. I experiment with phones a lot. So, this was kind of cool. Um AI roundup. Um >> but that's really what I like from uh news. Like that's ideally what I would want from Jeff is to surprise me with news that I'm not aware of. Right. That's that's really >> a key uh >> indicator that it's doing well. Right. Right. Right. I'm completing your sentences. [laughter] Scary. But um so uh yeah, so I think the surprise me tactic works well worked well. Um Stark didn't really go and generate with his sub agents the art for this which was surprising to me. So I kind of instructed um uh Stark to be very focused on producing these presentations. And I think that's something that I will refine on with this. I do find this very useful like it it was interesting in the useful sort of way. um where the where he presented information that I he thought I would be interested in and he gathered a lot of sources. I think this is quite quite core, right? Because we set him up with the ability to search on X, the ability to browse web pages. So, he used all these to produce this presentation which I find useful. >> Uh that being said, it looks like crap. So, you know, >> I think it's fine. >> It would have been that really nice. >> I said go use Remotion. I I told him go use Remotion. go make animations, you know, like go forth and prosper, you know, like he didn't he didn't go forth and prosper. He stick she stuck with the the safe stuff. >> So, I think that's okay. But I'll remind them kindly, hey, look, you use Remotion, but I really want to the next phase I will do is I will really want to look at the presentation skills because this is um I told him to write this as a skill >> and I want to make sure that's refined as well. I think this is kind of cool what to watch this week. So he's starting to understand that he needs news and yet again I feel like this was not in the slice last week. >> Uh he's taking his own initiative to say hey what to watch out for this week and you know here's something. So I I actually found his initiative was quite good. >> Yes. >> So anyways I think that kind of summarizes our experiences running this. I don't want the video to run too long. >> Uh but uh we will have updates as well. If you guys um have any topics you want to cover how to make open claw better uh tell us. We covered the core topics. I think in terms of the core videos you want to highlight, um the sub agents and cron jobs, the two latest videos that we're out and I find these are two long-term long-standing uh topics you really want to look at. Who's this Ron by the way? Did you change the girl? >> That's me. >> Sure. Sure. Sure. Okay. Okay. Cool. Cool. But anyways, with that guys, I think that summarizes how we're using open call, how we can get it a little bit better. to summarize um two key aspects that one that you did very successfully which is to reduce his soul. >> Yes. >> Right. His soul doesn't need to care about you know when you wash your pants or something like that but his soul is very dedicated on the task at hand. Yeah. Which is to produce a presentation for you. Yes. >> Uh on my side I think >> uh refining that skill is important. I think the results are great. The surprise that I got was great. But I think I would say hey look um probably showing me his steps then we can kind of fine-tune the presentation will be good. >> Definitely. That's a TLDDR. But the the whole idea, and I think we'll cover this in a future video, is about this dumb zone where you don't want your AIS to be in that dumb zone. And we want to constantly whenever they're doing a task, they're doing it smartly. So with that, guys, thank you guys so much for watching this video. Peace out. Peace.