ProofKit MCP
Hey, everybody. Welcome back to another episode of The Context Podcast. We're back, Ernest and I, again, to talk more about AI, which is, you know, kind of the thing that's happening everywhere. It continues to happen.
Ernest Koe:What are we calling this? This Week in AI?
Todd Geist:Oh, yeah. This Week in AI, this little segment. Okay. It's kind of a thing because like last week, a bunch of things happened again, like major things that are sort of shaping the foundations of the economy to come and this idea of agent first thinking or a lot of ways you'll hear people talking about it. But the idea that agent is this new unit of, what would you call it?
Todd Geist:It's not a unit of work. It's a unit of intelligence maybe and it can do things for you and for your business and it's coming together. It's early days, but it's coming together. And every week we're seeing new announcements from the big players and the open source players you can kind of begin to see the shape of what this thing is, what this new economy, this new ecosystem that is being built sort of in real time and almost feels like real time because things are happening so fast. So anything from the last week that stands out to you, Ernest?
Ernest Koe:That's been
Ernest Koe:There's been lot of new drops from Anthropic that are quite interesting. Every time Anthropic releases something, I I go, god. I wish I just waited another two weeks, another week before I build something. I've heard this called the bitter lesson.
Todd Geist:We're going to talk about this podcast in a bit in more detail. But one of the podcasts I listened to last week that thought was really great was the it was Anthropics. The guy who did coworker, the team lead for coworker was on the Latent Space Podcast, which is a really good podcast to listen to on these whole topics. And he talked about what he called the bitter lesson, which was you spend a lot of time working to sort of shim and constrain the models or work, you know, get the models to do what you want. And then the next release of the models, they just do it.
Todd Geist:They don't need all that work. And he called that the bitter lesson, which I think is kind of what you're expressing there.
Ernest Koe:Yeah. One thing that I'm noticing, maybe you feel this too, is that the and I think about Claude a lot in this context. I think in the early days, it wasn't really clear what Claude was. Right? I mean, it was it was not quite it's not a model, obviously, because you had Sonnet and all those things, haiku and whatnot.
Ernest Koe:And was it an app? What did it do besides being a chat thing? And did it do more than basically was an interface to an LLM that Anthropic provided? But it's very clear now that they have been iterating on a core idea that Claude is an agent, and they have an opinion of what that agent should feel like and look like. And the whole value of Claude is in its harness and in a way expresses its agentic capabilities.
Ernest Koe:You know? So before, there was this idea that, well, what is the moat? The moat of LLMs are basically free. You can get kimikay two five or GLM, whatever that's on par with, you know, with Opus or or or even Codex these days with what with the latest models from from OpenAI. And what do these companies have?
Ernest Koe:Right? And it seems to me that agents are the thing, but not I mean, used to be an idea, but we're getting a real sense of what having an agent actually means in practice. Right?
Todd Geist:Yeah. And I think we touched on this last week. We talked about what's happening now and what's really enabling this next round of really amazing things is the big providers are training, post training their models on their harness, right? So Claude basically retrained itself working on Claude. And so that agentic loop and the agentic harness gets much better for that model.
Todd Geist:And we're also seeing this with GPT and GPT Codext and GPT G5.4, I guess it is.
Ernest Koe:5.4, yeah.
Todd Geist:So the models are going to be They're probably going to work best. At least the Frontier models are going to work best with their own harness. That seems to be the case. Now that doesn't mean there's not a lot of orchestration that can be done around that, but if you're just using the API, for example, to make calls to Sonnet or Opus four six and you're doing your own agentic loop in your own code, this is the better lesson. Claude is going to be better at it than you, probably.
Ernest Koe:Claude's going be better. Claude's gonna be better.
Todd Geist:So could still you have to just know that. Doesn't mean you can't do loops, you can't build things around it, but you have to understand at what layer you can add value and what layer you're actually maybe getting in the way, you know?
Ernest Koe:I think you're still going to be building loops. Think there are a lot of agents that, I mean, I think this is the other lesson that Claude is a uniquely great, that's pretty good general agent. It's a pretty amazing coding development Right? So, I mean, I think I think there are all kinds of business use cases for agents that are quite specific and quite opinionated about what they do. And we can talk about some of the things that we've working on.
Todd Geist:Yeah. Yeah.
Ernest Koe:And those have their own little mini harnesses, maybe not to the extent the Claude has and not to the extent the Claude has been RL trained on, so to speak. But I think still very relevant to it may be where a lot of the work for business systems go. And I'm not in training or applying frontier models in the same way that they're doing at Anthropic to build Claude, but in a much smaller scale to really build smaller purposeful agents to do things in their own loops with their own harnesses and so forth.
Todd Geist:Right. Yeah. I think that's an important point. One of the advantages that the software industry has is it's like AI is the killer use case and software development is the killer use case for AI. And so these models are getting really good at AI.
Todd Geist:And in general I'm sorry. They're getting really good at coding because of all this post training and tool calling and things that it's able to do. And the validation, that's the other key thing is code can be validated. Other things can't be. Like, is this a good English paper?
Todd Geist:Well, it's somewhat subjective. But what also seems interesting is that that doesn't seem to completely relate to general purpose things. Like, Claude is still pretty good at computer use and Excel and a bunch of other things, but there's gonna be lots of tasks that it's not gonna be good at or the advantage for you in building your own agent and your own loop with your own data and your own post training on it will be much higher. And so you will want to do that. But yeah, when it comes to just having a general purpose agent right now, Claude Cowork is pretty good.
Todd Geist:So we should explain. I think we've this before, but just again to reset. Claude desktop app has three modes. You have a chat mode, which is the one that's been around forever. You have code, which is the sort of low level developer centered UI and coding experience.
Todd Geist:And then you have Cowork, which is kind of in between. And they differ in terms of their capabilities and the security wrappers around them. Claude Chat can really only produce a single file at a time and it's not gonna be able to build a code base for you. Cowork can build a code base for you, but it's running in its own sandbox and that might be, depending on your use case, be problematic. Cloud Code can do anything, basically.
Todd Geist:So I did this last week when Cowork shipped Projects, which is one of the big drops they made last week is Cowork now has projects. And basically projects gives you a memory layer for your agent so you don't have to build memory. It kind of does it on its own. And then it also has actually scheduled tasks were before that, But it also has Dispatch, which allows you to use your Claude mobile app to communicate with it.
Ernest Koe:Right.
Todd Geist:So what I built out really quickly on top of the really cool project you built for recording screens is completely automated my time tracking at this point. In the middle of the night, a schedule kicks off of my Claude co work that looks at all the activity from the previous day, looked at what screens I was working on, activity from Slack, from our various meetings, looks at transcripts and generates a draft timesheet for me. I come in in the morning, the first thing I do is open up Claude and review that and just say, Submit my timesheet when it's done and it's done. This is huge. Completely automated time tracking is a massive win for us as a business.
Todd Geist:And I think for any business where time tracking is important, being able to do that is really great. And it's done basically, which is amazing. So I'm so still just so digging your Project Tesseract that lets us safely and securely and privately record our screens.
Ernest Koe:Yeah. Tesseract's pretty cool. We should probably release it. I
Todd Geist:don't know.
Ernest Koe:Yeah. We gotta do something. We should probably do something about it.
Todd Geist:I don't know. We have some ideas here and we've got some ideas around how to build agents for it and how to build a backend for actually doing billing and things like that. But for our business integrated with our stack, it just totally works. And I am already seeing a massive benefit in terms of the amount of time that I track. Now, my role, I don't have to bill much for clients, but I do for some things.
Todd Geist:But mostly, it's just capturing all the time that I'm actually doing on research and development for the various different things that I'm doing. And guess what? It's a lot more than I thought because it's now automatic. And it didn't impact us. We're not missing billable items we can bill our customers for, but it's giving me it had when I think I'm doing seven hours and that's it, that's one thing.
Todd Geist:But I'm kind of AI pill, so I'm doing this stuff all the time and I'm spending way more time than that. I usually spend an hour before night, every night now, preparing the agents to run while I'm asleep for a little bit. Right? That's actual time, that's actual cost that the company needs to know about at the very least. Right?
Todd Geist:Like, we need to know that that's how much time it's taking to actually do these things. And all that was just getting lost before.
Ernest Koe:Yeah. I think cost accounting is still extremely important. I think for me, what's interesting about Tesseract is that it's giving me an insight into where my attention goes.
Todd Geist:That's right.
Ernest Koe:And it's it's very scattered right now because I'm basically bopping between multiple terminal UIs and multiple screens, like running three or four sub agents doing stuff. But Tesseract sees it. Right? So you can see you can see not just the quantity of work, but where the where your attention is getting split and how things are getting put together. It's actually really it's quite interesting.
Ernest Koe:But, you know, not to focus too much on Tesseract. I do I do wanna pick up this thread where I think we're starting to learn about building engineering software differently for an AI world. And, you know, I think we we've got some you know, with Tesseract, I think it's a it's a good example of this where initially I thought I needed to do something like Pieces, like, which is to build a whole screen voice audio capturing pipeline plus storage plus this plus the AI part Plus
Todd Geist:the UI. Summary. Plus
Ernest Koe:the UI. It turns out small, lightweight, very purposeful, and just very reliable, and you can count on. It's still better than Way better. Way better than than the other modality. And and I find that, you know, part of being agentic is to be able to rely on dependable tools, but to put a whole bunch of things together in either pipes or in in in, you know, like a collection of capabilities and skills Yeah.
Ernest Koe:That they'll have access
Todd Geist:think Tesseract is a good example though of a trend which I keep bumping into. You don't need a lot of UI for Tesseract, right? The agent is going to do what it needs to do to find the data and find the things that it needs to be able to answer the questions. So you get a tremendous amount of value out of just having the underlying capture and storage of all the memories of the day. And then the agent can just explore those memories and make a UI for you.
Todd Geist:I started experimenting a bit with that over the weekend. It was really interesting.
Ernest Koe:I think that's key. I agents are aware your interactions are happening, for me anyway, significantly more than anything else today. Like, I spend most of my time either talking to Claude or Mira, which is my own agent, which I always regret writing because Claude keeps, like, you know, shipping features that make it obsolete. At some point,
Todd Geist:you just can point Claude at the code base and say, make me a co worker version of
Ernest Koe:this. Make me a co worker version. Exactly. But, you know, but but, you know, it's the thing. We wait.
Ernest Koe:I mean, it's already there, but in a few and not I I'm I'm sure not so far in the future, we're gonna get MCP UI and and UI things that can be injected directly into an agent UI, which basically makes the application UI that you we were so relying on before mostly obsolete. Like Yeah. I only need Tesseract to show me that it's running and basically it. Right? Everything else is happening elsewhere.
Todd Geist:Yeah. Yeah. No. It's true. I I I think there's like one of the things I noticed about my review process is it would be nice to have a really nice visualization of the time.
Todd Geist:Right? And I don't have that, but we but it could be added with MCP UI.
Ernest Koe:Right.
Todd Geist:And if we get to it, we can show what that looks like in an example a little bit later. But I think the So when I finished building out this co work agent to do my time tracking, so then the thing is how do we share that with the team?
Ernest Koe:And
Todd Geist:so I could just take that folder that had the skills and little bits of code that it needed to do whatever it needs to And I could say, Here, put this in your Claude and maybe it'll work. But there's actually another way, and I think this is another really important trend, is you can actually just have documentation for Claude to build its own version of it and it works quite well. So instead of giving a bunch of documentation for a human, you just this is what I did after I was done with my agent. Was working great. I said, okay.
Todd Geist:Here's the problem. I wanna share this functionality with my team. Write a prompt for Claude, for you, that a person can put into their Claude and have it into their co work and have it build out the same thing. And I said, Ask the user some questions about what their sources are and when do they want the automated things to run and then write it yourself. Right?
Todd Geist:It does it great. Like, it has no problems doing it. Right? Instead of delivering a functioning code base, I delivered a prompt that allows the user to go through a process of customizing it for themselves by picking, do I have Slack connected? Do I have GitHub connected?
Todd Geist:Whatever the sources are I'm pulling this data from, And it will do that. And then it will write it over for them, customized for them, which is really interesting.
Ernest Koe:I'm always surprised by how little agent context you really need to give something like Claude for it to actually get it pretty good on the first shot. Yep. I mean, I think there used to be a way of thinking where you had to write very, very good prompts, and that was your job. Like, the prompts had to be perfect and outline all the details of every single thing that had to happen and all the steps that you need to take. And it mostly bootstraps itself into something pretty viable without a lot of intervention these days.
Todd Geist:Yeah. In fact, one of the other trends is people are saying delete your Claude MD or your Agents MD.
Ernest Koe:Well, I second that. I find that leaner Claude MDs are actually better.
Todd Geist:Yeah, they are. Yeah. So people are doing things with skills, which is one way to lean them out. And then what's going Skills can be a little hit or miss depending on the model that you're using. Anthropic models are really good at it.
Todd Geist:The OpenAI models are pretty good at it. The rest of them are, I think, not as good. So the Shim that people are using right now is they basically just reference their skills in the AgentMD or the CloudMD and that's That's the only thing that's in there.
Ernest Koe:It's just
Todd Geist:some links to if you need to know about this, go here, use that skill. And it does it. So that's interesting. Think it's important to say that this works best when you're using the really intelligent models, right?
Ernest Koe:The thinking reasoning models.
Todd Geist:Yeah. And so another thing that's very clear is when you're setting up a project, the setup is really important because if you'd get it set up in a way that's inconsistent or weird or that can later confuse the model
Ernest Koe:Or conflicting.
Todd Geist:Yeah, right. Exactly.
Ernest Koe:Where your Cloud MD does one thing or your Agents MD does one thing and your skills does something else and then you change And
Todd Geist:your code base does something else, you have a problem. So code is the source of truth, right? And agents are getting better and better and better at understanding just being able to ingest the code as needed and understand what to do, which means that in a weird way, architecture, project setup, like understanding how to get a project setup correctly early on is going to lead to much better outcomes down the road. So it turns out we still have jobs for at least a while.
Ernest Koe:I don't think I'm getting more free time. I don't know about you, but I No. That's right. Yeah. It's the opposite that's happening.
Ernest Koe:I'm busier than ever, and I I can't find enough time to do you know, like, I've got way more work than I know what to do with it seems like. I think there's a bug in this thing. Right? And it goes through something I think we it's a guy that you may have read. I think you read actually.
Ernest Koe:Mike shared an article Yes.
Todd Geist:The Last Quiet Thing.
Ernest Koe:The Last Quiet Thing. You want to talk about that a little bit? Because I think it's relevant to
Todd Geist:the story. Really, really important article. We'll have a link in the show notes, but it's great. The Last Quiet Thing by Terry Godier, who I have not encountered before, but it was a really impressive piece of writing and really nailed, I think, some of the issues we have. Actually, even in the pre it's not really directly about AI, but it sort of is.
Todd Geist:I think he starts out by talking about how your watch and your phone and everything is constantly notifying you about things that you need to do. And he has a lot of interesting ways to describe this sort of weird stress that this creates. I think we all feel it, right? You know the notification problem is real and there's this need to dive in and take care of some notification. And so we have all these things which I really love.
Todd Geist:I actually love having all of my Apple Health stuff available to me and all the other things that I have that feel very powerful, but they do come at a cost and that is they're constantly pinging you with updates. And I thought the thing that he said, the phrase that stuck out is that he realized that what he got was just a bunch of jobs, is that every one of these things is now a job that he had to monitor and deal with and own in some way. And those jobs used to belong to people, right? Your Amazon ring goes off because somebody's at your door or maybe it's at your building downstairs. We used to have doormen who did that.
Todd Geist:Frankly, having a doorman would be way better than having something that just buzzes you all the time. Right?
Ernest Koe:I think there's something really important in that article. I spent this weekend mostly doing analog things, things that I have been sort of languishing in the background because I've been been in AI land for so long, you know, months now. I had to re crimp and rewire some Ethernet in my house because I needed to get Wi Fi to the right corners of the house and all that. I helped my in laws with some plumbing. It was it was really funny because I was I was thinking about this and I was thinking, why does my father-in-law ask me to, like, deal with the TV and the internet and the Wi Fi and the plumbing things.
Ernest Koe:It's not like he's a very well respected doctor, you know, like really brilliant guy.
Todd Geist:I know where you're going with this, and it's so right. Yeah.
Ernest Koe:And I realized that partly because he didn't want it to be his job. That's right. Like, it's not like he couldn't. It's just like he just didn't wanna take on one thing that would create all these impositions on his own time and and mental space. And I thought, god, that was brilliant.
Ernest Koe:So he gave me the job. Gave me the job. Yeah. Oh, no. I'm happy to help.
Ernest Koe:But, know, I was I was I was reconvening my Ethernet thing and I just realized that this whole thing just like hit me like a lightning bolt. Oh, I I I've been doing it all wrong. It used to be that because I knew how to do, you know, networking and and I understood how to, like, deal with twisted pair and Ethernet and Wi Fi, and I'm buying my own access points and mounting them around the house. And guess what? I became tech support.
Ernest Koe:So That's right. Anytime anything happens, it's my job, and it's never ending.
Todd Geist:That's
Ernest Koe:right. I'm checking for coverage. I'm checking notifications to make sure things are working. You know, if somebody complains, like like, okay, well, have to go fix it.
Todd Geist:Anyway I had the same problem with our sprinkler system. I had to do a whole bunch of stuff to get it back online this weekend. And it's not stuff I really wanted to sign up for. It was not on my plan. It was not on my job description, but it's my job because I'm the one who can do it.
Todd Geist:So I'm the only person who knows enough in the house to make sure all the automated sprinklers are working properly. And I think this is I mean, the article is great and I recommend everybody read it for a lot of reasons, but the thing that's hitting me is And what I recognized right away in two ways that impacts our business and our industry. One is, is everybody going to do their own software? And the answer is no. Because every single piece of software that you're building is a job and I don't want the job, right?
Todd Geist:I can do it. We could write all the software that runs our entire business, but I actually don't want that to be my job. I want that to be somebody else's job. And I think businesses are going to make that decision. Now, they may choose to build more than they did in the past and they'll be able to do more they do in the past, but do you really want somebody at your company's job to be the accounting software?
Todd Geist:No. You just don't.
Ernest Koe:Think this is critically important. I don't think I mean, I think I'm somewhat reversing my initial anxiousness, I think maybe it's the right word about this whether or business that we call software development, custom software industry is still relevant. And given that everybody and anybody can write software. You know, and I'm realizing that because I'm writing a lot of software that it's not clear that writing more software, it's actually good or healthy in that you are basically owning the whole category of responsibilities that you are not a maybe not resource for and not gonna be paid for and see, you have no way of actually supporting that in a healthy long term way. You know?
Ernest Koe:Yeah. I think I think that that's a calculus we haven't figured out how to to deal with yet. But I think you're right. I don't think that I think the businesses and companies and people are going to want more software and more capability, but it's not clear to me that that means that they're gonna do it themselves.
Todd Geist:That's right. I mean, I think some will and some won't and they'll be able to choose differently than they were in the past, right? I think they may choose to build some things, but they're certainly going to choose to have somebody else build other things because it's just, again, I want somebody else to be responsible for this. And I don't think that goes away. Like, why would it?
Todd Geist:Just had somebody Somebody just came to fix my hot tub. It had a broken sensor. It took the guy who knew how to do it fifteen minutes. I probably could have gone on YouTube and probably could have found some enough videos and figured out how to do it, but it would have taken me an entire day. Again, I don't want that to be my job.
Todd Geist:Somebody else can do it way better than me.
Ernest Koe:Do you still think that SaaS is dead?
Todd Geist:I think it's going to change. I suspect, I do think it's going to be a lot more software built. My best guess is that we're gonna see the rise of more vertical type things where domain specific knowledge outweighs general stuff. Salesforce was successful because sales was a relatively well understood domain. Everybody did it approximately the same way and they did it well and they amortized the cost over whatever it is, I don't know, 20% of all businesses in the country or whatever it is, I have no idea.
Todd Geist:But that came at a cost, a huge operational cost for them and people have these bloated software that does way more than they need. So I suspect that there will be more Salesforce and there already are. There's like Pipedrive. There's a million of these. I expect there will be more sales tools and there'll be more accounting tools.
Todd Geist:There'll be more whatever tool is available. I think the prices will come down. I think you may make choices differently. Been reading the Mars trilogy, which we talked about before and they have a lot of ideas around co ops and partnerships. And what if one model is that you have a software development firm maybe partners with a dozen or two dozen companies and they're the ones who build the software for those two dozen companies and you join a co op where everybody pitches in to have one company specialize in the software that all the companies share in.
Todd Geist:I think a lot of these different kinds of models will be tried. I don't know which ones will succeed, but it is true that software is cheaper to build. So the implementation costs are lower, but taste and expertise and responsibility are still critically important. They're not going away. So somehow, companies will choose to offshore some part of that by not being their responsibility.
Ernest Koe:Think It's hard to say. I mean, think I share a lot of the same notions. I I think that the you know, using sort of constraint based thinking for a minute and the the cost of software, I think we talked about it as the upfront cost of actually building the thing, you know, and lots of engineers, design time, whatever market research and testing to get it really good and actual programming time, you know, the actual development, the code development and all that. But there's also the cost of support and the cost of maintenance and the responsibility of it working reliably. Like, I think those things are still unknown if I I'm I'm guessing AI will be a big factor in reducing some of those costs, you know, where small teams can do a lot more support for for a lot more people and and so forth.
Ernest Koe:But it's still a thing. I don't know if businesses have the wherewithal to actually have their own support team for their own, you know, business systems entirely. Yeah. You know? Yeah.
Ernest Koe:So I think that those those are those are still a little bit, you know, challenging. But I will say that I think for sure the cost of experimentation, the cost of getting to some kind of differentiation that you couldn't have before have dropped dramatically. And that's good. Think that gives I think that's good. Think that gives
Todd Geist:More competition, more opportunity. That's really good. One of the other things that was really interesting in this latent space podcast about Claude Cowork is they were able to ship Claude Cowork in like ten days, but that's really not accurate. Got the idea to ship something and ten days later it was shipped. But the reason they were able to do it so fast is they have dozens of experiments going on in house all the time.
Todd Geist:So Anthropic is running all kinds of things. And they have labs which basically their mandate is do something that's unlikely to ship essentially. Like, that's so crazy it's unlikely to ship. So they had like a dozen experiments of different features that ended up just getting pulled into co work very quickly. So they were already doing all these experiments.
Todd Geist:And so that's, I think, another thing is that even if you spend as much as you did before on building software, the software you're gonna get is gonna be so much better because you can try 10 things and pick the best one. It used to be like you had to decide before you went to implementation, okay, we're going to do this path. And you had to do all kinds of user surveys and questioning and try to take your best guess at what would be the best solution to solve this problem. And then you gave it to the coding teams and they would spend a bunch of money building that. And now you don't.
Todd Geist:You can just say, Let's try these five things because you can try them all. That's what they're doing. They're, like, trying every idea. They're not they're not having to pick before it actually gets, you know, built and they can test it in front of people.
Ernest Koe:Yeah. I'm really I'm hyped up about this idea because I think I think along the way, we forgot what the purpose of design or, I guess, design processes or design thinking was, and it became about more the process than the goal. And, really, the whole point is to to fail early and fail often so that you can test the best ideas and get the right right ideas. Right? What you don't want to do is to spend a lot of time wasting time and money and resources and treasure on things that are hard to change afterwards.
Ernest Koe:But if you can ship early and get a whole bunch of ideas out on the table and validate them, and and this is also important, you're not validating what people say they want. You're validating their experience with the, prototype for the effort, the actual thing. You know, that's actual testing. That's not that's not market research validation where you go ask somebody, hey. What do you think about this feature?
Ernest Koe:Everybody's everyone's gonna say, yes. I love it. Right? It should tell you what what you want to know.
Todd Geist:Yeah. Yeah.
Ernest Koe:But when when when they're actually using something and you can see the usage usage stats come back, that's a different story. Totally different. We can do that. Now we can do that at at scale before.
Todd Geist:Yeah. Design was kind of forced to optimize, to reduce the risk of going into implementation, to try to have the best possible plan for implementation. And we've all known that was always risky because once you start writing, the rubber meets the road and everything changes. Right? It's like they say about battle plans, it only survives after the first engagement.
Todd Geist:It's the same thing. Your great designs don't survive past the first user engagement. Right? Like, everything changes the minute it gets there. So now we can actually use design thinking without having to stop at that bottleneck and we can just go right through to here's 10 different ideas.
Todd Geist:Try them. Like, let's write them all and see what
Ernest Koe:they can do.
Todd Geist:Yeah. Prototype.
Ernest Koe:I think I think it's solid. So I wanna I wanna come back to this notion that we touched on, and I think we've been talking about it in in I think we almost take it for granted now, but we should spell it out. I think we've been talking about a way of thinking about AI that is different than the way we've been thinking about software development in the past. Right? And and I've been calling this like the two systems of thinking.
Ernest Koe:One is where you think about using AI to just accelerate the way you work as a programmer. Call it, you know, one way of thinking, one system, one mindset. And I think what you and I have been talking about and really exploring, which is the other, which is we're not writing code anymore, practically speaking. Agents are writing code. And what we're all doing now is working with agents and building harnesses, building the tools, building the scaffolding to allow our agents to actually be doing the work.
Ernest Koe:Right? And that's sort of like the second system's way of thinking about the whole system. Call it agentic system development, agentic something. I don't know. Come up with a better term.
Ernest Koe:Right? But first system, you're a faster developer. Second system, the agents are doing the work. Right? And I think this has really deep implications because it changes the way you think about your tools and the software that you want to work with in general.
Ernest Koe:Right?
Todd Geist:That's right.
Ernest Koe:But in my mind, you know, if if the tools are not compatible with the second way of working agentic development, then they're not really relevant anymore.
Todd Geist:No. That's right. It really changes. It really changes. And it's subtle in the different layers of effects it can have.
Todd Geist:Certainly there are tools, there are SaaS products that we have that I do not go to anymore because they do not fit this model well. They're not designed for agents to work with, they're designed for humans to work with them. And so there's that. You just start choosing tools that your agents are good at, right? And you don't do that because of some, Oh, hype.
Todd Geist:At least I didn't. It's not the hype or it's not the story. It's just reality. Once you're using agents to do your work, you very quickly self select out. You just select out the tools that don't work in that mode.
Todd Geist:So you tend to I prefer Markdown over any other note taking app that there is because the agents are so great at it. They don't need any tools to do Markdown. So we just use Markdown, right? Simple. And I think that's going to play out across the whole ecosystem.
Todd Geist:But then there's these other subtle effects like who are you writing documentation for to use your tools that you're building? Do you write it for the humans or do you write it for the Really probably write it for both, but if you're not writing it for the agents, you're missing an opportunity. So I'm starting to see this in documentation for products that we use or especially the developer tools, which tend to lead in this area. Developer tools will now have whole sections of the documentations that say, basically, give your agent a link to this page and tell it to go. And the agent reads it and implements the stuff into your code base.
Todd Geist:So it's that kind of thing. And it's like I mentioned before about my time tracking thing. I didn't deliver people working code, I delivered them a prompt that the agent's going to use to actually write the code. Right? It's that kind of thing.
Ernest Koe:Yeah. I think this is quite interesting and and important. I mean, I I wanna kinda tie this into a couple of things that we've been working on. And just quickly, maybe to share out a little bit. So I shipped a repository sentinel, like a security leaks checker system.
Ernest Koe:Yeah.
Todd Geist:It's another good example
Ernest Koe:of that. Right. Yeah. It was actually really interesting how that evolved because, I was definitely in system a thinking, which is like, I'm gonna make this thing. I'm gonna use AI to make it better, and I'm gonna see what bugs happen.
Ernest Koe:And then I'm gonna use that to then feed it to I thought I was in systems b thinking. I thought I was actually programming agentically. Turns out I hadn't gone all the way there. Right? And then it dawned upon me that, well, what if it could just fix itself?
Ernest Koe:Right? What if you wrote it so that it was actually taking the results of any bugs that came out in the workflow and the process and fed it back into itself as a GitHub issue and then it could monitor that and then fix itself by just making those agent calls. This is not a new idea. I mean, this sort of self referential thing has been the thing that that, you know, we've been sort of thinking about and and and and seeing out in the world. But I didn't think this was going to work.
Ernest Koe:And I especially didn't think this was gonna work given that I was using a very primitive, like, workflow solution. I was using GitHub Actions. It was like there was no special cron job. I mean, there was no special, like, daemon running. It was just like this stateless serverless thing that just, you know, kicks off every every few hours and just runs a workflow.
Ernest Koe:But it turns out it totally works. You know? I came back. I was like, oh my god. It just found a bug.
Ernest Koe:And it, like, it it you know, I tied it to kimu k two five, and then now I switched it to I switched it to it doesn't matter. But it was it was really surprising how cool to watch it, like, fix itself. You know? Like, to to do its own PR, to merge the code, and then and it's like, okay. My job's done.
Ernest Koe:You know? But I but I think this is what we're talking about. Right? So when when you're building for agents, it's not so much it's not just about instructing and prompting the agent then do the work. It's actually creating conditions for the agent to be able to bootstrap itself in a way.
Ernest Koe:I kind of joke that maybe it'll bootstrap itself into consciousness. I don't think we're there yet, but it's not clear to me that it won't do that sometimes.
Todd Geist:Yeah. So a couple little other examples of that is when you're writing an API or like an MCP server that needs to work with something and there's an error state, if you're writing for the agent, you make a nice error for the agent and you suggest additional things that it could do and test. And this is the magic sauce because the agents now, they get back an error, they read it and they go, Oh, it's giving me some suggestions here. It's giving me some prompts I can use and it'll just do it. Right?
Todd Geist:So instead of just giving some error that's not helpful and letting the human just say, Well, it didn't work. If you give a good error to the agent, I was doing this thing which we're going to look at in a little bit, asking a FileMaker file to get the relationship info for a particular table. And if the table name's wrong, I could just give an error. But if I give a nice detailed error that's good for the agents, then the agent knows how to try again. So like for example, in this case, if there's no table name that gets passed in, in the error, I say, Here's a list of all the table names that are in
Ernest Koe:this And
Todd Geist:so now the agent goes, oh, I thought it was contacts with a c, but it's actually these guys use contacts with a k because they're German or something. I don't know. Whatever. It sees it and goes, oh, here's the table name, so now I know which one to do. So if you give hints back to the agent, then the agent will just fix itself.
Todd Geist:It'll just keep doing it. And so that's another way in which it's like this second way of doing things where you always wanna make sure you're giving the agent what it needs to make its next best guess because the agent loop is going to do it. It's going to pick it up and run with it. So, yeah, that's another one. Yeah.
Todd Geist:So everything I build now is My goal with building any application that I'm building is what I want it to be set up for that I can simply talk about features and it can simply implement them and test it every way that it needs to test it to make sure that it didn't break anything, that everything works. And now that we have agent browser skills where they can actually if you're building a website or a FileMaker web viewer widget, you can actually have the agent just inspect the webpage itself and determine whether or not the button is green or blue or whether or not there's an error in the console and then it just closes that loop. So it's all about setting up the project for the agent to just be able to take whatever you tell it and make a feature, test the feature, ship the feature, write the PR, have it run through CICD, validate that everything worked, and then ship it. So I don't have to do any of that. That's always my goal now.
Todd Geist:Self running repos, basically.
Ernest Koe:Yeah. I think I think that's right. I think we've been pushing in this direction, I would say. I I I mean, I don't wanna frame it as a, you know, super intentional, well planned idea of
Todd Geist:I mean, it's just what's emerging in the space. Right? Like, it's
Ernest Koe:what's happening. Exactly. Yeah. It certainly it certainly has been what's emerging. But I I wanna I wanna highlight a couple of things that you that you mentioned, which feels that resonates with me.
Ernest Koe:Like, one, the level of concern for me now has shifted to design in a weird way, to taste, to to the goals of the software, and to the experiences and feelings I wanna create, and significantly away from the details of how it actually gets done. Right? I think that's the right separation of concern. And I think in a way, if you want to think about what agent first development means, it's something like humans define what this should do, why it should do it, and how it should feel when that thing does its thing. And then agents do everything else in achieving that vision.
Ernest Koe:And it's actually quite good at that today. I think I'm even underselling it. It's better than I am at doing that today for sure.
Todd Geist:It is. Just with the one caveat that still seems very clear. I don't know how long this how many model versions this will survive. But in some ways, it's interesting. If you know nothing about software and you don't care what it writes, you're in sort of this free place where you can just build things and you don't care about the implementation details.
Todd Geist:And that'll get you way farther than you ever could before, there's no question. But I think if you're going to ship things that are maintainable and will run and not crash and are available, high availability and can be maintained over time, at least for the time being, getting that project set up properly is software engineering. That is the software engineering that we have today. It's basically about setting it up so that the agents can run. And it's not insignificant and there are challenges and some platforms are better suited for it than others.
Todd Geist:And in fact, some of the goals that we've had for our FileMaker stuff is how can we make FileMaker as friendly to this kind of development as possible. And there are some real challenges there. And we've broken through some of them, but some remain. But that's actually I mean, it's serious engineering. You have to know what you're doing to set all this stuff up.
Todd Geist:It's things like if you can have many, many agents running on your codebase at once, you can't have them all hitting the same database for their testing. So you have to have completely isolated environments that these agents can spin up to work inside so they don't bump into each other or crash into the same databases or whatever. And this is a challenge. For some platforms, it's a bigger challenge than others, but you have to do that. Otherwise, you can't do multiple agents.
Todd Geist:We didn't say that, but that's actually a very important goal for me is that I want to be able to have the code bases that I set up have many agents running on them to fix things and to add features as I can manage, which means they each have to have their own isolated environment, including databases, which is always where it gets tricky if you have to deal with state. What other services? Email, whatever. You have to have something that will allow this to run-in a completely isolated way so that each agent gets a clean slate and when it submits back as a pull request, can be merged into the master branch. So it's not just agent first, it's actually many agents is what I want.
Ernest Koe:It's many agents first. Yeah. Yeah. I think having strong opinions about architecture and how things should be designed and how they should function from a systems perspective still feels really important, I think, like you said. You know?
Ernest Koe:I I wonder how soon before we get Claude and these other agents to kinda bootstrap themselves to saying, alright. You clearly don't know what you're doing. Let me give you three patterns. You know? Pick one, you know, and I'll test it for you.
Ernest Koe:Don't worry. You know? We're gonna do a couple
Todd Geist:of things. It's likely to be a solved problem by either if you pick the right infrastructure that can be isolated and the models get good enough, then I think that part of it will be a solved problem at some time in the future. And then it's just about taste and domain expertise that you bring into the table, which is still a tremendous amount, right?
Ernest Koe:Still a lot, right?
Todd Geist:Still a tremendous amount of stuff that you need to bring to the table to be able to create something that has value beyond your own nichey needs.
Ernest Koe:So speaking of taste and stuff we're working on agent first workflow. So we've been doing some pretty cool things at Proof to really shift at least a lot of our pharma development space work into into this modality. Right? And let me set it up real quick. I think what we wanna do is maybe talk about ProofKit MCP, which builds on ProofKit.
Ernest Koe:ProofKit is something that we've had for a while now. It's a it's a set of what do you wanna call it? A framework for spinning up web based projects for that use FileMaker as a potential back end, not just FileMaker. It could be super based or other things like that. That basically gets you all the tooling and framework and things that you need set up before you get going.
Ernest Koe:Yep. And so we're now extending this to the AI space with ProofKit MCP. Maybe Todd, you wanna take over and talk about what we're doing here?
Todd Geist:Yeah. So again, it's this idea that I just mentioned about wanting to make FileMaker expose as much of FileMaker as we can to these agentic types of coding experiences because that's where the work is going to get done. I can't stress this term strongly enough. People will not be using IDEs. Things like ScriptMaker will not make it through this bottleneck unless it can be reached and edited by an agent.
Todd Geist:I feel very strongly about this. I just don't think there's any future in which you need a custom code editor for a particular project we have with FileMaker because of The value that it brought to the table for many, many years was the no code or low code or even the but maybe just call it the fourth GL where you didn't have to know the text based programming. That value is completely eroded by AI. I mean, I just don't know how else to soften the blow. That is just the state of the world.
Todd Geist:Coding in ScriptMaker used to be faster than coding in TypeScript. That is just not true anymore. And so in order for FileMaker to continue to thrive in this new world, we want to be able to expose as much of it as we can to the AI agents. And so the first place that the first most obvious place that we can build something is built on top of ProofKit. So ProofKit has always had this concept of building web viewer apps, which are these apps that live inside of FileMaker.
Todd Geist:But they're basically based on web code and they connect to FileMaker. And this became possible really in FileMaker 19.6 when we could call scripts and send data in and out of the web viewer. Right? So that we've been building on that for a number of years. So the next step is how do we then continue to expose that underlying stuff that we built on ProofKit to the agents?
Todd Geist:And so a good way to do that is through an MCP server. So let me spin this up and talk a little bit about it. And Ernest, feel free to interrupt or say more as I get going here.
Ernest Koe:Yeah. As you spin that up, I just want to stress one thing. I get that we can probably write a lot of scripts in something like claw to codex and copying and then copy it into script maker and paste it. I think what Todd and I are sort of doubling down on is that that is not an agentic first development environment. That is to human first development environment that is deaf assisted by a augmented by an agent.
Ernest Koe:We want to make that entire loop end to end agentic meaning that there is no well, humans can be part of the process that they shouldn't have to be part of the process of making software come to life. Right? So that that is the idea here with ProofKit MCP, which is to bring agent development to the FileMaker platform essentially.
Todd Geist:Yeah. And like we actually we talk a lot to Claris about how can we make scripts editable by agents because I think that's actually really important. A couple years ago or a year ago maybe even, the loop with agentic coding was told it how to make the code or you could chat with it or write code, you'd run it, there'd be an error, you'd copy the error, you'd paste it back into the agent, and then you'd say fix the error. And that was amazing. That was like so great.
Todd Geist:But now I don't want that copy paste thing to happen. Agent should be able to get its own errors. Right? That's the key. So what we have is we built an MCP server that can connect to FileMaker locally.
Todd Geist:So I can run a script here in my FileMaker file. Let me just this is a simple FileMaker file, kind of the same demo file that we used for ProofKit Chat, but we've we've just kind of stripped out a bunch of stuff and left just the basics we needed for demoing this. And so what we have instead is a Connect to MCP. And so when I run that Connect to MCP, I get another window. Let me actually open up another window here to bring this back.
Todd Geist:And so this little window pops up and it just shows this little screen here which says we're connected to the as ProofKit MCP server connector connected to the CRM demo. Right? And so Claude now knows about this. And so if I go in and look at my connectors and I go to ProofKit MCP, you can see I've got all these tools here that are connected directly to this FileMaker file. So it's open on my desktop.
Todd Geist:Whatever I have authenticated access to the FileMaker file, Claude has access to now. We're relying nicely on the FileMaker security privilege sets. We're not getting around those in any way. We're not having to authenticate to the FileMaker file in some special way. I am the user.
Todd Geist:I log in. I have whatever access I do and that's it. Right? So once we have that, then start to do things. So I'm gonna start here in chat just to make a couple of examples and then we'll keep going.
Todd Geist:I'm gonna use voice to prompt. I usually talk to my agents. Okay. So what I'm gonna do is I'm just going to paste in this prompt and run this prompt that's basically just saying, hey. There's a FileMaker file connected to to Claude.
Todd Geist:Explore it. Tell me what you think it's for.
Ernest Koe:So that file is open in the background.
Todd Geist:Yeah. That's this one here. And we saw it's got this little window open for the server connector.
Ernest Koe:Right.
Todd Geist:Okay. So now it's gonna it says, oh, I got a file open, CRM. That's the name of this file. It's right down here. It's this file down here.
Todd Geist:That's the name of the file, and it's just gonna explore it. Now I'm gonna just allow it to use whatever tools it wants here. So it can read it can get the scripts. It can get valueless. It can get the DDL, which was added for the AI stuff.
Todd Geist:It can get layouts and the tables that are connected to them, the fields that are on the layouts. So now it's just getting script names. It can't read the script data itself at this point, but it can get names. So we'll just see what it comes up with. So the idea here is that what the MCP server does is it gives look, it's making an entity relationship diagram.
Todd Geist:So this is an instance of what we were talking about before
Ernest Koe:That's awesome.
Todd Geist:Which is MCP UI, right? Right. So we have a custom ERD viewer that we built into the MCP ProofKit or sorry, ProofKit MCP that actually makes a nice UI. And this is why we say sometimes that, again, it's like, how much of the UI are you gonna build into an app and how much are you just gonna make available to the agent so you can chat with it? Right?
Todd Geist:This is just one UI. You can imagine any kind of UI that you can think of building, you can build as part of an MCP server, for example. Right? So you can blow this up to full screen. You can drag these things around.
Todd Geist:You can rearrange them however you want.
Ernest Koe:I think it's so cool that it got the joins right for both of those keys, you know, and the and the cardinality of it. Yep. One too many. Yep.
Todd Geist:Got it got a whole bunch of stuff. And so let's see what else it told us here. So so what it is, this classic b to b CRM invoicing system. Right?
Ernest Koe:It's classic, Todd.
Todd Geist:Classic. Yeah.
Ernest Koe:Kind of thing a small to mid sized business would do. Got it.
Todd Geist:Yep. Yep. So it can and and it's got tools to actually query the database so it can execute data. It can execute SQL. It does a pretty good job of that, and it can execute the data API.
Todd Geist:And because it can execute the data API, if you tell it, it can actually update records. Right? You can do all this within your chat. Right? This is so we're not even building custom code yet.
Todd Geist:We're just using the MCP server to expose our FileMaker system to the chat.
Ernest Koe:Yeah. This is important. I mean, you can already you can already use this MCP server to really just interrogate your database and pull data and understand what customers are and how to connect invoices, that kind of reporting stuff. Right?
Todd Geist:Yeah. And chat can build you simple UIs too. It'll just do it. Right? It'll start building things.
Todd Geist:So you could prompt it to make a sales chart and it will make you a sales chart. It'll do all that stuff because we've exposed all of these tools to it. And so that's just step one is you've got MCP in chat, means you can do all the things like, for example, that we were doing with ProofKit, to be clear. So the difference between this and ProofKit, ProofKit lives inside your FileMaker file. It's running on an agent.
Todd Geist:It's running on an LLM and then an agent loop that we built. Right?
Ernest Koe:Right.
Todd Geist:This one is running on the Claude agent, which is frankly I mean, Anthropic is a multi billion dollar company. They're gonna have a better agent than we're gonna have in ProofKit chat, just to say. So anything else to say about chat there? Anything else we should demo there?
Ernest Koe:Yeah. But this because because it's an it's an MCP server, this should work in codecs. I don't think we fully fully tested it yet. But in principle, anything that has MCP support and an agent, basically an agent loop as part of its its working process should be able to take advantage of this.
Todd Geist:That's right. Now, coding agents have different levels of or all MCP servers, MCP clients, which coding agents typically are, will have different capabilities. So most coding agents do not do UI. So they're not gonna give you these beautiful entity relationship diagrams back or whatever else has been programmed into it. That's not gonna happen.
Todd Geist:And so in terms of the Clawd Desktop, for example, we have three different experiences in Clawd Desktop. We have the chat experience, we have the co work, co work, and then we have code. And so code is the hardcore one, which we're gonna jump to in a minute. Cowork is in between and extremely interesting. And Anthropic shipped a bunch of stuff last week, Telegram and Slack integration and projects.
Todd Geist:It's really great. You can make fully fledged agents with Cowork now. And so of the three, chat and Cowork do display MCP UI, which is a new standard that both Claude and OpenAI support. It's a standard that anybody can support and they do. So this, in theory, should work in ChatGPT, although frankly, just haven't tried it So what else to say about that?
Todd Geist:So that's chat. Let's jump over to code, and let's make a new thing here. And I'm gonna make a new folder, CRM demo. I think I have another one, so we'll call it CRM three. Now, when I'm in code, I'm in a new paradigm.
Todd Geist:The first thing I gotta do so, okay, what's my goal? My goal is I wanna make a FileMaker web viewer app. Right? That's my goal. And I've got a file open and I want to just build an app based on that file.
Todd Geist:And I wanna eventually gonna embed that right into my FileMaker solution and do that whole thing.
Ernest Koe:So the old way, so to speak now is that you maybe you vibe coded or you write it, write the HTML and CSS and all that cursor and clod. And then you do what? You make a bundle and then you copy it into web viewer and then you run your tests on it. And then if it doesn't work, you do the whole thing again and you copy paste and you manage that and then you stash the code somewhere and then maybe you want to update it and it's like, okay, do you copy it back down to your computer and then you paste it manually? That's the current story, right?
Ernest Koe:Yeah. Which is we've done
Todd Geist:a bunch of these. One of the, you know, a breakthrough from a few years ago is we got it to where you put a web viewer in your FileMaker app and you'd point the dev server for whatever framework you were using, like Veed or React, Next or Next. Js or whatever, and then you could actually see the UI update in your FileMaker file, which was And really you could call the FileMaker scripts and it would all work. The problem is is that browser is embedded in FileMaker, which means it is not open to all the automation tools that you can do if it's in Chrome or in Cloud or whatever. And so, again, we wanna remove the loops.
Todd Geist:We wanna get people out of those loops as much as possible so the agent can do more exploring and fixing of itself rather than you having to do it.
Ernest Koe:It's also worth mentioning that although this was smoothed out a little bit with quite a lot actually with ProofKit itself, But you had to know, you know, which framework to use, what what what bundler you use, whether it's Vite or something else. Is it Next. Js in the background or not? Is it do you need node running somewhere, and how do you do it? Mean, there are lots of, like set up project details and then and that's even before you get into things like auth and all that which we may not get to today.
Ernest Koe:But but there was a lot of different things that had to kind of be in place before you could successfully do a web based web viewer project.
Todd Geist:That's right. You had to understand a lot of stuff about how the web viewer works. We do. Like, you could say, we have taste in this. We have a lot of experience in knowing what the happy path is to making web viewer apps work inside of FileMaker.
Todd Geist:We have shipping apps that are shipped that way that hundreds, if not thousands of people actually use. Well, maybe a thousand might be the high end of it. But it's a lot and it's in heavy, heavy use all the time. That's AutoDeploy, part of our deployment frameworks that we have. That's all a WebViewer app.
Todd Geist:The entire thing is a WebViewer app living in FileMaker. So we know how to build these things, and so what we've done in this MCP server and inside of ProofKit is we've baked in that taste so that you get a sort of big head start. You do not have to go figure this stuff out. You can and Claude might even find references to things we published and eventually get you there. But again, project setup is still really the key.
Todd Geist:Like, if you don't get the project set up properly, it's gonna struggle. The coding agents will struggle. And so what we've tried to do is make this as simple as possible to get going with ProofKit.
Ernest Koe:And consistent. I think consistency is important.
Todd Geist:Yeah.
Ernest Koe:Yeah. Okay. So here we go.
Todd Geist:What I've done here is I've selected this folder CRM demo, which again, you're gonna have to do. You need a folder to operate in. It's best just to give Claude code a folder upfront and say, this is my folder, and then it will go ahead and operate within that.
Ernest Koe:Right. And that's a code base, basically.
Todd Geist:That's your code base. Like, this is where I wanna put it. So that likely will be the thing that you eventually check into GitHub so you can manage this using Git and all that. But in this case, we're just giving it a folder on my desktop and we're just saying, go ahead. And I've also set this up to bypass permissions, which it'll make it so I don't have to click as so many approval things for all the terminal commands it's gonna run.
Todd Geist:So let's do a prompt here. So let's say I have a FileMaker file connected to Claude. I would like to make a FileMaker web viewer app using ProofKit with that file. Can you get me started and set up properly? The first thing it's gonna ask me to do is trust the workspace.
Todd Geist:That's the standard with Cloud Code. So this is gonna take probably a while, so we can talk over this as it's going. But just notice what it's doing in the background. It's actually checking out the it's it's exploring the situation, And now it's gonna run the init command that sets up this project the way that this is where we've applied our taste, if you will, to the project. And so it's gonna get itself set up properly so that it will run the way that we hope will just make it super easy to go.
Ernest Koe:So, it knows how to initialize a ProofKit project because the MCP server is connected?
Todd Geist:The MCP server has a project setup tool.
Ernest Koe:Right.
Todd Geist:And so it's listening for you to say something about setting me up a project, and now it's going ahead and doing that. And now these are other things. We've got this if there's this library out there called TAN Stack Intent, which actually will automatically install skills, agent skills, to the repo based on the code that's already in place. And so it's going ahead and doing all this stuff, setting it up, setting up the Cloud MD, setting up agents, doing all the things that it needs to do to do this. So this is the stuff that you would have to figure out and tell it how to do, and we've just set it up so the MCP server just tells it how to do.
Todd Geist:This is another example of instead of giving you a code base to open up and use, we're just giving the agents context it needs to do it all itself.
Ernest Koe:Right. Yeah, this is key because the MCP server here is not merely a connector to your FileMaker file. It's coming with a battery of tools and and other things that it needs to kind of bootstrap itself into into this process.
Todd Geist:Yeah. So now it's exploring the metadata to figure out what we what kind of thing we can build out of this. So here's the files that are available, the table the tables. So ProofKit has figured out the agent figured out that we can connect directly to the FileMaker file and start to generate the TypeScript libraries that automatically connect to your FileMaker file. And TypeScript is one of these other things that's really helpful because it's one of the things that agents are very good at.
Todd Geist:And if types are wrong, if the code is wrong, the agent can quickly and easily determine what's wrong and fix it. So that's why we pretty much don't use JavaScript per se. We use TypeScript always because it's another way that the agents can self fix. They know how to fix what they're working on here.
Ernest Koe:How is it handling the data connection to the web viewer app that it's trying to scaffold up?
Todd Geist:Yeah, which is interesting, right? Because we're not even running. We don't have a web viewer with this dev server. We don't have the dev server set up here. Let's see what it's doing.
Todd Geist:So it's setting up the schema. So it generated the schema for customers, contacts, products, and invoices and invoice items. And so it now has it's okay. This is what's available. So, yeah, let's just take it to let's just do what it's what it's suggesting, which is let's start the dev server and then build out an initial page like a customer list.
Todd Geist:Let's just see what it does. Yes. Go. Yes. Go ahead with your idea.
Todd Geist:So one of the things that the the MCP server does is it lets you run code in a external browser like Chrome as though it was running inside of a FileMaker web viewer. And what's helpful with that is it can that means that it can use the embedded browsers that it has. And so eventually, it's going to, I think, try to open up its embedded browser and get to work on it. We'll see what it does here. But what's so what that means is in the old way, we had to add a web viewer on a layout.
Todd Geist:We put our dev server URL in there and then that's how we would write our apps. So we would just watch the changes go and it could connect to FileMaker because it's running inside of the web viewer. We've made it possible that any code that's running in a browser on your machine can connect locally directly to your FileMaker file and make those same calls as though it was running inside of a web viewer.
Ernest Koe:Yeah. That's pretty hot.
Todd Geist:Yeah. So it's got the FM Data API documentation. It's got itself a to do list. It's working through. It's building up navigation.
Todd Geist:So this could take a little while. It's gonna work through. And eventually, we're gonna get something and we'll see what it comes up with.
Ernest Koe:I mean, all this setup used to take a lot of time, I mean, to to Yes. To kinda do it by do it manually.
Todd Geist:Yes. And and it also required experience. Like, you had to know what you were doing. Right? And so what we're what we're trying to do here is make it so that it's very easy to get set up with a good, stable, ready for production workflow for building web viewer apps.
Todd Geist:And you can see that it's where was I going with that? Production. Oh, and then, of course, the next step, we can be able to make full web apps. And that's sort of the next round of this. It'll work kind of the same way.
Todd Geist:So what's going ahead there? It kicked open the browser. Preview. So it just opened it up. Here's the home page.
Todd Geist:And look. You can see. Let me make this bigger. Look at it. Like, it's now navigating through.
Todd Geist:I'm not touching this browser. The agent is is scrolling around, looking around, navigating in the web viewer here that's over here, and it's reading data out of my FileMaker database. Right? So it's taking a screenshot, then it reads the screenshot, and it says, okay. It's working.
Todd Geist:The customer page is pulling live data from your CRM demo, and
Ernest Koe:it
Todd Geist:pulled in pulled in all its stuff. The dev server is running. It's still working. It's still doing other things. We'll just let it go until it's done.
Todd Geist:But this is now this is now what we mean when we say the full loop. We didn't once we kicked it off, we've had to allow certain things. You can actually make that even if you wanna go crazy, you don't even have to do that and just let it run everything. And it says, okay. So now we're in Vibe coding land.
Todd Geist:Right? And we can can tell it what we wanna do. So let's say let's get rid of that homepage that we have, and let's just make the customer list with the drill down to the customer detail. And I'm purposely not giving a lot of detail. You might have more taste you wanna apply to it, but I'm just sort of letting it figure out what it thinks I mean by that and and just seeing what it does.
Todd Geist:But again, now we are in the mode where I can talk to the AI and the AI builds my project, and I don't have to micromanage that process. It's the agent's job to implement the code. That's we're talking about jobs before. I want the agent to have the job of writing the code, testing the code, and when it's done, I want it to have done a bunch of tests and figured out how to figured out that, yes, it actually completed the task that I wanted it to go that I wanted to do.
Ernest Koe:I'm sure this is a gory technical detail, but I think FileMaker developers listening to this may be curious how we're fetching the data from FileMaker. Is it Execute Data API? Is it is it Yeah.
Todd Geist:It's all Execute Data API. The magic, the thing that that the MCP server does is it bridges the it bridges those calls straight from any browser to FileMaker. And that's that's kind of how that's that's how that's working. So it's showing a list of card views, which is a little goofy. Maybe what we'll do is tell it to make a table view.
Todd Geist:But you can click on it and you can see now look, it's got detail. Right?
Ernest Koe:That's awesome.
Todd Geist:So it interpreted my idea as a card view instead of a But we could prompt it. We could tell it to do whatever we want at this point and it would keep going. And we could even tell it to make editable things. So we could say, make this make the name of the company editable. And it would figure out how to do that and it would write the the FileMaker code, the Execute Data API code to edit that and to make that all happen.
Ernest Koe:Yeah. That's actually a nontrivial thing. Right? So because we we take that for granted in FileMaker, that's not always an easiest experience on on with web code in FileMaker. Yeah.
Todd Geist:So that's all gonna work. Now the interesting thing is this is already useful. You could imagine that you have a particular view of the data you want in your FileMaker database, and you want the UI that you want in the FileMaker database. You don't actually care if anybody else has that same UI. You can use Claude code to write your own custom UI to a FileMaker database.
Todd Geist:Never deploy it anywhere. Just have it sitting in your Claude code. You come over to your you open up this particular thing, you hit preview, and this is your custom UI. And it can update records. You can tell it to basically build any kind of UI you want including writing back and making edits.
Todd Geist:You can do dashboards, edit views. You could write skills into your file that would tell it how to run certain things like, Okay, I need to send a new email to this customer, and the agent will just pick that up and do it. So you right here, we're looking at an agent. Over on this side is the agent. On this side is the UI that the agent is building in real time for you.
Todd Geist:You could do things like set up this you could set up cron jobs to run inside of this agent that we're building to do certain things. And like that time tracking thing that I talked about before, I could just open this up in the morning. It could have my time sheets ready to review right here in my UI. Right.
Ernest Koe:So as it is right now, does does it know anything about scripts and can it call scripts that are inside?
Todd Geist:It can. You can tell it to call particular scripts too. Yeah. Right.
Ernest Koe:So you could you could load up all your business logic in your scripts like you normally do and then it could just fire those off as part of the process. Yep. Know, let's say when I, when I make a customer record, need to get two approvals and it fires an email. But if those are already script calls that you've wired up, it could just pull those.
Todd Geist:That's right. That's right. It can just do it. It can read the script name so it can actually validate and call the right script. Right?
Todd Geist:So it can do all that. It can get valueless. Can valueless tables. Basically, every bit of metadata that made sense to get, it can get. And so that just helps it write better code because it knows what it's writing against.
Todd Geist:Right? So you could choose to leave this directly in your file and just use it yourself or you could choose to share it with your team. And so k. I wanna share this with everybody who has access to this file. So let's deploy this HTML to the FileMaker file so that it will work embedded inside the FileMaker UI without having to run Clot.
Todd Geist:So now we have a tool which knows how to build this app into a single HTML bundle and install it into the FileMaker file. Alright? So that's what it's gonna do. It's running a build. You can see here.
Todd Geist:Build successful. All done. And it's deploying it. See if see if it gets that done. Pull up that FileMaker file.
Todd Geist:I can see if it finished already. Let's go look at ProofKit apps open. And it looks like it did. Oh, Oh, now it's trying to deploy. Okay.
Todd Geist:Let's see. There it is. There it is. It's in your file. Amazing.
Todd Geist:Right? Okay. One more step. I've developed all this on my computer. It looks great.
Todd Geist:I've got this file. I want to share it with the world. What do I do next? Well, one of the tools we have in Claude code is look at the connectors, manage connectors, ProofKit. What's it called?
Todd Geist:I wonder if I have the wrong let's just see. I I there's a tool in here. Maybe it's a different name. Deploy HTML is not it. I have metadata.
Todd Geist:It looks like it does not have the tool that I was expecting. Oh, I think I know why. Yeah. I'll just say I'll just talk it through at this point. What we have is a deploy to automatic feature built in, So it'll deploy right to automatic.
Todd Geist:And the reason is I didn't put my API key into the project. We'll just skip that for now. But the idea is that you can you can automatically deploy to to automatic. And even if you don't have a server, you can deploy we'll we'll send we'll send we will spin you up a server and deploy it and that file is immediately available to you to share with anybody in the world. And then if you don't do anything with it or if you don't decide to give us a credit card, we'll just destroy it after a certain number of days.
Todd Geist:But it's all built in and I'm sorry I don't have that demo. I have a video of it. Maybe we'll post that in the show notes of it working, but it will just deploy it automatic and you have your file. So you can start with a local file on your FileMaker's desktop.
Ernest Koe:That's right. Yeah.
Todd Geist:You can build it, do whatever you want. You can make an HTML beautiful UI connected to all your FileMaker data. And then when you're ready to share it, you can deploy it to the world. So that's what we've been working on. It's pretty exciting.
Ernest Koe:It's pretty amazing. Yeah. Well, this is really, really cool. Can't wait to get this out. What's your thinking, Todd?
Ernest Koe:I mean, we are
Todd Geist:Right now, it's in private internal review, so we're going to get whatever kinks are left like that. It should have the tool is turned off when I don't have the key installed properly, and I I just didn't for the demo. Obviously, we have to make sure there's a we wanna make sure that you can deploy to our platform in a secure way. And so that's one of the last wrinkles that we had to figure out. But I don't think it's going to be too long.
Todd Geist:We're just going to probably do a couple of weeks on it and then we will figure out how to make it available to everybody, which we haven't quite figured out, but that's where we're going. And I think the key thing there, I think what we demonstrated is the value of removing the person from the loop or as much as possible making it so that agents can work. And so we know that agents can work quite well with FileMaker web viewers. There's nothing that prevents an agent from just doing its whole agentic loop, setting up an app and working on that app. And then we just have to layer the right tooling in there and it works.
Ernest Koe:I think this is I mean, I my we've been talking about this, but, certainly, we feel like this is going to just revolutionize the way we work in FileMaker. Right? I mean, we're treating file maker as a first class app container and database back end. Yeah, it's actually really cool because I mean that separation sort of front end versus back end. I mean, when you build a web app and if you're using node as a server or whatnot, I mean you have that distinction next, you know, you have a next JS app running in Vercel and whatnot.
Ernest Koe:There is a back end your secrets and your server side stuff is being handled But file the FileMaker file becomes a really credible place where you can deploy all this cool web code without really doing much more than writing it with your agent. Right? And then you don't have to worry about all that other layer of stuff, the server, the infra. Security is handled because it's just going through the FileMaker security layer. Right?
Ernest Koe:So whoever has permissions to that layout and whoever has access to those fields or scripts that you are tying to that still stays the same. You don't have to re architect and redo all of that. So I can imagine this being a really sensible path to update upgrade a lot of existing FileMaker apps, take existing layouts and modernize it and just make it really performant within a web viewer.
Todd Geist:I expect you to see a lot of people making new apps because one of the interesting things about agents is there's a lot of it's done on your computer, right? So why not just build a little agent build like we did? We built a bunch of little agent UIs. You can just put them in a FileMaker file. The advantage is running on your computer.
Todd Geist:It's just yours. It's database, so you've got all the database stuff in it, but your agent can operate with it. It can read data out of it, it can write data to it, it can update UIs for you. So it's actually a really good little database in terms of self contained, very easy to understand for doing local agentic development. And then it also can be deployed on a server.
Todd Geist:So if you think about what has changed here, we used to think of FileMaker as being the fastest way to build UI and for business operations, and that's just not true anymore. It's just using native FileMaker layouts. Have to really just know that. You gotta absorb that in your heart because it's just not true. It's way faster to do it with agents.
Todd Geist:So once you get over that, then it's just like, this is not a problem. This is easy. You still have to write your scripts by hand. We've got some older ideas which can try to which we'll talk about maybe later about how we can improve some of that. But also, we're using this as a way to explain and show to Claris what we mean by these things that we're talking about.
Todd Geist:I think Clarus believes that they need to build an entire agentic coding UI in their application. And what we're trying to say is, No, you do not. You need to build the hooks that the agentic coding environments already use. That's a key distinction.
Ernest Koe:Yeah. I would say I don't know if Clarus thinks that way, but I certainly see a lot of opinions about what Clarus needs to do, which is exactly that.
Todd Geist:That's a better way of saying it.
Ernest Koe:Yeah. I think a lot of people are saying, and I have a lot of respect for a lot of devs who are saying this because I get it, you know, we want to be able to compete in an agentic development space and we want vibe coding to come to FileMaker in some way. Have a FileMaker become the IDE where you're making layouts, making scripts, making all the rest, you know, but I think our strong opinion here is that we don't need to wait for any of that because a it's already very, very good outside and will always be I think better given the agent harnesses that Claude and Codex are putting together. I mean, that's just incredible coding environment programming environment and it's incredibly rich and it's all already where you are working, which is the most important thing. It's already where you're building things, you know, in the world.
Ernest Koe:And we think that ultimately it will be nice to be able to agentically modify and create scripts and schema and structure and whatnot. As the first step to making this first class agentic platform, we should just start here and make these layouts a non issue.
Todd Geist:It's Yeah. The way to build apps and do what we can to make that even better. And there's more that we can do.
Ernest Koe:There's more than we can do.
Todd Geist:Yeah. But I think to me, if I could get to where I could script and update tables with my agents, that's gotta be above 90 of all the code that you write, right? Other stuff, maybe custom functions too. That would be the other thing. But everything else is like you set up your security privileges, you don't change those all the time.
Todd Geist:The other things that aren't that valueless is a few things. You could just do scripts and tables, that's got to be 90% of the code as you write.
Ernest Koe:I think it's worth saying that we're coming to this not from some, I don't know. I think we're coming to this with some inform from some some Lyft experience and some informed understanding of of what it takes to to do it the other way. I mean, when we wrote ProofKit as a basically, a in FileMaker agent, I think two things were, you know, became very, very clear in that process. Like, one, yeah, the amount of effort to really make that into a world class do everything agent is nontrivial, doable, but nontrivial. Right?
Ernest Koe:I mean, and and you can you have to pick the battles at some point. Do you I think for for specific use cases and for for narrower problems that you wanna solve where chat is still a primary UI and you wanna do that, and that's okay. But we've also done things like buybacks that you prototype where you're building scripts and file maker and it just seems like in order for that to be a fluid experience, we want to just get past the old copy pasta. Yeah.
Todd Geist:Yeah, for sure.
Ernest Koe:Right? We got a breakthrough. We need apis and SDKs that we can actually leverage in an agentic way. So so really it's not that we don't think those ideas are interesting. We've tried them all.
Ernest Koe:I think it's more like this is a clear. I mean this unlocks so many things out of the box, right? It gives you a development surface that we can iterate on much more quickly than anything else that we can do in FileMaker today. We can bring on online new components so we can open this up to the whole FileMaker ecosystem ecosystem. We can make the layout design surface a nonissue.
Ernest Koe:It's still useful and relevant for all kinds of things like admin layouts and things like that. But it doesn't have to do all the heavy work making your app experience, you know, number one anymore. Right? That can now be shifted towards a place where you can have a lot more control and do it faster and better and and all that sense. So I I
Todd Geist:think for me say one more thing about that too, if I could. So what I think now FileMaker itself, app is not free. But what if it was? What if FileMaker was just free and it was on The your other thing that's happening right now that's again, like how do you get people to understand and to know what you can do with these apps? So here's an example.
Todd Geist:Is inside of Claude. You can go and you can say, Give me connectors. And there's all these connectors from all these companies. Wouldn't it be great if there was a Clarus one here? Because every coding environment has a marketplace.
Todd Geist:Cursor has it, Codex has it, Claude has it. What if this was just there's a Clarus FileMaker plugin and you can just install that and that's how you get this MCP server that we're talking about doing. That's how you get it. And then you've already got FileMaker because it's free. I think this is and again, people who are coming to this agentic stuff soup now, maybe they play with FileMaker once in a few times in the years past or they've heard of it or they know something about it and it's something they feel they understand or at least they feel they can deal with, how many people are gonna pick up FileMaker again if it's in this marketplace?
Todd Geist:Or how many places that have FileMaker where they have licenses already, will they get extended? Will they get extensions, new seats, new licenses, scale because there's a Clarus Marketplace you know, there's a Clarus FileMaker connector here in the marketplace.
Ernest Koe:I think it's hard to I talked to somebody recently. I think it was John Luke out in Canada. Well respected longtime dev. And, you know, he he made a comment that I thought was really spot on, and that is the the AI is doing incredible things for the development world just to paraphrase things and he imagines that's where he's spending a lot of his time for most of his time building things. But there's two separations of concerns and problems that he'd rather not have to worry about and deal with like deployment is still a pain and it's a pain not because Vercel is difficult and Superbase is difficult.
Ernest Koe:Those things are ostensibly on your own not really difficult once you have AI in the mix. The problem is with more AI you get actually more complexity. Like, gets back to this idea that you're creating jobs for yourself every time you choose to touch a layer that you want to affect. Right? So his thing was, well, I don't want my job to be like managing infra.
Ernest Koe:I don't want my job to be managing servers. I don't want my job to be managing how the database, how the database performs and the security layer around it and all that. Mean, and I think, I think that insight speaks to what problem this actually solves. Right? So you can still do your problem solving, vibe coding, agentic development and clawed or codex or whatever.
Ernest Koe:But now you have a development target, a deployment target where that entire layer doesn't have to be your job. It's whoever's job or that is to manage that continues to stay the same and you're drastically reducing the complexity surface area so that you can focus on actually building rather than maintaining and serving and hosting.
Todd Geist:Yeah. I mean, if you have FileMaker, you can agentically code it and do some really interesting things with just what we've shown today. We don't even have to go farther than what we've shown today and it's already really cool because your alternatives with agentic coding are you're gonna have to deal with SQLite. Again, not a hard thing, actually a great database, but you've never seen it before. Now you have to take on a new responsibility and you already have FileMaker, already know how to use it.
Todd Geist:Just make a new file, put it in a folder, open it up, target it with ProofKit MCP and start building. When you're ready to share, share. No friction. And you're now building things that look great, that look really, really good and are closer to the experience that people expect when they think about modern UI and applications. And you can do a whole lot.
Todd Geist:And there's a lot more to talk about, like how you can adopt some schema less designs with your FileMaker table so you don't need to have all your data in every field spelled out. You can use JSON objects as data instead. A lot of ways in which the AIs, again, are really good at iterating on top of that that you can do right within FileMaker. So if FileMaker was the way to build rapid UI and business workflows, it's no longer that in terms of there's way faster ways to build out the UI and the workflows that you want, But it is actually a great deployment target. It is the container, the app container that is really, really, really good.
Todd Geist:And you already know how to use it. Yeah. That. Takes care of the security layer for you and all the other stuff.
Ernest Koe:I also think about performance and some of the other, I think mature FileMaker systems tend to go through this evolution cycle where you start simple and then you get more and more complex again, complexity kills and what suffers is performance because the layouts now have to do a lot of work and lots of relationships in your graph and all that. You know, one way to solve that is to have more performance and analytical tools to kind of figure out how you optimize The other way you solve it is just remove it, kind of delete it.
Todd Geist:Get rid of all those layouts and build them in a new way that is not going to be so difficult. I mean, we all know where FileMaker performance bottlenecks come from. It's almost always UI. It's people trying to a bunch of records through a filtered portal, for example, things like that. None of that's an issue with web based stuff, especially if we can get some of these other rough edges polished off.
Todd Geist:It's just not a problem. Yeah. So I think we've been on for a while. I don't know. This thing's gonna be well over an hour by the time it gets edited down.
Todd Geist:We didn't get to talk about some of the other things. We'll save those for next week maybe or next time. But we'd love to know what you think of this stuff and sort of the general discussion, but also what you think of ProofKit MCP. Drop us a line, let us know if this is something that you think is worthwhile and tell Claris too, if you agree with us.
Ernest Koe:Well, please don't be shy. Definitely tell us because we will love to hear from you and subscribe.
Todd Geist:I mean, we're doing what we can and we're working with what we have and we're working with Clarus and they're very receptive. But anything that we can do to help sort of get this message across, if you agree with it, if you buy it, if you believe it, then help spread the word because I think this is really important. Super fun. It makes FileMaker fun again.
Ernest Koe:Yeah. I think the last thing I might maybe say about this is that I'm just so excited that there's so much energy around AI and how we apply it to FileMaker. I see a lot of different projects getting launched, trying to bring AI into FileMaker and all that. I certainly don't I think there's going to be sort of an explosion of different approaches to making FileMaker a better citizen in the whole agentic AI development process. You know, Todd and I have a sort of a premise based on what agentic development means and how that shapes up thinking and that's how we're talking about this stuff.
Ernest Koe:But hey, if you're building AI stuff and AI tools for FileMaker, please share that with us. And if you wanna come on the podcast and talk about it, we'd love to hear from you. That would be, That's all I think on mission in terms of making our work in this space all around just better.
Todd Geist:Yeah. I should say like the MCP server exists today is about half of it is really strong opinions and the other half is just tools that are given to the agent to explore a FileMaker file. So we think in the future that those opinions can be much more malleable. Right now, we're focusing very much on making sure it works and so we're expressing our opinions quite strongly. That might change though.
Todd Geist:It might loosen up and if you have a slightly different way of setting up a project that still works, we'd love to hear about that so that we can eventually expand out and allow people to have their own taste so they can apply to this. So I think that's it for this week. And I don't know. We're gonna try to do these things much more often. Don't know.
Todd Geist:It was just so much fun to be had programming. It's a little hard to take some time off and actually talk about this stuff, but we'll do our best. Let us know if you want to hear more and that will encourage us to do more. So thanks very much. Bye.
Todd Geist:See you.