I’ve been on this Ai-Assisted Design journey for about 6 weeks. It happened by coincidence. Claude shipped the “Code” Tab in their desktop app. I decided to export the Halo App code I built in FlutterFlow, and see what Claude Code does with it. I never went back to FlutterFlow again. Ran into limitations with the Claude Code desktop app, and switched to VS Code.
A Complete Mobile app for iOS and Android
Die Halo App started in FlutterFlow. But after I started working with AI-Assisted Coding, my productivity exploded.
Within a span of a few weeks, I shipped work I never dreamed of:
- Ability to purchase individual stories
- Multilingual support for 5 languages
- Complete refactor of the app to increase performance on older devices
Along the way I learned a TON about how to work with the LLMs, how to debug, and where the limitations lie.
A Room Visualizer for Real Estate Agents
My neighbor sells high-end real estate. We’re exploring opportunities to build together. As part of this, we explored the capabilities of using AI to “interior design” rooms.
This project went really well, too, which built up my confidence to try something even harder….
A WordPress Translation Plugin
I run a blog site with 4000+ posts, translated into 9 languages. The big problem is that the best translation models, like DeepL, are prohibitively expensive for this kind of work. So I thought: Could I build a plugin that uses LLMs to translate?
This didn’t go so well at first: The first iteration of the plugin would begin translating, run into errors it didn’t report to me, and ate up $250 over the course of a few days. That really disappointed me and forced me to regroup.
The learnings were:
- WordPress is an old and cryptic system, so you can’t just point an LLM at it without very careful context management. Before starting, I didn’t even check if there were any WordPress-specific skills for this kind of work. I didn’t point the LLMs toward the official WordPress plugin documentation. That was dumb.
- Be aware of your own limitations: If you don’t understand anything about PHP or plugin development, you’re gonna have a hard time filtering the BS that LLMs produce.
- It’s better to build a new plugin from scratch, rather than trying to fork an existing plugin.
Second iteration of the plugin worked perfectly.
Entirely New Workflows for Prototyping with Stakeholders
I think all designers are currently figuring out how to best integrate these new tools into their workflow. I am fortunate to be working with a company that allows me to experiment.
Two Entire Published Music Albums
My wife and I got addicted to Suno. It’s so much fun. We produced Christian music that we wanted to hear ourselves, then published it with DistroKid just so we could listen to them regularly at home.
What I learned Along The Way
These tools are not smart
PhD Level intelligence my ass. These are very impressive math functions that predict the next word based on their compounded knowledge. You can get miraculous amounts of work done, if you understand how to work around the limitations. But the more I use these AI tools, the less I am impressed by or afraid of their supposed intelligence.
GPT Codex > Claude Code
Claude Code regularly loses capabilities during peak hours. It’s like they don’t have enough compute to go around, and when too many people use it, the model gets dumber. Meanwhile GPT works more slowly, but also more reliably and deliberately.
AI Burnout Is Real
There is a mental tax when using these tools. You need to read everything they say, then run it through a “Is this bullshit / incomplete” filter. No amount of “AI Harnesses” will completely take this away. If you do this a lot, it starts to wear you out. It’s like a management job.
Work Around the Tool’s Limitations
Manage the model’s context windows
Long chats feel productive. Often they are just long.
As the thread grows, the model starts to lose its grip. It forgets earlier constraints. It brings back ideas that were already rejected. It answers the current question as if the last ten turns never happened, or as if they all still matter equally.
Customize it with instructions & skills
A generic model gives generic behavior. If I do not tell it how I want it to work, it will improvise. Sometimes that is useful. Often it is expensive.
So I try to make my standards reusable. I want the model to plan before it builds. I want it to name assumptions instead of hiding them. I want it to say when it is unsure. I want it to flag scope changes instead of sneaking them in. I want it to separate what it knows from what it is inferring.
That may sound obvious. But if I do not state those rules somewhere durable, I end up restating them in every session. Then I become the process.
Plan carefully
If I hand the model a blurry task, it will usually hand me back a blurry solution in neat packaging. That is one of the reasons these tools can waste so much time. They are very good at making unfinished thinking look finished.
Before implementation, I want the task made concrete. What is the goal? What are the non-goals? What files are likely to change? What edge cases matter? What counts as done? What is still uncertain?
Only use programming languages that are well documented and understood
The model works better where the ground is familiar: popular languages, mature frameworks, strong documentation, lots of examples, and clear conventions. It works worse in strange corners where the docs are thin and the patterns are murky.
For the kind of software I can realistically supervise, this matters a lot. I can build and debug a fairly straightforward mobile app that lets people view and consume media. In that setting, boring and well-documented is a strength. It gives me a better chance of seeing when the model is bluffing.
Know your own limitations in relation to the tool’s abilities and limitations
This means: If you are like me and can barely string together a function, don’t try to build things you have no hope of understanding. I can build and debug a relatively straightforward mobile app that lets you view and consume media. I don’t think I could build functioning medical device application.
The real question is not, “Can the model generate this?” The real question is, “Can I supervise this well enough to know when it is wrong?”
Schreibe einen Kommentar