Harshal Gajjar

Harshal Gajjar is an AI Forward-Deployed Engineer at C3 AI, based in the San Francisco Bay Area. Harshal leads Agentic AI harness development for the Forward-Deployed Engineering organisation at C3 AI, and since January 2026 has been building a stealth-mode startup in the Agentic AI space. Harshal cofounded Shram.io in 2024, where he led the pivot from a Jira-competitor product to an AI assistant that reached #2 Product of the Day on Product Hunt.

Harshal holds an M.S. in Computer Science (Machine Learning specialisation) from Georgia Tech and a B.Tech in Computer Science from IIT Dharwad, where he was part of the institute's foundational class. He spent three summers at Wolfram Research in Boston — first as a summer researcher in 2018, then as an instructor for high-school students in 2019 and 2020 — and was a Wolfram Student Ambassador throughout his undergrad.

Outside of work, Harshal is a long-distance cyclist and a vertical and horizontal caver, active with the San Francisco Bay Chapter (SFBC) grotto. In 2019 he was part of the Hubballi Bicycle Club Guinness World Record for the longest single line of bicycles.

Contact Harshal at mail@harshalgajjar.com.

The blank box problem

The chat-vs-UI divide isn't about interfaces. It's about who holds the model of the task.

There's a split forming in how people use software, and it's easy to mislabel. On one side, the people who mostly chat (great agentic interfaces like Claude Code) — a blank box, a sentence, and the machine does the thing. On the other, the people who mostly click — buttons, menus, forms, maybe a chatbot tucked in the corner as a fallback. It looks like a fight about interfaces. It isn't. The real axis is who holds the model of the task.

Chat-first people carry the task model in their own head. They know what they want, they can say it in words, and they treat the AI as a general-purpose executor. To them the blank box is freedom. But it only works because they already know what's possible — the box has no affordances, no map. It rewards the fluent and strands everyone else.

UI-first people need the software to hold the model for them. Buttons and menus are a discoverable answer to "what can I do here." A chatbot bolted into the corner is what you reach for when the map runs out. To them the blank box isn't freedom, it's anxiety — the blank box problem: cursor blinking, and no idea what to type.

The trap is thinking this is a personality type. It isn't a property of the person — it's a property of your expertise in that specific task. I'm chat-first for code and UI-first for video editing, because I don't have the vocabulary to say what I want to a timeline. Almost everyone sits on both sides of the line depending on the domain. So the real unit isn't the person, it's the person-plus-task. Step back far enough and each person's mix of tasks blends into an apparent type — a programmer "looks chat-first," a designer "looks UI-first" — but that label is just an average, and it hides the truth: the same person hits the blank-box wall the moment they try something they're not fluent in.

Underneath, it's the oldest tradeoff in interface design. Chat is recall. UI is recognition. Recognition is easier — seeing the options beats generating them from nothing, which is why menus feel safe and empty prompts feel like an exam. But recall has the higher ceiling: chat lets an expert ask for things no designer ever put on a menu. So chat is low-floor-only-if-you're-fluent, high-ceiling. UI is high-floor, low-ceiling — it guarantees you can do the anticipated twenty things and caps you there.

Which is why "just add a chatbot" and "just add buttons" are both losing moves. The bet worth making is the one that collapses the divide instead of picking a side: an interface that shows you the task model while letting you exceed it. Recognition to get started, recall to go past the edges. The pure-chat future quietly assumes people can specify what they want — and specification is the hard part. Most of us don't know what we want until we see it. Any interface that demands a fully-formed sentence before it shows you anything has offloaded its hardest job onto the person least equipped to do it.

The resolution I keep landing on is: don't make people choose between pointing and describing. Let them point at the real thing and describe the change — recognition and recall in the same gesture. You're not staring at a blank box trying to name what's wrong from memory. You're looking at the actual interface, putting your finger on the part that's off, and saying, in words, what it should be instead. The UI holds the model; the language lets you break past it.


That's the bet I'm making with medit. It lives inside your running app: you point at the element that's wrong, describe it in plain language, and it writes the code and commits it — and you keep iterating, playing with the change live, until it's right, then open the PR when you're happy. The pointing is the recognition — you never have to name a thing you can't see. The describing is the recall — you're not capped at whatever a menu anticipated. It's the same idea as the phone being the terminal: the interesting surface isn't another chat app, it's the layer that meets you where the task actually is.

The blank box was never the goal. It was just the first draft — and drafts are meant to be revised.

#interfaces#agents#llmsmedit