Shortlist is an AI-powered conversation-based job matching app I designed and built independently. This is the working record of how it came together — the decisions, the tradeoffs, the things that broke, what I learned, and how I experienced its impact in real time.
Demo walkthrough — condensed for time. The Coach conversation runs longer in practice.
I've spent years inside product and operations teams watching how friction accumulates when nobody asks whether the flow actually works for the person using it. When I started job searching in earnest recently, I turned that same lens on the process itself.
Traditional job boards are not human-optimized in a number of ways. They let you enter a title, select a few other preferences from dropdowns, and check "remote" and call that a search. They have no way to capture what I'd walk away from, or what kind of environment empowers me to do my best work, or that I've been quietly drawn to automation-forward companies for years without quite knowing how to boil it all down.
"A job search should start with a conversation, not a form. The thinking behind Shortlist is that simple."
So I built something that starts with a conversation instead. After uploading any documents you would like to have it take into consideration (e.g., resumes, previous interview prep materials, etc.) an AI coach then interviews you about your actual experience, values, and preferences. It synthesizes that into a structured profile. Then it scores job listings against the whole picture and surfaces results with plain-language explanations. Currently, everything runs locally on your machine.
Python and Flask on the backend. Plain HTML, CSS, and JavaScript on the front end with no framework. The Claude API powers the Coach conversation, profile synthesis, and job scoring. JSearch via RapidAPI pulls job listing data. Everything runs locally, packaged for distribution via PyInstaller.
I am not a software engineer. While I possess basic HTML and CSS skills, I want to be upfront about that. I used a combination of Claude chat and Claude Code extensively throughout this build. Claude Code is an agentic coding tool that writes, runs, and debugs code directly. My job was to think clearly about what I wanted, communicate it precisely, and make good decisions about what it produced. Using Claude chat to formulate the best possible prompts had me moving in the right direction and at the rapid clip of my ideation from the get go.
That turned out to be most of the job.
Every feature in this app started as a conversation with Claude chat that led to a thorough prompt that fed Claude Code exactly what it needed to start building my idea into reality. I used Claude chat as a tutor to help me create strong prompts that included: a structured spec with clear behavior definitions, edge cases, implementation notes, and test criteria. In tandem with Claude chat, I wrote hundreds of these over the course of the build.
Writing a good Claude Code prompt requires the same skills as writing a good project brief. Clear problem framing, explicit success criteria, awareness of edge cases, and enough systems thinking to anticipate what could go wrong. The prompts are artifacts of structured thinking, not just instructions to a machine.
Here's an example from the ATS quality weighting feature, which scores job results higher when they come from Greenhouse, Lever, or Ashby (platforms that tend to attract more established, intentional employers):
On any given day I am doing some mix of identifying what isn't working and figuring out why, writing prompts that specify what I want with enough precision that the output is actually usable, reviewing what got built, testing it, and deciding what to change next. I am also making product decisions about what to build, what to cut, and what the right UX pattern is. And I am writing copy — app UI, landing page, legal docs, LinkedIn posts. And designing two full visual themes.
This is obviously not how most software has historically been built at scale. But it is teaching me a lot about what clear thinking looks like in practice, and how much leverage you can get from communication skills when the tools are this capable.
The Coach conversation model was right from day one. The whole product rests on the idea that you should be able to describe yourself in natural language and the tool does the translation. Every feature decision downstream flows from that.
Making the profile fully editable was the most important trust decision I made. AI synthesizes well but it also overstates and occasionally invents details that sound plausible. A job seeker's profile is not the place to let that slide. Keeping the user in control of their own representation is how a tool earns trust.
The three-tab structure on the Evaluations page took several iterations to get right. Longlist, My Finds, My Shortlist. That progression reflects a real mental model and it matters.
The Tracker was a late addition, and it might be the most useful feature in the app. Moving from passive matching to active pipeline management, with interview tracking, interviewer profiles, a salary negotiation workspace, and a Lessons Learned field per role, closes a gap that no other tool I know of addresses cleanly.
Coach interactions are being refined for pacing and precision. Early versions were verbose in ways that felt more like being talked at than talked to. Current work involves tuning streaming speed, tightening response length, and adding contextual quick reply buttons so the conversation feels like a smart collaborator who knows where you are in the process.
Longlist search reliability is an open problem. While it is fully functional, it is limited by the number of calls permitted with a free RapidAPI account (200 calls per month). That's not a question of functionality, it's a question of ROI.
Getting the Longlist right matters. It is the feature most people will reach for first, and an empty result screen is a bad first impression for a tool built around finding you the right job.
Windows support, PyInstaller packaging, and the Gumroad listing are all on the roadmap, but not done. I decided to keep building features rather than packaging for distribution while the product is still in the very early evolutionary phases. The product I'd have shipped right away was nowhere near as robust as what I have now. The fact that it took me just a few weeks to get the product to where it is now speaks not only to the power of the AI collaboration, both to brainstorm and build, but to the power of imagination and determination when technology finally catches up to how your mind works. I haven't had this much fun in years!
Vague prompts produce vague output. Specificity is crucial, whether you directly author the prompts yourself or in thought partnership with Claude. You can't be precise about something you haven't thought through clearly. Talking out the goal and your perceived concerns about how to get there will help you figure out how to compose your prompts.
Almost every bug happened at the boundary between two features. Keeping a mental model of how the whole thing fits together matters as much as writing good individual prompts.
A thing that exists and is wrong is more useful than a perfect spec for something that doesn't. I iterated fastest on the features I could actually look at, react to honestly, and fix specifically.
Every label, every empty state, every error message is a decision about what the user believes the product does. "Have you applied?" is a better label than "Update status" because it meets the user where they are. These decisions compound.
The most important early AI design decision I made was making the profile fully editable. Coach synthesizes well, but it also overstates and occasionally invents details that may sound plausible. For something as consequential as how a person represents themselves to a potential employer, that's not cool. Putting the human in control of the final output was the right choice.
No one can argue the astonishing speed and wealth of helpful information a brief interaction with an AI assistant can offer. But just because AI is capable of fronting a wall of useful text in milliseconds doesn't mean a human can fully absorb or appreciate the volume and speed. Tuning the speed of Coach and training it to meter its communication for better human-grade consumption made a huge difference in how approachable the whole experience is.
Operations and communications professional based in Peoria, IL. I've worked in product and operations teams at organizations ranging from SaaS startups to healthcare. I think clearly about systems, communicate effectively, and build things that work for real people.
Shortlist is the clearest demonstration I have of how I actually work.
A running record of what got built, what broke, and what was learned. Most recent first.