Project: Action on Sent
Writeup of 'Action on Sent' - a project automating actions every time an email is sent
Press send. Do the associated task. Send another email. Do the associated task. .. And often the task is defined directly in the sent email đ€Šââïž
One busy director, in need of more hours in the day, was tired by this loop. They wanted to save the time and the mental energy drained by doing follow up tasks after emails were sent. It felt inefficient to do this menial grunt work again and again.
With the unstructured & varying data in emails, then this couldn't be done with regular automations. So we decided to create an AI Agent system that could relieve him of one of these mundane tasks, and then expand to more workflows later.
Description
The system serves as a bridge between email communications and CRM management, automating the process of capturing sales opportunities in the CRM.
It addresses the common business challenge of ensuring that potential sales leads from email communications are properly tracked in a CRM system without requiring manual data entry.
It works in the following way:
- The AI Orchestrator is pinged when an email webhook triggers.
- The Analysis AI analyses the text and decides whether itâs a sales deal.
- The CRM AI looks for duplicates in the CRM and creates necessary deal and objects.
- The Logging AI logs the outcome of the other AIâs for transparency & future improvements.
Email-to-CRM Automation Flow
Intelligent email processing with AI orchestration
Email Webhook
Incoming customer email
AI Orchestrator
Coordinates agent workflow
Analysis Agent
Analyzes email content for sales opportunities
Returns deal confidence score
CRM Agent
Creates deals in CRM if they don't exist
Creates deal if it doesn't exist
Logging Agent
Monitors and logs all activity
Creates & stores the activity log
UI Update
Shows all activity in UI for transparency
Process Flow
Why the use of an AI Agent flow for this instead of Zapier/Make?
The email analyser makes text analysis hassle free. Deciding whether something is a potential deal = easy.
Show example of emails that are deals versus those that arenât.
The AI Agent can extract context from the entire email chain and fill out fields as needed and fill in empty values when none is found.
Having worked with CRM automation before, it is a hassle to work with missing fields, missing context from long conversations, etc. An automation that creates objects wrongfully and has to be changed destroys the whole purpose of it.
An AI can do this job much better.
FROM: hn@moc.dk
can't extract name & org from email addressTO: bob@besafe.dk
Hi Bob.
I like what you have done to your office.
We need to get an upgrade on the security system in building 2. Can you send over a price?
I'd like to get this done before next week.
potential deadlineBest regards,
Hans Nielsen
The Municipality of Copenhagen
+45 23 42 23 42
FROM: bob@besafe.dk
TO: hn@moc.dk
Hi Hans.
It was great seeing you yesterday at the conference.
The price of 2 cameras, wiring and the security box will be âŹ15.000.
text indicating an offerConfirm this email and we'll get this sorted.
potential next stepsBest regards,
Bob
Bob sent over a price of 2 cameras, wiring & security box for âŹ15.000 as requested by Hans who wants this done before next week after seeing each other at the conference the day before.
The AI analyses the sent email. Understands that it is an offer. Also extracts Bobâs full name and organisation from the email as it decides his email address doesnât give the right info. It also extracts a short summary from the full email chain. And it extracts that the next step is for Hans to get back.
This can easily be extracted to some structured output and is then easy to use to check for duplicates, both strict and fuzzy logic.
The CRM agent then sees the contact email doesnât exists, so it creates it. The organisation does exist though, so it connects the new contact with the organisation. The contact doesnât have any pre existing deals, so a new one is created with the right name, organisation, an email chain summary in the notes and follow up task for Bob to follow up in 2 days as Hans wants to get this done before next week.
Email Chain â JSON â CRM Pipeline
AI agents processing email communications into structured CRM data
Hi Bob. I like what you have done to your office. We need to get an upgrade on the security system in building 2. Can you send over a price? I'd like to get this done before next week. Best regards, Hans Nielsen The Municipality of Copenhagen +45 23 42 23 42
Hi Hans. It was great seeing you yesterday at the conference. The price of 2 cameras, wiring and the security box will be âŹ15.000. Confirm this email and we'll get this sorted. Best regards, Bob
Click "Analyse email & extract deal data" to process emails
Convert JSON to see CRM entry
This system could have been done with just 1 agent, but giving each agent different context increases the likelihood of correct outcomes. It might have been overkill for this project, but we wanted to make the project future proof and ready to automate more automations from emails in the future.
The rest of this write up is for different audiences.. so click below if you:
- want to read more about the technical considerations
- are curious about my learnings (mostly related to how to work with LLMs)
- think that there must be some bigger potential with this application?
Tech stack
You can skim an accurate & extensive, though not that interesting nor filtered, description of all the technical aspects of the project in this deepwiki here.




My own aim with this project was to learn how to build a production ready AI Agent system using LLMs. There are a few steps in that:
- Figuring out the system architecture
- Figuring out what deployment providers to use
- Decide where on the scale of locally ready to enterprise ready the system should be
- Figuring out what LLM to use
Working with LLMs best practices state that one should discuss with multiple LLMs before generating a final specification doc that then can be used with an execution agent like Claude Code, Gemini CLI or Cursor.
When doing that itâs important to remind the LLM to consider the scale of your application and what your learning goals actually are. It will often suggest very scaleable systems, which might be overkill for the beginning of a project.
- Frontend: Next.js, TypeScript, Tailwind CSS, Shadcn/ui components (deployed on Vercel)
- Backend: Python FastAPI application (deployed on Railway)
- Database & Auth: Supabase
- Integrations: Pipedrive CRM, Microsoft Outlook
- AI: LLM API for sales opportunity detection. Use OpenRouter
- Deployment: Vercel (frontend) + Railway (backend) through Dockerfiles
- Triggers: Webhook-driven when user sends emails
- Real-time: Supabase subscriptions for live dashboard updates
System Architecture
Data Flow Sequence
Step by step flow diagram
So as you can read, an attempt to not overcomplicate things before it is necessary - but also have it production ready for a client.
v2 tech stack
- Frontend: Next.js, TypeScript, Tailwind CSS, Shadcn/ui components (deployed on Vercel)
- Backend: Python FastAPI application (deployed on Railway using Docker)
- Database & Auth: Supabase
- Integrations: MCP servers
- AI: LLM API for sales opportunity detection. OpenRouter as Gateway (& cost & rate limits), LangChain for functions
- Monitoring: OpenTelemetry (logs, metrics & traces) on Honeycomb, LangSmith for AI observability, Error tracking (Sentry).
What I have yet to implement
- Proper tests
- Implement LangChain instead of OpenRouter functions
- Rewrite monitoring aspect to fix Honeycomb & LangSmith
- Install Sentry & set up alert system
Learnings
First attempt: Let Cursor freely vibecode an enterprise grade application
Main error: Believing Cursor could handle a multi tenant full stack application with third party integrations, microservices, dev/stage/prod environment & databases with little supervision.
I wanted to learn to develop production ready systems, so without much prior experience I ambitiously journeyed out to try and create an application with multiple development environments, with different databases, third party app registrations for different environments and all of course deployed with Dockerfiles.
I got carried away by some vibecoding influencers and thought if I just described the project thoroughly enough, then Cursor could implement it.
I spent a long time discussing different technical architectural choices, infrastructure providers, etc. That process was actually really valuable. I ended up with a tech stack similar to the one I actually used in my third attempt, though everything hosted on Railway, but with too little oversight about the management of the different development environments.
I started just letting Cursor go and try to implement. When we got to the basic scaffolding (which was waaay too much in one go) and I had to deploy it, then I got into a hell trying to debug why the deployments were crashing.
The LLM too often got confused about env keys for the different environments, what database to spin up through the Dockerfile and where I was storing the variables. Every fix the Cursor tried to make just made me and it got even more confused about what to do. It didnât feel like it really understood the connection between the codebase and the infrastructure where we wanted to deploy it.
So that was when I thought:
The newly launched Gemini CLI with itâs huge free token tier must be able to make sense of this. Especially if I do it all with pure Google infrastructure! What could go wrong??
Sidenote: I want to try again with full-stack deployment on Railway with multiple services & PostgreSQL. But I want to deploy incrementally.
Second attempt: Lost trust in Gemini CLI in one day
Main error: Thinking I could handle the same as before if i ran a full Google Cloud Platform infrastructure with Gemini CLI implementing and also Gemini setting up the infrastructure through terminal commands.
I had previously worked with Firebase and my previous startup was running on GCP, so I felt confident that I could get the infrastructure to work there. With my newly earned learnings about the difference in development environments I tried carefully instructing Gemini CLI to be aware of the different environments.
HackerNews was overflowing with experienced developers using Claude Code with great results, though expensive. So when the free Gemini CLI came out I got eager to try it and ready to see itâs magic.
You can probably read from the text where this is going. I think it actually did a lot of things right. And it probably could have worked in GCP. However within my first day of testing it I encountered waaay to many times that Gemini tried fixing an error by editing a code file, but the code file wasnât actually edited, so when the edit (that didnât happen) didnât work (surprise), it went down a rabbit hole editing other correct files to make the code work. This seemed like such a basic capability that made me distrust Gemini immediately.
Thatâs probably a good product learning. Release quickly, but make damn sure you donât lose the trust of your customers.
Third attempt: Careful step by step development
It seems obvious in hinsight careful oversight was needed, but hey, we all have to learn it the hard way
I actually wanted to try to run as many services as possible as Vercel functions in my third attempt (I have always been an admirer of serverless architecture). But i quickly came to dislike how the Python code ran in those functions and the complexity in developing locally first before testing in production.
So the frontend ended up being deployed on Vercel and the backend on Railway. For this project I decided to only use a dev & prod environment, 1 supabase database for both environments, and keeping third party app registrations as the same for dev & prod. Not best practices i know, but a foundation to start and improve on.
There is no microservice structure, but one monolith backend deployed as one service on Railway to simplify the management of env variables.
What I did
This is where i actually learned why Docker, pre deployment steps & github branches are great. Also why itâs a stupid idea to have an AI generate your .gitignore & .vercelignore files as I might have had many hours wasted thanks to that.
For this project I also went back to using Cursor. Itâs interface makes it easy to review code and setting up memories & rules.
And now i have good a modern tech stack & deployment flow that i am comfortable with and that i can expand & adapt with different projects đ
What I got right with this attempt
- Outline and continuously refine a step by step plan with testable phases for Cursor to follow (
TASKS.md). Explicitly instruct the LLM to create tasks to make the things work in development first, then production before advancing to next phase. - When introducing new libraries, endpoints or bigger changes to the codebase, then get a basic setup up, tested & deployed. And then add the necessary logic.
- I used a more careful development flow:
- I created new branches when working on a new chunk of the project
- I followed the chat window more when the agent was writing code
- I told the AI tests itâs own implementation with small commands (preferably first) and then scripts.
- I always went through the pre deployment steps of building the Dockerfiles before merging back into
My workflow
At the start of the project I had a PROJECT_OVERVIEW.md file and a TASKS.md file. These gave the overall specifications and guidelines for the project. The task file is not a complete step by step file, but overall directions.
Then when starting on a new area or new feature Iâd often open Claude & Google AI studio and ask for ways a specific outcome could be reached. This context was applied to a prompt for Cursor to refine the next phase tasks to do it in a certain way.
Using that workflow I ensured Cursor always knew the overall direction, and when getting to a new phase then we would inject the task list with more specific guidance.
Complications
Cursor is great at just doing. The prompt running their system is designed for that. This works great when you donât have any real requirements to the feature and just want an outcome.
But when you have requirements, then you really need to steer it or it just finds a way. A way LLM coders bypass this is to use the initial specification doc and then generate a task list from that that gets updated.
Initially my workflow was discussing how a feature should be with Claude in a Claude Project where I had instructed it to always ask clarifying questions, and then having it generate code and copying it into my project. Initially this was much more reliable than Cursor code (in free mode), but copying/pasting became too tiring (I did learn how to quickly navigate around vs code with shortcuts and finding functions/files quickly though đ€).
It is also great at fixing bugs quickly. But not perfect. They can get quite stubborn that something is the error and then just tirelessly works on that. I have seen it in loops where it continuously accuses build caching, relative variable paths, wrong environment variables, etc.
Enabling yolo mode has also made my workflow quicker, however I quickly learned some rules I need to set to prevent certain actions. Like never commit to git or update TASKS.md without me explicitly asking for it. Yolo mode is not recommended by Claude, but I see more and more experienced developers on HackerNews using it in their workflows.
I prefer to always follow what it is doing, but sometimes I get too trusting and start working on other things or make my lunch while it is flying. Most times to good results, but some times I also wish I had been there to stop it.
I have directed it towards easier solutions than an overly complicated path it had taken quite some times. I have also prevented it from introducing hardcoded variables and sample data.
Also be aware of the AI using random library versions. Iâd push it to mostly use latest instead of specifying it. Iâve also seen it suddenly downgrading libraries to fix a bug, where the bug could also be fixed in simpler and less dangerous ways.
Donât be afraid of stopping the Agent mid execution or writing wait in the chat to stop it.
Other small ideas and common best practices
- Implement a design system as one of the first things.
- Start a new chat often. If a feature emerges into something new, then ask the AI to summarise the context.
- Keep a
WORKLOG.mdfile where you note down your own learnings and reminders for things you have skipped now but want to refine/add later. - Refine your project_overview.md & tasks.md files as you go along in the project.
- Continuously refine your cursorrules. If you see they are not being used, then they are probably too long. Maybe one should divide them up into be more area specific in the future (one for frontend, one for the database, one for API design, etc..). I found that some rules in the cursor directory might be too long.
- Remember that the AI is biased towards agreeing with you. With that in mind, be very aware of how you write your questions. I have caught myself many times at the end of writing a question laughing at how obvious it is that I really hope for a specific answer.
- Instead of: Is Firebase better for user authentication in our application than Supabase?
- Use neutral questions like: Evaluate the pros & cons of using Firebase for user authentication vs Supabase.
- This is the first project where i added libraries documentation directly into cursor, which seemed to work excellent.
How to learn from AI assisted development
- Write long ass blog posts for yourself that no one ever will get to the end of, but it feels like the learnings have manifested more thoroughly after this thought process.
- Ask the AI to explain step by step what happens in the code
- Ask whether there are other ways to write the same code (with pros & cons). Asking for what the modern way is vs the tested and reliable old way also gives interesting info
- Ask the AI to generate flow charts or sequence diagrams of how the solution or a specific feature look versus other ways it could look.
- Ask one AI to criticise another AIâs code.
- Provide error logs to the AI and asking what they mean instead of just letting it fix it immediately have accelerated my troubleshooting abilities.
.. Iâm not doing this as often as I think is good for me at the moment. I get obsessed and advances too quickly sometimes instead of taking time to learn. Maybe that is the best way to keep momentum, but Iâm definitely missing out on some ongoing learnings.
Other learnings
- Running Python code directly as Notebooks within VS Code/Cursor is a treat. Makes it so easy to see test code step by step. It was also essential for this project to refine the AI prompts and CRM duplicate testing quickly. When the code was done in the Python notebook test file, then the AI integrated that flow into the codebase without any errors.
- I learned to use Ngrok to test webhooks in local development.
- Learned to work in dev & prod environments. Having a deployment workflow for frontend and backend really helped me understand separations of concern.
- Getting the AI to optimise the Docker build early was great for quicker production iterations.
What i need to learn next
- I want to integrate monitoring from the beginning (maybe OpenTelemetry, Honeycomb, Sentry)
- Work with +2 dev environments.
- Work with a database for each environment.
- Implementing automated database backups & recovery.
- Caching
- 100% Infrastructure as Code.
- How & when to intelligently introduce tests to my development workflow
- Working with Claude Code
A major thing I have learned from this project is that, even as a rookie AI assisted developer, my intuition is valuable. When I sit with the LLM for hours, guiding it, tweaking it, seeing it run, then I get a good idea of when it is going wrong. Even in areas when Iâm not an expert.
I can only imagine how powerful this tool is when one is an expert in the given areas. Then one can steer it in the right direction much better and stop it early when itâs going down a wrong path.
To me, coding with LLMs is like having a very eager friend who really just wants to code. No matter how minimal the instructions it just wants to code. And I get to sit as the manager that acts both as a systems architect to make the overall technical decisions and a product manager that decides what is important to build next and is responsible for the system working in production. But because my friend is so eager to code and so fast it actually makes sense just to sit and observe what he is doing, while reflecting on ways to improve his output, while discussing with other LLMs about future technical decisions, while creating design component on the side in v0, etc.
Sidenote 1: Iâve found in my development process that i started talking to the AI as âweâ instead of you/i when prompting it. I noticed while writing that I kept writing we. I apparently consider my LLMs and I as a team, so i decided to keep it in. Just so you know Iâm not referring to any other people. Itâs also scary. This is how i refer to it after playing around with it for 2 months - letâs see how much itâs a part of me after 2 years x(
Sidenote 2: While Iâm trying to do as much as possible with AI, I still believe making these writeups myself is valuable for me and makes the content more valuable for others. All images & demos are AI generated in v0 though.
Potential
TL;DR
- Read emails sent â Take action. Actions could be much more than just CRM action. Hook the trigger up to any MCP and let it fly.
- From reading emails sent to reading emails as they are being written to reading everything the user writes in the browser to reading everything the user writes on their mac.
- Go from just doing things automatically to propose actions for user acceptance to allowing the user to say âalways yesâ.
There are many ways this project can expand.
Integrate many more workflows
I envision a system where the user just toggles what tools they use for different purposes (notion for documentation, zendesk for customer support, todoist for tasks) and then the system can automate everything that follows an email.
- Update item in Trello based on email
- Add this to âTeam Documentationâ in Notion
- Open a customer support ticket for this in Zendesk
- Send this to our accounting team on Teams
- etc, etc.
With the rise of MCPs and the OAuth system in place, integrations can roll out quickly. Defining the workflows and finetuning the prompts is what will make this work.
Also trigger on emails in inbox
Many project managers (myself including at Microsoft) spent a lot of their time receiving communication and then transforming it somehow and then passing it on to a colleague or entering it into a system. This is also an ideal workflow for AI Agents.
Feedback integrated
When the AI detects a potential action from an email, then a notification should pop up with the following options:
- Accept action
- Always accept this type of actions
- Reject action
When the action is done, then the user should be able to review the workflow and integrate that into the LLMs memory.
Quicker workflows
I envision a natural evolution of this system. Today, the AI Orchestrator decides what action to take based on the email sent. Next step could be:
- The AI analyses your email as you are writing it and proposes the task as you are finishing the email.
- The AI analyses any text you write in your browser or on your computer and proposes tasks.
- The AI sees everything on your screen and proposes tasks.
.. This all has to be done with a sharp focus on privacy and maybe a locally run or gated latency optimised AI that does the analysis quickly, so no external LLM gets access to all the persons info.
A neat feature could be that when the AI proposes a task, then it is automatically done in the background. So when the user clicks yes (preferably with a keyboard shortcut or through voice), then the link to the completed task immediately appears or shows. If the user rejects it, then the task gets cleaned up so the user wonât notice it.