Project: Action on Sent

Writeup of 'Action on Sent' - a project automating actions every time an email is sent

ai-agents

python

automations

By Jeppe Rasmussen

12th of July 2025

It's waaay too long, sorry. Kudos to you if you can keep your focus

Press send. Do the associated task. Send another email. Do the associated task. .. And often the task is defined directly in the sent email 🤦‍♂️

One busy director, in need of more hours in the day, was tired by this loop. They wanted to save the time and the mental energy drained by doing follow up tasks after emails were sent. It felt inefficient to do this menial grunt work again and again.

With the unstructured & varying data in emails, then this couldn't be done with regular automations. So we decided to create an AI Agent system that could relieve him of one of these mundane tasks, and then expand to more workflows later.

Hover if you are curious about my background

Hover if you are an experienced developer

Description

The system serves as a bridge between email communications and CRM management, automating the process of capturing sales opportunities in the CRM.

It addresses the common business challenge of ensuring that potential sales leads from email communications are properly tracked in a CRM system without requiring manual data entry.

It works in the following way:

The AI Orchestrator is pinged when an email webhook triggers.
The Analysis AI analyses the text and decides whether it’s a sales deal.
The CRM AI looks for duplicates in the CRM and creates necessary deal and objects.
The Logging AI logs the outcome of the other AI’s for transparency & future improvements.

Email-to-CRM Automation Flow

Intelligent email processing with AI orchestration

Email Webhook

Incoming customer email

AI Orchestrator

Coordinates agent workflow

Analysis Agent

Analyzes email content for sales opportunities

Returns deal confidence score

CRM Agent

Creates deals in CRM if they don't exist

Creates deal if it doesn't exist

Logging Agent

Monitors and logs all activity

Creates & stores the activity log

UI Update

Shows all activity in UI for transparency

Process Flow

Email received

Orchestrator activated

Content analyzed

Deal processed

Activity logged

UI updated

Why the use of an AI Agent flow for this instead of Zapier/Make?

The email analyser makes text analysis hassle free. Deciding whether something is a potential deal = easy.

Show example of emails that are deals versus those that aren’t.

The AI Agent can extract context from the entire email chain and fill out fields as needed and fill in empty values when none is found.

Having worked with CRM automation before, it is a hassle to work with missing fields, missing context from long conversations, etc. An automation that creates objects wrongfully and has to be changed destroys the whole purpose of it.

An AI can do this job much better.

FROM: hn@moc.dk

can't extract name & org from email address

TO: bob@besafe.dk

Hi Bob.

I like what you have done to your office.

We need to get an upgrade on the security system in building 2. Can you send over a price?

I'd like to get this done before next week.

potential deadline

Best regards,

Hans Nielsen

The Municipality of Copenhagen

+45 23 42 23 42

valuable for CRM entry

FROM: bob@besafe.dk

TO: hn@moc.dk

Hi Hans.

It was great seeing you yesterday at the conference.

The price of 2 cameras, wiring and the security box will be €15.000.

text indicating an offer

Confirm this email and we'll get this sorted.

potential next steps

Best regards,

Bob

SUMMARY

Bob sent over a price of 2 cameras, wiring & security box for €15.000 as requested by Hans who wants this done before next week after seeing each other at the conference the day before.

The AI analyses the sent email. Understands that it is an offer. Also extracts Bob’s full name and organisation from the email as it decides his email address doesn’t give the right info. It also extracts a short summary from the full email chain. And it extracts that the next step is for Hans to get back.

This can easily be extracted to some structured output and is then easy to use to check for duplicates, both strict and fuzzy logic.

The CRM agent then sees the contact email doesn’t exists, so it creates it. The organisation does exist though, so it connects the new contact with the organisation. The contact doesn’t have any pre existing deals, so a new one is created with the right name, organisation, an email chain summary in the notes and follow up task for Bob to follow up in 2 days as Hans wants to get this done before next week.

Email Chain → JSON → CRM Pipeline

AI agents processing email communications into structured CRM data

Email Chain

JSON Extract

CRM Entry

Email Chain

Raw email conversation between Hans and Bob

FROM: hn@moc.dk

TO: bob@besafe.dk

Hi Bob.

I like what you have done to your office.

We need to get an upgrade on the security system in building 2. Can you send over a price?

I'd like to get this done before next week.

Best regards,
Hans Nielsen
The Municipality of Copenhagen
+45 23 42 23 42

FROM: bob@besafe.dk

TO: hn@moc.dk

Hi Hans.

It was great seeing you yesterday at the conference.

The price of 2 cameras, wiring and the security box will be €15.000.

Confirm this email and we'll get this sorted.

Best regards,
Bob

JSON Extract

Structured data extracted by AI Agent 1

Click "Analyse email & extract deal data" to process emails

CRM Entry

Final CRM record created by AI Agent 2

Convert JSON to see CRM entry

This system could have been done with just 1 agent, but giving each agent different context increases the likelihood of correct outcomes. It might have been overkill for this project, but we wanted to make the project future proof and ready to automate more automations from emails in the future.

The rest of this write up is for different audiences.. so click below if you:

want to read more about the technical considerations
are curious about my learnings (mostly related to how to work with LLMs)
think that there must be some bigger potential with this application?

Tech stack

You can skim an accurate & extensive, though not that interesting nor filtered, description of all the technical aspects of the project in this deepwiki here.

Icon for https://avatars.githubusercontent.com/u/139895814?v=4

Icon for https://companieslogo.com/img/orig/supabase_BIG.D-94f7cfaf.png?t=1720244494

Icon for https://registry.npmmirror.com/@lobehub/icons-static-png/latest/files/dark/openrouter-text.png

Icon for https://registry.npmmirror.com/@lobehub/icons-static-png/latest/files/dark/vercel-text.png

My own aim with this project was to learn how to build a production ready AI Agent system using LLMs. There are a few steps in that:

Figuring out the system architecture
Figuring out what deployment providers to use
Decide where on the scale of locally ready to enterprise ready the system should be
Figuring out what LLM to use

Working with LLMs best practices state that one should discuss with multiple LLMs before generating a final specification doc that then can be used with an execution agent like Claude Code, Gemini CLI or Cursor.

When doing that it’s important to remind the LLM to consider the scale of your application and what your learning goals actually are. It will often suggest very scaleable systems, which might be overkill for the beginning of a project.

Frontend: Next.js, TypeScript, Tailwind CSS, Shadcn/ui components (deployed on Vercel)
Backend: Python FastAPI application (deployed on Railway)
Database & Auth: Supabase
Integrations: Pipedrive CRM, Microsoft Outlook
AI: LLM API for sales opportunity detection. Use OpenRouter
Deployment: Vercel (frontend) + Railway (backend) through Dockerfiles
Triggers: Webhook-driven when user sends emails
Real-time: Supabase subscriptions for live dashboard updates

System Architecture

Rendering diagram...

Data Flow Sequence

Rendering diagram...

Step by step flow diagram

Rendering diagram...

So as you can read, an attempt to not overcomplicate things before it is necessary - but also have it production ready for a client.

v2 tech stack

Frontend: Next.js, TypeScript, Tailwind CSS, Shadcn/ui components (deployed on Vercel)
Backend: Python FastAPI application (deployed on Railway using Docker)
Database & Auth: Supabase
Integrations: MCP servers
AI: LLM API for sales opportunity detection. OpenRouter as Gateway (& cost & rate limits), LangChain for functions
Monitoring: OpenTelemetry (logs, metrics & traces) on Honeycomb, LangSmith for AI observability, Error tracking (Sentry).

What I have yet to implement

Proper tests
Implement LangChain instead of OpenRouter functions
Rewrite monitoring aspect to fix Honeycomb & LangSmith
Install Sentry & set up alert system

Learnings

First attempt: Let Cursor freely vibecode an enterprise grade application

Main error: Believing Cursor could handle a multi tenant full stack application with third party integrations, microservices, dev/stage/prod environment & databases with little supervision.

I wanted to learn to develop production ready systems, so without much prior experience I ambitiously journeyed out to try and create an application with multiple development environments, with different databases, third party app registrations for different environments and all of course deployed with Dockerfiles.

I got carried away by some vibecoding influencers and thought if I just described the project thoroughly enough, then Cursor could implement it.

I spent a long time discussing different technical architectural choices, infrastructure providers, etc. That process was actually really valuable. I ended up with a tech stack similar to the one I actually used in my third attempt, though everything hosted on Railway, but with too little oversight about the management of the different development environments.

I started just letting Cursor go and try to implement. When we got to the basic scaffolding (which was waaay too much in one go) and I had to deploy it, then I got into a hell trying to debug why the deployments were crashing.

The LLM too often got confused about env keys for the different environments, what database to spin up through the Dockerfile and where I was storing the variables. Every fix the Cursor tried to make just made me and it got even more confused about what to do. It didn’t feel like it really understood the connection between the codebase and the infrastructure where we wanted to deploy it.

So that was when I thought:

The newly launched Gemini CLI with it’s huge free token tier must be able to make sense of this. Especially if I do it all with pure Google infrastructure! What could go wrong??

Sidenote: I want to try again with full-stack deployment on Railway with multiple services & PostgreSQL. But I want to deploy incrementally.

Second attempt: Lost trust in Gemini CLI in one day

Main error: Thinking I could handle the same as before if i ran a full Google Cloud Platform infrastructure with Gemini CLI implementing and also Gemini setting up the infrastructure through terminal commands.

I had previously worked with Firebase and my previous startup was running on GCP, so I felt confident that I could get the infrastructure to work there. With my newly earned learnings about the difference in development environments I tried carefully instructing Gemini CLI to be aware of the different environments.

HackerNews was overflowing with experienced developers using Claude Code with great results, though expensive. So when the free Gemini CLI came out I got eager to try it and ready to see it’s magic.

You can probably read from the text where this is going. I think it actually did a lot of things right. And it probably could have worked in GCP. However within my first day of testing it I encountered waaay to many times that Gemini tried fixing an error by editing a code file, but the code file wasn’t actually edited, so when the edit (that didn’t happen) didn’t work (surprise), it went down a rabbit hole editing other correct files to make the code work. This seemed like such a basic capability that made me distrust Gemini immediately.

That’s probably a good product learning. Release quickly, but make damn sure you don’t lose the trust of your customers.

Third attempt: Careful step by step development

It seems obvious in hinsight careful oversight was needed, but hey, we all have to learn it the hard way

I actually wanted to try to run as many services as possible as Vercel functions in my third attempt (I have always been an admirer of serverless architecture). But i quickly came to dislike how the Python code ran in those functions and the complexity in developing locally first before testing in production.

So the frontend ended up being deployed on Vercel and the backend on Railway. For this project I decided to only use a dev & prod environment, 1 supabase database for both environments, and keeping third party app registrations as the same for dev & prod. Not best practices i know, but a foundation to start and improve on.

There is no microservice structure, but one monolith backend deployed as one service on Railway to simplify the management of env variables.

What I did

This is where i actually learned why Docker, pre deployment steps & github branches are great. ~~Also why it’s a stupid idea to have an AI generate your .gitignore & .vercelignore files as I might have had many hours wasted thanks to that.~~

For this project I also went back to using Cursor. It’s interface makes it easy to review code and setting up memories & rules.

And now i have good a modern tech stack & deployment flow that i am comfortable with and that i can expand & adapt with different projects 😊

What I got right with this attempt

Outline and continuously refine a step by step plan with testable phases for Cursor to follow (TASKS.md). Explicitly instruct the LLM to create tasks to make the things work in development first, then production before advancing to next phase.
When introducing new libraries, endpoints or bigger changes to the codebase, then get a basic setup up, tested & deployed. And then add the necessary logic.
I used a more careful development flow:
- I created new branches when working on a new chunk of the project
- I followed the chat window more when the agent was writing code
- I told the AI tests it’s own implementation with small commands (preferably first) and then scripts.
- I always went through the pre deployment steps of building the Dockerfiles before merging back into

My workflow

At the start of the project I had a PROJECT_OVERVIEW.md file and a TASKS.md file. These gave the overall specifications and guidelines for the project. The task file is not a complete step by step file, but overall directions.

Then when starting on a new area or new feature I’d often open Claude & Google AI studio and ask for ways a specific outcome could be reached. This context was applied to a prompt for Cursor to refine the next phase tasks to do it in a certain way.

Using that workflow I ensured Cursor always knew the overall direction, and when getting to a new phase then we would inject the task list with more specific guidance.

Complications

Cursor is great at just doing. The prompt running their system is designed for that. This works great when you don’t have any real requirements to the feature and just want an outcome.

But when you have requirements, then you really need to steer it or it just finds a way. A way LLM coders bypass this is to use the initial specification doc and then generate a task list from that that gets updated.

Initially my workflow was discussing how a feature should be with Claude in a Claude Project where I had instructed it to always ask clarifying questions, and then having it generate code and copying it into my project. Initially this was much more reliable than Cursor code (in free mode), but copying/pasting became too tiring (I did learn how to quickly navigate around vs code with shortcuts and finding functions/files quickly though 🤓).

It is also great at fixing bugs quickly. But not perfect. They can get quite stubborn that something is the error and then just tirelessly works on that. I have seen it in loops where it continuously accuses build caching, relative variable paths, wrong environment variables, etc.

Enabling yolo mode has also made my workflow quicker, however I quickly learned some rules I need to set to prevent certain actions. Like never commit to git or update TASKS.md without me explicitly asking for it. Yolo mode is not recommended by Claude, but I see more and more experienced developers on HackerNews using it in their workflows.

I prefer to always follow what it is doing, but sometimes I get too trusting and start working on other things or make my lunch while it is flying. Most times to good results, but some times I also wish I had been there to stop it.

I have directed it towards easier solutions than an overly complicated path it had taken quite some times. I have also prevented it from introducing hardcoded variables and sample data.

Also be aware of the AI using random library versions. I’d push it to mostly use latest instead of specifying it. I’ve also seen it suddenly downgrading libraries to fix a bug, where the bug could also be fixed in simpler and less dangerous ways.

Don’t be afraid of stopping the Agent mid execution or writing wait in the chat to stop it.

Other small ideas and common best practices

Implement a design system as one of the first things.
Start a new chat often. If a feature emerges into something new, then ask the AI to summarise the context.
Keep a WORKLOG.md file where you note down your own learnings and reminders for things you have skipped now but want to refine/add later.
Refine your project_overview.md & tasks.md files as you go along in the project.
Continuously refine your cursorrules. If you see they are not being used, then they are probably too long. Maybe one should divide them up into be more area specific in the future (one for frontend, one for the database, one for API design, etc..). I found that some rules in the cursor directory might be too long.
Remember that the AI is biased towards agreeing with you. With that in mind, be very aware of how you write your questions. I have caught myself many times at the end of writing a question laughing at how obvious it is that I really hope for a specific answer.
- Instead of: Is Firebase better for user authentication in our application than Supabase?
- Use neutral questions like: Evaluate the pros & cons of using Firebase for user authentication vs Supabase.
This is the first project where i added libraries documentation directly into cursor, which seemed to work excellent.

How to learn from AI assisted development

Write long ass blog posts for yourself that no one ever will get to the end of, but it feels like the learnings have manifested more thoroughly after this thought process.
Ask the AI to explain step by step what happens in the code
Ask whether there are other ways to write the same code (with pros & cons). Asking for what the modern way is vs the tested and reliable old way also gives interesting info
Ask the AI to generate flow charts or sequence diagrams of how the solution or a specific feature look versus other ways it could look.
Ask one AI to criticise another AI’s code.
Provide error logs to the AI and asking what they mean instead of just letting it fix it immediately have accelerated my troubleshooting abilities.

.. I’m not doing this as often as I think is good for me at the moment. I get obsessed and advances too quickly sometimes instead of taking time to learn. Maybe that is the best way to keep momentum, but I’m definitely missing out on some ongoing learnings.

Other learnings

Running Python code directly as Notebooks within VS Code/Cursor is a treat. Makes it so easy to see test code step by step. It was also essential for this project to refine the AI prompts and CRM duplicate testing quickly. When the code was done in the Python notebook test file, then the AI integrated that flow into the codebase without any errors.
I learned to use Ngrok to test webhooks in local development.
Learned to work in dev & prod environments. Having a deployment workflow for frontend and backend really helped me understand separations of concern.
Getting the AI to optimise the Docker build early was great for quicker production iterations.

What i need to learn next

I want to integrate monitoring from the beginning (maybe OpenTelemetry, Honeycomb, Sentry)
Work with +2 dev environments.
Work with a database for each environment.
Implementing automated database backups & recovery.
Caching
100% Infrastructure as Code.
How & when to intelligently introduce tests to my development workflow
Working with Claude Code

A major thing I have learned from this project is that, even as a rookie AI assisted developer, my intuition is valuable. When I sit with the LLM for hours, guiding it, tweaking it, seeing it run, then I get a good idea of when it is going wrong. Even in areas when I’m not an expert.

I can only imagine how powerful this tool is when one is an expert in the given areas. Then one can steer it in the right direction much better and stop it early when it’s going down a wrong path.

To me, coding with LLMs is like having a very eager friend who really just wants to code. No matter how minimal the instructions it just wants to code. And I get to sit as the manager that acts both as a systems architect to make the overall technical decisions and a product manager that decides what is important to build next and is responsible for the system working in production. But because my friend is so eager to code and so fast it actually makes sense just to sit and observe what he is doing, while reflecting on ways to improve his output, while discussing with other LLMs about future technical decisions, while creating design component on the side in v0, etc.

Sidenote 1: I’ve found in my development process that i started talking to the AI as “we” instead of you/i when prompting it. I noticed while writing that I kept writing we. I apparently consider my LLMs and I as a team, so i decided to keep it in. Just so you know I’m not referring to any other people. It’s also scary. This is how i refer to it after playing around with it for 2 months - let’s see how much it’s a part of me after 2 years x(

Sidenote 2: While I’m trying to do as much as possible with AI, I still believe making these writeups myself is valuable for me and makes the content more valuable for others. All images & demos are AI generated in v0 though.

Potential

TL;DR

Read emails sent → Take action. Actions could be much more than just CRM action. Hook the trigger up to any MCP and let it fly.
From reading emails sent to reading emails as they are being written to reading everything the user writes in the browser to reading everything the user writes on their mac.
Go from just doing things automatically to propose actions for user acceptance to allowing the user to say “always yes”.

There are many ways this project can expand.

Integrate many more workflows

I envision a system where the user just toggles what tools they use for different purposes (notion for documentation, zendesk for customer support, todoist for tasks) and then the system can automate everything that follows an email.

Update item in Trello based on email
Add this to “Team Documentation” in Notion
Open a customer support ticket for this in Zendesk
Send this to our accounting team on Teams
etc, etc.

With the rise of MCPs and the OAuth system in place, integrations can roll out quickly. Defining the workflows and finetuning the prompts is what will make this work.

Also trigger on emails in inbox

Many project managers (myself including at Microsoft) spent a lot of their time receiving communication and then transforming it somehow and then passing it on to a colleague or entering it into a system. This is also an ideal workflow for AI Agents.

Feedback integrated

When the AI detects a potential action from an email, then a notification should pop up with the following options:

Accept action
Always accept this type of actions
Reject action

When the action is done, then the user should be able to review the workflow and integrate that into the LLMs memory.

Quicker workflows

I envision a natural evolution of this system. Today, the AI Orchestrator decides what action to take based on the email sent. Next step could be:

The AI analyses your email as you are writing it and proposes the task as you are finishing the email.
The AI analyses any text you write in your browser or on your computer and proposes tasks.
The AI sees everything on your screen and proposes tasks.

.. This all has to be done with a sharp focus on privacy and maybe a locally run or gated latency optimised AI that does the analysis quickly, so no external LLM gets access to all the persons info.

A neat feature could be that when the AI proposes a task, then it is automatically done in the background. So when the user clicks yes (preferably with a keyboard shortcut or through voice), then the link to the completed task immediately appears or shows. If the user rejects it, then the task gets cleaned up so the user won’t notice it.

Project: Action on Sent

Description

Email-to-CRM Automation Flow

Email Chain → JSON → CRM Pipeline

The rest of this write up is for different audiences.. so click below if you:

Tech stack

System Architecture

Data Flow Sequence

Step by step flow diagram

AI assisted devs make flawed applications

My work on the logging aspect is not done yet, but you can read draft thoughts here

Learnings

First attempt: Let Cursor freely vibecode an enterprise grade application

Second attempt: Lost trust in Gemini CLI in one day

Third attempt: Careful step by step development

What I did

What I got right with this attempt

My workflow

Complications

Other small ideas and common best practices

How to learn from AI assisted development

Other learnings

What i need to learn next

Potential

TL;DR

Integrate many more workflows

Also trigger on emails in inbox

Feedback integrated

Quicker workflows