
Company research in 60 seconds with AI agents
Before every sales meeting, somebody has to research the prospect. At the companies I have worked with, that means opening five browser tabs: the KvK company page, a Google News search, LinkedIn for the directors, the company website, and sometimes the aanbestedingen.nl database if the prospect operates in the public sector. By the time you have gathered all of that and turned it into something you can use in a conversation, thirty to forty minutes are gone. Sometimes more.
The problem gets worse across a team. Senior reps build research instincts over years and know which signals matter. Junior reps do not always know how to read a SBI code or understand what a company's legal form says about its structure. The result is inconsistent output: one rep walks into a meeting with a well-prepared briefing, another walks in with a few lines scraped from the About page. Both represent the same company to the same prospect.
The Dutch market has a specific version of this problem. The KvK Handelsregister contains reliable, structured data for virtually every Dutch business: sector codes, employee counts, legal form, registration date, trade names. That data exists and it is accessible via a public API. But getting it out, combining it with current news, and converting it into something a rep can act on in a first meeting is still a manual process that most teams solve with copy-paste and experience.
What We Built
The KvK Intelligence Agent takes a Dutch company name as input and returns a structured one-page PDF brief in under 60 seconds. The pipeline has four steps: look up the company in the KvK Handelsregister API to get sector, SBI code, employee count, and legal form; run parallel Tavily web searches for recent news and general company signals; pass everything to Gemini 2.5 Flash via LangChain for synthesis; then render the result into an A4 PDF using ReportLab. The UI is a single-page Streamlit app.
The company name goes in, the KvK Handelsregister API returns sector, employees, SBI code, and legal form, Tavily pulls in recent news and web signals in parallel, Gemini 2.5 Flash synthesises it all into a structured intelligence object, and ReportLab turns that into a downloadable A4 brief. The full stack is the KvK Handelsregister API, Tavily Python SDK, LangChain with langchain-google-genai, Gemini 2.5 Flash, Streamlit, and ReportLab.
The Interesting Part: Structured Output with Pydantic
The most important design decision was using LangChain's structured output capability rather than asking the LLM for free-form text and parsing it afterwards. You define a Pydantic model upfront and the LLM is constrained to return data that matches that schema. Every field is guaranteed to be present and typed. No string splitting, no regex, no guessing whether the model decided to return a list or a comma-separated string this time.
The output schema captures everything a sales rep needs: the official company name, a two-to-three sentence factual overview, a list of key people with their roles, the primary sector, a human-readable size estimate, up to three recent news items, exactly three specific pain points, and exactly three conversation starters a rep can say out loud in a first meeting. Because the output is a typed object, Streamlit renders each field with the right component and ReportLab builds the PDF sections without any defensive parsing. The field descriptions also reach the model directly, so writing pain points as "exactly three specific, actionable pain points this company is facing right now" reinforces the system prompt constraints at the schema level.
The Hardest Part: Pain Points That Are Actually Useful
The part that took the most iteration was making the pain points mean something. Early runs produced output that was generic to the point of useless: "managing costs in a competitive environment," "attracting and retaining talent," "challenges with digital transformation." These apply to every company, in every sector, every year. A sales rep walking into a meeting with that list is no better prepared than if they had done no research at all.
The fix was being very explicit in the system prompt about what specific means. The current prompt bans a list of phrases by name, including "managing costs," "digital transformation," and "market credibility," and requires each pain point to reference at least one of: the company's specific sector, a named Dutch regulation such as CSRD, AVG, or Wet Open Overheid, or a recent news item from the input data. Each pain point also has to be phrased from the company's own perspective: what are they losing money, time, or sleep over right now, in their sector, in the Netherlands. The result is output like "fuel costs eroding last-mile delivery margins for a Rotterdam logistics firm expanding into Germany" rather than "operational challenges in a competitive landscape." Small change in the prompt, large change in what actually comes out.
Results
The pipeline works best for established Dutch companies with a meaningful web presence and a clean KvK entry. Larger organisations, companies with recent press coverage, and businesses in well-documented sectors like logistics, healthcare IT, and professional services produce the sharpest output. For smaller companies with minimal online activity, Tavily returns less signal and the pain points fall back on sector knowledge rather than company-specific data. KvK employee counts are often missing or out of date, and the key people section only populates when directors appear in public sources. LLM hallucination is a real risk in both of those fields, and the report should be treated as a starting point for preparation, not a verified source.
Try It Yourself
The code is on GitHub at github.com/LetsDevItUP/kvk-intelligence-agent. A live demo is running at kvk-intelligence-agent.streamlit.app. To run it locally: clone the repo, create a virtual environment, install from requirements.txt, add your three API keys for Gemini, KvK, and Tavily to a .env file, and run the app. The setup takes under five minutes.
What's Next
The most useful addition would be pulling in the Rijksoverheid aanbestedingen database, so the agent can surface recent public procurement activity for prospects that operate in the public sector. A company comparison mode, running two prospects side by side, would be useful for competitive deal situations. There is also a natural integration path toward CRM tools: writing the structured output directly into a Salesforce or HubSpot record so the briefing follows the prospect through the pipeline. None of that is built yet. For now, the agent does one thing and does it consistently.
Built at dev-UP AI Lab, a Dutch AI consultancy. Feedback welcome.