VM case study cover image

tl;dr

  • Client: VielfaltMenü, one of Germany’s larger B2G catering operators, running five regional units serving public-sector canteens. Growth was throttled by churn and a slow, manual tender-discovery process.
  • Platform: VielfaltMenü’s AI platform that surfaces live tenders and predicts upcoming ones before they’re announced, by mining public-sector budgets, council minutes, and procurement memos. LLMs do extraction only. Deterministic, client-configurable logic handles scoring, routing, and prioritization, so the system stays auditable and cheap to retune.
  • Business impact: Rolling out across all five regional units in closed production, with a full audit trail of every LLM call. VielfaltMenü spun it out as a separate IT entity, built generic enough from day one to extend beyond catering into other B2G sectors.

VielfaltMenü runs a catering operation that most software teams would consider a logistics puzzle, not a sales problem. Hot meals ship within roughly a 50-kilometer radius of each kitchen. Cook-and-chill and frozen meals can travel up to 150 kilometers. Five regional business units across Germany run their own kitchens, delivery fleets, and sales pipelines, each one constrained by capacity and geography.

Inside that operation, growth was stalling. Customers were leaving faster than new ones were coming in, and the sales teams had no way to find new opportunities early enough. In German public-sector catering, “early enough” means something specific.

Reading a published tender faster than competitors is not a real edge. Every competitor sees the notice at the same moment. The edge lives one step upstream, in the budget plans, council minutes, and procurement memos that precede a formal tender by months.

With that framing already in mind, they came to Pragmatic Coders by referral. They asked for two pieces of software. The first, MealPilot, was about standardizing how their regional managers planned, produced, and costed meals. That is a separate case study. This one is about the second product: VielfaltMenü’s AI platform that turns public tender documents and pre-tender signals into prioritized, routable sales leads.

Challenges

Finding Early-Stage Signals in a National Document Firehose

Public-sector procurement in Germany produces thousands of documents a week, most of them irrelevant to any single vendor. A competitive catering company needs to find the ten that matter, not read the thousand that don’t.

The documents are built for legal completeness, not sales triage. Tender packs can run into hundreds of pages, and some exceed the input context window of the largest LLMs we tested. Pre-tender signals are harder still. A few relevant paragraphs about catering might sit buried inside a thousand-page municipal plan. Finding them means scanning a much wider corpus, with a different set of semantic cues, and classifying intent from language that is often intentionally vague.

Stopping the Extractor From Hallucinating Sales Leads Into Spam

In a sales context, a hallucinating extractor is the line between early-stage prospecting and spam.

If the platform confidently reports a contract size that is wrong, or routes a lead to the wrong region, the sales team either chases a ghost or misses a real opportunity. Trust has to be high enough that sales reps act on prioritized leads instead of treating them as one more dashboard to ignore. Extraction quality is a business-critical metric, not just a model-evaluation one.

Routing Leads Across Five Regional Kitchens With Different Capacity

A lead that looks attractive in the abstract can be worthless to a specific kitchen. Relevance depends on radius, capacity, meal type, and whether a regional team is already loaded.

Each of VielfaltMenü’s five business units has its own kitchens, serving radii, capacity, and product specializations. Assigning a lead to the wrong region wastes the sales team’s time and breaks trust in the system. The platform had to combine extracted lead data with per-region configuration and return results scoped automatically to what each team could realistically win.

Designing One Codebase That Works Across B2G Sectors

From the first design conversation, VielfaltMenü asked whether the platform could serve other industries that run on public-sector tenders. The answer had to be yes, without forking the codebase for every new sector.

Sector-specific logic (keywords, scoring, categories, business units) had to be configurable from day one, so pointing the platform at a new industry wouldn’t require a release.

Our Approach

A Six-Step Pipeline, With PostgreSQL as the Single Source of Truth

The system is deliberately simple at the top level: sources, import, store, extract, enrich, present. Every piece of complexity lives inside one of those six steps, not in the connections between them.

Public-sector procurement APIs and the client’s CRM are the sources. Scheduled importers pull from them hourly and write raw and structured rows into PostgreSQL, which is the single source of truth for the rest of the platform. An LLM extraction layer reads normalized documents out of Postgres and writes structured fields back. A scheduled enrichment layer computes scores, distances, and relationship context over those fields. A React application presents the result through role-based dashboards, with a FastAPI backend and a Salesforce push for accounts that qualify.

Using the LLM as an Extractor, Not as a Judge

The single most important architectural decision was to split the AI’s job from the business logic.

The LLM reads a document and fills in a structured schema: contract duration, deadline, meal type, serving capacity, lead location, tender probability, and a set of justification fields. It does not decide whether a lead is good. Scoring runs as a separate step on deterministic, configurable rules. Boolean flags contribute fixed point values. Numeric fields contribute points on a bounded scale. Distance is penalized outside a region’s radius. Relationship history from the CRM contributes its own category.

The result is a system where sales teams can see exactly why a lead received its score. When the business wants to reweigh a category, they edit a scoring configuration in the admin panel. When they want to experiment with a new rule set, they duplicate the template and promote it when ready. We never have to retrain a model or edit a prompt to change how prioritization works.

This also keeps costs predictable. LLM calls happen once per document at extraction time. Every downstream score, recalculation, or rule change runs on structured data that already lives in the database.

Normalizing PDFs to Markdown Before the LLM Sees Them

Extraction quality lives or dies before the LLM sees the document. We normalize everything to Markdown first.

Procurement documents arrive as editable PDFs most of the time, occasionally as HTML, rarely as anything else. A normalization step converts the full set into consistent Markdown, which the LLM handles far more reliably than raw PDFs or loose HTML. For documents that exceed the context window, the pipeline summarizes in chunks and aggregates the structured output into a single result. Context size, chunk count, and model selection are all configurable, because the right parameters a year from now will not be the right parameters today.

Pre-Tender Signals and Customer Feedback From One Document Stream

Alongside the tender pipeline, the platform runs a second track for earlier-stage signals and feedback on existing accounts.

A curated keyword set filters the incoming document stream for anything catering-relevant. The LLM then classifies each match. Some matches point at a future tender: a council planning a new procurement, a budget allocation for catering next year, a resolution to review current arrangements. Those become pre-tender leads and go to sales.

Other matches carry feedback about an existing caterer, and that cuts both ways. If the provider named in the document is a competitor, the alert is an acquisition opportunity: a dissatisfied municipality worth pitching before the dissatisfaction formalizes into a tender. If the provider is the client itself, the alert is a retention cue. Severity ranges from routine mentions to bomb alerts. A bomb alert is a direct negative feedback serious enough to put the contract at risk. Customer Success has to act fast before the account churns. Both kinds of alert land in the same Customer Success dashboard, scoped to the agent’s region. Patterns across many alerts also give the client a rolling read on where the wider catering market is moving.

Customer identity is the one place we deliberately keep a human in the loop. The LLM extracts the feedback text and suggests which known account the document refers to. The Customer Success agent confirms the match manually, because misattributing a complaint to the wrong account is too high-stakes for the model to decide alone.

Role-Based Dashboards, With Salesforce as the System of Record

Sales, Customer Success, and Admin each get their own scoped dashboard, and Salesforce stays the customer record underneath.

The Sales View shows prioritized leads filterable by classification, business unit, date range, and source, with score breakdowns and direct links to Salesforce records. The Customer Success View surfaces classified customer-feedback alerts scoped to the agent’s region, with quick access to linked Salesforce accounts once an agent confirms the match. The Admin Panel is where the platform’s flexibility lives: admins manage business units and kitchen locations, maintain the keyword list, create and activate scoring configurations, tune integration settings, and review the audit log. Every configuration change is versioned.

Leads are previewed against live Salesforce picklists before being pushed, so what lands in the CRM conforms to the customer’s existing data model. Relationship scoring reads directly from Salesforce accounts and opportunities to decide whether a new lead is tied to an existing customer, and adjusts prioritization accordingly.

Compliance and Governance Built In From Day One

For German enterprise and public-sector buyers, compliance is a buying criterion. We treated it as an architectural constraint from the first design conversation. Auditability, oversight, and access control are structural properties of the platform.

  • The LLM extracts, it never decides. Scoring, routing, and prioritization run on deterministic, configurable rules, so every business decision is reproducible and explainable.
  • Every relevant action is logged. User actions, LLM calls, score calculations, and document imports are recorded with event type, user, timestamp, and result status, down to token counts and per-call LLM cost.
  • Configuration is versioned. Keyword sets, scoring templates, and category weights change through versioned config in the admin panel rather than code, so every rule change has a history and an owner.
  • Access is scoped by role and region. Region assignment is automated from postal codes and kitchen radius data, so a user only ever sees the leads and alerts that belong to their function and area.
  • High-impact attribution keeps a human in the loop. Assigning negative customer feedback to a named account is confirmed by a Customer Success agent, never by the model alone.

VielfaltMenu AI Platform Architecture Diagram

Outcomes

Production Rollout Across All Five Regional Units

The product is entering production across VielfaltMenü’s five business units in a closed rollout phase, with real users, real leads, and real customer alerts moving through the system every day.

  • A working AI tender discovery platform. Sales teams see prioritized tender leads scoped to their region, Customer Success teams see classified customer-feedback alerts scoped to their accounts, and admins control every knob without engineering involvement.
  • Full configurability without code changes. Keyword lists, scoring templates, business units and kitchen locations, category weights, and notification rules are all editable from the admin panel. A business analyst can retune the system without opening a pull request.
  • Auditable AI spend. The business can answer real-time cost and usage questions straight from the audit log, without looping in engineering.
  • Compliance-ready architecture by design. The platform separates AI extraction from deterministic business decisions, keeps humans in the loop for high-impact attribution, versions configuration changes, and logs every relevant system action. This gives the client a defensible governance model for AI-assisted sales and customer-success workflows.
  • Sales operations standardized on Salesforce. The engagement restructured how the client’s sales process runs, with Salesforce rebuilt as the single hub for both sales and customer-feedback data. Domain categories (data source, client type, region) are modeled explicitly, and sales and customer success teams were trained on the new process. The CRM now supports budget planning and executive reporting, not just lead tracking.
  • A platform that outgrew the original scope. VielfaltMenü spun the product out into a separate IT entity to license the software to other organizations operating on public-sector tenders, inside and outside catering. That was only possible because the platform was designed generic from day one.

Three Lessons for Shipping AI Products That Earn Trust

All three lessons came directly from how this platform was built, and all three travel well to other AI engagements.

  • Generic-first beats bespoke in the AI era. Configurable scoring, keyword management, and business-unit abstractions cost more up front and save that cost back within months. The moment the client wants to retune scoring, try a new keyword set, or add a region, the platform absorbs the change without a release cycle.
  • Use LLMs as extractors, not as judges. Have the LLM fill a structured schema. Deterministic, configurable rules handle scoring and routing over that data. Every decision stays auditable, and reweighing a category is a config change, not a model retrain.
  • Compliance by design is a product feature. In B2G and enterprise environments, audit logs, role-based access, human oversight, and configurable decision logic are what make an AI system trusted enough to deploy. They turn a working demo into something a public-sector procurement team will sign off on.

Conclusion

VielfaltMenü came to us with a retention and acquisition problem that looked like a sales issue and was really an information problem. We built them a pipeline that handles it end to end. Pull from public sources. Store the raw records. Extract structured fields with a tightly scoped LLM. Enrich with deterministic scoring and routing. Present the result through role-based dashboards. The AI is narrow, the logic is transparent, the configuration is the client’s to control. The platform is now rolling out across all five regional business units, and the client saw enough potential to spin it out as its own IT company.

Contents

Let's talk

We’ve got answers on anything connected with software development.

Ask a question

You can ask us a question using the form below. We will respond as soon as possible.

Schedule a meeting

You can also schedule an online meeting with Wojciech, our Senior Business Consultant.

wojciech knizewski portrait
Wojciech Kniżewski

Senior Business Consultant

8 out of 10

founders who contacted us wanted
to work with our team.

Trusted partner

Newsletter

You are just one click away from receiving our 1-min business newsletter. Get insights on product management, product design, Agile, fintech, digital health, and AI.

LOOK INSIDE

Pragmatic times Newsletter

We're here to become the world's most effective software company.