What types of AI projects does Pharosyne work on?

Pharosyne specializes in multi-agent systems, RAG (Retrieval-Augmented Generation) architectures, voice AI agents, and custom AI/ML solutions. The focus is on enterprise-grade implementations that deliver measurable business value. With 20+ years of software engineering experience and leadership roles at companies like BBVA, Santander, and El Corte Inglés, Pharosyne brings proven expertise to complex AI challenges.

Does Pharosyne offer fractional CTO or advisory services?

Yes, Pharosyne provides fractional CTO and Head of AI services for companies that need strategic technical leadership without a full-time commitment. This includes architecture guidance, team mentoring, and technology roadmap development.

What is the typical engagement model?

Pharosyne offers flexible engagement models including project-based consulting, retainer arrangements, and fractional leadership roles. Each engagement starts with a discovery call to understand your needs. You work directly with senior architects, no account managers or intermediaries.

Does Pharosyne work with startups or only enterprises?

Pharosyne works with both. The sweet spot is mid-market companies and growth-stage startups with complex technical challenges. Pharosyne also supports enterprises looking to modernize their AI and software architecture.

What is RAG architecture and when should I use it?

RAG (Retrieval-Augmented Generation) combines large language models with your own data sources to provide accurate, contextual responses. It's ideal for building internal knowledge bases, customer support systems, or any AI application that needs access to proprietary information. Pharosyne has implemented RAG systems for enterprise clients handling millions of documents.

How do multi-agent AI systems work?

Multi-agent systems use multiple AI agents working together to solve complex tasks. Each agent specializes in a specific function, and they collaborate to achieve goals that would be too complex for a single agent. Pharosyne designs and implements these systems for workflow automation, research pipelines, and autonomous business processes.

What languages and technologies does Pharosyne use?

Pharosyne works primarily with TypeScript, Python, React, and Next.js for full-stack development. For AI/ML, the team uses OpenAI, Anthropic Claude, LangChain, and custom model implementations, with deep experience in vector databases like Qdrant, Weaviate, and pgvector for RAG systems.

Does Pharosyne work remotely or on-site?

Pharosyne works primarily remotely from Madrid and the EU for European, UK, and selected international B2B teams. For key workshops or critical project phases, on-site engagements within Europe are available when the project justifies it.

AI technical audit for RAG and agent systems

An AI technical audit is useful when the team is no longer asking "can we build this?" and has started asking a harder question: "should we keep building it this way?"

That question usually appears after a prototype works in demos but feels unstable in real use. Retrieval misses obvious documents. The agent gets stuck in loops. API costs are higher than expected. Nobody can explain why one answer was good and the next one was dangerous.

At that point, another sprint of feature work can make the problem worse. The right move is often a focused audit.

Fast answer

An AI technical audit should answer five questions:

Is the current architecture fit for the business risk?
Can the team measure quality, or are they relying on demos?
Can failures be traced back to retrieval, prompts, tools, model choice, or product logic?
Are cost, latency, privacy, and vendor dependencies visible enough?
What should be fixed, simplified, paused, or rebuilt before production?

The output should not be a vague maturity score. It should be a decision document: keep, fix, simplify, or stop.

When a RAG system needs an audit

RAG systems often look simple from the outside. Documents go in, questions go out, answers come back with some context. The failure modes are rarely visible in a demo.

Audit the RAG system when any of these are true:

Users ask reasonable questions and the system retrieves irrelevant chunks.
Answers sound confident but cite weak or outdated sources.
The team cannot reproduce why a specific answer appeared.
Search quality is judged by opinion, not a fixed evaluation set.
Document updates do not reliably appear in answers.
The vector database, embedding model, chunking rules, or reranker were chosen without tests.
The company is about to expose the system to customers or internal operational teams.

The audit should inspect ingestion, chunking, metadata, embeddings, hybrid search, reranking, prompt construction, citation behavior, and evaluation data.

For commercial systems, the key question is blunt: can a buyer, operator, or support agent trust the answer enough to act on it? If not, the RAG pipeline is still a prototype.

When an AI agent system needs an audit

Agents fail differently from RAG.

A RAG system may retrieve the wrong evidence. An agent can retrieve the wrong evidence, call the wrong tool, retry the same broken action, ignore a permission boundary, and then write a convincing response.

Audit the agent system when:

The workflow has more than a few tool calls per task.
The agent can write to systems, send messages, create tickets, update records, or affect customers.
There is no maximum loop length or timeout.
Human escalation rules are unclear.
Tool schemas are loose or errors are not structured.
The team cannot replay a failed run from traces.
The architecture has multiple agents but no clear contracts between them.

The audit should inspect state handling, tool permissions, orchestration, retries, loop limits, logging, human handoffs, and evals per step.

This is also where many teams discover that they do not need multi-agent architecture yet. They need one simpler agent with better tools and better measurement.

What to inspect in an LLM integration

Some products do not need RAG or agents. They need a reliable LLM integration inside an existing product.

Audit the integration when:

Output quality changes after model updates.
API spend is rising but nobody knows which feature causes it.
Prompts are edited manually without versioning.
There is no test set for common user inputs.
The product depends on one provider with no fallback path.
Logs contain sensitive data without a clear retention policy.
The same model handles every task, from routing to complex reasoning.

The audit should inspect prompts, model routing, structured outputs, API error handling, data retention, observability, evals, fallback paths, and cost attribution.

Most bad LLM integrations are not bad because the model is weak. They are bad because the product has no way to know when the model is weak.

The audit checklist

A serious audit should cover these areas.

Architecture. What are the core components? Which parts are deterministic software and which parts depend on a model? Where can state be lost?

Data boundaries. What data reaches model providers? What is stored in logs? What is redacted? What is retained? Who can access traces?

Retrieval quality. Which queries are used to measure retrieval? Does hybrid search help? Are citations valid? Are stale documents filtered?

Evaluation. Is there a representative test set? Are there pass/fail criteria? Are failures reviewed by category? Does the team know whether quality is improving?

Observability. Can the team reconstruct a run? Are prompts, model versions, retrieved chunks, tool calls, latency, and costs recorded?

Cost and latency. Which calls dominate spend? Which steps dominate latency? Can cheap models handle simple routing or classification?

Safety and control. What can the system do without a human? Which actions require approval? What happens when confidence is low?

Vendor risk. Is the system locked to one provider, framework, vector database, or orchestration layer? If yes, is that a deliberate decision?

What a useful audit deliverable looks like

The useful output is not a 60-page PDF nobody reads.

A good audit should leave the team with:

A map of the current architecture.
A list of critical failure modes.
Evidence from traces, code, config, or sample runs.
A prioritized fix list.
A decision on what to keep, simplify, pause, or rebuild.
A short production checklist.
A recommended next step small enough to execute.

For teams that already have a vendor, the audit should also separate vendor problems from internal product problems. Sometimes the vendor is fine and the product contract is weak. Sometimes the architecture is fine and the data is bad. Sometimes the system should be stopped before more money is spent.

Where Pharosyne fits

Pharosyne's audit work is for teams that need senior technical judgment before committing more budget.

Good fits:

A RAG prototype that needs to become an internal knowledge system.
An AI agent workflow that has started touching real operations.
An LLM product feature with unpredictable quality or cost.
A founder preparing for due diligence or a customer security review.
A team deciding between fixing the current build or changing vendor.

If this is the situation, start with AI consulting services, RAG consulting, or send the context. The first audit conversation should identify the system, the risk, what evidence exists, and what decision needs to be made.