This job has been archived and is no longer active.

AI Agentic integration Engineer (Senior / Lead)

RemoteSenior

4500$

Vallettasoftware - custom mobile/web software developer in the US and Europe. Our teams implement IT projects of varying complexity, including website and mobile app development, enterprise systems, and solutions based on artificial intelligence and machine learning (AI/ML).

Our company has earned a place among Clutch's Fall 2025 Champions. These awards confirm that we are a Top 15 AI Agent Developing companies!

We are a distributed team - you can work from any place in the world, except RF and RB, ensuring silence, good internet connection, availability and proper environment .

We are looking for a Senior / Lead AI Engineer to build production-ready AI systems where:

LLM is the core layer of the solution
Agentic workflows are used as the primary orchestration pattern
System quality is managed through evaluation
Reliability, observability, and cost control are designed as part of the architecture, not added later

This is not a backend engineer with AI functions, nor is it a prompt engineer.

This is an engineer who knows how to build AI-native systems end-to-end and take them all the way to production.

1. Hard Requirements

1.1. AI / LLM Systems

Must-have:

Real experience developing and shipping production LLM systems
Experience working with LLM APIs: OpenAI / Anthropic / Gemini / similar
Prompt design
Structured outputs
Tool / function calling
Model selection and understanding of trade-offs between models

1.2. Agentic Systems

Must-have:

Experience designing multi-step workflows
Experience developing agent-based systems: single-agent and/or multi-agent
Orchestration: planning, execution, retry, fallback, verification
Management of:
- State
- Context
- Memory
Understanding when an agentic approach is needed and when it's not
Understanding of trust boundaries in agentic systems
Principle of least privilege for tool permissions
Protection against indirect prompt injection via external data (retrieval, tool results, external APIs)

1.3. Evaluation & Quality Control

Must-have:

Building evaluation pipelines
Offline evaluation
Comparing prompt / model versions
Quality metrics
Approaches to online validation / A/B testing / human review loops
Ability to connect evaluation to real product quality

1.4. Context Management & Hallucination Control

Must-have:

Context management:
- Chunking strategies
- Context window optimization
- Memory patterns
- Retrieval scope control
Hallucination reduction techniques:
- Grounding
- Retrieval
- Tool-based verification
- Constraints
- Self-check / validation patterns

1.5. Production / LLM Ops / Reliability

Must-have:

Retries / exponential backoff
Timeout handling
Fallbacks / model routing
Degraded mode / graceful failure
Rate-limit handling
Observability:
- Latency
- Token usage
- Cost
- Failure rate
- Output quality signals
Cost control
Monitoring and debugging AI systems in production
PII handling: filtering before logging, tenant isolation in memory and retrieval
Output validation and content guardrails
Awareness of data residency risks when using external LLM APIs

1.6. Data / Retrieval

Must-have:

Understanding of retrieval pipelines:
- Embeddings
- Chunking
- Reranking
- Retrieval quality tuning
Experience with vector storage / vector DB of any type
Working with structured and unstructured data

1.7. Engineering Foundation

Must-have:

Strong engineering background in backend / system development
Backend stack is not critical
Ability to build APIs, services, integrations, async workflows
SQL + NoSQL
Git, Docker, CI/CD

Nice-to-have:

Full-stack development experience (backend + frontend)
Understanding of UI/UX aspects of AI products (chat, copilots, dashboards)

1.8. AI Safety & Security

Must-have (awareness level for Senior, ownership for Lead):

Prompt Injection:

Understanding the difference between direct and indirect injection
Indirect injection — a specific threat to agentic systems: attacks via data from retrieval, tool results, external sources
Content sanitization before inserting into context
Architectural separation of system prompt, user input, and external data

Tool Permission Model:

Principle of least privilege: minimum necessary permissions for each agent and tool
Separation of read-only and write operations at the architecture level, not via prompts
Human-in-the-loop for irreversible actions (delete, send, publish, execute code)
Whitelist of allowed external calls

Data Leakage & PII:

Tenant isolation: one user's data never enters another user's context
PII masking before logging (including tool results and retrieval results)
Understanding that LLM APIs are third-party; for regulated domains — DPA, filtering, or self-hosted

Output Safety:

Output validation: schema checks, content filtering
Understanding the difference between "the prompt says don't do X" (weak protection) and "the architecture does not allow X" (strong protection)

1.9. Communication

Must-have:

English B2+
Ability to explain architectural decisions, trade-offs, and risks
For Lead: ability to set engineering standards and lead the technical direction of the team

Valletta Software

Valletta Software - custom mobile/web software developer in the US and Europe.

Website

See all 3 jobs at Valletta Software

Similar jobs

On The Spot Development

Team Lead Backend Engineer

Senior, Lead

PolandHybrid

about 16 hours ago

This job has been archived and is no longer active.

AI Agentic integration Engineer (Senior / Lead)

RemoteSenior

4500$

Our company has earned a place among Clutch's Fall 2025 Champions. These awards confirm that we are a Top 15 AI Agent Developing companies!

We are a distributed team - you can work from any place in the world, except RF and RB, ensuring silence, good internet connection, availability and proper environment .

We are looking for a Senior / Lead AI Engineer to build production-ready AI systems where:

LLM is the core layer of the solution
Agentic workflows are used as the primary orchestration pattern
System quality is managed through evaluation
Reliability, observability, and cost control are designed as part of the architecture, not added later

This is not a backend engineer with AI functions, nor is it a prompt engineer.

This is an engineer who knows how to build AI-native systems end-to-end and take them all the way to production.

1. Hard Requirements

1.1. AI / LLM Systems

Must-have:

Real experience developing and shipping production LLM systems
Experience working with LLM APIs: OpenAI / Anthropic / Gemini / similar
Prompt design
Structured outputs
Tool / function calling
Model selection and understanding of trade-offs between models

1.2. Agentic Systems

Must-have:

Experience designing multi-step workflows
Experience developing agent-based systems: single-agent and/or multi-agent
Orchestration: planning, execution, retry, fallback, verification
Management of:
- State
- Context
- Memory
Understanding when an agentic approach is needed and when it's not
Understanding of trust boundaries in agentic systems
Principle of least privilege for tool permissions
Protection against indirect prompt injection via external data (retrieval, tool results, external APIs)

1.3. Evaluation & Quality Control

Must-have:

Building evaluation pipelines
Offline evaluation
Comparing prompt / model versions
Quality metrics
Approaches to online validation / A/B testing / human review loops
Ability to connect evaluation to real product quality

1.4. Context Management & Hallucination Control

Must-have:

Context management:
- Chunking strategies
- Context window optimization
- Memory patterns
- Retrieval scope control
Hallucination reduction techniques:
- Grounding
- Retrieval
- Tool-based verification
- Constraints
- Self-check / validation patterns

1.5. Production / LLM Ops / Reliability

Must-have:

Retries / exponential backoff
Timeout handling
Fallbacks / model routing
Degraded mode / graceful failure
Rate-limit handling
Observability:
- Latency
- Token usage
- Cost
- Failure rate
- Output quality signals
Cost control
Monitoring and debugging AI systems in production
PII handling: filtering before logging, tenant isolation in memory and retrieval
Output validation and content guardrails
Awareness of data residency risks when using external LLM APIs

1.6. Data / Retrieval

Must-have:

Understanding of retrieval pipelines:
- Embeddings
- Chunking
- Reranking
- Retrieval quality tuning
Experience with vector storage / vector DB of any type
Working with structured and unstructured data

1.7. Engineering Foundation

Must-have:

Strong engineering background in backend / system development
Backend stack is not critical
Ability to build APIs, services, integrations, async workflows
SQL + NoSQL
Git, Docker, CI/CD

Nice-to-have:

Full-stack development experience (backend + frontend)
Understanding of UI/UX aspects of AI products (chat, copilots, dashboards)

1.8. AI Safety & Security

Must-have (awareness level for Senior, ownership for Lead):

Prompt Injection:

Understanding the difference between direct and indirect injection
Indirect injection — a specific threat to agentic systems: attacks via data from retrieval, tool results, external sources
Content sanitization before inserting into context
Architectural separation of system prompt, user input, and external data

Tool Permission Model:

Principle of least privilege: minimum necessary permissions for each agent and tool
Separation of read-only and write operations at the architecture level, not via prompts
Human-in-the-loop for irreversible actions (delete, send, publish, execute code)
Whitelist of allowed external calls

Data Leakage & PII:

Tenant isolation: one user's data never enters another user's context
PII masking before logging (including tool results and retrieval results)
Understanding that LLM APIs are third-party; for regulated domains — DPA, filtering, or self-hosted

Output Safety:

Output validation: schema checks, content filtering
Understanding the difference between "the prompt says don't do X" (weak protection) and "the architecture does not allow X" (strong protection)

1.9. Communication

Must-have:

English B2+
Ability to explain architectural decisions, trade-offs, and risks
For Lead: ability to set engineering standards and lead the technical direction of the team

Valletta Software

Valletta Software - custom mobile/web software developer in the US and Europe.

Website

See all 3 jobs at Valletta Software

Similar jobs

On The Spot Development

Team Lead Backend Engineer

Senior, Lead

PolandHybrid

about 16 hours ago