Agentic Engineer

Location: Richmond, Virginia, USA
Duration: 5 months (Hybrid)
Visa Status: USC, GC, GCEAD, H1B, C2C
Job Description:

Responsibilities

Design and develop data pipelines for agentic systems; develop robust data flows to handle complex interactions between AI agents and data sources.

Train and fine-tune large language models (LLMs).

Design and build data architecture, including databases and data lakes, to support various data engineering tasks.

Develop and manage Extract, Load, Transform (ELT) processes to ensure data is accurately and efficiently moved from source systems to analytical platforms used in data science.

Implement data pipelines that facilitate feedback loops, allowing human input to improve system performance in human-in-the-loop systems.

Work with vector databases to store and retrieve embeddings efficiently.

Collaborate with data scientists and engineers to preprocess data, train models, and integrate AI into applications.

Optimize data storage and retrieval for high performance.

Conduct statistical analysis to identify trends and patterns and create data formats from multiple sources.

Qualifications
Strong data engineering fundamentals.

Experience with big data frameworks such as Apache Spark and Azure Databricks.

Ability to train LLMs using structured and unstructured datasets.

Understanding of graph databases.

Experience with Azure services including Blob Storage, Data Lakes, Databricks, Azure Machine Learning, Azure Computer Vision, Azure Video Indexer, Azure OpenAI, Azure Media Services, and Azure AI Search.

Proficient in determining effective data partitioning criteria and implementing partition schemas using Spark.

Understanding of core machine learning concepts and algorithms.

Familiarity with cloud computing concepts and practices.

Strong programming skills in Python and experience with AI/ML frameworks.

Proficiency in working with vector databases and embedding models for retrieval tasks.

Expertise in integrating with AI agent frameworks.

Experience with cloud-based AI services, especially Azure AI.

Proficient with version control systems such as Git.

Bachelor’s or Master’s degree in Computer Science, Artificial Intelligence, Data Science, or a related field.



Required 5 YearsUnderstanding the Big data technologies
Required 5 YearsExperience developing ETL and ELT pipelines
Required 5 YearsExperience with Spark, GraphDB, Azure Databricks
Required 4 Years Experience training LLMs with structured and unstructured data sets
Required 3 Years Experience in Data Partitioning and Data conflation
Required 3 Years Experience with GIS spatial data

Apply Now