Job Description
Senior Data Engineer
Experience: 4–8 years | 2+ years building data-intensive pipelines
Location: Office – Coimbatore/Bengaluru
About Aivar Innovations
Aivar is an AI-first technology partner where cutting-edge technology meets industry expertise to supercharge your projects. Our AI-augmented teams accelerate development, reduce time-to-market, and deliver exceptional code quality. We bring together the best minds in tech to craft scalable, repeatable solutions that drive real momentum for your business.
Technical Focus
Build the data foundation powering our accelerators’ autonomous agents. Design large-scale ingestion, processing, and feature engineering systems that transform unstructured enterprise data (invoices, documents, transactions, RFQs) into structured, high-quality datasets. Enable agentic AI to make accurate, compliance-aware decisions with full data lineage and auditability.
Functional Expectations\
- Design end-to-end data pipelines processing large volumes of unstructured enterprise data (documents, PDFs, transaction records, email)
- Build data ingestion frameworks supporting multiple sources and formats with automated validation and quality checks
- Implement large-scale processing using distributed computing frameworks (Spark, Flink, AWS Glue) handling terabytes efficiently
- Develop advanced feature engineering pipelines — document classification, entity extraction, semantic tagging from unstructured data
- Design data warehousing architecture supporting both near-real-time operational and analytical queries for agentic AI reasoning
- Build data quality frameworks ensuring accuracy critical for agent decision-making and regulatory compliance
- Implement data governance — lineage tracking, metadata management, audit trails for regulated environments
- Lead data security for sensitive information (PII, financial data, healthcare records) with encryption and access controls
Must-Have Technical Skills
- Unstructured data expertise — production ingestion and processing of documents, PDFs, images, text, logs at scale
- Distributed data processing — Apache Spark, Flink, or AWS Glue at production scale
- Feature engineering — advanced techniques for ML systems, automated feature extraction and transformation
- Expert Python — data processing, ETL pipeline development, data science workflows; not notebook-level
- NLP/text processing — document understanding, entity extraction, semantic processing (spaCy, transformers)
- Data architecture — data warehouses, data lakes, or lakehouse architectures supporting batch and real-time processing
- ETL/ELT pipeline design — production-grade with error handling, retry logic, and monitoring
- AWS data services — S3, Athena, Glue, RDS, DynamoDB, MSK
- Data quality & governance — metadata management, lineage tracking, compliance frameworks (GDPR, HIPAA, SOC2)
Core Tech Stack
Python, Apache Spark/Flink, AWS (S3, Glue, Athena, RDS, DynamoDB, MSK), Kafka/Redis Streams, spaCy/transformers/LangChain, Pinecone/Weaviate/pgvector, dbt, Great Expectations, Terraform/CDK, Prometheus/Grafana/OpenTelemetry
Benefits
Why You’ll Love Working at Aivar
- Learn from Experts: Work directly with former AWS leaders and AI pioneers.
- Direct Ownership: Lead high-impact "greenfield" projects from concept to global launch.
- Modern Tech: Master the latest Generative AI frameworks and cloud-native architectures.
- Real-World Impact: Build mission-critical systems used by major global enterprises.
- Rapid Growth: Scale your career quickly in a high-speed
Diversity and Inclusion
Aivar Innovations is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to gender, gender identity, sexual orientation, religion, disability, age, marital status, caste, or any other protected characteristic, and we are committed to building a diverse, inclusive, and respectful workplace for everyone.