Data Engineer · Azure Cloud Expert · ETL Specialist

Hi, I'm Vinay Wadhwa.

Accomplished Data Engineer with 3.5+ years in architecting and refining scalable data pipelines, ETL workflows, and Azure-based data platforms. Expert in Databricks, Delta Lake, and Synapse, turning raw data into strategic insights. Fueled by passion to build resilient data ecosystems through Azure cloud prowess and Databricks expertise.

Core Stack: Python · SQL · PySpark · Scala · Azure Data Factory · Azure Databricks · Apache Spark · Google Cloud

About Me

VW

Vinay Wadhwa

Data Engineer

☁ Databricks Certified Data Engineer Professional
☁ Google Cloud Professional
⭐ 5-Star HackerRank (Python & SQL)

Dynamic Data Engineer dedicated to crafting scalable, high-performance data architectures. With 3.5+ years of practical expertise, I excel in developing resilient ETL pipelines, leveraging multi-cloud environments, and powering large-scale data-driven strategies.

Certified in Databricks with advanced proficiency in Apache Spark, I thrive on converting intricate data problems into streamlined, sustainable solutions.

Beyond technical work, I contribute to open source projects and hold top 5-star HackerRank ratings in Python and SQL. For me, exceptional data engineering fuses sharp analytical thinking with masterful technical execution.

Key Highlights

Design and optimize ETL pipelines reducing processing time by 30% and improving efficiency by 20%.

Deep expertise in multi-cloud platforms: Azure and Google Cloud.

🗄️

Optimize database queries reducing execution time by 2hrs.

5-star rank on HackerRank in Python and SQL.

Experience

Data Engineer
Cencora, Pune
Aug 2024 – Present
  • Designed and implemented a Medallion (Bronze/Silver/Gold) Lakehouse architecture from scratch on Azure Databricks + Delta Lake, standardizing ingestion, refinement, and curated publishing for analytics consumption.
  • Played a key role in migration from EAP to Databricks, re-platforming legacy jobs into scalable Spark-based pipelines and establishing reusable transformation patterns.
  • Built end-to-end batch pipelines using Azure Data Factory + Databricks (PySpark/SparkSQL) to onboard data from multiple sources and deliver business-ready datasets.
  • Improved overall pipeline throughput by redesigning transformations, partitioning strategy, and Spark execution plans, reducing the daily load window by ~2 hours.
  • Optimized SQL and SparkSQL queries (join strategy, predicate pushdown, partition-aware filtering), reducing execution time by ~20% for critical transformations.
  • Implemented Delta Lake best practices including schema enforcement/evolution, optimized writes, and scalable table management to improve reliability.
  • Implemented data quality checks (statistical validation, rule-based checks, reconciliation) and standardized logging to support a zero-defect goal during UAT and production releases.
  • Used Azure Key Vault for secure secrets management and controlled access for source/target connectivity.
  • Mentored junior engineers via code reviews and hands-on guidance on Spark/SQL and pipeline best practices.
Data Engineer
Course 5i, Mumbai
Aug 2022 – Aug 2024
  • Developed and maintained ETL pipelines to process structured and unstructured data for analytics and reporting use cases.
  • Built Spark processing jobs using PySpark (DataFrames, RDD, SparkSQL) and applied performance tuning techniques to improve execution efficiency.
  • Wrote complex SQL and PL/SQL scripts for transformations, data extraction, and reporting; improved query performance through tuning and optimized design.
  • Designed and optimized data storage models using MySQL aligning schema/collections to workload and query patterns.
  • Supported automated testing and deployments via CI/CD pipelines, improving release reliability and environment consistency.
  • Documented ETL patterns, runbooks, and best practices to improve onboarding and long-term maintainability.

Skills

Languages

Python SQL

Big Data

Apache Spark PySpark Databricks

Cloud

Azure Google Cloud

ETL & Pipelines

Azure Data Factory Airflow

Databases

PostgreSQL MySQL MongoDB

What People Say

Vinay's data engineering and cloud platform skills are truly outstanding. His talent for streamlining complex ETL workflows and dramatically cutting processing times transformed our data infrastructure for the better.

Senior Data Architect
Cencora

His proficiency in multi-cloud environments and Databricks integration brought innovative solutions to our data challenges. The automation he implemented saved countless hours of manual work.

Manager
Course5

Get In Touch

I'm open to new opportunities, collaborations, and consulting engagements. Whether you have a project in mind or just want to connect, feel free to reach out!

Send a Mail