About

About Me

I'm Namit Naik, a Data Engineer specializing in real-time and batch data processing frameworks, with extensive expertise in designing and optimizing data-intensive applications. My technical foundation is rooted in Big Data and AWS technologies, where I have developed and managed sophisticated solutions that streamline operations and enhance decision-making capabilities.

Skilled in Python and SQL, I have comprehensive experience with AWS services such as DMS, MSK, MWAA, S3, EMR, EC2, Lambda, Step Function, Glue & Redshift. I am also proficient with advanced data processing and orchestration technologies like Apache Spark, Apache Flink, Apache Hudi, Apache Kafka, and Apache Airflow, which empower me to build scalable, efficient data pipelines and real-time ETL systems.

With a strong emphasis on continuous improvement and innovation, I am adept at leveraging real-time data streaming and state management to improve system responsiveness and data accuracy. My approach always includes automating workflows to reduce manual intervention, increase process efficiency, and ensure data integrity across complex deployments.

As a committed lifelong learner and problem solver, I thrive in environments that challenge me to evolve and adapt to the ever-changing landscape of technology.

  • Name: Namit Naik
  • Date of birth: February 23, 1999
  • Address: Mumbai MH, India
  • Pin code: 421201
  • Email: namitnaik1999@gmail.com
  • Phone: +91 7506099306

Experience

My Experience

2024-Present

Product Engineer I - Big Data

Experian

Significant advancements in data processing and operational efficiency were achieved through various projects. A PySpark-based dashboard was developed for real-time monitoring of EMR step executions, alongside a Python CLI wrapper on EC2 that automated business flows, minimizing manual interventions. Improvements in ETL data loading and vulnerability resolutions contributed to enhanced system reliability. Leading a real-time ETL proof of concept, the project won the Best Tech Idea at the EMAP Hackathon FY2025 Q3. Contributions also led to winning the Experian Role Model Award in August 2024. Additionally, PySpark training sessions were conducted, boosting the analytics team's efficiency and fostering strong stakeholder relationships for successful project outcomes.

2021-2022

Data Engineer

Larsen & Toubro Infotech

Responsible for maintaining PySpark scripts, which involved error detection and correction to ensure the proper processing of raw data. Developed a PySpark tool, known as JSON-Hive, that utilized dynamic Data Definition Language (DDL) to store JSON data in HIVE tables. Created PySpark scripts for transforming flattened data into JSON format and contributed to building an AWS Step Functions pipeline for triggering Informatica (BDM) workflows. Integrated the JSON-Hive Spark utility into existing pipeline and identified an optimization approach for EMR Cluster utilization based on the size of incoming JSON files landing on AWS S3 Bucket.

2022-2024

Data Engineer

LTIMindtree

Made significant contributions to a successful Proof of Concept (POC) that involved transitioning ETL framework from AWS EMR and Informatica to AWS Glue. My roles included setting up the AWS Glue environment, designing an end-to-end job pipeline, addressing data challenges, and developing CI/CD code. Also provided training, collaborated on test automation, and played a key role in converting test suites from Pandas to PySpark. Additionally, helped to implement a unified HTML Test Automation report and concluded the POC by migrating Test-Automation Suite to an AWS EMR cluster.

2021

Graduate Engineer Trainee

Larsen & Toubro Infotech

Tasks involving cleaning and refining data, setting up data streaming using Kafka and connecting it to S3, creating visual data displays with Tableau, and maintaining written records and documentation.

Skills

My Skills

Python

80%

SQL

75%

Hadoop

75%

PySpark

80%

Apache Kafka

70%

Apache Flink

70%

Apache Hudi

70%

Apache Airflow

75%

AWS

85%

Tableau

70%

Projects

My Projects

Achievements

My Achievements

Publications

My Publications

Aug 28, 2021 IEEE

College Enquiry Chatbot using Rasa Framework

In the study undertaken, we have created a chatbot in education domain & it is named as “College Enquiry Chatbot”. This chatbot is a web-based application that analyses and understands user's queries and provides an instant and accurate response.

Aug 3, 2021 IEEE

Conversational AI: Chatbots

In the study undertaken, we reviewed several papers & discussed types of chatbots, their advantages & disadvantages. The review suggested that chatbots can be used everywhere because of its accuracy, lack of dependability on human resources & 24x7 accessibility.

Sep 27, 2019 IEEE SIES GST

Mixed Reality

This article usually briefs on:
-What actually Mixed Reality is ???
-Its evolution with Microsoft HoloLens 2
-What can we do with MR ???
-Why it has an edge over Virtual Reality ???

Contact

Contact Me

If you are in need of a highly experienced data engineering professional to enhance your data infrastructure and drive impactful business outcomes, feel free to reach out. Let's discuss how I can bring value to your organization with cutting-edge data solutions.

Address

Mumbai MH, India 421201

Contact Number

+91 7506099306

Email Address

namitnaik1999@gmail.com