Domain-driven business-savvy Data Engineer and consultant 👓 with a broad spectrum of technical expertise in multiple domains including Big Data, Data Warehousing, Business Intelligence, Data Science, and Machine Learning. Enthusiastic about helping data-driven 📈 companies generate valuable insights to help them meet their goals.
Blog: BigDataLad.com
Skills include:
➡️ Experience in multiple programming languages including Python, Scala, SQL, C#, Java, & JavaScript. 💻
➡️ Working in Big Data technologies including Hadoop/HDFS, Apache Spark/PySpark, Apache Kafka (for real-time data streaming and pipelining), Apache Nifi, Apache Hive, Apache Impala, etc. 💾
➡️ Experience in the orchestration of different jobs using tools like Apache Airflow & Jenkins.
➡️ Experience working in cloud-based environments. Working majorly in AWS services including, but not limited to, EC2, S3, Athena, Amazon EMR, Amazon Redshift, Amazon Managed Workflows for Apache Airflow (MWAA), etc.
➡️ Great hands-on in the Apache Kafka ecosystem from Kafka Connect API to Kafka Streams Applications and Schema Registry.
➡️ Experience in developing real-time streaming applications containing various data transformations using Kafka Streams DSL API in Scala which contained complex data enrichment and joins in real-time semantics. 🔢
➡️ Experience in working with Dockerized applications in Kubernetes-based environments.
➡️ Experience in Data Science & traditional Machine Learning. Hands-on experience in exploring and pre-processing data for Descriptive Analysis as well as doing predictive analytics using traditional Machine Learning Modelling. 👾
➡️ Hands-on experience in customer segmentation using Clustering Algorithms such as K-Means Clustering, Hierarchical or Agglomerative Clustering, and DBSCAN. ⌨
Tools include:
✅ Scala, Java, SQL, Python, C#
✅ AWS | EMR, Redshift, MWAA, S3, EC2, etc.
✅ Hadoop/HDFS/Apache Yarn
✅ Apache Spark (Both Scala and PySpark)
✅ Apache NiFi
✅ Apache Kafka (Connect API, Streams DSL API, Schema Registry)
✅ Apache Hive/Impala, MySQL, Teradata, Microsoft SQL Server, MongoDB
✅ IPython Kernels (Jupyter Notebook, Google Colab)
Completed Bachelors of Science in Software Engineering from COMSATS University Islamabad. 🎓
Side passions:
⚡ DataOps | DevOps + Data Engineering.
⚡ Building DevOps pipelines.
⚡ Full Stack Web Applications Development using ReactJS/Django stack.
Previously worked in Enterprise Software Development using Microsoft .NET Stack and experienced in developing software applications in all the phases of the Software Development Life Cycle.