● Overall 7+ years of IT experience in designing and implementing various ETL big data pipelines, Data Analysis, Statistical analysis, machine learning models, Development, Testing and Productizing ml models and data pipelines.
● Strong in problem solving and solutioning business problems by breaking down into structured deliverables.
● Experience in collaborating with multiple stakeholder teams (Business, product, tech, finance, delivery teams) to prioritize and align on cross-functional processes.
● Experience working in a fast-paced environment and make pragmatic engineering decisions in a short amount of time.
● Experience in Cloud platforms like Google Cloud Platform and Microsoft Azure Services.
● Experience in crafting ML Models implementing feature engineering, inferencing pipelines and real time model predictions for high performance and scalability.
● Experience in ML Ops to measure and track model performance
● Experience in writing complex SQL queries based on business requirement.
● Experience in Handling Big Data using different Hadoop eco system components such as HDFS, YARN, MapReduce, Spark, Pig, Sqoop, Hive, Hbase and Kafka.
● Proficiency in programming languages like Python, PySpark and SQL.
● Hands on experience in working with Continuous Integration and Deployment (CI/CD) using Jenkins.
● Experience with creating scripts for data modeling and data import and export and automating various activities.
● Experienced in Agile Methodologies, Scrum stories and sprints experience in a Python based environment, along with data analytics, data wrangling.
● Team Player with good interpersonal skills, strong understanding of fundamental business processes
● Excellent communication and problem-solving skills.
● Capable of rapidly learning new technologies and processes and successfully applying them to projects and operations
● Languages: Python, SQL and Pyspark.
● Domains: Healthcare, Retail and Airlines.
● Cloud Environment: Microsoft Azure Services and Google Cloud Platform.
● Data Visualization: Power BI & Visualization packages (Matplotlib, Seaborn and Plotly & Cufflinks) in Python
● Libraries & Frameworks: Pandas, NumPy, Hadoop, Spark, Apache Beam, Hive, JSON and NDJson, Dtale , sklearn, NLTK, Spacy, Web Scraping, Data Munging, lime & Data Mining.
● Algorithms: Linear and Logistic Regression, Decision Tree, KNN, SVM, Bagging, Boosting ,Clustering algorithms, Ensemble Models, Naive Bayes, Evaluation Metrics, PCA, NLP, Beautiful Soup & Deep Learning(ANN).