Venkat Alapati

Projects

Data Science Projects

GoodBelly: Using Statistics to Justify the Marketing Expense

Advanced Statistical Analysis with R ,Casual Analysis, Excel

  • Conducted causal and statistical analyses to identify a 125.4 unit sales increase and $512 revenue boost from in-store demos, netting $222.58 profit after expenses
  • Recommended dataset refinement by regional/store categorization for improved accuracy and insights.
  • Highlighted the importance of local sales representatives; used qualitative data for informed hiring to boost sales
  • Applied a 4-Stage MLR model in R for precise causal analysis.
  • Analyzed sales impact of in-store demos, revealing a 125.4 unit increase and $512 additional revenue, resulting in $222.58 profit after costs. Proposed dataset refinement for accuracy, highlighted sales rep importance, and utilized a 4-Stage MLR model in R for detailed causal analysis
Statistical Analysis 3 Statistical Analysis 4 Statistical Analysis 5 Statistical Analysis 6 Statistical Analysis 7 Statistical Analysis 8

Classification Model Accuracy Study (Group Project)

Statistical Data Analysis, ML, Hyperparameter Tuning, R

  • Assessed Decision Tree, Naïve Bayes, KNN, XGBoost, Random Forest, BRIF across 20 datasets.
  • Improved XGBoost, Naïve Bayes through hyperparameter tuning; identified best models via average results.
  • Evaluated Decision Tree, Naïve Bayes, KNN, XGBoost, Random Forest, and BRIF across 20 datasets, enhancing model accuracy with hyperparameter tuning. Systematically determined best models by averaging results across datasets.
Statistical Analysis 3 Statistical Analysis 4 Statistical Analysis 5 Statistical Analysis 6 Statistical Analysis 7

Text Processing Using RNN and LSTM

NLP, Hyperparameter Tuning, PyTorch, scikit-learn, NumPy, Pandas

  • Developed tokenization for a 2 million-record dataset; converted embeddings to NumPy arrays for classification.
  • Cleaned/tokenized using NLTK, trained RNN and LSTM models achieving 81% and 91% accuracy, respectively.
  • Developed tokenization for a 2 million-record dataset, achieving 81% accuracy with RNN and 91% with LSTM models.
  • Utilized NLTK for data cleaning, PyTorch for model training, and achieved high classification accuracy
NLP 3 NLP 4 NLP 5 NLP 6 NLP 6

Data Science Strategy VBC case study

Data Governance, Data Science Strategy, Action Plans, Tableau, Python

  • Aligned VBC strategic goals with data management for 100% compliance; boosted data-driven decisions by 30%.
  • Created a data science strategy, improving organizational alignment and strategic agility by 25%.
  • Analyzed impaired driving fatalities, identifying trends and contributing to a 15% reduction in road accidents.
  • Aligned VBC's strategic goals with data management for 100% compliance and increased data-driven decision-making by 30%.
  • Presented a unified data science strategy, improving organizational alignment by 25% and contributing to a 15% reduction in road accidents through trend analysis.
 case study 3 case study 4 case study 5 case study 6 case study 6

Demand & Accounting Analysis on GB Bicycle Products

Product, Customer and Demand Analysis, Alteryx, IDEA , SAP Analytics

  • Analyzed GB bicycle products for seasonality, customer transitions, and strategic discounts, increasing sales by 10%.
  • Implemented region-specific churn analytics, reducing customer attrition by 15% and optimizing business processes.
  • Used time series forecasting to cut inventory costs by 20% through accurate demand prediction.
  • Led analysis on GB bicycle products, identifying seasonality and customer trends that increased sales by 10%. Implemented churn analytics and time series forecasting, reducing customer attrition by 15% and inventory costs by 20%.
Statistical Analysis 3 Statistical Analysis 4 Statistical Analysis 5 Statistical Analysis 6 Statistical Analysis 7 Statistical Analysis 7 Statistical Analysis 7

GCP and Cloud Projects

AWS-SageMaker finding Potential Talanted Individuals

AWS CLI,AM (Identity and Access Management), AWS EC2 ,AutoTokenizer,AutoModel,Transformers,LightGBM

  • Efficient Data Processing: Integrated AWS SageMaker with a Jupyter Notebook, reducing data processing time by 40%.
  • Advanced AI Techniques Integration: Leveraged AWS SageMaker for BERT text feature extraction, NCF recommendation enhancement, and GridSearchCV hyperparameter optimization, improving analysis outcomes.
  • Model Performance Enhancement: Achieved a 15% increase in model prediction accuracy on AWS SageMaker through hyperparameter fine-tuning and automated model tuning.
  • Streamlined Deployment Process: Automated model deployment workflows on AWS SageMaker, reducing deployment time by 50%.
  • Cost Optimization: Utilized AWS EC2 spot instances, cutting model training expenses by 30% and maximizing computational resource ROI.
AWS Project 1 AWS Project 2 AWS Project 1 AWS Project 2 AWS Project 1 AWS Project 2 AWS Project 2

Heroku, GCP python Plotly dashboard

Dash Core Components,Plotly Dash, Dash Bootstrap Components ,Git,Heroku,GCP

  • Implemented a robust data cleaning process, addressing missing values, type conversions, outliers, and normalization, ensuring a clean and accurate dataset for analysis.
  • Developed over 20 interactive visualizations using Plotly Dash, enabling comprehensive exploration of data correlations, frequencies, and relationships, enhancing understanding and facilitating insightful data exploration.
  • Used `gcloud` CLI tool to deploy the Plotly Dash application, configuring the `app.yaml` file for runtime settings and versioning
  • Leveraged GAE's automatic scaling and custom domain mapping for optimal performance and accessibility.
  • Deployed the Plotly Dash application using Heroku CLI, specifying dependencies in `requirements.txt` and runtime configurations in `Procfile
  • Utilized Heroku's container-based platform and CI/CD pipelines for automated deployment and version control, with secure management of environment variables in the Heroku dashboard.
Dashboard Project 1 Dashboard Project 2 Dashboard Project 2

AWS-SageMaker Digitalization experience for the blind

aws-lambda ,aws-s3, pillow, aws-ec2 imagedatagenerat, earlystop, aws-sagemaker

  • Baseline CNN Accuracy: Achieved 78.08% accuracy on a validation set of 2,367 images across 3 classes using TensorFlow. Implemented Amazon SageMaker for scalable model training.
  • Fine-Tuned ResNet50: Enhanced classification accuracy by 4% through the fine-tuning of a ResNet50 model, handling 23 million parameters. Utilized Amazon EC2 instances, reducing model optimization time by 25%.
  • Extensive Data Augmentation: Reduced overfitting by 20% through image augmentation techniques in TensorFlow Keras. Leveraged Amazon S3 for data storage, ensuring seamless access to augmented data during training, improving training time by 15%.
  • Efficient Model Training: Decreased training overhead by 30% by optimizing data pipelines and employing batch processing. Utilized AWS Lambda for preprocessing tasks, reducing model training time by 10%.
  • Integrating AWS services for enhanced scalability and cost-effectiveness. Reduced deployment time by 30% and infrastructure costs by 20%.
  • An ensemble model was created by averaging the predictions of the baseline, ResNet50, and EfficientNet models. The ensemble model showed an improved F1 score of 0.89, demonstrating the benefits of combining multiple models.
ds Project 1 ds Project 1 ds Project 1 ds Project 1 ds Project 1

CAR Rental

Node.js, Express, JWT, Nodemailer

  • Designed and implemented a Car Rental API, enabling customers to search, reserve cars for specific dates, and receive email confirmations, significantly enhancing user experience and operational efficiency
  • Leveraged Node.js, Express, JWT authentication, and Nodemailer for secure and scalable application architecture, achieving over 2,000 monthly user interactions with comprehensive API documentation using Swagger
  • Implemented a Car Rental API for car searching, reservation, and email confirmation, using Node.js, Express, JWT, and Nodemailer, with over 2,000 monthly user interactions
  • Search Function Speed: Improved search response times by 8% through backend enhancements, optimizing the user search experience
  • Reservation Process Efficiency: Made reservation and confirmation processes 7% more efficient with refined Node.js and Express implementation.
E-commerce Project 1 E-commerce Project 2 E-commerce Project 1 E-commerce Project 2 E-commerce Project 1

Predict stock price valuations on a daily, weekly and monthly basis

ARIMA,SARIMA,Prophet,Hyperparameter Optimization (HPO),TensorFlow,Amazon SageMaker,AWS S3,AWS EC2

  • Processing Efficiency: Achieved a 35% reduction in data preparation time by implementing optimized data pipelines and preprocessing techniques using pandas and NumPy.
  • Training Speed and Efficiency: Enhanced training speed by 25% through leveraging TensorFlow and SageMaker's distributed training capabilities with optimized GPU settings, reducing overall model training time significantly.
  • Model Accuracy Improvement: Improved model prediction accuracy by 20% through automated hyperparameter tuning using SageMaker's Hyperparameter Optimization (HPO) feature, leading to more precise and reliable forecasting results.
  • Advanced Forecasting Models: Successfully implemented and tested advanced time series models, including ARIMA, SARIMA, and Prophet, resulting in a 15% increase in forecasting accuracy for complex time series data.
Predict stock price 1 Predict stock price 2

BI Projects

A Data Driven Approach to Safer Roads (Group Project)

Tableau, Python , Geo Spatial Analysis, Predictive Modeling

  • Advanced Predictive Modeling: Developed machine learning models using XGBoost and Decision Trees, enhanced by class imbalance techniques like SMOTE, achieving high prediction accuracy for traffic accident outcomes.
  • Feature Influence Analysis: Utilized SHAP and LIME methodologies to identify and explain the key features impacting accident severities, such as time, road conditions, and driver impairment, enhancing the interpretability of predictive models.
  • Geospatial Data Visualization: Created dynamic maps to visualize the geographic distribution of fatal accident ratios across Metro Detroit, pinpointing high-risk areas to inform targeted safety measures.
  • Comprehensive Data Processing and Model Evaluation: Processed large datasets with advanced data handling techniques and rigorously evaluated model performance using metrics like accuracy, precision, and F1-score to ensure the reliability of findings for stakeholder decision-making.
Traffic Control System 1 Traffic Control System 2 Traffic Control System 1 Traffic Control System 2 Traffic Control System 1 Traffic Control System 1

Olympian Analytics

Tableau, Statitcal Data Analysis

  • dvanced Visualization Techniques: Developed a comprehensive dashboard that illustrates the distribution of medals across different regions and genders, utilizing dynamic filters and color coding to enhance the visualization impact and user interaction.
  • Sport-Specific Medal Insights: Analyzed and displayed the medal tallies between male and female athletes across various sports, pinpointing sports where female athletes are closing the gap, such as swimming, handball, and volleyball.
  • Height and Weight Impact Study: Investigated how physical attributes like height and weight impact medal winnings, revealing that 50% of the medals for male athletes come from those whose height ranges between 166-183 cm, and correlating weight with medal success particularly in weight-sensitive sports.
  • Interactive Data Exploration: Implemented interactive elements in the data visualization dashboard that allow users to filter results based on variables like sex, sport, and year, facilitating deeper and more specific analyses by end-users.
  • Predictive Analytics for Future Games: Used historical data to forecast potential outcomes in future Olympic games, applying machine learning models to predict medal tallies based on trends and athlete characteristics, enhancing strategic decisions for training and development.
BI1 BI 2 BI 1 BI 2 BI 1 BI 2

Product Sales Analysis by Region and Time Period

Excel's powerful features to process and analyze data more efficiently, directly impacting strategic decisions and operational efficiencie

  • Advanced Data Analytics Implementation: Utilized complex array formulas and multi-criteria lookup functions to merge and analyze disparate data sets, enhancing data accuracy and granularity. This approach enabled the identification of underperforming product lines, leading to a targeted strategy that improved product performance by 30%.
  • Dynamic Reporting Through Pivot Tables: Developed comprehensive pivot table reports that dynamically segmented customer demographics and purchasing behavior, providing real-time insights that increased marketing campaign effectiveness by 45%.
  • Automated Data Insights with Excel Formulas: Implemented automated financial forecasting models using Excel functions like `FORECAST.LINEAR` and `XLOOKUP` to predict next quarter sales, achieving a 95% accuracy rate compared to historical projections. This precise forecasting prompted preemptive adjustments in supply chain operations.
  • Cross-Functional Data Integration: Employed advanced lookup tables combining `VLOOKUP` and `INDEX-MATCH` across multiple sheets to consolidate financial and operational data, which streamlined month-end reporting processes and reduced report generation time by 50%.
BI 1 BI2 BI 1 BI 2

Customer Segmentation,Market Trend Adaptation and Profit Margin Improvement

data segmentation, predictive analytics, cluster analysis and pivot tables

  • Sales Performance Optimization: Leveraged advanced data segmentation and predictive analytics to identify high-performing products, resulting in a 20% increase in overall sales revenue over the previous quarter.
  • Market Trend Adaptation: Utilized time-series analysis to detect emerging market trends, enabling the adjustment of inventory levels and marketing strategies. This proactive approach led to a 15% reduction in stock-outs and overstock situations, optimizing inventory turnover.
  • Customer Segmentation and Targeted Marketing: Applied cluster analysis to categorize customers based on purchasing behavior and preferences, which informed a targeted marketing campaign that boosted customer engagement by 30% and increased repeat purchases by 25%.
  • Cost Efficiency and Profit Margin Improvement: Implemented cost analysis through pivot tables to track and reduce unnecessary expenditures in the supply chain. These adjustments improved profit margins by 10% across key product lines without compromising quality.
BI 1 BI 2 BI 1 BI 2 BI 1

Forecasting and strategic planning of Global Superstore

Tableau, Statitcal Data Analysis , Transform Raw Data

  • Innovative Visualization Techniques: Developed an interactive dashboard using Tableau that combines various complex visualizations, including heat maps and animated time series. This dashboard effectively illustrates the dynamics of shipping costs, order priorities, and market distributions, providing stakeholders with a comprehensive tool for data-driven decision-making.
  • Advanced Analytical Models: Leveraged Tableau’s advanced analytics capabilities to create calculated fields like 'Time to Ship' and 'Ship Cost per Unit', using functions such as `DATEDIFF` and `DIV`. This allowed enhancing operational insights.
  • Dynamic Reporting and Forecasting: Implemented dynamic heat maps to visualize the distribution of shipping costs and order metrics globally, enabling real-time insights into logistic performance. These visualizations are augmented with animated trends over time, providing clear visual tracking of changes and trends that inform forecasting and strategic planning.
  • Data Integration and Automation: Automated the extraction and transformation of complex data sets from the Global Superstore database into actionable Tableau visualizations. This automation supports ongoing monitoring and quick updates to the dashboard, facilitating up-to-date decision-making with minimal manual intervention.
BI 1 BI 2 BI 1 BI 2 BI 1

Supply Chain Projects

Tax and Drawback Duties Optimization

Supply Chain Finace ,Regulations,Constranied Optimization, Complex Formulas

  • Optimized tax drawback claims, increasing tax refunds by 12%, achieving a $15,000 boost in cash flow.
  • Ensured compliance with tax regulations, reducing non-compliance risks by 20%, avoiding potential penalties.
  • Maximized cost recovery through tax incentives, realizing a $10,000 increase in cost savings.
  • Improved financial benefits, enhancing cash flow management by 18%, resulting in $12,000 profit increase.
SupplyChain Project1 SupplyChain Project 2 SupplyChain Project 1 SupplyChain Project 2 SupplyChain Project 1 SupplyChain Project 2

Last Mile Delivery for Sheron stores

Hypothesis ,Level of Service,Constranied Optimization, Complex Formulas

  • Optimized delivery routes, reducing last mile delivery costs by 18%, saving $10,000 annually on transportation expenses.
  • Enhanced fleet management, increasing delivery efficiency by 20%, reducing fuel consumption by 15%.
  • Reduced transportation costs by 10%, achieving a $5,000 cost saving through efficient route planning.
  • Improved service levels, enhancing delivery speed by 25%, resulting in a 10% increase in customer satisfaction.
SupplyChain Project 1 SupplyChain Project 2 SupplyChain Project 1 SupplyChain Project 2

Optimization of Supply Chain Operations and BOM

Available to Promise, Supply Chain operations ,ROI,Constranied Optimization, Demand Forecasting

  • Conducted MPS analysis to balance 30 initial units to meet forecasted demand, resulting in a projected available balance of 53 units after adjustments, optimizing inventory levels by 30%.
  • Utilized BOM optimization to reduce raw material wastage by 15%, achieving a cost saving of $3,500 per production cycle.
  • Implemented MRP to ensure material availability, reducing stockouts by 20%, improving production efficiency by 18%.
  • Enhanced DRP to align supply chain operations, resulting in a 10% reduction in logistical costs and improving delivery times by 12 hours on average.
SupplyChain Project1 SupplyChain Project 2 SupplyChain Project 1 SupplyChain Project2

Supply Chian financial planning through Perptunity and Anuvality

Long Term Investment Value, ROI , Free cash Flows , Terminal Value

  • Calculated annuity payments, resulting in a present value savings of $15,000 over a 10-year period.
  • Determined present value of future cash flows, optimizing financial planning by 12%, improving investment returns.
  • Analyzed interest rate impacts, achieving a 10% increase in annuity valuation, enhancing financial stability.
  • Applied financial planning techniques, resulting in a $25,000 increase in long-term investment value.
SupplyChain Project 1 SupplyChain Project 2 SupplyChain Project 1

Optimization Based Procurement selected Suppliers

strategic sourcing , bulk purchasing , Volume Optimization

  • Optimized procurement strategy, reducing procurement costs by 15%, achieving a $30,000 annual saving.
  • Selected suppliers based on cost, quality, and delivery performance, reducing supplier costs by 12%, improving overall supply chain efficiency.
  • Maintained optimal inventory levels, reducing stockholding costs by 18%, saving $7,500 annually.
  • Achieved cost savings through strategic sourcing and bulk purchasing, increasing profit margins by 8%.
Procurement Project 1 ProcurementProject 2 Procurement Project 1 Procurement Project 2 Procurement Project 1

Certifications

Technical proficiency

Programming Languages:

Python
9
SQL
8.5
Shell Scripting
10
R
8.5
Spark
7
SAS
7.5
HTML
9
CSS
8.5
Node.js
8.5

Databases:

Hadoop
7
MongoDB
7.5
AWS
9
Oracle
8

Cloud Platforms:

Snowflake
9
Data Bricks
7.5
AWS
9
GCP
9
Azure
7
Heroku
9

Version Control:

Git
9
GitLab
6.5
Confluence
9

Visualization & Reporting:

SAP-Analytics
9
Tableau
9
Streamlit
10
Alteryx
10
IDEA
10
Plotly-Dash
8

Data Science & Analytics:

SageMaker
8.5
Scikit-learn
8
TensorFlow
7.5
NumPy
9
Pandas
8.5
Matplotlib
8
Seaborn
8
PyTorch
6.5

Microsoft Excel-Solver:

Pivot Tables
8.5
Complex Formulas
10
Descriptive Statistics
9
Hypothesis Testing
10
Distributions
10
Regression Analysis
10
Optimization
10
Network Design
8
Cost/Profit
8.5
ROI
8
Last Mile Design
8.5

About Me

Profile Picture

My ability to curate data and software interactions led me to move into the data science business and pursue a mid-level profession with a data science degree.competent in all areas necessary for data engineering and analytics, including Python, SQL, Machine Learning, and the entire stack. I can provide robust solutions based on facts because my experience includes statistical data analysis, database management, machine/deep learning, and software development. Throughout my work, I have created and improved machine learning models, APIs, and data pipelines. Along with implementing numerous machine learning models, I have also enhanced pipelines, managed massive data retrieval, created dynamic multi-page KPI dashboards across platforms, and overcome the some limitations of BI tools in a variety of projects.

Further, I would bring the following strengths to your team:

  • Seven plus years of expertise developing data-driven embedded and industrial software test tools.
  • Developing and optimizing data pipelines using SQL and Python.
  • Implementing machine learning models with tools like Scikit-learn, TensorFlow, and PyTorch.
  • Building interactive analytics dashboards using HTML, CSS, and frameworks like Streamlit and Plotly Dash.
  • Deploying and maintaining ML/DL models following MLOps practices.
  • Expertise in cloud platforms such as AWS, Snowflake, and GCP, as well as proficiency in tools like Altreyx, Tableau and Power BI to create scalable and efficient solutions that empower data-driven decision-making.
  • Assisting with implementation, maintenance, and documentation of internal data, analytics, and reporting processes.
  • Education

    Wayne State University | MS in Data Science | GPA: 3.72 | Aug 2023

    Relevant Courses : Data Science Strategy and Leadership, Accounting Data analytics, Intelligence Analytics, Data Science and Analytics, Computing Platforms for Data Science, Marketing Analytics, Supply chain Analytics, Strategic Procurement, Information Analytics: Inbound Information Technology

    University Website

    MIT-CTL (MITx) | Micro Masters in SupplyChain Management | GPA: 3.5 | Aug 2020

    Relevant Courses : Supply Chain Analytics ,Supply chain Fundamentals ,Supply Chain Design , Supply Chain Dynamics , Supply Chain Technology and Systems ,Supply Chain Finance

    University Website

    JNTU College Of Engineering | M.Tech in Embedded Systems | GPA: 3.8 | Jan 2015

    Relevant Courses : Micro Controllers for Embedded System Design ,Digital System Design , Embedded C, Digital Control Systems ,Robotics ,CPLD&FPGA Architecture and Applications , Real time Operating Systems , System on chip Architecture , Network Security & Cryptography ,DSP Processors & Architectures Sensors & Actuators , Design of Fault tolerance systems

    University Website

    Contact

    valapati.ds@gmail.com

    313-424-5423

    GitHub

    LinkedIn