
Data analyst skilled in Python, PowerBi and SQL @chrissny88
Cred-Pay: Organise the data so that, it will be easily accessible and help the company predict, if the application can be accepted for a credit card or not
MovieLens: A detailed analysis of the data, and come up with some meaningful insights which will help the company to address users in a better way.
Honey Production: To visualize how honey production has changed over the years (1998–2016) in the United States.
Uber: Extracting actionable insights from data that will help in the growth of the the business.
FoodHub: Analysis of data to get a fair idea about the demand of different restaurants which will help in enhancing customers experience. Skills and Tools: Exploratory Data Analysis (Variable Identification, Univariate analysis, Bi-Variate analysis), Python.
Medicon: Analyzing random samples collected from the batch, to infer the quality (whether a dose will do a satisfactory job or not) and time of effect of a particular batch.
Talent Hunt Examination: A2Z institute wants to provide an estimate of the average score obtained by aspirants who enroll in their program. Keeping in mind the variation in scores every year, the institute wants to provide a more reliable estimate of the average score using a range of scores instead of a single estimate.
Diet plans: In order to understand the effectiveness of each of the different diets for weight loss,we have been asked to perform a statistical analysis to find evidence of whether the mean weight losses with respect to the three diet plans are significantly different.
ENews express: This project used statistical analysis, a/b testing, and visualization to decide whether the new landing page of an online news portal (E-news Express) is effective enough to gather new subscribers or not. The simulated dataset has certain important metrics such as converted status and time spent on the page that will help to conclude the effectiveness of the new landing page. Apart from that, the dependence of conversion on the preferred language will also be analyzed in this project.. Skills and Tools: Hypothesis Testing, a/b testing, Data Visualization, Statistical Inference.
ReCell: Analyze the used devices dataset, build a model which will help develop a dynamic pricing strategy for used and refurbished devices, and identify factors that significantly influence the price. # Skills and Tools: EDA, Linear Regression, Linear Regression assumptions, Business insights and recommendations.
INN Hotels: Analyze the data of INN Hotels to find which factors have a high influence on booking cancellations, build a predictive model that can predict which booking is going to be canceled in advance, and help in formulating profitable policies for cancellations and refunds. # Skills and Tools: EDA, Data Pre-processing, Logistic regression, Multicollinearity, Finding optimal threshold using AUC-ROC curve, Decision trees, Pruning.
EasyVisa: Analyze the data of Visa applicants, build a predictive model to facilitate the process of visa approvals, and based on important factors that significantly influence the Visa status recommend a suitable profile for the applicants for whom the visa should be certified or denied. # Skills and Tools: EDA, Data Preprocessing, Customer Profiling, Bagging Classifiers (Bagging and Random Forest), Boosting Classifier (AdaBoost,Gradient Boosting,XGBoost), Stacking Classifier, Hyperparameter Tuning using GridSearchCV, Business insights
ReneWind: "ReneWind" is a company working on improving the machinery/processes involved in the production of wind energy using machine learning and has collected data of generator failure of wind turbines using sensors. The objective is to build various classification models, tune them and find the best one that will help identify failures so that the generator could be repaired before failing/breaking and the overall maintenance cost of the generators can be brought down. # Skills and Tools: Up and downsampling, Regularization, Hyperparameter tuning, Business insights
Trade&Ahead: Analyze the stocks data, grouping the stocks based on the attributes provided, and sharing insights about the characteristics of each group. # Skills and Tools: EDA, Kmeans Clustering, Hierarchical Clustering, Cluster Profiling,Business Insight
Lost to follow up: Analysing patient data to find the association between loss to follow up patients and data entry delay. Fore each quarter,each program, each health center, what is the loss to follow up rate and data entry delay. For each program, associate LTFU and data entry delay.. # Skills and Tools: EDA, matplotlib,data analysis,data visualization,pandas
Cyarubare Health center: A new maternity wing was built in Cyarubare health center and started to be used in November 2020, I assessed if there was a significant increase of deliveries in Cyarubare and if there was also an increase compared to other health facilities. . # Skills and Tools: Data Preprocessing, EDA, data visualization, Hypothesis testing, descriptive statistic
Patients Forecasting:I performed time series forecasting analysis using seasonal autoregressive integrated moving average(SARIMA) on different indicators to predict the number patients to expect in 2023. The analysis involves building models through historical analysis and using them to make observations and drive future strategic decision-making. . # Skills and Tools: Data Preprocessing, Data visualization, SARIMA
Insurance Claim Prediction:CarIns is a startup that provides insurance for cars. It is one of the best car insurance brands known for the highest claim settlement ratio. It was launched back in Oct 2020 and acquired its initial policyholders by providing a hassle-free claim process, instant policy issuance, and claim settlements at minimum coverages. As it's a fast growing startup, the company would like to optimize the cost of the insurance by identifying the policyholders who are more likely to claim in the next 6 months. # Skills and Tools: EDA, Data Preprocessing, Customer Profiling, Bagging Classifiers (Bagging and Random Forest), Boosting Classifier (AdaBoost,Gradient Boosting,XGBoost), Stacking Classifier, Hyperparameter Tuning using GridSearchCV
Bank Churn Prediction:build a neural network based classifier that can determine whether a customer will leave the bank or not in the next 6 months. # Skills and Tools: EDA, Data Preprocessing, Customer Profiling,Keras,Tensorflow,ANN
Plant Seedling Classification:Building a Convolutional Neural Netowrk to classify plant seedlings into their respective categories. # Skills and Tools:Image Preprocessing, Computer Vision,Keras,CNN
Twitter US Airline Sentiment:classifying the sentiment of tweets into the positive, neutral & negative. # Skills and Tools:Vectorization(Count vectorizer & tf-idf vectorizer),Sentiment analysis,Parameter tuning, Confusion matrix based model evaluation