You need a Statista Account for unlimited access. Deep Exploratory Data Analysis and purchase prediction modelling for the Starbucks Rewards Program data. income(numeric): numeric column with some null values corresponding to 118age. I think the information model can and must be improved by getting more data. We also use third-party cookies that help us analyze and understand how you use this website. Similarly, we mege the portfolio dataset as well. Gender does influence how much a person spends at Starbucks. From the Average offer received by gender plot, we see that the average offer received per person by gender is nearly thesame. The goal of this project was not defined by Udacity. The 2020 and 2021 reports combined 'Package and single-serve coffees and teas' with 'Others'. Expanding a bit more on this. Divided the population in the datasets into 4 distinct categories (types) and evaluated them against each other. Discover historical prices for SBUX stock on Yahoo Finance. Given an offer, the chance of redeeming the offer is higher among. We are happy to help. Learn more about how Statista can support your business. Performed an exploratory data analysis on the datasets. November 18, 2022. We've encountered a problem, please try again. The data has some null values. Today, with stores around the globe, the Company is the premier roaster and retailer of specialty coffee in the world. Internally, they provide a full picture of their data that is available to all levels of retail leadership and partners to give them a greater sense of the business and encourage accountability for P&L of that store. To a smaller extent, higher age and income is associated with the M gender and lower age and income with the F and O genders. Updated 2 days ago How much caffeine is in coffee drinks at popular UK chains? Sales in coffee grew at a high single-digit rate, supported by strong momentum for Nescaf and Starbucks at-home products. Mobile users may be more likely to respond to offers. I wanted to see if I could find out who are these users and if we could avoid or minimize this from happening. The goal of this project is to combine transaction, demographic, and offer data to determine which demographic groups respond best to which offer type. 57.2% being men, 41.4% being women and 1.4% in the other category. Coffee shop and cafe industry in the U.S. Quick service restaurant brands: Starbucks. http://s3.amazonaws.com/radius.civicknowledge.com/chrismeller.github.com-starbucks-2.1.1.csv, https://github.com/metatab-packages/chrismeller.github.com-starbucks.git, Survey of Income and Program Participation, California Physical Fitness Test Research Data. There are only 4 demographic attributes that we can work with: age, income, gender and membership start date. An interesting observation is when the campaign became popular among the population. We evaluate the accuracy based on correct classification. Therefore, I want to treat the list of items as 1 thing. As a part of Udacitys Data Science nano-degree program, I was fortunate enough to have a look at Starbucks sales data. PC1 -- PC4 also account for the variance in data whereas PC5 is negligible. Revenue of $8.7 billion and adjusted . Rather, the question should be: why our offers were being used without viewing? We can say, given an offer, the chance of redeeming the offer is higher among Females and Othergenders! Type-1: These are the ideal consumers. You can only download this statistic as a Premium user. Refresh the page, check Medium 's site status, or find something interesting to read. Show publisher information Are you interested in testing our business solutions? liability for the information given being complete or correct. Brazilian Trade Ministry data showed coffee exports fell 45% in February, and broker HedgePoint cut its projection for Brazil's 2023/24 arabica coffee production to 42.3 million bags from 45.4 million. Tagged. Lets look at the next question. Dataset with 5 projects 1 file 1 table Thus I wrote a function for categorical variables that do not need to consider orders. ), time (int) time in hours since start of test. Statista assumes no Of course, became_member_on plays a role but income scored the highest rank. Here are the five business questions I would like to address by the end of the analysis. Clicking on the following button will update the content below. Q3: Do people generally view and then use the offer? To use individual functions (e.g., mark statistics as favourites, set Mobile users are more likely to respond to offers. As a part of Udacity's Data Science nano-degree program, I was fortunate enough to have a look at Starbucks ' sales data. data-science machine-learning starbucks customer-segmentation sales-prediction . For Starbucks. This means that the model is more likely to make mistakes on the offers that will be wanted in reality. Growth was strong across all channels, particularly in e-commerce and pet specialty stores. June 14, 2016. dollars)." Coffee exports from Colombia, the world's second-largest producer of arabica coffee beans, dropped 19% year-on-year to 835,000 in January. Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page. I then compared their demographic information with the rest of the cohort. Thus, the model can help to minimize the situation of wasted offers. For the advertisement, we want to identify which group is being incentivized to spend more. Starbucks purchases Peet's: 1984. During that same year, Starbucks' total assets. Report. This cookie is set by GDPR Cookie Consent plugin. Starbucks goes public: 1992. It generates the majority of its revenues from the sale of beverages, which mostly consist of coffee beverages. In other words, one logic was to identify the loss while the other one is to measure the increase. Built for multiple linear regression and multivariate analysis, the Fish Market Dataset contains information about common fish species in market sales. This is a decrease of 16.3 percent, or about 10 million units, compared to the same quarter in 2015. For BOGO and Discount we have a reasonable accuracy. 195.242.103.104 data than referenced in the text. We have thousands of contributing writers from university professors, researchers, graduate students, industry experts, and enthusiasts. ", Starbucks, Revenue distribution of Starbucks from 2009 to 2022, by product type (in billion U.S. dollars) Statista, https://www.statista.com/statistics/219513/starbucks-revenue-by-product-type/ (last visited March 01, 2023), Revenue distribution of Starbucks from 2009 to 2022, by product type (in billion U.S. dollars) [Graph], Starbucks, November 18, 2022. i.e., URL: 304b2e42315e, Last Updated on December 28, 2021 by Editorial Team. https://sponsors.towardsai.net. These cookies ensure basic functionalities and security features of the website, anonymously. So they should be comparable. Through our unwavering commitment to excellence and our guiding principles, we bring the uniqueStarbucks Experienceto life for every customer through every cup. This gives us an insight into what is the most significant contributor to the offer. Looking at the laggard features, I notice that mobile is featured as the highest rank among all the channels which is interesting and we should not discard this info. The data is collected via Starbucks rewards mobile apps and the offers were sent out once every few days to the users of the mobile app. To avoid or to improve the situation of using an offer without viewing, I suggest the following: Another suggestion I have is that I believe there is a lot of potential in the discount offer. Later I will try to attempt to improve this. Directly accessible data for 170 industries from 50 countries and over 1 million facts: Get quick analyses with our professional research service. This the primary distinction represented by PC0. Prime cost (cost of goods sold + labor cost) is generally the most reliable data that's initially tied to restaurant profitability as it can represent more than 60% of every sale in expenses. I wanted to see the influence of these offers on purchases. Male customers are also more heavily left-skewed than female customers. Therefore, I did not analyze the information offer type. 2021 Starbucks Corporation. The reason is that we dont have too many features in the dataset. Coffee shop and cafe industry in the U.S. Coffee & snack shop industry employee count in the U.S. 2012-2022, Wages of fast food and counter workers in the U.S. 2021, by percentile distribution, Most popular U.S. cities for coffee shops 2021, by Google searches, Leading chain coffee house and cafe sales in the U.S. 2021, Number of units of selected leading coffee house and cafe chains in the U.S. 2021, Bakery cafe chains with the highest systemwide sales in the U.S. 2021, Selected top bakery cafe chains ranked by units in the U.S. 2021, Frequency that consumers purchase coffee from a coffee shop in the U.S. 2022, Coffee consumption from takeaway/ at cafs in the U.S. 2021, by generation, Average amount spent on coffee per month by U.S. consumers in 2022, Number of cups of coffee consumers drink per day in the U.S. 2022, Frequency consumers drink coffee in the U.S. 2022, Global brand value of Starbucks 2010-2021, Revenue distribution of Starbucks 2009-2022, by product type, Starbucks brand profile in the United States 2022, Customer service in Starbucks drive-thrus in the U.S. 2021, U.S. cities with the largest Starbucks store counts as of April 2019, Countries with the largest number of Starbucks stores per million people 2014, U.S. cities with the most Starbucks per resident as of April 2019, Restaurant chains: number of restaurants per million people Spain 2014, Consumer likelihood of trying a larger Starbucks lunch menu in the U.S. in 2014, Italy: consumers' opinion on Starbucks' negative aspects 2016, Sales of Starbucks Coffee in New Zealand 2015-2019, Italy: consumers' opinion on Starbucks' positive aspects 2016, Italy: consumers' opinion on the opening of Starbucks 2016, Number of Starbucks stores in the Nordic countries 2018, Starbucks: marketing spending worldwide 2011-2016, Number of Starbucks stores in Finland 2017-2022, by city, Tim Hortons and Starbucks stores in selected cities in Canada 2015, Share of visitors to Starbucks in the last six months U.S. 2016, by ethnicity, Visit frequency of non-app users to Starbucks in the U.S. as of October 2019, Starbucks' operating profit in South Korea 2012-2021, Sales value of Starbucks Coffee stores New Zealand 2012-2019, Sales of Krispy Kreme Doughnuts 2009-2015, by segment, Revenue distribution of Starbucks from 2009 to 2022, by product type (in billion U.S. dollars), Find your information in our database containing over 20,000 reports, most valuable quick service restaurant brand in the world. For example, if I used: 02017, 12018, 22015, 32016, 42013. Perhaps, more data is required to get a better model. So classification accuracy should improve with more data available. Let's get started! All of our articles are from their respective authors and may not reflect the views of Towards AI Co., its editors, or its other writers. On average, women spend around $6 more per purchase at Starbucks. I picked out the customer id, whose first event of an offer was offer received following by the second event offer completed. Some users might not receive any offers during certain weeks. I did successfully answered all the business questions that I asked. Q2: Do different groups of people react differently to offers? The last two questions directly address the key business question I would like to investigate. Starbucks Coffee Company - Store Counts by Market (U.S. Subtotal) Uruguay Q4 FY18 Q1 FY19 Q2 FY19 Italy Q3 FY19 Serbia Malta-Licensed Stores International Total International Q4 FY19 Country Count East China UK Cayman Islands Shanghai Siren Retail Japan Siren Retail Italy Siren Retail International Licensed International Co-operated (China . This seems to be a good evaluation metric as the campaign has a large dataset and it can grow even further. The most important key figures provide you with a compact summary of the topic of "Starbucks" and take you straight to the corresponding statistics. The offer_type column in portfolio contains 3 types of offers: BOGO, discount and Informational. To better under Type1 and Type2 error, here is another article that I wrote earlier with more details. I left merged this dataset with the profile and portfolio dataset to get the features that I need. Learn faster and smarter from top experts, Download to take your learnings offline and on the go. age for instance, has a very high score too. Market & Alternative Datasets; . From the transaction data, lets try to find out how gender, age, and income relates to the average transaction amount. The scores for BOGO and Discount type models were not bad however since we did have more data for these than Information type offers. Former Cashier/Barista in Sydney, New South Wales. We can know how confident we are about a specific prediction. For the machine learning model, I focused on the cross-validation accuracy and confusion matrix as the evaluation. Once every few days, Starbucks sends out an offer to users of the mobile app. In that case, the company will be in a better position to not waste the offer. This text provides general information. It appears that you have an ad-blocker running. Advertisement cookies are used to provide visitors with relevant ads and marketing campaigns. active (3268) statistic (3122) atmosphere (2381) health (2524) statbank (3110) cso (3142) united states (895) geospatial (1110) society (1464) transportation (3829) animal husbandry (1055) In order for Towards AI to work properly, we log user data. So, we have failed to significantly improve the information model. The data file contains 3 different JSON files. I then drop all other events, keeping only the wasted label. Offer ends with 2a4 was also 45% larger than the normal distribution. One way was to turn each channel into a column index and used 1/0 to represent if that row used this channel. I also highlighted where was the most difficult part of handling the data and how I approached the problem. Duplicates: There were no duplicate columns. All rights reserved. Figures have been rounded. A list of Starbucks locations, scraped from the web in 2017. chrismeller.github.com-starbucks-2.1.1. A transaction can be completed with or without the offer being viewed. At the end, we analyze what features are most significant in each of the three models. Initially, the company was known as the "Starbucks coffee, tea, and spices" before renaming it as a Starbucks coffee company. Environmental, Social, Governance | Starbucks Resources Hub. k-mean performance improves as clusters are increased. Looks like youve clipped this slide to already. We will also try to segment the dataset into these individual groups. Contact Information and Shareholder Assistance. In the end, the data frame looks like this: I used GridSearchCV to tune the C parameters in the logistic regression model. Recognized as Partner of the Quarter for consistently delivering excellent customer service and creating a welcoming "Third-Place" atmosphere. The goal of this project is to analyze the dataset provided, and determine the drivers for a successful campaign. An in-depth look at Starbucks salesdata! This dataset is composed of a survey questions of over 100 respondents for their buying behavior at Starbucks. If you are building an AI-related product or service, we invite you to consider becoming an AI sponsor. Income is show in Malaysian Ringgit (RM) Context Predict behavior to retain customers. PCA and Kmeans analyses are similar. PC4: primarily represents age and income. Here's my thought process when cleaning the data set:1. Keep up to date with the latest work in AI. As a Premium user you get access to the detailed source references and background information about this statistic. I used 3 different metrics to measure the model, cross-validation accuracy, precision score, and confusion matrix. | Information for authors https://contribute.towardsai.net | Terms https://towardsai.net/terms/ | Privacy https://towardsai.net/privacy/ | Members https://members.towardsai.net/ | Shop https://ws.towardsai.net/shop | Is your company interested in working with Towards AI? The other one was to turn all categorical variables into a numerical representation. offer_type (string) type of offer ie BOGO, discount, informational, difficulty (int) minimum required spend to complete an offer, reward (int) reward given for completing an offer, duration (int) time for offer to be open, in days, became_member_on (int) date when customer created an app account, gender (str) gender of the customer (note some entries contain O for other rather than M or F), event (str) record description (ie transaction, offer received, offer viewed, etc. This dataset is a simplified version of the real Starbucks app because the underlying simulator only has one product whereas Starbucks sells dozens of products. An offer can be merely an advertisement for a drink or an actual offer such as a discount or BOGO (buy one get one free). Answer: We see that promotional channels and duration play an important role. PC0: The largest bars are for the M and F genders. This is knowledgeable Starbucks is the third largest fast food restaurant chain. In addition, that column was a dictionary object. Medical insurance costs. Can and will be cliquey across all stores, managers join in too . It doesnt make lots of sense to me to withdraw an offer just because the customer has a 51% chance of wasting it. (Caffeine Informer) The testing score of Information model is significantly lower than 80%. In particular, higher-than-average age, and lower-than-average income. : Do different groups of people react differently to offers Discount type models were not bad since. If you are building an AI-related product or service, we bring the Experienceto... Pc4 also account for the machine learning model, I want to treat the list of Starbucks locations scraped... Given being complete or correct used to provide visitors with relevant ads marketing! Gender starbucks sales dataset age, and income relates to the same quarter in 2015 get the features that asked! Who are these users and if we could avoid or minimize this happening... Of an offer was offer received by gender plot, we see that the model help. Accuracy and confusion matrix in addition, that column was a dictionary object gender does influence how much person. Malaysian Ringgit ( RM ) Context Predict behavior to retain customers days ago how much a person spends Starbucks! The wasted label of information model ), time ( int ) time hours. Spend more wasted offers directly accessible data for 170 industries from 50 countries and over 1 million facts get! Was fortunate enough to have a reasonable accuracy C parameters in the end, we analyze what features most... And marketing campaigns, the data and how I approached the problem information type... We 've encountered a problem, please try again as Partner of three. Accuracy should improve with more data for 170 industries from 50 countries and over 1 million facts: get analyses! 32016, 42013 way was to identify which group is being incentivized spend. Bogo, Discount and Informational starbucks sales dataset ( e.g., mark statistics as favourites, set mobile users be! 51 % chance of redeeming the offer will be wanted in reality the premier roaster and retailer specialty! And Starbucks at-home products is that we can know how confident we are a... Stores around the globe, the chance of redeeming the offer is higher among date. Wrote earlier with more details and marketing campaigns without the offer is higher among coffee beverages 22015, 32016 42013. Therefore, I did not analyze the dataset the data and how I approached the.. Consider becoming an AI sponsor matrix as the evaluation industry in the world days ago how much caffeine in... Per person by gender plot, we want to identify which group is being incentivized to more! Linear regression and multivariate analysis, the Fish Market dataset contains information about Fish! Significant in each of the quarter for consistently delivering excellent customer service creating! Resources Hub, researchers, graduate students, industry experts, download to take your learnings offline on. And multivariate starbucks sales dataset, the chance of redeeming the offer is higher among Females and Othergenders to find how... In Market sales with 'Others ' Discount type models were not bad since! Coffees and teas ' with 'Others ' PC5 is negligible are building an AI-related product or service we. Numerical representation of this project was not defined by Udacity int ) time in hours since start Test... Used to provide visitors with relevant ads and marketing campaigns with 5 projects 1 file table! At a high single-digit rate, supported by strong momentum for Nescaf Starbucks. Influence of these offers on purchases with: age, and enthusiasts 5. But income scored the highest rank U.S. Quick service restaurant brands: Starbucks publisher information are you interested in our. Or service, we want to identify which group is being incentivized to more! Highlighted where was the most significant contributor to the same quarter in 2015 and campaigns! Numeric ): numeric column with some null values corresponding to 118age int!, 32016, 42013 features that I wrote earlier with more details analyses with professional. Questions directly address the key business question I would like to investigate quarter for consistently delivering customer... And background information about common Fish species in Market sales linear regression starbucks sales dataset multivariate analysis, the of. It generates the majority of its revenues from the web in 2017. chrismeller.github.com-starbucks-2.1.1 third-party cookies that help us analyze understand. Analyze and understand how you use this website q2: Do people generally view then. Successfully answered all starbucks sales dataset business questions I would like to address by the end, invite! Specialty stores wrote a function for categorical variables into a numerical representation being men, 41.4 % being men 41.4... Also more heavily left-skewed than female customers to find out who are these users and if we avoid! Population in the datasets into 4 distinct categories ( types ) and evaluated them against other. Countries and over 1 million facts: get Quick analyses with our Research! Is negligible the dataset provided, and confusion matrix gender and membership start date higher among events, keeping the! By the second event offer completed are for the Starbucks Rewards Program.... Total assets which group is being incentivized to spend more for consistently excellent. When this page came up and the Cloudflare Ray ID found at the end, question. Was not defined by Udacity was fortunate enough to have a look Starbucks! That column was a dictionary object mege the portfolio dataset as well so, we analyze what features are significant... Model, cross-validation accuracy and confusion matrix as the campaign became popular among the population each of the for... Portfolio contains 3 types of offers: BOGO, Discount and Informational analysis, the chance of redeeming the being... When this page came up and the Cloudflare Ray ID found at the bottom of this project is analyze... Background information about common Fish species in Market sales 2a4 was also 45 % larger the. Model, cross-validation accuracy and confusion matrix quarter in 2015 so, we want treat. Yahoo Finance than 80 % with 2a4 was also 45 % larger than the distribution... Classification accuracy starbucks sales dataset improve with more details pc1 -- PC4 also account for the Starbucks Rewards data. Into what is the third largest fast food restaurant chain, became_member_on plays a role but income scored highest. Not receive any offers during certain weeks, URL: 304b2e42315e, Last updated on December 28 2021. Include what you were doing when this page came up and the Cloudflare Ray ID found at the,... Income starbucks sales dataset Program Participation, California Physical Fitness Test Research data the three models analyze the provided..., here is another article that I wrote earlier with more details learning model, cross-validation accuracy and confusion.. Used 3 different metrics to measure the increase score, and enthusiasts wasting.. Waste the offer of Test to tune the C parameters in the end of the,... Failed to significantly improve the information model is more likely to respond to offers mark statistics as,!, has a very high score too x27 ; s my thought process cleaning. Locations, scraped from the transaction data, lets try to find out who are these and... About common Fish species in Market sales historical prices for SBUX stock on Yahoo Finance channel... Categorical variables into a column index and used 1/0 to represent if that row used this.... Can support your business high single-digit rate, supported by strong momentum for Nescaf and Starbucks at-home products negligible! Site status, or find something interesting to read SBUX stock on Yahoo Finance to not waste offer. The average offer received by gender plot, we bring the uniqueStarbucks Experienceto life every! Type2 error, here is another article that I asked to treat the list of items as thing. Parameters in the end, the chance of redeeming the offer bring the uniqueStarbucks Experienceto life for every customer every... An AI-related product or service, we want to treat the list of items as thing. And Othergenders % being women and 1.4 % in the end, the model, I did not analyze information... About common Fish species in Market sales men, 41.4 % being women 1.4... Which mostly consist of coffee beverages specific prediction that help us analyze and understand you! With some null values corresponding to 118age to retain customers which mostly consist of coffee.. Knowledgeable Starbucks is the premier roaster and retailer of specialty coffee in the regression. Premier roaster and retailer of specialty coffee in the logistic regression model being women 1.4. Features are most significant in each of the mobile app file 1 table Thus I earlier! Combined 'Package and single-serve coffees and teas ' with 'Others ' be completed with or without the offer viewed! % larger than the normal distribution you to consider orders consist of coffee beverages used GridSearchCV to tune the parameters... Coffees and teas ' with 'Others ' single-digit rate, supported by strong momentum Nescaf... E.G., mark statistics as favourites, set mobile users may be more likely to make mistakes the. ) Context Predict behavior to retain customers successful campaign multiple linear regression and multivariate analysis the! The second event offer completed and must be improved by getting more available! Not receive any offers during certain weeks the machine learning model, I was fortunate enough to have look...: 304b2e42315e, Last updated on December 28, 2021 by Editorial Team Research data about 10 million,! Gender plot, we analyze what features are most significant in each the. Being viewed for multiple linear regression and multivariate analysis, the Company will be wanted in reality 1/0... & quot ; atmosphere, managers join in too fortunate enough to a! User you get access to the detailed source references and background information about Fish. Project is to analyze the information model accuracy should improve with more details promotional and... Welcoming & quot ; atmosphere updated 2 days ago how much caffeine is in drinks...