Fitbit users insights for guided decisions

SummaryThis work is the ‘Capstone Project’, a study case part of the eighth course “Google Data Analytics Capstone” of the “Google Data Analyst” program.
ToolsBigQuery, RStudio, Tableau, Spreadsheets
Repository Link
SkillsR, SQL, Markdown
TypeData Mining

Bellabeat is a successful, small, high-tech company of health products for women. The heads believe that analyzing competence device data could help unlock growth chances. We should find insights in the data about the user’s behavior and make suggestions.

Due to find new opportunities to grow business, we will analyze competence smart device usage data by gain insights into the uses. Do apply these insights into one Bellabeat product and make recommendations.

Key Questions

  1. What are some trends in smart device usage?
  1. How could these trends apply to Bellabeat customers?
  1. How could these trends help influence Bellabeat marketing strategy? ### 1.3. Stakeholders


  1. The dataset from which we are recommended to start working is public. It refers to a set of data on consumption habits carried out through “Amazon Mechanical Turk” between 10/4/2016 and 12/05/2016, where the respondents (30 chosen) agreed to share the data (biometrics, minute-level output for physical activity, heart rate, and sleep monitoring) of theirs wearable devices for prospective study purposes.
  1. The information is stored in long format, although some specific tables are arranged in wide format. Especially, the most important tables, those that collect the information grouped by larger time intervals (case of “dailyActivity_merged”) are configured in long format in relation to the date.
  1. The data does not meet the ROCC parameters. The information is not reliable, since they do not specify more parameters than user ID numbers, we do not know if the information contains some kind of bias. For example, we do not know the gender of the user, if this survey has been carried out only by men.

What is the point of applying the discoveries made here to a smart device designed for women? Likewise, we do not know ethnicity, nationality and most importantly, the age of the respondents.

About the data of dataset creation, we should say the data set isn’t current, it dates from 2016, six years old. We can say that, when talking about technology, six years is the prehistory.

Finally and none the less, the information isn’t original, the data set has been ‘retouched’ to be published on the “Kaggle” platform.

For all these reasons, we cannot consider the information reliable at all.

  1. About Data integrity, the datasets are in .csv format, meeting the integrity requirements with a fair level of confidence. Not for less, the datasets has been obtained from a platform whose members are passionate about data science. However, we confirmed the integrity analyzing the data set using some R programming language functions.
  1. Although the data is clearly compromised, we can still draw some conclusions that can help us meet our goals.
  1. In normal circumstances, a meeting with the stakeholders would have to be held. It would be necessary for them to agree to carry out their own survey and to provide data and primary information, that is, that is in the possession of the company.

Also, if it did not exceed the scope and requirements of this work, I would propose incorporating other open data, such as this Apple dataset:Apple Watch and Fitbit data, a much more complete and in tune with the ROCCC parameters.