Scenario:
During one of the many twists and turns of the COVID-19 pandemic, a provincial government in Canada implemented a rule limiting the number of customers in stores to no more than 50% of the store’s capacity.
To assess compliance and customer behavior, a study was conducted to determine the average customer volume, expressed as a percentage of store capacity, on a specific day.
Study Design:
-
A random sample of 50 stores was selected.
-
For each store, the percentage of capacity used by customers on that day was recorded.
-
These values were then averaged to determine the overall mean customer volume.
Random Variable and Data:
-
The random variable in this context is the percent of store capacity occupied by customers on a given day.
-
Before the data is collected, this random variable represents the range of all possible outcomes — essentially, any value between 0% and 50% (due to the restriction).
-
Once the study is conducted and values are recorded, the random variable takes on specific values — these are the data.
A sample space is created by categorizing stores based on the percentage of their capacity occupied by customers. The categories are defined as follows:
-
Category 1: More than 40% and up to 50%
-
Category 2: More than 30% and up to 40%
-
Category 3: More than 20% and up to 30%
-
Category 4: More than 10% and up to 20%
-
Category 5: 10% or less
These categories allow the data to be grouped and analyzed more easily, providing insights into how full stores were during the observed period.
Sample Space S= {1,2,3,4,5} Probabilities P(1)= 0.05 P(2)= 0.10 P(3)= 0.15 P(4)= 0.20 P(5)= 0.50 > a <- c(.05,.10,.15,.20,.50) > cumsum(a) [1] 0.05 0.15 0.3 0.5 1.0This sample space, along with the mathematically calculated probabilities, represents one possible version of what the actual data might reveal. As observed, the probabilities are higher in the lower ranges of store capacity, which suggests that customers were more cautious due to health and safety concerns.
Many likely preferred to shop online to minimize the risk of exposure to COVID-19, leading to fewer in-store visits and lower occupancy levels across many stores.
Using plot function in R >plot(c(.05,.10,.15,.20,.5) 0),c(1,4,9,16,25) Calculating the Mean of Values using R Each data value is multiplied by its relative frequency. The mean is then equal to the sum of this product. data <- c(10,40,-18,24,19,0,32) > p <- c(.13,.17,.05,.10,.24,.16,.15) > cumsum(p) [1] 0.13 0.30 0.35 0.45 0.69 0.85 1.00 > mean <- sum(data*p) > mean [1] 18.96