By Erik Gfesser — Dec 14, 2023

Product Reviews: Part 12 - Update with a note about data quality (be careful when using floating data types!)

Almost exactly a year has passed since publishing my last post in this series, and a lot has changed. So many changes have taken place, in fact, that I've decided to significantly decrease my contributions as an Amazon product reviewer, beginning at the conclusion of my second "evaluation period" on November 9, 2023.

A year ago, Amazon apparently decided to gamify its product review process. In summary, Vine membership now provides tiered access and rewards based on contributions, meaning that membership is no longer a free-for-all and that unmet tier criteria will jeopardize future membership.

These two tiers are named "silver" and "gold". To maintain a "silver" account, at least 50% of free product needs to be reviewed within a given 6-month evaluation period, and to maintain a "gold" account, at least 90% of free product needs to be reviewed within this same time period, but gold tier membership additionally requires that at least 100 products be reviewed. While I recall that silver tier membership once required at least 50 products be reviewed within a given 6-month evaluation period, this criteria has apparently since been dropped.

The sole motivator to achieve gold tier membership is the ability to order up to 8 free products per day of *any* value, because silver tier membership limits the number of free products to be ordered at only 3 per day, with each product limited to $100 in value.

Value categories of $100 or less, and above $100 are based on the values considered for tax calculations. While *all* free product has been tax-free for the bulk of my time as an Amazon product reviewer, I mentioned in an earlier post that Amazon began issuing tax form 1099-NEC a few years ago. The acronym "NEC" stands for "nonemployee compensation", and this form is used by businesses to report payments made to independent contractors, freelancers, sole proprietors, and self-employed individuals. This form is not issued unless over $600 in product is received, and as I've mentioned previously, the vast majority of products I've ordered are consumables that don't contribute to this total dollar amount.

The biggest reason for scaling down my involvement at this point in time is persistent issues related to basic Amazon math calculations. (The second biggest reason is Amazon not only increasingly rejecting reviews, but Amazon also suggesting that reviews be edited following rejection while also not providing access to originally written reviews. My product reviews have typically been much more in-depth than the one- or two-liners that many reviewers write, arguably one of the reasons I was a highly ranked reviewer before Amazon decided to do away with reviewer rankings in late-2022.)

While most of these issues have been relatively benign, the most recent issue led to Amazon dropping me from gold tier membership following my second evaluation period, even though I met applicable criteria, as can be seen from the following screenshot, which clearly shows that I reviewed at least 100 products and at least 90% of received product. My outreach to Amazon has resulted in incorrect, automated responses thus far.

However, is this dashboard accurate to the appropriate significant digits? Unfortunately, while I *technically* have access to the raw data on which the 90% value was calculated, the challenge on my end in double-checking this calculation is my not keeping track of whether a product was received during this time period, or a previous time period, because the denominator is based on total products received during this time period, and the numerator is based on total products reviewed that were received *regardless* of time period, and the tooling that Amazon makes available to product reviewers is sparse, requiring product reviewers such as myself to keep track of these metrics.

Yes, I could spend time walking through each of the 132 products I reviewed during this time period, and I may end up doing so, but the above dashboard clearly states a value of 90%, and based on past automated responses from Amazon support I'm not convinced that either a human or presumed automation would evaluate my support ticket, seeing the presumed error in their math calculation.

Why do I suspect that Amazon has made some very basic miscalculations? The following are some examples from early in this evaluation period:

The first example should have shown that 14% of received products had been reviewed at that point in time (May 18, 2023), and the second example should have shown that 55% of received products had been reviewed at that point in time (May 27, 2023). These values should have been rounded to 14% and 55%, respectively, or perhaps 1 / 7 = 14.3% and 6 / 11 = 54.6% to show that percentage reviewed was actually just above 14% and not quite 55%.

However, something is terribly wrong here. Why did Amazon's calculations result in 14.000000000000002 and 55.00000000000001 ? Apparently, they decided to round to the nearest integer. But what's with the high number of *incorrect* decimal place values? It's obvious they used a floating point data type to communicate results, and as *all* data professionals should know, a floating point data type should *never* be used in this type of scenario. Yes, floating point data types provide a higher number of decimal places to be reported, but floating data types are also notorious for being unstable!

Not long ago, a nonspecialized software engineer working with my data engineering team argued with my directive to use decimal data type for monetary values stored in the data lakehouse we were building:

"But Erik", they argued, "when a float is used we'll always have access to precise calculated values!"

"Absolutely not", I countered, "floating data types are unstable, leading to the likelihood that expected values will be inconsistently stored. For monetary values, we *always* need to use decimal data type, because this guarantees that values will *always* match expected values, a data quality aspect *especially* important for auditing."

Amazon should have used a decimal data type providing one decimal place with respect to 14.3 and 54.6, but since Amazon was apparently looking to communicate only integer values, it should have just used an integer value to store the result.

The bottom line is that it's *possible* I actually reviewed 89.9% of product during this evaluation period, and Amazon decided to report it as 90%. However, product reviewers such as myself should not need to be concerned about misreporting, now that Amazon has decided to gamify its product review process.

Subscribe to Erik on Software