Analysis of Stat Measurements: How They are Created and What They Mean

WoWReplays Logo

For the past few months, I have been working closely with Jammin411 of WoWReplays.com to validate and recreate the Ship Rating and Aggression/Passiveness statistics that are used on the website. Literally, millions of cases of data have been analyzed to find the best variables to use and how to make them come together to get the results that we are looking for. Both stats offered their own challenges such as, “What are we trying to measure?” “What variables can be used to measure that?” “Is this really measuring what we want it to?” and “How do we explain these to the community?” This article will go on to explain the process, in short, of how the formulas were created and how the variables used were chosen. Further, I will also explain what the scores for each represent and what to you need to know when reading them on the site.

Variable Selection: Ship Rating

Using the data from the wowreplays.com database and analyzing it using SPSS, numerous variables were correlated with each other to find those that have the strongest correlation with Win Rate. This initially required the computing of ‘custom’ variables that were not originally part of the raw data but could be easily calculated into a new variable. An example of one such variable is average damage or Damage Per Battle. Many of the variables that were originally correlated had to be created this way. Per battle averages were selected as a basis for many of the variables as corresponding data for each ship existed as well as the desire to compare an individual to the population as a whole. Once all of the variables were created that was considered to be a factor into win rate, they were then correlated against each other, and most importantly against win rate (Fig. 1). Within that correlation table, things that were considered for inclusion were significance, overall correlation value with winrate, and correlation with other variables. Variables that had a negative correlation were discarded from use in the metric.

 

Variable Selection: Aggression Rating

Variable selection for aggression was more difficult to define than it was for ship rating. When considering the variables used, we discussed things that an aggressive player does in a battle and how that might affect certain metrics. We also considered variables affected by the opposite, passiveness, and added them to the mix. This initial list included increased accuracy across all armaments, more capture points taken, planes killed per battle, ships spotted and overall survival. A few of these had to be discarded before any analysis as there were no corresponding variables recorded on a per ship basis or they were just not very clear on how they were measured to begin with. Though correlations were run among the variables, like with the Ship Rating, the selection of the final variables were based upon correlation with both Win Rate and Damage per Battle and the final result of the formulas analysis using mean average and standard deviation.

 

Metric Creation

When attempting to create the aforementioned formulas, we knew we wanted a predictable mean. As such, the formulas were the overall sum of the variables for the player divided by the overall sum of the variables for the ship. This would help ensure that an ‘average’ player would have a score of 1. This was then further multiplied by 100 to enhance power and make it easier to read, as well has remain unique from other player statistic sites that exists. Constant changes were made to both formulas to achieve both the ideal average, but most importantly, a manageable standard deviation. We did not want a standard deviation that would only have room for less than -2 standardized score[1] on the lower end. Further analysis of the results of the new metrics also brought light that many extremes existed within the data. These extremes, or very high values, we attributed to few games played in a ship that were often very good. As such, we decided to ‘trim’ out results of players that had less than 10 battles in a ship. This made the data even manageable and has thus carried over to the live versions on the wowreplays.com website.

 

Stats Explained: Ship Rating

The ship rating, based on the data used to create the formula, which is only about 5% of the total data that the entire site has, has a mean of 101.8 and a standard deviation of 37. Figure 2 shows the descriptive stats for the Ship Rating as well as the select percentile ranks and their corresponding scores.  This is with the players with less than 10 battles excluded. There is a slight skew however, but nothing that I am worried about based upon the sample population. Our ship rating, much like the Warships.Today Rating (WTR), is a measure of a player’s skill in a certain ship. The final metric considers a player’s winrate, damage done, kills made, XP earned, how often they survive and win and how often they survive overall. An average player is considered to have a score between 80 and 120 or about ±.5 standard deviation. A full scale of this will be published on the site soon.

 

Stats Explained: Aggression

Like that of the ship rating, the aggression rating was created using about 5% of the data from the entire wowreplays.com database. The mean for the rating is 102.6 with a standard deviation of 26. Figure 3 shows the descriptive stats for the Aggression Rating as well as the select percentile ranks and their corresponding scores. This also excludes players with less than 10 battles in a ship. The final formula, though currently proprietary[2], uses the concept that the more aggressive you are, the closer you get to the enemy, so your accuracy of armaments will increase. We are currently considering players with a score between 90 and 115 to be neutral – neither passive nor aggressive. This rating is entirely unique to WoWReplays and something we hope to improve upon with the integration of MxStats as well as the hope of unlocking the data stored in the replays. Something worth noting about aggression however, is it does not have a very high correlation with win rate, even though it is considered significant. Upon further investigation of this, as expected and mentioned in World of Warships: A Statistical Analysis of Win Rate and Aggression Attributes – Questioning the Current Meta, there is such thing as too aggressive. Win rate has a tendency to plateau and decrease as aggression gets too high. Further, due to the fact that the average amount of battles in any one ship is about 32, players who have between 10 and 32 battles have excellent opportunities to have inflated scores exceeding the mean as well as keeping a high win rate.

Conclusion

As with all statistical metrics to ‘define’ a player, I recommend using them as a guideline and not a ruler to measure others by. We are using current player data and current averages to assess constructs as a way to inform others as well as a way for players to know what to do to improve. As such, the value may fluctuate over time which will change how a player may be rated, particularly with new ships or even players. More data is better in these cases. Another thing to consider is that each player tends to improve over time and further find the best way a ship fits their play style or find the best play style for that ship. This could mean they tend to be more passive in a ship, but still perform well or conversely, they may play aggressively and have yet to learn that that ship does not fare well in aggressive play. Use the two values as tools that complement each other to help you as a player perform better in a ship and to its strengths as well as your own. Just as importantly though, us it to provide constructive criticism to other players who may not be playing to their ship’s or own strengths.

[1] If the standard deviation was too large, it makes it difficult to create groups of skill or aggression. One such example was with one formula version the mean was about 100, while the standard deviation was about 93. This meant the data was extremely skewed and just over one standard deviation below the mean would be 0.

[2] This will likely be released in the future and published but for now, as it’s still in ‘beta,’ will remain proprietary.

Fig. 1

Stats-Fig1

Fig. 2

Stats-Fig2

Fig. 3

Stats-Fig3

Author: SnipeySnipes

A technical and analytic writer for the WoWReplays.com blog - I focus on nuances and details that are often ignored or over looked by other World of Warships community contributors. With a background in the social sciences and statistics I aim to answer questions about the game and player population as a whole, backing my findings with numbers as proof. I play World of Warships as often as I can but by no means am a "professional." Though I would say I perform better then the average player, I still screw up and die often! You can find me in game or hanging about the WoWReplays Discord server. Hit me up for questions, comments or suggestions. Snide remarks or toxic comments will be forwarded to my complaints department for disposal.

1 thought on “Analysis of Stat Measurements: How They are Created and What They Mean”

  1. i would consider to take only ships from tier VI and up to be in stats
    Also take into account that some ships are also harder top play then a other ship
    so a player can be rewarding for it’s team but not doing much damge
    or doing 95% dmg and not get the kill
    also winrate? this is not dependable on the player it depends on team effort and or luck so winrate neeeh

    Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s