How AI Detected the 'Status Economy' of Team Fortress 2
Training AI to understand the underlying social dynamics of TF2
AI and Games is made possible thanks to crowdfunding on Patreon as well as right here with paid subscriptions on Substack.
Support the show to have your name in video credits, contribute to future episode topics, watch content in early access and receive exclusive supporters-only content
If you'd like to work with us on your own games projects, please check out our consulting services provided at AI and Games. To sponsor our content, please visit our dedicated sponsorship page.
Note: This episode was originally written in 2017 and published on our old website.
Videogames are a means of expression for their creators: translating ideas of story and character through function and play. In some instances, players themselves are afforded expression through the manner in which they interact with games. This can arise from making specific decisions on the use of specific mechanics to making specific decisions in narrative-driven games or even the implicit objectives we set ourselves. This latter case being rather evident in communities that rally around specific targets: such as speed running in Spelunky or ‘no-hit runs’ in Dark Souls.
One increasing area of self-expression in more recent years is through cosmetic customisation: ranging from calling cards and emblems in Call of Duty to armour sets in World of Warcraft. In many instances, avatar customisation does have a tangible impact upon player performance, most notably in role-playing and adventure games given these items influence in-game attributes. However, in a vast number of cases, customisation options are purely cosmetic. What is consistent however, is the ability for players to make a statement of their ability, their attitude or their dedication to a given game based on attributes of their profile and avatar. Games in the Free to Play (F2P) market exploit this phenomena by providing exotic or extravagant cosmetic items at a price to players, often through real-world currency. These micro-transactions are immensely profitable for games such as Dota2, League of Legends and Team Fortress 2, to the point it is now adopted in full-price games such as Rainbow Six: Siege and Overwatch.
What is of interest to us, is whether any relationships exist between these cosmetic elements and a players performance and interactions within the game itself. We've previously explored how in-game performance can correlate with aspects of a players personality and age in Battlefield 3. However, in this case, the focus is on a player's in-game performance and whether this may have an impact on subsequent decision made for self-expression, through customisation of avatars in Team Fortress 2.
Team Fortress 2
Team Fortress 2 (TF2) is a 2007 online multiplayer first-person shooter developed by Valve and a sequel to both Team Fortress - a mod for the original Quake - and Team Fortress Classic - a conversion of the Team Fortress mod to the Goldsource engine used for Valve's inaugural title Half Life. The Team Fortress series is focussed upon team-based and objective-driven play that is heavily reliant on cooperation. Each team is composed of a selection of mercenaries complete with their own strengths, weaknesses and weaponry. Players are tasked to complete a variety of different objectives depending on the game mode selected: ranging from capturing control points on the map, to stealing enemy flags, escorting bombs and even play sports in the updates dated back to 2016.
TF2 Hats
The Sniper vs Spy update of 2009 introduced the use of hats to Team Fortress 2: in which each of the nine in-character classes could now wear a hat during the play. While starting with just nine hats (one for each character), the series has since expanded to over 1200 items that can be worn on hands, torso and head. In addition, a series of quality and rarity classifications has arisen over time, due to the challenge and difficulty in attaining specific items.
Starting in 2011, Team Fortress 2 migrated from a paid-for product to free-to-play (FTP), in which the game itself would be free to download and play, but new unique equipment such as weapons and outfits would be available through use of microtransactions via the Steam distribution platform. While a handful of items carry features that can have a minor (to the point of negligible) affect on gameplay, the vast majority are purely cosmetic in nature. While many items can be obtained through normal play or crafting, many are only available to players upon attaining specific ranks, achievements and statistics or through promotions for other games on the Steam platform.
Methodology
Research into Team Fortress 2 hat-usage was conducted by MIT grad student Chong-U Lim as part of a masters thesis. The focus of the work was to see whether the range of hats available and the rarity of certain hats, combined with the diversity of hats worn by players can tell us something about the players themselves. The rationale was driven by the fact that cosmetic items in online games often act as an expression of achievement that is not necessarily tied to a player's skill. This melding of the players persona with their in-game representation is denoted in the research as 'phantasmal identities': a blend identity between the avatar in a video game with the real-world perceptions and beliefs of the human player. This actually leads to a form of projection of the players identity in an often ridiculous and exaggerated fashion.
Lim gathered profile data from 200 players on Steam as part of the research process: gathering information on how they interact in Steam forums, communicate on each others profiles as well as gather a stack of information on item acquisition for Team Fortress 2. With this data in-hand, Lim conducted a number of processing techniques and statistical methods in order to garner useful data from it. Players are categorised by two distinct metrics that are used in greater detail throughout the research later on: status performance and tie strength.
Status performance is a relationship between of player performance and avatar customisation. This is largely reliant on establishing the value of items players own and the way in which they perform their self-expression. To do this, Lim gathered data from a third-party pricing site for hats to ascertain two key elements:
The collected value: the total monetary value of hats in a players TF2 inventory.
The used value: monetary value of all hats equipped across all of their TF2 characters.
This collection of hats was cross-referenced against a third-party website for price listings both for regular and 'unusual' hat types. Players are then grouped based on their status performance using what are known as clustering techniques: a form of analysis that allows for grouping of data based on the features of the dataset, resulting in tight-knit groups or clusters. Lim adopted the k-means clustering algorithm to group players more accurately based upon their status performance.
Meanwhile tie strength is related to the number of friends a person establishes on a social network with respect to their activity. Naturally, this varies depending on the social network itself and the nature with which connections arise. For example, the tie-strength between Facebook and Twitter will vary significantly given the nature of the relationship between friends on each network and the manner in which user activity is disseminated; with Facebook reliant largely on posting and commenting one another’s wall while Twitter is a broadcast medium. This is ascertained in Steam by calculating a number of key variables:
User's number of friends and the length of their relationship on Steam.
Posts on their own walls and the walls of their friends.
The amount of words exchanged between wall posts.
The number of virtual items the player has on Steam and specifically, the number of items they have accrued through trading on the platform.
The number of common applications: games that are shared among all of your friends (typically those that you would play together).
The number of positive and negative emotional words used in social interactions (measured through sentiment analysis).
Measuring the number of mutual friends and common groups among players. Allowing for an understanding of social clusters that exist on the platform.
This data is naturally rather coarse, so it's then tidied up by running a statistical method known as Principal Component Analysis to streamline it for future calculations by reducing the data into key features it expresses. This allows for the data to be discussed on a 'social status' level, rather than a level of specific data ranges and constraints.
With the status performance and social status identified, the next and critical component was how to understand how these two elements interact with one another: can we predict a status performance based on a players social status? This led to an analysis of how the clustered status performance groups (and the labels that are associated with them) relate to the social status markers. This required both sets of data to be appropriately clustered such that any relationships can be properly mapped. By clustering the social status data by also using k-means, two tests are made: first, to understand whether *any* relationship exists between the two clusters, followed by a second test to establish whether a specific social status could imply status performance. This last part is achieved through use of Support Vector Machines (SVM): a form of supervised machine learning that can be used for classification purposes. In this instance it is used to train an appropriate classification model for prediction of player behaviour.
Results
With the data acquired from Steam, Lim was able to identify a broad range of monetary value in the hats worn, ranging from hats worth nothing in a monetary sense, to some worth over $1100. With subsequent analysis, the monetary value of hats can be attributed to three status performance clusters shown below with players of lower status performance on-average wore hats worth less than 25 cents.
Further principal component analysis against the dataset of player profiles was able to establish four characteristics or social behaviours that occur on the Steam platform:
Players can have high social interaction despite the games they have in common being single-player games.
Players can have both close relationships with specific players as well as - in the context of the Steam - many friends.
Players who typically trade with one another have a higher chance of engaging in social discourse via Steam, both before and after transactions are completed.
Players fluctuate between those that typically post on their own wall versus those who will focus on the use of the walls to maintain contact with friends.
Now this all sounds rather common sense and is largely to be expected but the beauty of it is that these conclusions are derived from the PCA analysis of the data. In other words, these characteristics, which sound rather nominal, are being reached from statistical analysis of the data provided, which is pretty freaking cool. With this completed, a final clustering effort took place for player profiles that match those specific behavioural traits. The final results identify 5 specific clusters of players which relate their social status to status performance and is summarised in the table shown below.
Closing
While reading this article now in 2024 - seven years after I originally wrote it - the idea that players attribute value to cosmetics and the impact it can have on social status is rather well established. Cosmetics are now an incredibly popular aspect of free-to-play economies in games, and we’ve even seen the darker side of this concept. ‘Skin gambling’ has become a growing concern in online games, Valve have shut down money laundering operations that used Counterstrike skins, and kids who play Fortnite experience bullying for not engaging in the cosmetics economy.
But it’s interesting to look back on this research that was published over a decade ago, and how the authors themselves identified many of the issues that could arise from cosmetic economies as far back as 2013. To quote (Lim and Harrell 2013):
It is also important to consider implications related to social issues that might arise out of computational identity representation systems, especially with the high levels of interaction that occur between players, as well as with developers. Inference regarding a player’s real-world identity and preferences can be correlated with their behaviors in virtual worlds including avatar creation and customization (and vice versa). Also, the creation of items for sale and distribution in a virtual environment has similarities to the construction of value of physical items in the real world. Creating items for distribution in a virtual environment has similarities to the construction of value for real-world items. Looking at hats in TF2 based upon factors such as mode of acquisition, promotions by developers, monetary value, and so on parallels real world phenomena, such as the appeal of designer or limited edition goods. One can examine the different categories of people who seek to acquire particular virtual items or classes of virtual items (e.g., people with the means to seek out expensive items, people who care about aesthetics, etc.) and predict how they might perform status in a gaming/virtual world. In constructing virtual economies, consideration of social effects must go beyond enhancing or balancing gameplay, and should include sociological issues such as privilege and marginalization.
Bibliography
Chong-U Lim, and D. Fox Harrell. (2013). "Modeling Player Preferences in Avatar Customization using Social Network Data: A Case-Study using Virtual Items in Team Fortress 2." In Proceedings of the IEEE Conference on Computational Intelligence and Games (CIG2013), Niagara Falls, Canada. Aug-11 - Aug-13, 2013. pp. 153 - 160.
Chong-U Lim (2013). "Modeling Player Self-Representation in Multiplayer Online Games using Social Network Data", S.M. Thesis, 2013, Massachusetts Institute of Technology, USA.
Further information can be found on the project can be found at its dedicated website: the 'Steam-Player-Preference Analyzer and the AIR Status Performance Classifier'.
I was glad to read "this all sounds common sense" right after I thought it 😄 just one question, do u know how that final social status clustering (5 groups) was done?