Influence of augmented humans in online interactions during voting events

Influence of augmented humans in online interactions during voting events

  • Massimo Stella (Scholar)
  • Marco Cristoforetti (Scholar)
  • Marco Cristoforetti (Scholar)
  • Abstract: Overwhelming empirical evidence has shown that online social dynamics mirrors real-world events. Hence, understanding the mechanisms leading to social contagion in online ecosystems is fundamental for predicting, and even manouvering, human behavior. It has been shown that one of such mechanisms is based on fabricating armies of automated agents that are known as social bots. Using the recent Italian elections as an emblematic case study, here we provide evidence for the existence of a special class of highly influential users, that we name “augmented humans”. They exploit bots for enhancing both their visibility and influence, generating deep information cascades to the same extent of news media and other broadcasters. Augmented humans uniformly infiltrate across the full range of identified clusters of accounts, the latter reflecting political parties and their electoral ranks.
  • Bruter and Harrison [19] shift the focus on the psychological in uence that electoral arrangements exert on voters by altering their emotions and behavior. The investigation of voting from a cognitive perspective leads to the concept of electoral ergonomics: Understanding optimal ways in which voters emotionally cope with voting decisions and outcomes leads to a better prediction of the elections. (pg 1)
  • Most of the Twitter interactions are from humans to bots (46%); Humans tend to interact with bots in 56% of mentions, 41% of replies and 43% of retweets. Bots interact with humans roughly in 4% of the interactions, independently on interaction type. This indicates that bots play a passive role in the network but are rather highly mentioned/replied/retweeted by humans. (pg 2)
  • bots’ locations are distributed worldwide and they are present in areas where no human users are geo-localized such as Morocco.  (pg 2)
  • Since the number of social interactions (i.e., the degree) of a given user is an important estimator of the in uence of user itself in online social networks [1722], we consider a null model fixing users’ degree while randomizing their connections, also known as configuration model [2324].  (pg 2)
  • During the whole period, bot bot interactions are more likely than random (Δ > 0), indicating that bots tend to interact more with other bots rather than with humans (Δ < 0) during Italian elections. Since interactions often encode the spread of a given content online [16], the positive assortativity highlights that bots share contents mainly with each other and hence can resonate with the same content, be it news or spam.  (pg 2)
  • Differently from previous works, where the semantic content of bots and humans differs in its emotional polarity [12], in here we nd that bots mainly repeat the same political content of human users, thus boosting the spreading of hashtags strongly related to the electoral process, such as hashtags referring to the government or to political victory, names of political parties or names of influential politicians (see also 3). (pg 4)
  • Frequencies of individual hashtags during the whole electoral process display some interesting shifts, reported in Table III (Top). For instance, the hashtag #exitpoll, indicating the electoral outcome, becomes 10000 times more frequent on the voting day than before March 4. These shifts indicate that the frequency of hashtags reflects real-world events, thus underlining the strong link between online social dynamics and the real-world electoral process. (pg 4)
  • TABLE II. Top influencers are mostly bots. Hubs characterize influential users and broadcasters in online social systems [17], hence we use degree rankings for identifying the most in uential users in the network. (pg 5)
  • bots are mostly influential nodes which tend to interact mostly with other bots rather than humans and, when they interact with human users, they preferentially target the most influential ones. (pg 5)
  • we first filter the network by considering only pair of users with at least one retweet, with either direction, because re-sharing content it is often a good proxy of social endorsement [21]. However, Retweets alone are not sufficient to wash out the noise intrinsic to systems like Twitter, therefore we apply a more selective restriction, by requiring that at least another social action – i.e., either mention or reply – must be present in addition to a retweet [12]. This restrictive selection allows one to filter out all spurious interactions among users with the advantage of not requiring any thresholding approach with respect to the frequency of interactions themselves. (pg 5)
  • The resulting network is what we call the social bulk, i.e. a network core of endorsement and exchange among users. By construction, information ows among users who share strong social relationships and are characterized by similar ideologies: in fact, when a retweet goes from one user to another one, both of them are endorsing the same content, thus making non-directionality a viable approach for representing the endorsement related to content sharing. (pg 5)
  • Fiedler partitioning
  • The relevant literature has used the term “cyborg” for identifying indistinctly bot-assisted human or human-assisted bot accounts generating spam content over social platforms such as Twitter [5, 35]. Here, we prefer to use the term \augmented human” for indicating specifically those human accounts exploiting bots for artificially increasing, i.e. augmenting, their in uence in online social platforms, analogously to physical augmentation improving human performances in the real world [36]. (pg 8)
  • Baseline social behavior is defined by the medians of the two observables, like shown in Fig. 6c. This map allows to easily identify four categories of individuals in the social dynamics: i) hidden in uentials, generating information cascades rapidly spreading from a large small number of followers; ii) in uentials, generating information cascades rapidly spreading from a large number of followers; iii) broadcasters, generating information cascades slowly spreading from a large number of followers; iv) common users, generating information cascades slowly spreading from a small number of followers. (pg 9)
  • Hidden influentials, known to be efficient spreaders in viral phenomena [45], are mostly humans: in this category falls the augmented humans, assisted by social bots to increase their online visibility. (pg 10)
  • We define augmented humans as human users having at least 50% + 1 of bot neighbours in the social bulk. We discard users having less than 3 interactions in the social bulk. (pg 10)
  • The most central augmented human in terms of number of social interactions is Utente01, which interacts with 2700 bots and 55 humans in the social bulk. (pg 10)
  • The above cascade analysis reveals that almost 2 out 3 augmented humans resulted playing an important role in the flow of online content: 67% of augmented humans were either influentials or hidden influentials or broadcasters. These results strongly support the idea that via augmentation even common users can become social influencers without having a large number of followers/friends but rather by recurring to the aid of either armies of bots (e.g., Utente01, an hidden in uential) or the selection of a few key helping bots. (pg 11)

The Group Polarization Phenomenon

The Group Polarization Phenomenon

David G. Myers

Helmut Lamm

Experiments exploring the effects of group discussion on attitudes, jury decisions, ethical decisions, judgments, person perceptions, negotiations, and risk taking (other than the choice-dilemmas task) are generally consistent with a “group polarization” hypothesis, derived from the risky-shift literature. Recent attempts to explain the phenomenon fall mostly into one of three theoretical approaches: (a) group decision rules, especially majority rule (which is contradicted by available data), (b) interpersonal comparisons (for which there is mixed support), and (c) informational influence (for which there is strong support). A conceptual scheme is presented which integrates the latter two viewpoints and suggests how attitudes develop in a social context.

  • Pictures may be important as part of an argument. Need to be able to support that.
  • This polarization concept should also be distinguished from a related concept, extremization. Whereas polarization refers to shifts toward the already preferred pole, extremization has been used to refer to movement away from neutrality, regardless of direction. Since all instances of group polarization are instances of extremization, but not vice versa, extremization may be easier to demonstrate than polarization. (pp 603)
  • For convenience we have organized these studies into seven categories: attitudes, jury decisions, ethical decisions, judgments, person perceptions, negotiation behavior, and risk measures other than the choice dilemmas. This categorization is admittedly somewhat arbitrary. (pp 604)
  • In other studies, however, it is possible to infer the direction of initial preferences. Robinson (1941) conducted lengthy discussions of two attitudes. On attitude toward war, where students were initially quite pacifistic, there was a nonsignificant shift to even more pacifism following discussion. On attitude toward capital punishment, to which students were initially opposed, there was a significant shift to even stronger opposition. (pp 604)
  • Varying the stimulus materials. Myers and Kaplan (1976) engaged their subjects in discussion of stimulus materials which elicited a dominant predisposition of guilty or not guilty. After discussing traffic cases in which the defendants were made to appear as low in guilt, the Subjects Were even more definite in their judgments of innocence and more lenient in recommended punishment. After discussing “high-guilt” cases, the subjects polarized toward harsher judgments of guilt and punishment. (pp 605)
  • Group composition studies. Vidmar composed groups of jurors high or low in dogmatism. The high-dogmatism juries shifted toward harsher sentences following discussion, and the low-dogmatism groups shifted toward more lenient sentences, despite the fact that the high- and low-dogmatism juries did not differ in their predeliberation judgments. (pp 606)
  • Main and Walker (1973) observed that these constitutionality decisions were also more libertarian in the group condition (65% versus 45%). Although a minority of the single-judge decisions were prolibertarian, Walker and Main surmised that the preexisting private values of the judges were actually prolibertarian and that their decisions made alone were compromised in the face of antilibertarian public pressure. Their private values were then supposedly released and reinforced in the professional group context (pp 606)
  • From what we have been able to perceive thus far, the process is an interesting combination of rational persuasion, sheer social pressure, and the psychological mechanism by which individual perceptions undergo change when exposed to group discussion (pp 606)
  • Myers (1975) also used a faculty evaluation task. The subjects responded to 200 word descriptions of “good” or “bad” faculty with a scale judgment and by distributing a pay increase budget among the hypothetical faculty. As predicted by the group polarization hypothesis, good faculty were rated and paid even more favorably after the group interaction, and contrariwise for the bad faculty. (pp 608)
  • in general, the work on person perception supports the group polarization hypothesis, especially when the stimulus materials are more complex than just a single adjective. (pp 608)
  • Myers and Bach (1976) compared the conflict behavior of individuals and groups, using an expanded prisoner’s dilemma matrix cast in the language of a gas war. There was no difference in their conflict behavior (both individuals and groups were highly noncooperative). But on postexperimental scales assessing the subjects’ evaluations of themselves and their opponents, individuals tended to justify their own behavior, and groups were even more inclined toward self-justification. This demonstration of group polarization supports Janis’s (1972) contention that in situations of intergroup conflict, group members are likely to develop a strengthened belief in the inherent morality of their actions.  (pp 608)
  • Skewness cannot account for group polarization. This is particularly relevant to the majority rule scheme, which depends on a skewed distribution of initial choices. On choice dilemmas, positively skewed distributions (i.e., with a risky majority) should produce risky shift, and negatively skewed distributions should yield a conservative shift. Several findings refute this prediction. (pp 612)
  • Shifts in the group median, although slightly attenuated, are not significantly smaller than shifts in the group mean (pp 612)
  • Group shift has also been shown to occur in dyads (although somewhat reduced), where obviously there can be no skewness in the initial responses (pp 612)
  • while group decision models may be useful in other situations in which discussion is minimal or absent and the task is to reach agreement (e.g., Lambert, 1969), the models (or at least the majority rule model stressed in this analysis) are not a sufficient explanation of the group polarization findings we are seeking to explain. There are still a variety of other decision schemes that can be explored and with other specific tasks. But clearly, group induced shift on choice dilemmas is something more than a statistical artifact. (pp 612)
  • Interpersonal Comparisons theory suggests that a subject changes when he discovers that others share his inclinations more than he would have supposed, either because the group norm is discovered to be more in the preferred direction than previously imagined or because the subject is released to more strongly act out his preference after observing someone else who models it more extremely than himself. This theory, taken by itself, suggests that relevant new information which emerges during the discussion is of no consequence. Group polarization is a source effect, not a message effect. (pp 614)
    • This is very close to the flocking theory where one agent looks at the alignment and velocity of nearby agents.
  • Differences between self, presumed other, and ideal scores. One well-known and widely substantiated assumption of the interpersonal comparisons approach is the observation from choice-dilemmas research that if, after responding, the subjects go back over the items and guess how their average peer would respond and then go back over the items a third time and indicate what response they would actually admire most, they tend to estimate the group norm as more neutral than their own initial response and their ideal as more extreme (pp 613)
  • Lamm et al. (1972) have also shown that not only do subjects indicate their ideal as more extreme than their actual response, but they also suspect that the same is true of their peers. The tendency of people to perceive themselves as more in what they consider to be the socially desirable direction than their average peer extends beyond the choice dilemmas (see Codol, Note 13). For example, most businessmen believe themselves to be more ethical than the average businessman (Baumhart, 1968), and there is evidence that people perceive their own views as less prejudiced than the norm of their community (Lenihan, Note 14). (pp 613)
  • The tendency to perceive others as “behind” oneself exists only when the self response is made prior to estimating the group norm (McCauley, Kogan, & Teger, 1971; Myers, 1974). Evidently it is after one has decided for himself that there is then a tendency to consider one’s action as relatively admirable (by perceiving the average person as less admirable than oneself). (pp 613)
  • it has been reliably demonstrated that subjects perceive other persons who have responded more extremely than themselves (in the direction of their ideal) as more socially desirable than persons who have not (Baron, Monson, & Baron, 1973; Jellison & Davis, 1973; Jellison & Riskind, 1970, 1971; Madaras & Bern, 1968). A parallel finding exists in the attitude literature (Eisinger & Mills, 1968): An extreme communicator on one’s side of an issue tends to be perceived as more sincere and competent than a moderate. (pp 614)
  • Burnstein, Vinokur, and Pichevin (1974) took an informational influence viewpoint and showed that people who adopt extreme choices are presumed to possess cogent arguments and are then presumably admired for their ability. They also demonstrated that subjects have much less confidence in others’ choices than in their own, suggesting that the tendency to perceive others as more neutral than oneself simply reflects ignorance about others’ choices (pp 614)
  • self-ideal difference scores are less affected by order of measurement than self versus perceived other differences (Myers, 1974)—suggest that the self-ideal discrepancy may be the more crucial element of a viable interpersonal comparisons approach. (pp 614)
  • One set of studies has manipulated the information about others’ responses by providing fake norms. More than a dozen separate studies all show that subjects will move toward the manipulated norm (see Myers, 1973) (pp 615)
    • Can’t find this paper, but herding!
  • Consistent with this idea, they observed that exposure to others’ choices produced shift only when subjects then wrote arguments on the item. If knowledge of others’ choices was denied or if an opportunity to rethink the item was denied, no shift occurred. (pp 615)
  • On the other hand, it may be reasoned that in each of the studies producing minimal or nonexistent shift after exposure to others’ attitudes, the subjects were first induced to bind themselves publicly to a pretest choice and then simply exposed to others’ choices. It takes only a quick recall of some classic conformity studies (e.g., Asch, 1956) to realize that this was an excellent procedure for inhibiting response change. (pp 615)
  • Bishop and Myers (1974) have formulated mathematical models of the presumed informational influence mechanisms. These models assume that the amount of group shift will be determined by three factors: the direction of each argument (which alternative it favors), the persuasiveness of each argument, and the originality of each argument (the extent to which it is not already known by the group members before discussion). In discussion, the potency of an argument will be zero if either the rated persuasiveness is zero (it is trivial or irrelevant) or if all group members considered the argument before discussion (pp 616)
  • the simple direction of arguments is such an excellent predictor of shift (without considering persuasiveness and originality), it is not easy to demonstrate the superiority of the models over a simple analysis of argument direction as undertaken by Ebbesen and Bowers (1974). (pp 617)
    • This supports the notion that alignment and heading, as used in the model may really be sufficient to model polarizing behavior
  • A group that is fairly polarized on a particular item before discussion is presumably already in general possession of those arguments which polarize a group. A less extreme group has more to gain from the expression of partially shared persuasive arguments. (pp 617)
  • Passive receipt of arguments outside an interactive discussion context generally produces reduced shift (e.g., Bishop & Myers, 1974; Burnstein & Vinokur, 1973; St. Jean, 1970; St. Jean & Percival, 1974). Likewise, listening to a group discussion generally elicits less shift than actual participation (pp 617)
    • There may be implications here with respect to what’s being seen and read on the news having a lower influence than items that are being discussed on social media. A good questions is at what point does the reception of information feel ‘interactive’? Is clicking ‘like enough? My guess is that it is.
  • Verbal commitment could produce the increased sense of involvement and certainty that Moscovici and Zavolloni (1969) believe to be inherent in group polarization. (pp 618)
    • This reinforces the point above, but we need to know what the minimum threshold of what can be considered ‘verbal commitment’.
  • By offering arguments that tend toward the outer limits of his range of acceptability, the individual tests his ideals and also presents himself favorably to the group since, as we noted earlier, extremity in the direction of the ideal connotes knowledgeability and competence. (pp 618)
  • Diagram (pp 619) PolarazationDiagram
  • Arguments spoken in discussion more decisively favor the dominant alternative than do written arguments. The tendency for discussion arguments to be one-sided is probably not equal for all phases of a given discussion. Studies in speech-communications (see Fisher, 1974) suggest that one-sided discussion is especially likely after a choice direction has implicitly emerged and group members mutually reinforce their shared inclination. (pp 619)
    • This review is pre IRC, and views writing as non-interactive. THis may not be true any more.
  • The strength of the various vectors is expected to vary across situations. In more fact-oriented judgment tasks (group problem solving tasks being the extreme case), the cognitive determinants will likely be paramount, although people will still be motivated to demonstrate their abilities. On matters of social preference, in which the social desirability of actions is more evident, the direct and indirect attitudinal effects of social motivation are likely to appear. The direct impact will occur in situations in which the individual has ideals that may be compromised by presumed norms but in which exposure to others’ positions informs him that his ideals are shared more strongly or widely than he would have supposed. These situations—in which expressed ideals are a step ahead of prior responses—will also tend to elicit discussion content that is biased toward the ideals. (pp 620)
  • What is the extent of small group influence on attitudes? McGuire (1969) noted, “It is clear that any impact that the mass media have on opinion is less than that produced by informal face-to-face communication of the person with his primary groups, his family, friends, co-workers, and neighbors (p. 231,).” (pp 220)


New computer, new plans

I am now the proud(?) owner of a brand new ThinkStation. Nice, fast box, and loaded up with dev and analysis goodies.

Based on discussion with Dr. Hurst, I’m going to look into using Mechanical Turk as a way of gathering large amounts of biometric data. Basically, the goal is to have Turkers log in and type training text. The question is how. And that question has two parts – how to interface with the Turk system and how to get the best results. I’ll be researching these over the next week or so, but I thought I’d put down some initial thoughts here first.

I think a good way to get data would be to write a simple game that presents words or phrases to a user and has them type those words back into the system. Points are given for speed (up to 10 seconds for word?) and accuracy (edit distance from word). Points are converted into cash somehow using the Turk API?

The list of words should be the 100 most-used English words? Words/phrases are presented randomly. There is some kind of upper and lower limit on words that a user can enter so the system is not abused. In addition, ip address/browser info can be checked as a rough cull for repeat users.

Ok, I’m now the proud owner of an AWS account and a Turk Requestor account. I’ve also found the Amazon Mechanical Turk Getting Started Guide, though the Kindle download is extremely slow this Christmas Day.

Setting up sandbox accounts. Some things appear to timeout. Not sure if I’ll be able to use the api directly. I may have to do a check against the session uid cut-and-paste.

Set up Admin, dev and read-only IAM user accounts.

Accessed the production and sandbox accounts with the command line tools. Since I already had a JRE, I just downloaded the installed directory. You need to create an MTURK_CMD_HOME environment variable that points to the root of your turk install. In my case ‘C:\TurkTools\aws-mturk-clt-1.3.1’ Do not add this value to your path – it makes java throw a fit. The other undocumented thing that you must do is change the service-url values for the accounts from http to https.

To log in successfully, I was not able to use the IAM accounts and had to use a rootkey. And sure enough, when I looked over the AWS Request Authentication page, there it was: Amazon Mechanical Turk does not use AWS Identity and Access Management (IAM) credentials. Sigh. But that’s 4 tabs I can close and ignore.

Setting up the Java project for the HIT tutorial.

  • Since I’ve been using IntelleJ for all my JavaScript and PHP, I thought I’d see how it is with Java. The first confusion was how to grab all the libraries/jar files that Turk needs. To add items, use File->Project Structure. This brings up a ‘Project Structure” dialog. Pick Modules under Project Settings then click the Sources/Paths/Dependencies tab. Click the green ‘+‘ on the far right-hand side, then select Jar or Directories. It doesn’t seem to recurse down the tree, but you can shift-click to select multiple one-level directories. This should then populate the selected folders in the list that has the Java and <Module Source> listed. Once you hit OK at the bottom of the dialog, you can verify that the jar files are listed under the External Libraries heading in the Project panel.
  • Needed to put the file at the root of the project, since that’s where System.getProperty(“user.dir”) said I should.

Success! Sent a task, logged in and performed the task on in the sandbox. Now I have to see how to get the results and such.

And the API information on Amazon makes no sense. It looks like the Java API is not actually built by Amazon, the only true access is through SOAP/REST. The Java API is in the following locations:

If you download the zip file, the javadoc API is available in the docs directory, and there appear to be well-commented samples in the samples directory.

Took parts from the and examples and have managed to post and approve. Need to see if already approved values can be retrieved. If they can be, then I can check against previous uids. Then I could either pull down the session_ids as a list or create a new PHP page that returns the session_ids restfully. Kinda like that.

The php page that returns session_ids is done and up: I’ve now got java code running that will pull down the entire xml document and search through it looking for a result that I can then match against the submitted values. I need to check to see that the value isn’t already approved, which means going through the approved items first to look for the sessionIDs

And, of course, there’s a new wrinkle. A worker can only access an HIT once, which has led me to loading up quite a few HITs. And I’ve noticed that you can’t search across multiple HITs, so I need to keep track of them and fill the arrays from multiple sources. Or change the db to have an ‘approved’ flag, which I don’t want to do if I don’t have to.  I’d rather start with keeping all the HIT ids and iterating over them. If that fails, I’ll make another RESTful interface with the db that will have a list of approved session_ids.

Iterating over the HITs in order of creation seems to work fine. At least good enough for now

Bought and set up Got a little nervous about pointing a bunch of turkers at I will need to make it https after a while too. Will need to move over the javascript files and create a new turk page.

Starting on the training wizard directive. While looking around for a source of training text I initially tried vocabulary building sites but wound up at famous quotations, at least for the moment. Here’s one that generates random quotes.

The wizard is done and it’s initial implementation as my turk data input page. Currently, the pages are:

  • irev3 – the main app
  • irevTurk – the mechanical turk data collection
  • irevdb – data pulls and formatting from the db.

And I appear to have turk data!

First Draft?

I think I’ve made enough progress with the coding to have something useful. And everything is using the AngularJS framework, so I’m pretty buzzword compliant. Well, that may be to bold a statement, but at least it can make data that can be analyzed by something a bit more rigorous than just looking at it in a spreadsheet.

Here the current state of things:

The data analysis app:

  • Improved the UX on both apps. The main app first.
    • You can now look at other poster’s posts without ‘logging in’. You have to type the passphrase to add anything though.
    • It’s now possible to search through the posts by search term and date
    • The code is now more modular and maintainable (I know, not that academic, but it makes me happy)
    • Twitter and Facebook crossposting are coming.
  • For the db app
    • Dropdown selection of common queries
    • Tab selection of query output
    • Rule-based parsing (currently keyDownUp, keyDownDown and word
    • Excel-ready cvs output for all rules
    • WEKA ready output for keyDownUpkeyDownDown and word. A caveat on this. The WEKA ARFF format wants to have all session information in a single row. This has two ramifications:
      • There has to be a column for every key/word, including the misspellings. For the training task it’s not so bad, but for the free form text it means there are going to be a lot of columns. WEKA has a marker ‘?’ for missing data, so I’m going to start with that, but it may be that the data will have to be ‘cleaned’ by deleting uncommon words.
      • Since there is only one column per key/word, keys and words that are typed multiple times have to be grouped somehow. Right now I’m averaging, but that looses a lot of information. I may add a standard deviation measure, but that will mean double the columns. Something to ponder.

Lastly, Larry Sanger (co-founder of Wikipedia) has started a wiki-ish news site. It’s possible that I could piggyback on this effort, or at least use some of their ideas/code. It’s called There’s a good manifesto here.

Normally, I would be able to start analyzing data now, with WEKA and SPSS (which I bought/leased about a week ago), but my home dev computer died and I’m waiting for a replacement right now. Frustrating.

Cluster Analysis in Mathematica

UMBC appears to have a Wolfram Pro account and student copies of Mathematica, covered by tuition, it seems. I need to do cluster analysis on words, trigraphs and digraphs. This seems to be a serious win. One option is to use the heavy client. This page seems to cover that.

I wonder if I can use Alpha Pro as a service for an analysis page though. That could be very cool. It certainly seems like a possibility. More as this progresses…


  • R Commander Two-way Analysis of Variance Model –
  • Success! In that I was able to read in a file (Insert->File Path…), then click Import under the line. Boy, that’s intuitive…
  • ANOVA (yes, all caps) runs like this: ANOVA[myModel, {myFactor1, myFactor2, All}, {myFactor1, myFactor2}]