The Computational Democracy Project


Opinion Groups

A conversation can often be thought of as being composed of opinion groups (subsets of the participant body which tend to agree with each other). These opinion groups are central to how we think about and get value out of a conversation.

As a tool, Polis' goal is to:

  • reflect back to participants an understanding of themselves in relation to the opinion landscape
  • surface points of common ground or rough consensus

Opinion groups allow participants to

  • frame an understanding of their position in the opinion landscape relative to what opinion group they align with
  • understand where people who think differently than they do tend to fall on the issues by reviewing the representative comments for other opinion groups
  • surface points of common ground by looking at group informed consensus

Generally speaking, opinion groups are able to accomplish this (and more) by serving as a lower dimensional representation of the full set of information in the conversation. Instead of thinking about thousands of participants and comments, we can think about a handful of opinion groups and the key comments which help us understand them. This allows us to weave a more coherent story about how these groups relate to each other, which would not be possible otherwise.


There is no one "right way" to detect opinion groups within a conversation. Many algorithms exist for doing this, and are generally considered clustering algorithms. Groups within the conversation are typically identified with (or as) individual clusters.


Polis uses the k means clustering algorithm (clustering algorithms) to group participants into clusters based on similarity of responses.

K-means is a very old and simple method. In comparison with newer techniques, it is very limited in the kinds of patterns it can find. While this does restrict the kinds of patterns the resulting opinion groups can reflect, it has the benefit of being relatively predictable and easy to interpret.

Selecting a number of clusters

Because the k means algorithm depends on a fixed choice of K (the number of clusters to be selected in the data set), Polis uses the silhouette coefficient to select for an optimal number of clusters.

More advanced techniques

There are numerous clustering algorithms out there that we are researching and trialing for downstream analysis. However, we're unlikely to adopt any of these as part of the core real-time engine without careful consideration of the trade offs.

It is one of our number one goals as an organization to researched and develop a well defined computational and sociological framework for evaluating these trade offs.

See also:

  • Opinion group
  • Representative Comments which help Polis identify different sides of the conversation.

The Computational Democracy Project

© 2021 The Computational Democracy Project, a 501c3 nonprofit.


© 2021 The Computational Democracy Project, a 501c3 nonprofit.