Discursive Thinking Around “The Social Observatory”

Social phenomena unfold at a human level. But these phenomena are most often observable only in the aggregate. MIT researchers do not have access to your paystubs to study your job history, but they can examine aggregate incomes among you and your neighbors according to census tract, or they can look at people like you in nationally representative surveys. Resultantly, your mayor or county supervisor must lean on nationally- or internationally-centered evidence to ascertain the best programs. Because the evidence is often poorly synthesized and not designed by researchers to be yet generalizable, officials more likely will turn to innovating by anecdote. (If not by ideology alone.) This seems like a failure by academia and funders in coordinating the work of social scientists so that their efforts hold use for society: as the medical research enterprise aligns its work (in private and public efforts alike) to benefit professionals who serve society, so should social science.

I share others’ excitement toward social observatories. We need more granular information, more diverse cohorts from whom it is collected, and more rich data sources that can be combined. We also need new ways of reporting science: there are topological relationships between topics we study, between the entities and phenomena we estimate. It is not enough to have a landscape of research populated by dense PDFs of written words that imprecisely communicate why we did the research, how we understood the research’s potential contributions in ongoing discourse, what concerns we hold about the research conducted, and how we interpret the research.

We waste immense effort because so much information is siloed, both at the source and at the finish line. Funders pay research teams to nosedive into data dictionaries trying to merge datasets that should, as a default, be compatible, colocated, and even co-produced. In the US, even researchers with authorized access to microdata (person-level data) typically cannot connect those data with relevant data at a similar level, either because the data does not exist at that level or because there is not information in each dataset allowing one to be merged to another.

You shouldn’t — and wouldn’t — allow me to make this argument without recognizing the work and great concern from convening institutions like the National Academies of Sciences Engineering and Medicine, funders like National Institutes for Health and the Robert Wood Johnson Foundation, or the editorial boards of academic journals. The Information Mapping section of Wikipedia teems with articles, which will point you toward the large literature on knowledge representation. They care deeply about this. Yet the problem of underdetermination in science persists — and likely always will. So what’s new?

Nothing, under this sun.

Log the logic. We just need a community-moderated website where researchers log the logical relationships they study. Consider  they understand from existing science, propose new ones, and add basic information about what their studies imply about relationships. “Relationships” here connect multiple entities or events: for instance: “immigrant labor ~ job creation.” (More formal descriptions of how to log relationships can be found at the Genome Biology article linked below.) The platform would allow standardized descriptions of research results to be  infinite depth of relationships to be reported, but would seek to standardize the information

Create Data Observatories at a Local Scale. Consider the vision of the Social Observatory Coordinating Network. From https://socialobservatories.org/vision:

… a representative sample of the places where people live and the people who live there. Each observatory would be an entity, whether physical or virtual, that is charged with collecting, curating, and disseminating data from people, places, and institutions in the United States. These observatories must provide a basis for inference from what happens in local places to a national context and ensure a robust theoretical foundation for social analysis


A national framework for studying local contexts.

A national framework for interdisciplinary collaboration.

These could be regionally organized community-level data collection efforts. One of the Network’s white papers calls for drawing on public and private databases to create a coherent data environment documenting local activity and life experiences. It also suggests conducting web-scraping and automated media analysis of local online experiences and content. These would all be wonderful ideas, accounting for rigorous commitment to individual privacy. For that reason, the priority setting would occur locally or regionally but the observatory areas would have a substantive core activity that is common nationwide. These ideas are developed here by John Schulz: https://socialobservatories.org/papers/paper_20130205.173152

For population health, there may be no more impressive goal than to stand up this envisioned observatory. UMD’s Christine Bachrach writes on how the social determinants of health would be a timely and useful topic to train the observatories’ lenses upon. Consider the value of merging biomedical testing data, hospital data, insurer claims data, e-health records, environmental, situational, demographic, and public-program data to give meaningful, geographically relevant and in-all-other-ways rich information on health. This 2013 paper sounds familiar especially now as the NIH undertakes its All of Us research project around the country, especially for underrepresented populations to gather lifecourse health data from voluntary participants.

Like All of Us, Bachrach’s vision would hand useful learning back to the communities engaged in the studies, and allow local leaders monitoring. This highlights a great opportunity generally for the observatory model: we need model communities that know their populations’ needs and can draw on good info too meet them in the long term and bringing the nation’s best minds to bear on the problems of the community. Participating “observatory communities” would be periscopes, and beneficiaries; crucially, they should also help steer the observatory’s principles for studied topics and privacy. Bachrach’s idea is detailed here: https://socialobservatories.org/papers/paper_20130703.0001 [Reader]

One final thought: in addition to creating a new familiarity with local social-behavioral data collection and use, we should welcome the observatory model as a substrate for new ways of talking about social research. New ways of organizing social science should emerge from the project, both in how scientists can transcend disciplinary silos and in how study topics and interpretation can benefit from participatory community-based research.


Use new ontologies. Here, there are inspiring strides taken in studying ontologies themselves by the Open Biological and Biomedical Ontologies Foundry, which is well described in an open-access article in Genome Biology here: https://genomebiology.biomedcentral.com/articles/10.1186/gb-2005-6-5-r46. The Foundry studies ontologies and has been consistently proposing improvements to how we can most efficiently represent knowledge. Efficiency does not mean lack of nuance; instead, the nuance is retained with new dimensions added to allow better meta-analysis. Consider for instance topological research of the loops, shapes, and networks formed among relations. Other dimensions, represented wisely, could be inputs to those topological evaluations.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s